Results 1 to 15 of 15

Thread: Database Tutorial - Part 1

  1. #1
    Join Date
    Nov 2010
    Location
    Australia
    Posts
    1,472
    Mentioned
    0 Post(s)
    Quoted
    8 Post(s)

    Lightbulb Database Tutorial - Part 1

    Database tutorial - Part 1
    Date: 08/NOV/2012

    Index:
    Basics
    Constraints
    SQL DDL
    SQL DML
    Queries and Aggregate functions

  2. #2
    Join Date
    Nov 2010
    Location
    Australia
    Posts
    1,472
    Mentioned
    0 Post(s)
    Quoted
    8 Post(s)

    Default

    Welcome to the first ever database tutorial on villavu which also happens to be my first ever tutorial here. So lets get right into it!

    What is a database you ask?
    Well a database is basically a “structured collection of data” that is related together. e.g. Data about a particular country and its weather over a period of time, or data about list of all items available in a game(RS :P) etc.


    The data can be stored in lots of different ways, such as inside filing cabinet as folders,
    comma separated text files(name, age, phone...etc.) also known as flat files or we can also store it as bunch of related tables. This tutorial will concentrate on databases that store data in relational tables(on a computer) a.k.a relational databases.

    Everything in a relational database is stored in a table, and each table can have more than one column and more than one row and there can be as many tables as the database will let you.

    Each table has a name(eg: table_employee) and it is identified by its name. Therefore the table name must be unique within the database i.e. no two tables can have the same name.

    Each column also has a type(similar to pascal types: integer, strings, decimals etc)and a name which must be unique within that table. So you can have a table called table_employee with columns: name, age, id and then another table called table_item with columns: name, weight, id. But you cannot have a table with the following columns: name, name, weight, id because there are two columns called “name”.

    Each table can also have many rows. Similar to tables and columns which have a unique name, rows may also have a unique identifier that separates them from the rest of the rows in a table. This identifier is called a primary key. A primary key is just a column in a table and it can be either a number or a string or a combination of both and it must not be empty. A primary key can also consist of multiple columns within the same table.

    Why do we need a primary key?
    We need a primary key because we can have two rows with the same data and the primary key helps distinguish between them. e.g. two people named Fred smith who are the same age:

    Attachment 17726

    Lets say that a Fred Smith has left the office and we have to remove his details from the database. Now, which one do we remove, the first or the second? Primary keys help solve this problem because each and every row in table is different.

    In the following employee table, the column “ID” is the primary key and it helps us in determining which Fred Smith is leaving. We can just ask his ID(assuming he says the truth ) and delete the row that corresponds to that ID.

    Attachment 17728

    If you are having difficulty imagining tables and primary keys, think of it this way:
    We have a type called TEmployee
    Simba Code:
    type
      TEmployee= record
        ID: Integer;
        Name: String;
        Age: Integer;

    The table employee can be thought of as an array of TEmployee. And each row represents an item in that array. Emp_Array[0] = Employee(319, Fred Smith, 24); But instead of using square brackets to get the item, we use the primary key to get the row in a table. And every time you add a row the table is automatically made bigger to accommodate the new row.

    You might have noticed that I have a “Salarytype” column in the employee table. You might be asking, what the hell does TS1 or PG2 or AW1 mean. That's because you didn't see the Salary table which is related to the employee table.

    Attachment 17729

    ID is also the primary key for Salary table. (Remember, we can have same column names as long as they are both in different tables).

    Attachment 17730

    How are they related you ask?
    Well the column SalaryType in the Employee table is a foreign key. Unlike primary keys, it is alright to have two foreign keys that are the same. However, the foreign key must match the primary key in the other table OR it can be empty(maybe we still haven't decided how to pay the new person?). e.g. It is ok to have two rows with salary type TS1, or a row with an empty salarytype but we cannot have a row with salarytype of TS3, because it does not exist in the salary table.

    A foreign key in a table points to a primary key in another table. Which means that you look up TS1 in the Salary table and you find out that both the Fred Smith's are paid $240(30 per hour x 8 hours) every day and the money is deposited in their bank at the end of the day, while Jane Doe is also paid $30 per hour but the money is deposited in her bank at the end of the week.

  3. #3
    Join Date
    Nov 2010
    Location
    Australia
    Posts
    1,472
    Mentioned
    0 Post(s)
    Quoted
    8 Post(s)

    Default Constraints

    Cardinality constraint
    Participation constraint
    Primary key constraint
    Foreign key constraint
    Other constraints
    Null constraint
    Unique constraint
    Check constraint


    There are two main types of constraints, column and table. Column constraints only apply to a single column in a table. While table column apply to more than one column in a table. They do not apply for the table!
    The scope of the constraint depends on where the constraint is used. If a column constraint is needed, it must follow the column name, or else the constraint must be placed at the end of the table.

    Cardinality constraints tell us how many foreign keys we can have for a particular primary key in a table. And it is always one to one (1:1) or one to many (1:N).

    1:1 – It means that in a given table, we can only have one unique foreign key. e.g. Only one person can travel with one ticket. Here the person is the foreign key while the ticket number is the primary key.

    1:N – There can be either none, one, or many foreign keys. In the above employee table, the cardinality constraint is 1:N because we have two foreign keys that are the same.

    Participation constraints tell us whether the foreign key is compulsory or not. e.g. Once we buy a ticket to a movie, we may watch it or not. It is not compulsory to go, just because we have a ticket.

    e.g. People must drive a car on the left hand side of the road(in most parts of the world) because its a legal obligation. There is no choice! We cannot choose which side to drive on.


    Primary key constraints: When we declare a column to be a primary key, the database software automatically checks the new rows to see that they are all different. If the new row contains a duplicate primary key, then we get an duplicate primary key error.

    Foreign key constraints: When we declare a column to be a foreign key, we must also specify the table in which the primary key exists. When a new row is inserted or an existing row is updated, the database software automatically checks to make sure that the foreign key exists in the primary key table and it also checks if the foreign key can be empty or not.


    Other constraints

    Null constraints: It applies to a single column and it determines whether we can input an empty value or not. In the employee table, the salary type column allows empty(or null) values. If we wanted to force the user to input a value, then we would declare the column as not null. By default, when you create a column it accepts null value unless otherwise specified using "Not null" modifier.

    Unique constraints: Applies to a single column and similar to primary key, it makes sure that all the values in a column are different. But unlike primary key, this column is not used to differentiate between two rows. By default, a column will accept duplicate values, unless we specify other wise with the "Unique" modifier.

    Check constraints: Also applies to a single colum These are like if statements, and they check to make sure the new value is within a given limit. e.g. We put a check constraint for the amount column in Salary table saying that amount must be less than or equal to 50. If we try to enter a number greater than 50, say 55, we get an error saying that the value must be less than or equal to 50.

  4. #4
    Join Date
    Nov 2010
    Location
    Australia
    Posts
    1,472
    Mentioned
    0 Post(s)
    Quoted
    8 Post(s)

    Default

    Now that you know the theoretical part about databases, its time to learn SQL syntax. SQL – Structured Query Language is used by all database management software(oracle, mysql, ms access, libre office base, sqlite, postgresql etc). Its like a programming language but only for databases. Regardless of the database software you use, they will all use the basic SQL statements. For the purpose of this tutorial, we shall use sqlite3 database which can be downloaded from here. It is a command line utility and will print results on screen. If you want a gui, you can download sqlite browser from here.

    SQL statements are generally considered to be in two groups:
    DDL – Data definition language: These statements affect the structure of the database.
    e.g. Create table, alter table, drop table

    DML – Data Manipulation language: These statements modify the data.
    e.g. Insert, update, delete

    Every single sql statement is terminated using semi-colon, just like pascal, and indentation of code is not compulsory, just like pascal. In this post, we will deal with DDL statements.

    The syntax to create a table is as follows:
    Code:
    Create Table <table-name> (
    <Column-name1> <data-type> [(max length)]  [constraint],
    <Column-name2> <data-type> [(max length)] [constraint],
    --…add as many columns as you need
    [Constraint <constraint-name> constraint type,]
    [Foreign Key (<column-name>) References (<table-name>)] );
    data-type depends on the database that you are using. For sqlite, the following datatypes are valid:
    NULL. The value is a NULL value.
    INTEGER. The value is a signed integer, stored in 1, 2, 3, 4, 6, or 8 bytes depending on the magnitude of the value.
    REAL. The value is a floating point value, stored as an 8-byte IEEE floating point number.
    TEXT. The value is a text string, stored using the database encoding (UTF-8, UTF-16BE or UTF-16LE).
    BLOB. The value is a blob of data, stored exactly as it was input.
    If you have previous knowledge of sql, you will know that in other database software, there are types such as boolean, datetime, varchar2 etc. You can also use these types as column types, but sqlite will convert them to the nearest storage type. e.g. datetime will be converted into TEXT type and booleans will be converted into 0's and 1's.

    In order to create the salary table above, we would do this:
    Code:
    Create table Salary(
      ID Text Primary Key,
      Frequency Text ,
      Amount Integer  Constraint colamt_tabSalary Check(Amount >0));
    And to create the employee table we would do this:
    Code:
    Create table Employee(
      ID Integer  Primary Key,
      Name Text not null,
      Age Integer ,
      Salarytype Text,
      Foreign Key (Salarytype) References Salary);
    The order of creating the tables is important because the table Employee references table salary. If we created table Employee first, we would get an error because the table Salary has not been created yet.

    You might have noticed that there were two places that constraints could be added. One at the end of declaring a column, and another at the end of the table. If you want the constraint to apply to only one column, then you can use it after declaring the column. But if you want the constraint to apply to more than one column, then you put it at the end of the table after declaring all the columns.

    What if you made a mistake in creating the table and you want to delete the table, you would use drop table command.

    Syntax:
    Code:
    Drop table <table-name>;
    It will delete the entire table and the data it contains! So be careful if you don't want that. Lets say that you don't want to delete the entire table just because you forgot to add a column. You can do so using the alter table command, which is as follows:

    Code:
    ALTER TABLE <tablename>
    [ ADD <column name> <type> [<length>] [null/not null] ]
    [ MODIFY <column name> <type> [<length>] [null/not null] ]
    [ DROP <column name> ]
    [ ADD <constraint> ]
    [ DROP <constraint> ];
    So if we wanted to add a column called phone_number to our employee table, we would just need to do:
    Code:
    Alter table employee 
    add (ph_num Integer) not null;

  5. #5
    Join Date
    Nov 2010
    Location
    Australia
    Posts
    1,472
    Mentioned
    0 Post(s)
    Quoted
    8 Post(s)

    Default

    DML statements
    Now that you know how to create the structures in a database, its time to populate them with data! We do this using the Insert statement, and the syntax is like this:
    Code:
    INSERT INTO <table-name> [(<column-name1> , <column-name2> , ... )]
    VALUES ( <value1> , <value2> , ... );
    The columns and the values must match the order. Meaning, value1 will be stored in column1, and value2 will be stored in column2. The column names are optional, and need not be written. But if we don't write the column name, we won't know what the value specifies just by looking at it.

    First, lets add rows inside the salary table, like so:
    Code:
    Insert into Salary(ID, Frequency, Amount) Values('TS1', 'Daily', 30);
    Insert into Salary(ID, Frequency, Amount) Values('PG2', 'Weekly', 30);
    Insert into Salary(ID, Frequency, Amount) Values('AW1', 'Fortnightly', 41);
    So in order to add the 5 people to our employee table, we have to write the insert statement 5 times. And No there is no other way

    Code:
    Insert Into Employee(ID, name, Age, SalaryType) Values(319, 'Fred Smith', 24, 'TS1');
    Insert Into Employee(ID, name, Age, SalaryType) Values(476, 'Fred Smith', 24, 'TS1');
    Insert Into Employee(ID, name, Age, SalaryType) Values(503, 'Jane Doe', 34, 'PG2');
    Insert Into Employee(ID, name, Age, SalaryType) Values(543, 'John Pitt', 42, 'AW1');
    Insert Into Employee(ID, name, Age, SalaryType) Values(677, 'Carol Song', 50, null);

    Select Statement

    The syntax for select statement is actually quite big, so I have only used the most frequently used ones
    Code:
    Select <item> 
    From <table-name>
    [inner/left outer/right outer join <table-name2> on <condition>]
    [Where <condition>]
    [Order by <column> [asc/desc]];
    So if we wanted to select all the employees from employee table, we could do:
    Code:
    Select * 
    From Employee;
    The * returns every column from the table. But if we only wanted the names and ID's of the employees, then we should do:
    Code:
    Select ID, name
    from employee
    order by ID; --default is ascending order
    Lets say that we also want to get the full employee information, including the salary they get paid. How do we do that? Well we use the inner join clause:
    Code:
    Select E.ID, E.name, S.Amount   --this means get the column ID from table E
    From Employee E  --we can now refer to table Employee as E, so we have to  type less :D
    Inner join Salary S On E.SalaryType = S.ID 
    --what this does, is it joins the two tables based on the primary key-foreign key relationship and only returns rows whose foreign key has a corresponding value in the primary key table. 
    Order by 1; --order by the first column :D
    If you actually execute that statement, you will realise that it only returns 4 rows. So what happened to the other row? Why didn't we get that too? Its because of the PK-FK relationship. Employee Carol Song had a null salary type, and therefore it did not have any corresponding value in the salary table. So how do we get Carol's information as well? Well for that, we need to use outer joins:

    Code:
    Select E.ID, E.name, S.Amount 
    From Employee E  
    Left outer join Salary S On E.SalaryType = S.ID 
    --what this does, is it joins the two tables based on the primary key-foreign key relationship and it will also return every single row from the left table(Employee) even if it cannot find the primary key.
    Order by 1;
    The following statement is also valid and will have the same effect:
    Code:
    Select E.ID, E.name, S.Amount 
    From Salary S 
    Right outer join Employee E On E.SalaryType = S.ID 
    --in this case, the right outer join works on the right side of the table, which is Employee table.
    Order by 1;
    Update Statement
    What happens if you want to change the salary type of Carol song? You could either delete the row, and add it again like so:
    Code:
    Delete from Employee
    where name = 'Carol Song';
    
    Insert Into Employee(ID, name, Age, SalaryType) Values(677, 'Carol Song', 50, TS1);
    The general syntax for delete statement is this:
    Code:
    DELETE FROM <tablename>
    [WHERE <condition>];
    It will delete all the rows from the table where the condition evaluates to true. If you don't put a where clause, it will delete all the rows from the table.

    But what if we don't want to delete the row? Well, we can use the Update statement, and it works like this:
    Code:
    UPDATE <tablename>
    SET <columnname1>=<value1>, 
    [<columnname2>=<value2>,]
    --add as many columns as you want
    [WHERE <condition>];
    What it does is that it updates the table called 'tablename' and sets the value of a single column to the corresponding value. If we omit the where clause, it will change the value of all the rows for that column. So if we wanted to update Carol Song's salary instead of deleting, we would do:
    Code:
    Update Employee
    set Salarytype = 'TS1'
    where name = 'Carol Song';

  6. #6
    Join Date
    Nov 2010
    Location
    Australia
    Posts
    1,472
    Mentioned
    0 Post(s)
    Quoted
    8 Post(s)

    Default

    Queries
    What are queries? Well simply put, they are select, update and delete statements. But most of the time, we only use select statements and that too only to produce a detailed report. e.g. If we wanted to find out the number of employees we currently have, we could do:
    Code:
    Select Count(*) From Employees;
    The count function takes a column name and returns every row which satisfies the criteria. So if we want to know how many employees do not yet have a salarytype, we could do:
    Code:
    Select count(*) - count(salarytype) From employees;
    What it does is it gets all the rows, then removes the rows which have a salary type which leaves us with the rows that do not have a salary type.

    What if we wanted to find the details of the youngest and oldest person? Well we use the min and max functions, like so:
    Code:
    Select *, 'young' from Employee where age = (select min(age) from employee)
    union
    Select *, 'old' from Employee where age = (select max(age) from employee);
    The select min/max(age) from employee returns a single value, not a row, but a number. And the union function combines the results of the two select queries and returns them together. When you see the result you can see that 'young' and 'old' are printed at the end of the rows. When we wrote 'young' in the query, we actually created an extra column which has the same value for every row in the result. But if you look at the employee table, you will see that the table is not modified, only the result is modified.

    What if you wanted to know the average age of your employees? Well you use the avg function:
    Code:
    select avg(age) from employee;
    That will print out a single number.

  7. #7
    Join Date
    Dec 2007
    Posts
    2,112
    Mentioned
    71 Post(s)
    Quoted
    580 Post(s)

    Default

    looking forward to reading this, but why didnt you do part one first and then make the reserve posts

  8. #8
    Join Date
    Dec 2011
    Location
    Nj
    Posts
    2,341
    Mentioned
    1 Post(s)
    Quoted
    18 Post(s)

    Default

    Definitely gonna need this! Props on the thread BTW!

    For the basics of the basics of pascal, try my TuT. ||Photoshop Editing ||MapleResourceDung Script || Book a flight! BuySellTrip

  9. #9
    Join Date
    Nov 2010
    Location
    Australia
    Posts
    1,472
    Mentioned
    0 Post(s)
    Quoted
    8 Post(s)

    Default

    Quote Originally Posted by Kasi View Post
    looking forward to reading this, but why didnt you do part one first and then make the reserve posts
    already made part 1 and 2, i was just making them look good. Lots more to come within the next couple of days.

  10. #10
    Join Date
    Mar 2007
    Posts
    5,125
    Mentioned
    275 Post(s)
    Quoted
    901 Post(s)

    Default

    Looks good so far, can't wait for it to be finished

    Forum account issues? Please send me a PM

  11. #11
    Join Date
    Nov 2010
    Location
    Australia
    Posts
    1,472
    Mentioned
    0 Post(s)
    Quoted
    8 Post(s)

    Default

    Updated the rest of the posts, any feedback is appreciated

  12. #12
    Join Date
    Dec 2007
    Posts
    2,112
    Mentioned
    71 Post(s)
    Quoted
    580 Post(s)

    Default

    Repped , you sir have gained a spot on my bookmarks.

  13. #13
    Join Date
    Nov 2010
    Location
    Australia
    Posts
    1,472
    Mentioned
    0 Post(s)
    Quoted
    8 Post(s)

    Default

    Quote Originally Posted by Kasi View Post
    Repped , you sir have gained a spot on my bookmarks.
    Thanks! And make sure to keep some free space in your bookmarks as i'm going to make an advanced database tutorial soon

  14. #14
    Join Date
    Dec 2007
    Posts
    2,112
    Mentioned
    71 Post(s)
    Quoted
    580 Post(s)

    Default

    Quote Originally Posted by KingKong View Post
    Thanks! And make sure to keep some free space in your bookmarks as i'm going to make an advanced database tutorial soon
    :P i have a folder for tutorials

  15. #15
    Join Date
    Aug 2007
    Location
    Where do you live?
    Posts
    934
    Mentioned
    0 Post(s)
    Quoted
    1 Post(s)

    Default

    Quote Originally Posted by KingKong View Post
    Thanks! And make sure to keep some free space in your bookmarks as i'm going to make an advanced database tutorial soon
    Looking forward to the advanced one, learned a little about databases in college, forgot most of it but it's all coming back as I read more of this tut

    Good work, very well written

Thread Information

Users Browsing this Thread

There are currently 1 users browsing this thread. (0 members and 1 guests)

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •