Wednesday, July 4, 2012

Databases

The first and foremost question:

What is a database?

        A database is a storage entity in short. It is a place where a large amount of data is organized and stored in an orderly manner. It is a storage area basically and in order to access or update this storage area, we use something called a 'Database Management System' or in short DBMS. Imagine a database as a black box that stores and DBMS as a means to talk to the black box.


What is DBMS then?

       DBMS is a software that stores the data into a database and then retrieves it. There are many types of DBMS like relational DBMS, transactional DBMS, etc. But the main idea of a DBMS remains the same i.e it controls the in/out of the databases. There are many DBMS vendors to choose from like oracle, mysql, sql server, etc, each with their own versions of reliability, security and performances. 

What is SQL then?

       A structured query language is a mechanism of talking to a DBMS. Almost every DBMS vendor uses the same SQL syntax. Which means, once you learn SQL, you will be able to communicate with almost every DBMS in use. Indirectly you will be able to talk to any database. Learning SQL is crucial for managing a database.
      SQL comprises of only a few commands. The key to master SQL is to understand each of the SQL commands and use them logically. It is that simple.
There are four kinds of commands:

  1. Data Definition language commands(DDL)
  2. Data manipulation language commands(DML)
  3. Data control language commands(DCL)
  4. Transaction control language commands(TCL)
        DDL and DML are the most frequently used commands in SQL. You are less likely to use DCL or TCL. Moreover, DCL and TCL commands do not get involved in logic usually. Even DDL has no logic involved. Here we shall discuss about DML in detail and others in brief.

DDL

         These commands structure the data. They involve commands that create, alter or delete the objects in a database. Database objects may be anything , a table, a view, or a database.
CREATE: creates objects.
ALTER: changes the structure of objects like changing the columns, their type, etc
RENAME: renames the database objects
DROP: drops or deletes objects
TRUNCATE: empties the content of objects

DCL

      These commands deal with the access grants and revokes.
GRANT: grants access on an object to a user(s)
REVOKE: revokes access on an object from a user(s).

TCL

     TCL commands manage the actions of DML commands like saving the changes to a database, reverting the changes, etc.
COMMIT: saving the changes to a database
ROLLBACK: undoing or reverting the changes made to the database
SAVEPOINT: points to a state in a transaction where the ROLLBACK can be used to revert to
SET TRANSACTION: this manages the transaction options like isolation options

DML

     These commands are the most frequently used ones. They manage the actual data in the database.
SELECT: this is used to retrieve the data from the database object
INSERT: used to add one or more rows to the database object
UPDATE: used to update or change the already added rows
DELETE: used to delete the rows

The syntax of DML is as follows:
<DML command> <column name[,column names list]> 
FROM <table object>[<table object> JOIN <table object> ON <condition>]
[WHERE<condition>]
[ORDER BY column name<sorting order>[,column name<sorting order>]]
[GROUP BY column name<grouping order>[[,column name<grouping order>]
HAVING <condition>]

As an example,
SELECT a, b, c FROM abcWHERE a>10 and b<20ORDER BY c desc, b ascGROUP BY c HAVING c=3


Phew!  That was hard, to gather up that syntax. Oh, by the way, SQL is case insensitive. I only used caps to identify the syntax from the rest. Using DML is easy, you just have to learn how filters are used along with DML commands. Anything that manipulates the normal behavior of DML commands are filters.

Let us start with JOIN.
I have already made two tables named t1 and t2.

mysql> select * from t1;
+------+------+
| col1 | col2 |
+------+------+
|    1 |  100 |
|    1 |  101 |
|    1 |  102 |
|    2 |  103 |
|    2 |  104 |
|    3 |  105 |
|    3 |  106 |
|    4 |  107 |
|    4 |  108 |
|    5 |  109 |
|    5 |  110 |
|    6 |  111 |
|    6 |  112 |
+------+------+
13 rows in set (0.00 sec) 

mysql> select * from t2;
+------+------+
| col2 | col3 |
+------+------+
|  100 | a    |
|  101 | b    |
|  102 | c    |
|  103 | d    |
|  104 | e    |
|  105 | f    |
|  106 | g    |
|  107 | h    |
|  108 | i    |
|  109 | j    |
|  110 | k    |
|  111 | l    |
|  112 | m    |
+------+------+
13 rows in set (0.00 sec)
Now lets try JOIN.
        mysql> select * from t1 join t2 on t1.col2=t2.col2  

+------+------+------+------+
| col1 | col2 | col2 | col3 |
+------+------+------+------+
|    1 |  100 |  100 | a    |
|    1 |  101 |  101 | b    |
|    1 |  102 |  102 | c    |
|    2 |  103 |  103 | d    |
|    2 |  104 |  104 | e    |
|    3 |  105 |  105 | f    |
|    3 |  106 |  106 | g    |
|    4 |  107 |  107 | h    |
|    4 |  108 |  108 | i    |
|    5 |  109 |  109 | j    |
|    5 |  110 |  110 | k    |
|    6 |  111 |  111 | l    |
|    6 |  112 |  112 | m    |
+------+------+------+------+
13 rows in set (0.02 sec)
How about we delete the col2 values of rows that contain 'b' and 'c'  in col3 of table t2?

mysql> update t2 set col2=null where col3 in ('b','c');
Query OK, 2 rows affected (0.05 sec)Rows matched: 2  Changed: 2  Warnings: 0

mysql>  select * from t1 join t2 on t1.col2=t2.col2 ;
+------+------+------+------+
| col1 | col2 | col2 | col3 |
+------+------+------+------+
|    1 |  100 |  100 | a    |
|    1 |  101 |  101 | NULL |
|    1 |  102 |  102 | NULL |
|    2 |  103 |  103 | d    |
|    2 |  104 |  104 | e    |
|    3 |  105 |  105 | f    |
|    3 |  106 |  106 | g    |
|    4 |  107 |  107 | h    |
|    4 |  108 |  108 | i    |
|    5 |  109 |  109 | j    |
|    5 |  110 |  110 | k    |
|    6 |  111 |  111 | l    |
|    6 |  112 |  112 | m    |
+------+------+------+------+
13 rows in set (0.00 sec)

Notice that the rows containing 'b' and 'c' are missing even though the rows exist in table t2. Let me show you that.

mysql> select * from t2;
+------+------+
| col2 | col3 |
+------+------+
|  100 | a    |
|  101 | NULL |
|  102 | NULL |
|  103 | d    |
|  104 | e    |
|  105 | f    |
|  106 | g    |
|  107 | h    |
|  108 | i    |
|  109 | j    |
|  110 | k    |
|  111 | l    |
|  112 | m    |
+------+------+

The JOIN we just used is INNER JOIN by default. There are other joins too. However, lets verify whether if the type of join is by default INNER JOIN if it the type is not specified.

mysql>  select * from t1 inner join t2 on t1.col2=t2.col2 ;
+------+------+------+------+
| col1 | col2 | col2 | col3 |
+------+------+------+------+
|    1 |  100 |  100 | a    |
|    2 |  103 |  103 | d    |
|    2 |  104 |  104 | e    |
|    3 |  105 |  105 | f    |
|    3 |  106 |  106 | g    |
|    4 |  107 |  107 | h    |
|    4 |  108 |  108 | i    |
|    5 |  109 |  109 | j    |
|    5 |  110 |  110 | k    |
|    6 |  111 |  111 | l    |
|    6 |  112 |  112 | m    |
+------+------+------+------+
11 rows in set (0.00 sec)
Yep, it is the same thing.
What about the other types of joins?
LEFT JOIN. It considers the values of the columns of the left table and compares it with the values in columns of the right side table of the JOIN. There by making a list of entries of left table and attaching the right side table to it.

mysql>  select * from t1 left join t2 on t1.col2=t2.col2 ;
+------+------+------+------+
| col1 | col2 | col2 | col3 |
+------+------+------+------+
|    1 |  100 |  100 | a    |
|    1 |  101 | NULL | NULL |
|    1 |  102 | NULL | NULL |
|    2 |  103 |  103 | d    |
|    2 |  104 |  104 | e    |
|    3 |  105 |  105 | f    |
|    3 |  106 |  106 | g    |
|    4 |  107 |  107 | h    |
|    4 |  108 |  108 | i    |
|    5 |  109 |  109 | j    |
|    5 |  110 |  110 | k    |
|    6 |  111 |  111 | l    |
|    6 |  112 |  112 | m    |
+------+------+------+------+
13 rows in set (0.00 sec)
 RIGHT JOIN. same story, from the right side.
mysql>  select * from t1 right join t2 on t1.col2=t2.col2 ;
+------+------+------+------+
| col1 | col2 | col2 | col3 |
+------+------+------+------+
|    1 |  100 |  100 | a    |
|    2 |  103 |  103 | d    |
|    2 |  104 |  104 | e    |
|    3 |  105 |  105 | f    |
|    3 |  106 |  106 | g    |
|    4 |  107 |  107 | h    |
|    4 |  108 |  108 | i    |
|    5 |  109 |  109 | j    |
|    5 |  110 |  110 | k    |
|    6 |  111 |  111 | l    |
|    6 |  112 |  112 | m    |
| NULL | NULL | NULL | b    |
| NULL | NULL | NULL | c    |
+------+------+------+------+
13 rows in set (0.00 sec)

Let us make a few modifications to learn other JOINS.

mysql> update t1 set col2=null where col2 in (106,107);
Query OK, 2 rows affected (0.03 sec)
Rows matched: 2  Changed: 2  Warnings: 0

mysql> update t1 set col2=108 where col1 in (4,5);
Query OK, 3 rows affected (0.06 sec)
Rows matched: 4  Changed: 3  Warnings: 0

mysql> select * from t1;
+------+------+
| col1 | col2 |
+------+------+
|    1 |  100 |
|    1 |  101 |
|    1 |  102 |
|    2 |  103 |
|    2 |  104 |
|    3 |  105 |
|    3 | NULL |
|    4 |  108 |
|    4 |  108 |
|    5 |  108 |
|    5 |  108 |
|    6 |  111 |
|    6 |  112 |
+------+------+
13 rows in set (0.00 sec)
CROSS JOIN. This is both the LEFT JOIN and RIGHT JOIN combined. Compares the values of each table to the other table's corresponding values. However, if the value is null, it simply ignores them.
mysql>  select * from t1 cross join t2 on t1.col2=t2.col2 ;
+------+------+------+------+
| col1 | col2 | col2 | col3 |
+------+------+------+------+
|    1 |  100 |  100 | a    |
|    2 |  103 |  103 | d    |
|    2 |  104 |  104 | e    |
|    3 |  105 |  105 | f    |
|    4 |  108 |  108 | i    |
|    4 |  108 |  108 | i    |
|    5 |  108 |  108 | i    |
|    5 |  108 |  108 | i    |
|    6 |  111 |  111 | l    |
|    6 |  112 |  112 | m    |
+------+------+------+------+
10 rows in set (0.00 sec)
SELF JOIN. sometimes we need to join a table to itself in order to get the result. Here I have included a unusual situation to demonstrate how SELF JOIN works.
mysql>  select * from t1 self join t1 on t1.col2=t1.col2 ;
+------+------+------+------+
| col1 | col2 | col1 | col2 |
+------+------+------+------+
|    1 |  100 |    1 |  100 |
|    1 |  101 |    1 |  100 |
|    1 |  102 |    1 |  100 |
|    2 |  103 |    1 |  100 |
|    2 |  104 |    1 |  100 |
|    3 |  105 |    1 |  100 |
|    3 | NULL |    1 |  100 |
|    4 |  108 |    1 |  100 |
|    4 |  108 |    1 |  100 |
|    5 |  108 |    1 |  100 |
|    5 |  108 |    1 |  100 |
|    6 |  111 |    1 |  100 |
|    6 |  112 |    1 |  100 |
|    1 |  100 |    1 |  101 |
|    1 |  101 |    1 |  101 |
|    1 |  102 |    1 |  101 |
|    2 |  103 |    1 |  101 |
|    2 |  104 |    1 |  101 |
|    3 |  105 |    1 |  101 |
|    3 | NULL |    1 |  101 |
|    4 |  108 |    1 |  101 |
|    4 |  108 |    1 |  101 |
|    5 |  108 |    1 |  101 |
|    5 |  108 |    1 |  101 |
|    6 |  111 |    1 |  101 |
|    6 |  112 |    1 |  101 |
|    1 |  100 |    1 |  102 |
|    1 |  101 |    1 |  102 |
|    1 |  102 |    1 |  102 |
|    2 |  103 |    1 |  102 |
|    2 |  104 |    1 |  102 |
|    3 |  105 |    1 |  102 |
|    3 | NULL |    1 |  102 |
|    4 |  108 |    1 |  102 |
|    4 |  108 |    1 |  102 |
|    5 |  108 |    1 |  102 |
|    5 |  108 |    1 |  102 |
|    6 |  111 |    1 |  102 |
|    6 |  112 |    1 |  102 |
|    1 |  100 |    2 |  103 |
|    1 |  101 |    2 |  103 |
|    1 |  102 |    2 |  103 |
|    2 |  103 |    2 |  103 |
|    2 |  104 |    2 |  103 |
|    3 |  105 |    2 |  103 |
|    3 | NULL |    2 |  103 |
|    4 |  108 |    2 |  103 |
|    4 |  108 |    2 |  103 |
|    5 |  108 |    2 |  103 |
|    5 |  108 |    2 |  103 |
|    6 |  111 |    2 |  103 |
|    6 |  112 |    2 |  103 |
|    1 |  100 |    2 |  104 |
|    1 |  101 |    2 |  104 |
|    1 |  102 |    2 |  104 |
|    2 |  103 |    2 |  104 |
|    2 |  104 |    2 |  104 |
|    3 |  105 |    2 |  104 |
|    3 | NULL |    2 |  104 |
|    4 |  108 |    2 |  104 |...... and to infinity.
This demonstrates us that the table used in the commands are just treated as objects and never is the real table passed. In the above example, 2 instances of the same table are created in order to complete SELF JOIN. similarly, considering a JOINed object, we can nest JOINs. However, the objects need to be referenced. Like,

mysql>  select * from t1 join t2 on t1.col2=t2.col2 join t2 a on a.col2=t2.col2;
+------+------+------+------+------+------+
| col1 | col2 | col2 | col3 | col2 | col3 |
+------+------+------+------+------+------+
|    1 |  100 |  100 | a    |  100 | a    |
|    2 |  103 |  103 | d    |  103 | d    |
|    2 |  104 |  104 | e    |  104 | e    |
|    3 |  105 |  105 | f    |  105 | f    |
|    4 |  108 |  108 | i    |  108 | i    |
|    4 |  108 |  108 | i    |  108 | i    |
|    5 |  108 |  108 | i    |  108 | i    |
|    5 |  108 |  108 | i    |  108 | i    |
|    6 |  111 |  111 | l    |  111 | l    |
|    6 |  112 |  112 | m    |  112 | m    |
+------+------+------+------+------+------+
10 rows in set (0.02 sec)

Consider the above query where t1 JOIN t2, i.e table t1 is already joined with t2. But, t2 is again joined with the total join. However, in order to identify the latest t2 instance from the older t2 instance that has already been joined with t1, we make a reference to it. Here we have made 'a' as the reference.

t1 JOIN t2    on t1.col2=t2.col2        JOIN t2 a    on a.col2=t2.col2

Anyway, I think its too much already. I'm done.
References:
Database used: MySql

Feel free to ask me anything on Facebook