Tuesday, June 10, 2014

Relational Database concepts and Normalization

In order to understand how the databases function ,it is very essential to understand the concept of relational databases .Lets dive into the field of relational databases for a while .

Evolution of Relational databases :

 In the early 1960 s the data was stored in the form of flat files .Those were physical in existence .But this  type of storage consumed large volumes of data and hence was ddifficult to store or retrieve data at a later stage .Since the data volumes continued to increase alternative solution was required .

Hierarchical Database Model

This is the next stage in which data was stored in the form of an inverted tree structure .Here the entities follow child-parent relationship.each child should have an associated Parent node and a parent node can have multiple child .Here the relationship is one to many since one parent is having multiple child .But the disadvantage is that inorder to locate  the child node it should first locate the root node and thn trverse down to the child which consumes resource and time .

Example :

The school here is the parent node whichhas multiple departments such as science ,arts,commerce ets
an each department have multiple sections like a,b and c and each section has students .inorder to locate a student ,we should know the organization,from there migrate to the deartment and thn look for section and then spot the student .

Network Database Model

 This model is a further refinement of the hierarchical database model .This model allows many-to-many relationship along with the one to many relationship.One good example is the relationship between employee and task.One employee can have multiple tasks and one task can be assigned to multiple employees also .Here comes the need for assignment table which specifies which task is assigned to which employee (there should be composite key for employee id and task id and their combination).



Relational Database Model :

One big advantage of relational database model is that we do not have to navigate from the root to the child node to locate the data .it allows any number of parent child relationship provided there is a sensible link between the tables . Any tables an be linked together regardless of their hierarchial positions in this model.One disadvantage of this model is that it is not very efficient in retreiving a single object .

Object Database Model :

The next evolution is the object database model which stores the data in the form of object and methods or definition to retrieve the objects .One advantage of this model is that it retrieves or fetches one specific data quickly since it is not interlinked with any other node but getting a group of data would be more complex in this model.

Object Relational Model :

This adds the object oriented features in the relational data model.


Designing a Database :

In order to design the database we should first understand the end user needs .According to the purpose of the database there are three category for the databases

1. Transnational Databases
2.Decision support system
3.Hybrid systems

Transnational Databases :

The main purpose of transaction Databases should be faster and effective  fetching or storing of data. Transnational Databases mainly involves adding ,removing and changing the data/ meta data for the end users .Hence it should be quiet fast and much of relational modelling techniques are not needed here .

Eg :Client server machines and OLTP .

Decision Support Databases :

This refers to the data warehousing which stores huge amount of data that is being processed . Once the data gets older it will be moved from the database to the warehousing unit .A proper data modelling is required for warehouse since it stores huge amount of data .Small chunks of data units in the warehouse can be called as DATA MART .Reporting Database also is another type of warehousing database but it does not have the purged or old data and hence it ll be smaller in size

Hybrid Database :

It is a combination of both the oltp and warehousing .For smaller organizations which have smaller clients a single database can be used for performing transactions and also storing the data.Hence This will be more cost effective due to fewer machines, fewer people  and fewer licenses .

Things to remember while designing a database

I shortly mention it in the following bullet points .
  •  The databse should be well structured and easy to read an understand 
  •  To maintain data integrity (Data is not lost but only hidden).
  •  To ensure that the DB supports both planned and adhoc queries .
  •  Do not normalize a table so much since it is mentioned in the rules .
  •  Future growth of data should be an important factor to consider 
  •  Each table in the DB should refer to a single structure
  •  Avoid changing the underlying design once the application is up and running .It is very expensive and time consuming to change the underlying design after a point of time 
  •  The application should have rapid response time on smaller transactions and high concurrency level.
  •  Queries should be relatively simple and should no cause any error due to lack of designing constraints or due to poor table design 
  •  The application should be simpler and there should be less dependency between database  model and application . 
Methods of Database Design :

There are several database designs ,The best method is described below :

Requirement Analysis :

The first and foremost thing to do while designing the database is to understand the requirement of the end user .Meet up with the end user and understand what they want from the designer and collect the details and thoroughly go through each and every thing  .

Conceptual Design :

This step involves creating the ER diagram ,designing the tables ,constraints,relationship .This step also includes normalization ie breaking up of tables into smaller ones for more readability and also for increasing the space.

Logical Design :

This stage involves the creation of DDL commands for the underlying tables .

Physical Design :

Involves the compelte designing and restructuring of the tables

Tuning :

This phase involves performance tuning techniques like building prper indexes,normalizing or de normalizing the tables ,addign security features an so on