TechiePen

Tuesday, November 3, 2015

AWK command

Introduction :

awk is a very powerful command in unix that helps us easily manipulate a file or read through a file .AWK takes input from the console or from a file which is specified with the command

syntax :

The syntax to execute AWK in command line would be as follows

awk ' BEGIN {} END {} ' <FILENAME>

The begin and end statements are really optional here.BEGIN prints the command once before the awk loops through the file and END commands prints it after the execution of AWK is completes.

we can simply print the first column of a file using the below command

awk ' { print $1 } ' sample.txt

The file sample.txt will be read and the first column is printed in this case .By default it assumes the file is tab delimited or space delimited .

let us assume that we have a sample file sample.txt with the following records

name address phone salary
abc xcheufhe 12121212 10000
xyz fmmrkfkkr 2323254 1000000
cns dfffggggggg 123454545 3999
sdsds dkdkwdjwej 16767676 5000

Now lets run the basic awk commands on them and compare the outputs

1.awk ' { print $1 } ' sample.txt

The output will be as follows

name
abc
xyz
cns
sdsds

2.awk ' BEGIN { print "start" } { print $1 } END { print "done" } ' sample.txt

Output will be

start
name
abc
xyz
cns
sdsds
done

This is the difference between using begin and end in the awk command

3 awk ' { print $1"\t" $2 } ' sample.txt

Output will be

name    address
abc    xcheufhe
xyz    fmmrkfkkr
cns    dfffggggggg
sdsds    dkdkwdjwej
The \t here seperates the two fileds name and address with a tab space.if that is not given the result will be nameaddress (without space).

awk ' { print $2=" " ; print $0 } ' sample.txt

Output will be

name phone salary
abc 12121212 10000
xyz 2323254 1000000
cns 123454545 3999
sdsds 16767676 5000

We can see that all the columns except the 2nd column ie address is printed here .$0 prints all the columns and since we have given $2 as blank that column did not appear in the output

4.awk ' { print $2=$3=" "; print $0 } ' sample.txt

Output will be

name   salary
abc   10000
xyz   1000000
cns   3999
sdsds   5000

Here the columns 2 and 3 ie address and phone number is excluded from the output

5.Suppose we have a huge file and we need to print a range of values say column 2 to 6 from tht file we can use the below command

awk -v a=2 -v b=6 ' {for (i=a;i<=b;i++);print $i } ' sample.txt

The -v argument stands for variables and it can be used inside the begin end loop.here we are assiging two variables a and b with the minimum and maximum range we need and then we are using a for loop to iterate and print all the columns starting from 2 to 6 .

Built in variables available with the awk command :

There are 8 most popular built in variables that comes handy with an awk command .Lets go through each one by one .

1.FS or input field seperator :

By default the awk command assumes that the file is space or tab delimited one .
suppose if we have a file with say comma delimited we may have to explicitly mention that when using in the awk command

awk -F "," ' { print $1} sample.txt

2.OFS or output field seperator :

awk -F "," ' BEGIN { OFS="=";} { print $1,$2,$3; } ' sample.txt

This command reads the file in a comma seperated values and prints the columns 1 ,2, and 3 seperated by =.note that OFS cannot be directly used in the command line .it has to be enclosed within a begin and end block .

3. RS or record seperator :

awk -F "," ' BEGIN { RS="\n"; OFS=":"; } { print $1,$2,$3} ' sample.txt

this command assumes the record seperator is a new line which is the default and ofs is : and the file is actually comma delimited .so it reads the file assuming the record ends with a new line and outputs them with a : .this command also should be used within a begin and end statement

4 NR or total number of records in the file :

awk ' BEGIN { print "stats" } { print "processing record-",NR } END { print NR,"number of records processed " } ' sample.txt

If there are 10 records in the file the output will be something like this

stats
record processed - 1
record processed - 2
record processed - 3
record processed - 4
record processed - 5
record processed - 6
record processed - 7
record processed - 8
record processed - 9
record processed - 10
10 records processed

5 NF or number of fields in a record

This command will give the number of fields in the file for each record

awk -F "," ' { print NR , "=" ,NF } ' sample.txt

this will read the file in a comma delimited format and counts the number of records = number of fileds

the output will be something like

1 = 5
2 = 5
3 = 5
4 = 5
5 = 0
6 = 0
7 = 0
8 = 0
9 = 0
10 = 0

this means that the 1 st row to 4 th row has 5 fileds and rest of the rows are empty

6 FILENAME

This command prints the filename as many times as the NR

awk ' { print FILENAME } ' sample.txt

will print sample.txt 10 times since the file has 10 records

awk -F "," ' BEGIN { OFS=":";} {print $0,FILENAME} ' sample.txt

This will print something like this

name,place,address,phonenumber,salary:sample.txt
aparna,cochin,trinityworld,9037289898,1000:sample.txt
anjali,palakkad,infopark,9090909090,100000:sample.txt
anusha,banglore,electroncity,903456565,40000:sample.txt

Some simple AWK commands

1.return the number of lines in a file :
awk ' END { print NR } ' <filename>

2.print the odd lines in a file

awk ' { if (NR % !=0) print $0} ' <filename>

3.Print the even lines in a file

awk ' { if (NR %==0) print $0} ' <filename>

4.Print the length of the longest line in the file

awk ' { if (length($0) > max) max= length($0) } END { print max} ' <filename>

5.Print the longest line in the file

awk ' { if (length($0) > max) max = $0 } END { print max } ' <filename>

Exit status :

if an AWK command runs successfully the exit status will be 0 else it will be 1 .
We can manually give an exit code also .in that case the awk command will exit with that code

Thursday, February 12, 2015

Triggers

Triggers:

Triggers are PLSQL block of code that will be executed automatically upon an event.They are mainly used for Auditing purposes,to prevent a user from performing certain activities,security purposes and so on.

There are several types of Triggers in Oracle.Lets see each one of them.

DDL Trigger :

We can write a trigger such that the trigger will be executed before/after a DDL statement (Drop,Create,Alter etc).Such triggers are called as a DDL trigger.

DML Trigger :

When we write a trigger that is executed when a DML operation like insert/update/delete happens we can call it as a DML trigger.

Event Trigger :

When a trigger is fired upon a system event like login /logoff of system,database etc they are called as event triggers.

Instead of Triggers :

We can cause a trigger to be fired instead of performing an activity which is called as instead of triggers .

Compound Triggers :

It is a new concept which is released in Oracle 11g,They allow multiple triggers to be created at the same point of time.

Components of a trigger :

The triggers have the below components :

1.Trigger Name
2.Triggering event (update/insert etc)
3.Triggering time (Before/After)
4.Triggering level(Statement/row level).

Syntax of a Trigger :

A Trigger can be written as :

create or replace trigger trigger_name
(before/after/instead of) (update/insert/delete) on table_name
begin
<code>

end;

Lets see each one of the trigger now .

DDL Trigger :

A DDL trigger is fired when a DDL change happens to the table associated with the trigger

eg :

create or replace trigger abc_test
before drop on table employees
begin
raise_application_error('-1000','the table employees cannot be dropped');
end ;

In the above example the trigger will be executed whenever any user tries to drop the table employees.

the trigger raises an exception saying that the employee table cannot be dropped.Along with this we can also insert a record into a table which has the user and the time stamp so that we can find out if anyone has tried to delete the employee table and at what time.This is really useful for security purposes.Since the trigger is called before a DDL statement on the table employees this can be called as a DDL trigger.

DML Trigger:

create or replace trigger emp_update

after update on employees

begin

dbms_output.put_line('The employee table is updated at '|| sysdate);

end;

Above is an example for a DML trigger .Whenever a user tries to update the employee table .Once the update is completed the trigger will be fired since we have created it as after update on employees .If we create it as before update on employees ,it will be fired before updating the table.The output will be printed as some user has updated the employee table at this time .This type of triggers are also called as after trigger since the trigger is fired after the update in the table.

Event Trigger :

If you are working as a database administrator and you have a very important files in a system .and whenever someone logs in we should be able to track which user has logged into the system at what time .In this type if scenario we can go for an event trigger.

Row level triggers:

Row level triggers will be executed once for each row that is effected in a DML statement .For Eg if there is a trigger after update on employees and if the update happens for 10 rows then the trigger will be fired 10 times .It is represented as for each row clause in a trigger .Eg :

create or replace trigger trig_name
after update on employees
for each row
when :new.salary < :old.salary
begin
insert into audit_table values (:new.salary,:old.salary,:new.emp_id);
end ;

In the above trigger for each row clause indicates that it is a row level trigger and the when condition is used to restrict the trigger execution.In this case the trigger will be executed only when the new salary that is updated is less than the old salary.
The :new and :old are called as correlated identifies and it can be used only with a row level trigger

Statement level trigger :

A statement level trigger will be fired only once for a statement though it may cause multiple rows to be effected .By default the trigger will be a statement level trigger .
We cannot use the qualifiers like old and new in the statement level triggers .But they can be used when we need a single operation to be done after a DML command .

Difference Between Row level and Statement Level triggers :

The row level triggers will be fired once for every row effected whereas the statement level triggers will be fired only once for an operation.

The row level triggers can have qualifies such as new and old where as a statement level triggers cannot have qualifiers .

Commit in Triggers:

We cannot issue a commit/rollback inside a trigger body.This is because Triggers are part of larger transactions and commit/rollback might cause a change in the main data .
For eg :we are issuing a commit inside an before update trigger which inserts a record into the audit table.Once we issue the update command ,the trigger will run successfully and insert the record in Audit table.But what if the original update statement fails ?
But If there are situations where we should go for commit in a trigger.We should use
PRAGMA AUTONOMOUS TRANSACTION command .it means that the trigger will execute as an autonomous transaction and will commit /rollback.But this is generally not recommended.

Drawbacks of Trigger :

Since the triggers are executed automatically creating unnecessary triggers will cause huge costs.

We should never a write a trigger that does an operations which cannot be rollbacked.
For eg I have an after insert trigger which sends a mail to the admin that a particular record is inserted.In that case we issue an insert and the mail is sent to the admin.What if we rollback the insert.So the record no longer remains in the table,But mails would be sent that the record is inserted.So using any UTL packages which cannot be rollbacked should not be written in a trigger body.

Composite Triggers :Oracle 11g :

Composite triggers are a new concept which is introduced in Oracle 11g.
They allow multiple triggers to be executed at the same point of time

Wednesday, January 7, 2015

Indexes In Oracle -Part2

Function based Index :

Functional Based Indexes are most beneficial if the where clause of the SQL statement contains a function.

Eg :

select * from employees
where upper(employee_name)='ABC;

If the above statement is used several times then creating an index on the employee_name alone will not use the index.Hence we must create an index on upper(employee_name).This would speed the query execution since the index will be created on the upper column .

Create index v_idx on employees(upper(employee_name));

To enable this index we must set two session parameters.

Query_rewrite_enabled :

This session parameter has three values .False,True,Force .If the session parameter is set to false it will not use the functional index for computing the values for the functional based index.If set to force will ensure the query is re written using the index.

Bitmap Index :

Bitmap index is mainly used in the data warehousing environment where the DMLs are less.Bitmap indexes are very useful for low cardinality columns. ie when the cardinality is less than 0.1%.
For Eg :Creating a bitmap index on the Gender or the Marital Status column has very less distinct values.Hence the Bitmap Indexes are very useful here .The Bitmap Index stores the rowid along with the bit if set means it contains a key value.Hence scanning the index and retreiving the data is easy.

Disadvantage :

The disadvantage of bitmap index is that if the table is manipulated often using inserts or updates,it will cause an overhead for the index.
Also deadlock condition may arise if multiple sessions try to insert the record into the table at the same time.

Btree index:

The Btree index is organized in the form of a tree and hence the name.This index is very useful if we have wide range of distinct values .It starts with a root node and the leaf nodes .Once when a query is issued it goes to the root node and decides on which leaf nodes the data exists and then traverse the leaf node to locate the data .

Monday, December 22, 2014

Index In Oracle

Indexes are used for faster retrieval of queries from a table .The Index in oracle functions just like the Index in a book where we locate a topic easily by scanning the index .Oracle scans the index and locates the data location and retrieves the data quickly.

When does a table require index :

1.When we want to retrieve less than 15% of the data from a relatively large table we need to create an index on the table .
2.When a join operation is done to retrieve data from multiple tables,creating an index on the joining column will retrieve the data quicker .
3.Smaller tables do not require indexes .If a query runs longer then the size of the table has increased or there is some other problem in it that needs to be addressed.

When should we create an Index :

The ideal way to create an index is to create the table,populate the data and then create an appropriate index for the table.If we create the table with index ,each insert into the table needs an entry in the index which might take considerable amount of time .

What columns should be chosen while Indexing :

1.Values are relatively unique for the tables.
2.If the column has wide range of values it will be suitable for normal or regular index .
3.If the column has small range of values it will be suitable for bit map indexes.
4.When we use a mathematical function eg: multiplication and the column has many null values we know that only the not null value will be used for the operation.Hence creating an index on this will be helpful
5.Creating an Index which appears frequently in the where clause of the query .But if the indexed column is used in a function in the where clause do not create the Index .(Functional Index will be more useful in this case).
7.Always check the execution plan to ensure that the Index is used in the query.
8.Do not create index on a column that is frequently updated,inserted or deleted .This will add overhead since we need to do all the operations in index as well.
9.Always choose an index with high selectivity.

Selectivity for Index :

If a table has 10,000 records and the index on the table has 8000 distinct values then the selectivity of the index would be 8000/10000=0.8.The ideal selectivity is 1.That can be only achieved using a unique index on a not null column .

Composite Index :

Composite Index is a combination of one or more keys used to create an index

When we create a composite index, we must make sure that the column that is frequently used are mentioned first .

eg :

create index t_idx
on t(col1,col2,col3);

In the above index the col1,col1|col2,col1|col2|col3 are all leading portion of the index whereas
col2,col3,col2|col3 are the not leading portion of the index .

So in the above scenario only the queries that access the leading portion of the index uses the index.if we try to query the table using a where condition which has the not leading portion of the index (col2,col3,col2|col3) it will not access the index.

Choosing a key for composite index :

1.If all the columns in the where clause are used for creating the index then ordering the columns based on selectivity (higher selectivity to lower) will increase the performance.
2.If all the columns in the where clause are used for creating the index and the table is ordered on a particular key,then make sure to include that column as the first column for index.
3.If only some keys are used in the query make sure to create the index such that the frequently used column forms the leading portion of the index.

Limiting the number of indexes :

A table can have as many indexes as needed.But creating more number of indexes causes overhead especially while inserting or deleting the data.the Index also needs to be deleted or inserted which results in the overhead .Also while doing an update the corresponding index entry also needs to be updated ,Hence we must limit the use of index.

A table which is used as read only can have indexes where as a table where there are heavy DML s running it is essential to reduce the number of indexes .

Dropping the Index :

An index should be dropped under the following situation:

1.If the performance is not improved.It may be because the table is very small or because the size of the index is very small .

2.If the query we want to use is not accessing the index .If the queries we use are not accessing the index then there is no use in creating the index on the first place and hence it should be dropped .

Tablespace for Index:

An index can be created in the same tablespace or a different tablespace as of the table.

If the index is created in the same table space the backup will be easier .But with different tablespace performance will be improved since it reduces the disk contention .

But with the table and index in different tablespace a query accessing the table may not work if the index or tables tablespace is not online.

To manually prevent using Indexes :

If we want the CBO to manually prevent a query from accessing the index we can use the NO_INDEX hint or use a FULL hint which will cause a full table scan instead of the index.

Thursday, August 28, 2014

Learning Perl

Perl shortcuts and commands

perl -v --> gives the version of the perl installed in the system
perldoc -f <function name >-->gives the usage of a function in perl.
eg :

perldoc -f print
perldoc -f substr

perldoc -q "any thing that you want to search"-->brings the details from the faqs .
perldoc perldoc--> gives the metadata of what is there in perldoc.

Print command :

this is the function that prints the statement in the console

perl -e "print \"hello world"\";

Shebang lines :

The Perl program often begins with the shebang lines .(ie #!).it takes the following forms

#! /usr/bin/perl
#! /usr/local/bin/perl
#! /usr/bin/perl -w
#! /usr/bin/env perl

The first two commands directly points to the perl executable that should run the program.
The third command has q -w that says it should run with global warnings .

use strict;
use warnings;
use diagnostics;

Tuesday, July 1, 2014

Netezza Basics

Important things to note in the Netezza Architecture :

Host :

The host is a high performance Linux server setup in the active passive mode.This host is responsible for compiling the SQL queries into executable blocks called snippets,for creating an optimized query plan,distributes th snippets to nodes for execution.

Snippet Blades or S blades :

Sblades or snippet blades are independent server containing multi core CPU s,multiple engine FPGA and its own memory all designed to work concurrently to deliver peak performance .Each snippet is connected to a set of 8 disks.

Field Programmable Gate Array:

FPGA is the key component in the Netezza architecture.It has the following engines embedded in it .

Compress engine - This uncompress the data in wire speed transforming each block on disk into 4 to 8 blocks in memory.

Project and Restrict engine- This further increases the performance by filtering out the rows and columns mentioned in the select and the where clause.

Visibility engine -Filters out the rows that should not be seen by the Query ie the data that is not committed.

Disk enclosures :

The disks are high density ,high performance ones.Each table data is uniformly distributed across various disks.A high speed network connects the disk with the s blades so that the data gets streamed at maximum rate possible.

Optimizer:

The host compiles the query and generates an execution plan.The optimizer intelligence is a key factor in performance of the query .The optimizer makes use of all the nodes in order to get an up to date statistics of all database objects referenced in a query.Another example of optimizer efficiency is in calculating the join order.if for eg a small table is joined against all the fact tables ,the optimizer can showcase the small table to all the s blades while keeping the large fact table distributed across the snippets.This approach minimizes the data movement while taking advantage of the parallel processing.The optimizer minimizes the I/O and data movement ,the two factor slowing performance in the warehouse system.The other functions of optimizer includes

Determining the correct join order

Rewriting expressions

Removing redundancy in the SQL operations

Compiler :

The compiler converts the query plan into executable segments known as snippets which are executed in parallel.The intelligence of the compiler is that it has a feature called object cache which is a large cache of previously compiled snippet code with parameter variation.This will eliminate compilation for many snippets.

When an SQL query is executed the following events take place.

The optimizer generates an execution plan and the compiler creates scheduled task called snippets.
The data is moved from the disks to the corresponding S blades in a compressed manner through a high speed network.
The snippet processor reads the table data into memory utilizing a technique. called as zone map which will reduce disk scans by storing the minimum and maximum value and hence it avoids fetching the data out of range.The details of zone map unlike indexes are created and updated automatically.
The compressed data is cached in the memory using a smart algorithm which will make the most accessible data to be available instantly rather than fetching from the disk.
The data then moves to FPGA field programmable gate array responsible for uncompressing data ,extracting the data and then applies the filter condition and pass it to the CPU
The CPU performs other operations like Joins and the results of each snippet is sent to the host which does the final calculation and pass down the results to the end user .

Here,Most of the processing happens in the Hard disk and less in CPU.

Disk has data ,mirror data and free space .So even if one of the disk fails it obtains data from the mirror of the other disk .

When a table is created pieces of table will be distributed across all the discs and hence the data is fetched at a faster rate and Parallel processing is achieved in that way .

DMLs in Netezza :

In Netezza the DML such as insert,update and delete are autocommit and it gets commited once you execute the query.However once you delete or update a row the data is not completely lost.

Rollback of a delete :

If we have accidentally deleted a record and need to rollback we can set an option such as set show_deleted_records=true

Now if we select deletexid from the table where deletexid !=0 we will get the data for that row with a transaction id and then we can re insert that record into the table

insert into t select * from t where deletexid=123;

Rollback of an Update :

If you update a table,a record is deleted and a new record will be inserted .so there will be a deletexid and insertxid populated for the record.to recover or rollback the delete all we need is to delete the record inserted with that transaction id and insert the record with the update statements deletexid

In both the case if a groom table is issued then the table will be updated with the latest stats and a rollback may not be possible.

Also A truncate table will remove the records permanently and hence cannot be roll backed.

Tuning of NZSQL:

1.Always distribute the tables with a proper key.Integer values which are having high cardinality is a good choice for distribution.

2.When joining two tables always use the common column as distribution key in both the tables.for eg dept id is primary key in dept table and foreign key in employees tables use dept id column distribution key in both the tables.also the datatype of both the tables key should be same.

3.even after using proper distribution key some of the joins might be long

running.check whether any distinct value has more records eg -1 or Unknown.if those records are high then the result might take longer time.try to create random numbers for these in a seperate table and associate them with the main table.

4.Always try to broadcast the small table to improve the performance.you can set enable_factrel_planner = true and then set factrel_size_threshold high like 15000000, which says anything under 15 million rows is more of a dimension so broadcast or redistribute it, and tables over 15M rows are big facts so try to leave in place and don’t redistribute.

5.Always run the groom table and groom table version.Also generate the statistics of the objects created.

Tuesday, June 10, 2014

Relational Database concepts and Normalization

In order to understand how the databases function ,it is very essential to understand the concept of relational databases .Lets dive into the field of relational databases for a while .

Evolution of Relational databases :

In the early 1960 s the data was stored in the form of flat files .Those were physical in existence .But this type of storage consumed large volumes of data and hence was ddifficult to store or retrieve data at a later stage .Since the data volumes continued to increase alternative solution was required .

Hierarchical Database Model

This is the next stage in which data was stored in the form of an inverted tree structure .Here the entities follow child-parent relationship.each child should have an associated Parent node and a parent node can have multiple child .Here the relationship is one to many since one parent is having multiple child .But the disadvantage is that inorder to locate the child node it should first locate the root node and thn trverse down to the child which consumes resource and time .

Example :

The school here is the parent node whichhas multiple departments such as science ,arts,commerce ets

an each department have multiple sections like a,b and c and each section has students .inorder to locate a student ,we should know the organization,from there migrate to the deartment and thn look for section and then spot the student .

Network Database Model

This model is a further refinement of the hierarchical database model .This model allows many-to-many relationship along with the one to many relationship.One good example is the relationship between employee and task.One employee can have multiple tasks and one task can be assigned to multiple employees also .Here comes the need for assignment table which specifies which task is assigned to which employee (there should be composite key for employee id and task id and their combination).

Relational Database Model :

One big advantage of relational database model is that we do not have to navigate from the root to the child node to locate the data .it allows any number of parent child relationship provided there is a sensible link between the tables . Any tables an be linked together regardless of their hierarchial positions in this model.One disadvantage of this model is that it is not very efficient in retreiving a single object .

Object Database Model :

The next evolution is the object database model which stores the data in the form of object and methods or definition to retrieve the objects .One advantage of this model is that it retrieves or fetches one specific data quickly since it is not interlinked with any other node but getting a group of data would be more complex in this model.

Object Relational Model :

This adds the object oriented features in the relational data model.

Designing a Database :

In order to design the database we should first understand the end user needs .According to the purpose of the database there are three category for the databases

1. Transnational Databases

2.Decision support system

3.Hybrid systems

Transnational Databases :

The main purpose of transaction Databases should be faster and effective fetching or storing of data. Transnational Databases mainly involves adding ,removing and changing the data/ meta data for the end users .Hence it should be quiet fast and much of relational modelling techniques are not needed here .

Eg :Client server machines and OLTP .

Decision Support Databases :

This refers to the data warehousing which stores huge amount of data that is being processed . Once the data gets older it will be moved from the database to the warehousing unit .A proper data modelling is required for warehouse since it stores huge amount of data .Small chunks of data units in the warehouse can be called as DATA MART .Reporting Database also is another type of warehousing database but it does not have the purged or old data and hence it ll be smaller in size

Hybrid Database :

It is a combination of both the oltp and warehousing .For smaller organizations which have smaller clients a single database can be used for performing transactions and also storing the data.Hence This will be more cost effective due to fewer machines, fewer people and fewer licenses .

Things to remember while designing a database

I shortly mention it in the following bullet points .

The databse should be well structured and easy to read an understand
To maintain data integrity (Data is not lost but only hidden).
To ensure that the DB supports both planned and adhoc queries .
Do not normalize a table so much since it is mentioned in the rules .
Future growth of data should be an important factor to consider
Each table in the DB should refer to a single structure
Avoid changing the underlying design once the application is up and running .It is very expensive and time consuming to change the underlying design after a point of time
The application should have rapid response time on smaller transactions and high concurrency level.
Queries should be relatively simple and should no cause any error due to lack of designing constraints or due to poor table design
The application should be simpler and there should be less dependency between database model and application .

Methods of Database Design :

There are several database designs ,The best method is described below :

Requirement Analysis :

The first and foremost thing to do while designing the database is to understand the requirement of the end user .Meet up with the end user and understand what they want from the designer and collect the details and thoroughly go through each and every thing .

Conceptual Design :

This step involves creating the ER diagram ,designing the tables ,constraints,relationship .This step also includes normalization ie breaking up of tables into smaller ones for more readability and also for increasing the space.

Logical Design :

This stage involves the creation of DDL commands for the underlying tables .

Physical Design :

Involves the compelte designing and restructuring of the tables

Tuning :

This phase involves performance tuning techniques like building prper indexes,normalizing or de normalizing the tables ,addign security features an so on