Tuesday, November 3, 2015

AWK command

Introduction :

awk is a very powerful command in unix that helps us easily manipulate a file or read through a file .AWK takes input from the console or from a file which is specified with the command

syntax :

The syntax to execute AWK in command line would be as follows

awk ' BEGIN {} END {} ' <FILENAME>

The begin and end statements are really optional here.BEGIN  prints the command once before the awk loops through the file and END commands prints it after the execution of AWK is completes.

we can simply print the first column of a file using the below command

awk ' { print $1 } ' sample.txt 

The file sample.txt will be read and the first column is printed in this case .By default it assumes the file is tab delimited or space delimited .

let us assume that we have a sample file sample.txt with the following records

name address phone salary
abc xcheufhe 12121212 10000
xyz fmmrkfkkr 2323254 1000000
cns dfffggggggg 123454545 3999
sdsds dkdkwdjwej 16767676 5000


Now lets run the basic awk commands on them and compare the outputs

1.awk ' { print $1 } ' sample.txt

The output will be as follows

name
abc
xyz
cns
sdsds






2.awk ' BEGIN { print "start" } { print $1 } END { print "done" } ' sample.txt

Output will be

start 
name
abc
xyz
cns
sdsds

done 

This is the difference between using begin and end in the awk command

3 awk ' { print $1"\t" $2 } ' sample.txt

Output will be

name    address   
abc    xcheufhe   
xyz    fmmrkfkkr   
cns    dfffggggggg   
sdsds    dkdkwdjwej

The \t here seperates the two fileds name and address with a tab space.if that is not given the result will be nameaddress (without space).


awk ' { print $2=" " ; print $0 } ' sample.txt

Output will be

name  phone salary 
abc  12121212 10000
xyz  2323254 1000000
cns  123454545 3999
sdsds  16767676 5000


We can see that all the columns except the 2nd column ie address is printed here .$0 prints all the columns and since we have given $2 as blank that column did not appear in the output

4.awk ' { print $2=$3=" "; print $0 } ' sample.txt

Output will be 

name   salary
abc   10000
xyz   1000000
cns   3999
sdsds   5000
 

Here the columns 2 and 3 ie address and phone number is excluded from the output

5.Suppose we have a huge file and we need to print a range of values say column 2 to 6 from tht file we can use the below command

awk -v a=2 -v b=6 ' {for (i=a;i<=b;i++);print $i } ' sample.txt

The -v argument stands for variables and it can be used inside the begin end loop.here we are assiging two variables a and b with the minimum and maximum range we need and then we are using a for loop to iterate and print all the columns starting from 2 to 6 .


Built in variables available with the awk command :

There are 8 most popular built in variables that comes handy with an awk command .Lets go through each one by one .

1.FS or input field seperator :

By default the awk command assumes that the file is space or tab delimited one .
suppose if we have a file with say comma delimited we may have to explicitly mention that when using in the awk command

awk -F "," ' { print $1} sample.txt


2.OFS or output field seperator :

awk -F "," ' BEGIN { OFS="=";} { print $1,$2,$3; } ' sample.txt

This command reads the file in a comma seperated values and prints the columns 1 ,2, and 3 seperated by =.note that OFS cannot be directly used in the command line .it has to be enclosed within a begin and end block .

 3. RS or record seperator :

awk -F "," ' BEGIN { RS="\n"; OFS=":"; } { print $1,$2,$3} ' sample.txt

 this command assumes the record seperator is a new line which is the default and ofs is : and the file is actually comma delimited .so it reads the file assuming the record ends with a new line and outputs them with a : .this command also should be used within a begin and end statement 

4 NR or total number of records in the file :

awk ' BEGIN { print "stats" } { print "processing record-",NR } END { print NR,"number of records processed " } ' sample.txt

If there are 10 records in the file the output will be something like this 

stats
record processed -  1
record processed -  2
record processed -  3
record processed -  4
record processed -  5
record processed -  6
record processed -  7
record processed -  8
record processed -  9
record processed -  10
10 records processed


5 NF or number of fields in a record 

This command will give the number of fields in the file for each record

awk -F "," ' { print NR , "=" ,NF } ' sample.txt

this will read the file in a comma delimited format and counts the number of records = number of fileds 

the output will be something like 

1 = 5
2 = 5
3 = 5
4 = 5
5 = 0
6 = 0
7 = 0
8 = 0
9 = 0
10 = 0


this means that the 1  st row to 4 th row has 5 fileds and rest of the rows are empty 

6 FILENAME 

This command prints the filename as many times as the NR 

awk ' { print FILENAME } ' sample.txt

will print sample.txt 10 times since the file has 10 records 

awk -F "," ' BEGIN { OFS=":";} {print $0,FILENAME} ' sample.txt

This will print something like this 

name,place,address,phonenumber,salary:sample.txt
aparna,cochin,trinityworld,9037289898,1000:sample.txt
anjali,palakkad,infopark,9090909090,100000:sample.txt
anusha,banglore,electroncity,903456565,40000:sample.txt


Some simple AWK commands 

1.return the number of lines in a file :
awk ' END { print NR } ' <filename>

2.print the odd lines in a file 

awk ' { if (NR % !=0) print $0} ' <filename>

3.Print the even lines in a file 

awk ' { if (NR %==0) print $0} ' <filename>

4.Print the length of the longest line in the file 

awk ' { if (length($0) > max) max= length($0) } END { print max} ' <filename>

5.Print the longest line in the file

awk ' { if (length($0) > max) max = $0 } END { print max } ' <filename>

Exit status :

if an AWK command runs successfully the exit status will be 0 else it will be 1 .
We can manually give an exit code also .in that case the awk command will exit with that code 





















 
 





Thursday, February 12, 2015

Triggers

Triggers:

Triggers are PLSQL block of code that will be executed automatically upon an event.They are mainly used for Auditing purposes,to prevent a user from performing certain activities,security purposes and so on.

There are several types of Triggers in Oracle.Lets see each one of them.

DDL Trigger :

We can write a trigger such that the trigger will be executed before/after a DDL statement (Drop,Create,Alter etc).Such triggers are called as a DDL trigger.

DML Trigger :

When we write a trigger that is executed when a DML operation like insert/update/delete happens we can call it as a DML trigger.

Event Trigger :

When a trigger is fired upon a system event like login /logoff of system,database etc they are called as event triggers.

Instead of Triggers :

We can cause a trigger to be fired instead of performing an activity which is called as  instead of triggers .

Compound Triggers :

It is a new concept which is released in Oracle 11g,They allow multiple triggers to be created at the same point of time.

Components of a trigger :

The triggers have the below components :

1.Trigger Name 
2.Triggering event (update/insert etc)
3.Triggering time (Before/After)
4.Triggering level(Statement/row level).

Syntax of a Trigger :

A Trigger can be written as :

create or replace trigger trigger_name
(before/after/instead of) (update/insert/delete) on table_name 
begin 
    <code>

end;

Lets see each one of the trigger now .

DDL Trigger :

A DDL trigger is fired when a DDL change happens to the table associated with the trigger 

eg :

create or replace trigger abc_test 
before drop on table employees 
begin 
raise_application_error('-1000','the table employees cannot be dropped');
end ;

In the above example the trigger will be executed whenever any user tries to drop the table employees.
the trigger raises an exception saying that the employee table cannot be dropped.Along with this we can also insert a record into a table which has the user and the time stamp so that we can find out if anyone has tried to delete the employee table and at what time.This is really useful for security purposes.Since the trigger is called before a DDL statement on the table employees this can be called as a DDL trigger.

DML Trigger:


create or replace trigger emp_update 

after update on employees 

begin 
   dbms_output.put_line('The employee table is updated at '|| sysdate);   
end;

Above is an example for a DML trigger .Whenever  a user tries to update the employee table .Once the update is completed the trigger will be fired since we have created it as after update on employees .If we create it as before update on employees ,it will be fired before updating the table.The output will be printed as some user has updated the employee table at this time .This type of triggers are also called as after trigger since the trigger is fired after the update in the table.

Event Trigger :

If you are working as a database administrator and you have a very important files in a system .and whenever someone logs in we should be able to track which user has logged into the system at what time .In this type if scenario we can go for an event trigger.

Row level triggers:

Row level triggers will be executed once for each row that is effected in a DML statement .For Eg if there is a trigger after update on employees and if the update happens for 10 rows then the trigger will be fired 10 times .It is represented as for each row clause in a trigger .Eg :

create or replace trigger trig_name 
after update on employees 
for each row 
when :new.salary < :old.salary
begin 
insert into audit_table values (:new.salary,:old.salary,:new.emp_id);
end ;

In the above trigger for each row clause indicates that it is a row level trigger and the when condition is used to restrict the trigger execution.In this case the trigger will be executed only when the new salary that is updated is less than the old salary.
The :new and :old are called as correlated identifies and it can be used only with a row level trigger

Statement level trigger :

A statement level trigger will be fired only once for a statement though it may cause multiple rows to be effected .By default the trigger will be  a statement level trigger .
We cannot use the qualifiers like old and new in the statement level triggers .But they can be used when we need a single operation to be done after a DML command .

Difference Between Row level and Statement Level triggers :

The row level triggers will be fired once for every row effected whereas the statement level triggers will be fired only once for an operation.

The row level triggers can have qualifies such as new and old where as a statement level triggers cannot have qualifiers .

Commit in Triggers:

We cannot issue a commit/rollback inside a trigger body.This is because Triggers are part of larger transactions and commit/rollback might cause a change in the main data .
For eg :we are issuing a commit inside an before update trigger which inserts a record into the audit table.Once we issue the update command ,the trigger will run successfully and insert the record in Audit table.But what if the original update statement fails ?
But If there are situations where we should go for commit in a trigger.We should use
PRAGMA AUTONOMOUS TRANSACTION command .it means that the trigger will execute as an autonomous transaction and will commit /rollback.But this is generally not recommended.

Drawbacks of Trigger :

Since the triggers are executed automatically creating unnecessary triggers will cause huge costs.

We should never a write a trigger that does an operations which cannot be rollbacked.
For eg I have an after insert trigger which sends a mail to the admin that a particular record is inserted.In that case we issue an insert and the mail is sent to the admin.What if we rollback the insert.So the record no longer remains in the table,But mails would be sent that the record is inserted.So using any UTL packages which cannot be rollbacked should not be written in a trigger body.

Composite Triggers :Oracle 11g :

Composite triggers are a new concept which is introduced in Oracle 11g.
They allow multiple triggers to be executed at the same point of time




















Wednesday, January 7, 2015

Indexes In Oracle -Part2

Function based Index :

Functional Based Indexes are most beneficial if the where clause of the SQL statement contains a function.

Eg :

select * from employees
where upper(employee_name)='ABC;

If the above statement is used several times then creating an index on the employee_name alone will not use the index.Hence we must create an index on upper(employee_name).This would speed the query execution since the index will be created on the upper column .

Create index v_idx on employees(upper(employee_name));

To enable this index we must set two session parameters.

Query_rewrite_enabled :

This session parameter has three values .False,True,Force .If the session parameter is set to false it will not use the functional index for computing the values for the functional based index.If set to force will ensure the query is re written using the index.

Bitmap Index :

Bitmap index is mainly used in the data warehousing environment where the DMLs are less.Bitmap indexes are very useful for low cardinality columns. ie when the cardinality is less than 0.1%.
For Eg :Creating a bitmap index on the Gender or the Marital Status column has very less distinct values.Hence the Bitmap Indexes are very useful here .The Bitmap Index stores the rowid along with the bit if set means it contains a key value.Hence scanning the index and retreiving the data is easy.

Disadvantage :

The disadvantage of bitmap index is that if the table is manipulated often using inserts or updates,it will cause an overhead for the index.
Also deadlock condition may arise if multiple sessions try to insert the record into the table at the same time.

Btree index:

The Btree index is organized in the form of a tree and hence the name.This index is very useful if we have wide range of distinct values .It starts with a root node and the leaf nodes .Once when a query is issued it goes to the root node and decides on which leaf nodes the data exists and then traverse the leaf node to locate the data .