Monday, July 2, 2012

Oracle Architecture

Oracle Database Server :
                                    
Now lets focus on the the oracle server.Though we are very good programmers many of us lack the idea of what happens in the database once we execute the query (due to the hectic delivery dates who bothers all that ).Lets wade a little into that world now .

Consider the Oracle sever as similar to the human beings .We have hands,eyes,ears and we do all the tasks with these and we also have a memory where we store the happenings.In course of time we may forget some things and some things remain there .So we are creating a blog and writing all the happenings of the day so that at a later point of time we can recollect them by reading it .The oracle server is exactly the same .It has got a memory and it has got some processes (like the tasks that we do ) and it also writes all the datas into files (jult like our blog).


So essentially a database server has the following 


1.Instance -->comprising of background processes+Memory 

2.Database -->comprising of the various data files

The instance and the data files coordinately work together .One instance will be associated with one data file.This mode is called single instance server.There are also cases where multiple instances are associated with the same datafile.This is called parallel server or RAC (real application cluster).In a 64 node RAC there are 64 instances attached to a single datafile. But one instance cannot be connected to multiple data files.

Connecting to the database :


Assume that our Database server s installed in UK and we need to connect to the database for executing some of the tasks .So we need a client machine which is essentially a toad/sqlplus and then we need the following information to connect to the database .


1.IP address of the server

2.Database name 
3.Port no 
4.Transfer protocol

if we have all the things needed we need to present this in a format that client machine can understand and connect to the server .If we have access to the tnsnames.ora we can see the connection string (a weird one ) with an alias to it .We have to specify the username/password along with  the alias name or type the entire string in order to connect to the server .


Dedicated Server Mode :


Now assuming that we have the ip,db name,port no,transfer protocol,username and password for the DB and we are connecting using toad ,once the connection is initiated there will be a listener which verifies the credentials and forks a dedicated server for the client and then it asks the client to communicate directly to the dedicated server and all the queries ran by the client is now executed by the dedicated server on the behalf of the client.The dedicated server uses the memory and the processes (together known as instance) to accomplish the tasks.For every client for every instance there will be a dedicated server created .if there are 100 clients connecting to the server there will be 100 dedicated servers and each dedicated server is a heavy process .There is a disadvantage here .Even if a client does not use the server often ,the dedicated server remains an can be used only by that client and adding new dedicated servers for every client is not a feasible choice .Hence it is recommended to go for the shared server mode 



Shared Server Mode :

In the shared server mode when a user tries to connect to the server the listener accepts the connection and instead of invoking a dedicated server it forks a light weight process called as dispatcher and asks the client to connect to the dispatcher directly.For each client there will be a separate dispatcher .The dispatcher in turn puts the Tasks into a
data structure called as request queue in the memory i .All tasks by all the dispatchers are formed as a queue  in the request queue  and there is only a single shared server that accepts the task in a round robin format from the request queue and execute them in a sequential order and transfers the result set to another data structure called response queue which takes the output and pass it to the dispatcher which delivers the result to the user .So Even if there are 1000 s of users connecting to the server there will be only a single shared server and 1000 dispatchers .Since dispatchers are light weight process it is far more feasible than having 1000 users with 1000 dedicated servers .Here the dispatcher and the listeners are just mediators and the data is actually executed only in the server .


Now coming back to the oracle server ,we discussed that it is made of instances which has the processes and memory and database which contains the data file .Now we will get into the database part before discussing the instance.



Data Files :

There are the following types of data files available in oracle server .

1.Parameter file
2.Control file
3.Data file
4.Redo log file
5.Trace/Alert file
6.Password file
7.Temp file

We can see them one by one .

Parameter File :

The parameter file consists of several key value pairs and it is very essential to start up a database.In order to start a database we need to give the below command.

STARTUP PFILE=<pfile location>

The pfile is a text editable file which can be opened ,edited and renamed manually.This has 2 main problems. ie if there are 10 administrators and each one of them created their own pfile to start up the database then there wont be any original pfile or one true copy of the pfile .Also since it is manually editable if any of the DBAs wrongly edited the data then the database will throw error during the start-up .if the parameters were changed some time ago along with many other parameters then reverting back will be a problem .
Hence from 9i on-wards oracle introduced a file called SPfile or system parameter file .The System parameter file is a binary file which cannot be edited manually .But still oracle permits the use of pfile also but the recommendation is to use the spfile for start-up of the database.
If we want to change any parameter to the spfile we must issue an alter system command on the database.There is a back end table called v$parameter which holds all the default, non default and calculated parameters.In the Pfile we may not see this many parameters since it will have only the default parameters .For starting up a database we only need the below 3 parameters .

1.CONTROL_FILE
2.DB_BLOCK_SIZE
3 DB_NAME

But in the v$parameter we will come across several parameters .Now if we want to view a parameter whose name is SQL_TRACE ,we should execute the following 

select * from v$parameter where name='SQL_TRACE';

There is another shortcut to view the parameter as:

show parameter sql_trace;

supposing sql_trace is set to FALSE and we want to change it to TRUE.In that case there are the following options.

1.If there is a pfile we can manually edit the pfile and change the value
2.If we are using an spfile ,then we have to issue the following command in the DB .

Alter system set value='TRUE' where name='SQL_TRACE' and scope=memory;
Alter system set value='TRUE' where name='SQL_TRACE' and scope=pfile;
Alter system set value='TRUE' where name='SQL_TRACE' and scope=both ;

There are three types of scopes mentioned here .If the scope is given as memory then the SQL_TRACE value will be changed only for the current instance .Once we re open the session the value will be false again .If the scope is pfile then there wont be any change to the current session but the sql_trace will be set to true in the pfile. So when we re start the database the sql_trace value would be changed .If the scope is both then the current instance also will be changed to true and the spfile also will be changed to the value true .Depending on the necessity of the change we need to decide what scope needs to be given .

We can also create an spfile from the pfile .In case if the spfile is corrupted use strings command to get the key value pairs and copy them to a new pfile and then create a new spfile from that pfile to start the database.Also the non default parameters are stored in the trace files from where we can get the data back.

In order to understand more about the parameters in the pfile read the oracle documentation on parameter files.

Data Files :

Data files are the location where all the data is written into .for example ,if we create a table t and insert the records into the table all the inserted records will be written into the data file.Assume that we  are creating the following table .

create table test
(employee_id number,
employee_name varchar(1000)
)
TABLESPACE USER
storage(initial 50k);

When the above object is created in the database ,internally the database allocates a space of 50k called as segment for the data in the table test .The segment is a logical structure that consists of 1 or more extends .The data inside the extends are contiguous
where as the extends may or may not be contiguous.The extends consists of very small units called as blocks where the data actually resides .All these blocks,extends and segments are contained in a container called as table space .These are all the logical structure of the data and each table space is associated with a data file like a.dbf or b.dbf etc and once we insert a record into the table it is actually getting written into the data file .But the table space ,extends,segments and blocks is the logical structure in which the data will be stored.When we don't specify any storage space oracle assigns a default space for the table and when the table space name is not specified a default table space is 
assigned for the user .Now let us consider the example of the above table creation.Once the table is created ,the following will be the scenario.

1.A segment of 50k is assigned for the table test.This segment can contain any number of extends .let us assume this table has 2 extends of 25 k each .But it could also be 4 extends or 5 extends with variable sizes .now each of our extend has several blocks .Now since we have not inserted any records in the table the blocks are all empty.We have created this table in the USER tablespace and this tablespace is associated to the data file users.dbf.Now we are inserting records into the table .so all the blocks in the extends starts to get filled with data .Let us look at the structure of blocks now .

Description of Figure 2-2 follows



This is how a typical block looks like in a database .The blocks have the following parts .

Header :

The header of the block contains information like what type of data is there in the block a table data or index data an so on ,where is this block pointing to in the disk since the data in the blocks are ultimately written into disc and so this contains the address of the physical disk and the transaction details if any dml is happening in the block

Table Directory:

This is optional but it contains the details of the table involved .In our case this specifies the table name .

Row Directory :

This has the addresses of the rows inserted into the block .

Data Overhead :

The Header along with the row directory and the table directory is called as the data overhead .It generally contains the meta data ie the details about the data .

Free Space :

There are some free space in the block for providing  space for additional inserts.

Row data :

Here the actual data resides in rows .

Now we will come back to the above example .The blocks will be all empty in the beginning and now we are starting to fill the blocks with data .when one extend is fllled with data ,it moves to the next extend .Once all the extends are filled oracle automatically assigns an extend to the segment and starts inserting the records .The data is actually getting written to the data file user.dbf and lets suppose the size of that file is 3 GB.Once the entire file is filled with data ,we cant proceed with our insert since oracle throws an error that there is no space in the disk to write the data into.In that case we need to insert an additional data file say user1.dbf and associate them to the tablespace user .Thus a segment can have as many extends and a tablespace can have as many data files as needed .The same tablespace may contain many other segments also.Suppose I am creating an index on the table test and since an index is a DB object a segment will be created for the index in the user tablespace .Similarly I can create as many tables as I want in a tablespace and associate them to a data file .The maximum size of the data file with 32k block size is 128 TB and with 8k Block size we can have a datafile of maximum 32 TB .

Moving the table/Index data from one table space to another :

Ina scenario where the size of the tablespace  grew exceptionally we might want to move the table into a different tablespace .In that case we can issue a command 


alter TABLE ACCOUNT move
 tablespace users
 storage ( initial 10240 next 10240
 minextents 1 maxextents 121
 pctincrease 50 freelists 1);




Control file :

This is the most important file in the oracle database system.In the parameter file we can see the location of the control file.Control files are binary files which cannot be read but oracle can read the file .It has the following information 

1.Database name
2.Location of data files and the redo log files 

The location of the data file and redo log files are very much needed for the database startup. For starting the DB ,it first reads the init.ora and finds the location of the control files and reads the control files to obtain the name and location of the datafiles and redo files .So the control files should be multiplexed properly .in a production environment there should be ideally 3 to 4 copies of the control files and they are all specified in the init.ora parameter .So even if one disk crashes we can install a new drive and configure the control file in the other disk and start-up the database .Hence Multiplexing the control files are very important for a database .

Temp File :

Suppose we are issuing a select query with an order by clause then those records will reside in the temp file .Oracle handles temp files and data files differently.It will not generate redo for the data that reside in the temp file as it does not see any purpose there .So when we create a table or an important database object we should not use the temp.dbf file since there wont be any redo for that and data cannot be recovered back in case of crisis .





















































                               

Global Area:

This is the shared memory region in oracle. It has the following .

Buffer pool:

It caches all the recent data file blocks ,so that it need not be read from the disk.

Shared pool:

It caches the recently executed query .so that it need not be parsed again.

Large pool:

It caches large lsql blocks of code

Java pool :

It caches the Java code.

Result set cache:

This is the new concept introduced in 11g .It caches the resultset obtained from a query directly .

two types:

Client side resultset cache-->The result set is stored in the client side
Server side resultset cache-->The result set is stored in the server side




 






oracle 11g new concepts

Oracle 11g new concepts

Adaptive cursor sharing :

depends on the selectivity of a query different execution plan is created for a single sql query with bind variable .

This concept is introduced in 11g and prior to that a new concept called peeking was used where the sameplan will be generated and used for queries with the bind variables.

when the selectivity differs very much this will increase the performance many fold:

eg:

first query returns 50 rows.
2nd query returns 10000 rows out of 12000

so an index scan is needed for the first and a full table scan for the second.

in adaptive cursor sharing two different plans will be generated when running both the queries seperately.

two types:

bind sensitive:
bind aware

The cursor will be bind sensitive if the selectivity  remains the same for the queries.
the cursor will become bind aware when  there needs 2 seperate plans needed for both the plans .
whe a new plan is generated the old plan will become unusable .

Cursor merging :

when both the cursor is open for the same plan it will merge both into a single cursor to reduce the memory usage .

Baselines:

This is yet another concept introduced in 11g .when it generates the execution  plan ,it compares the plan with the old ones and if it is better than the old one ,it will add this in the list .if the performance is not better then the old plan is used .this is called as baselines