Monday, June 19, 2006

 

I/O Performance and Oracle

One of the most common performance problems that I encounter is an insufficient I/O subsystem. This is especially true when working in an Oracle (or SQL Server) environment, because of the Database Server’s sensitivity to high I/O latencies. With Oracle, the time it takes for each I/O operation is much more important than the actual throughput that you can achieve (MB/sec). Because of this, the I/O subsystem should be monitored, tuned and sized sufficiently.

One of the biggest problems that performance tuning consultants have encountered in the last few years is the introduction of larger and larger disk drives. This has led to database systems being created with larger and larger disk drives, but fewer of them. This has led to an epidemic of I/O performance problems. Let me explain further.

Most I/O in Oracle is random in nature. This is due to the fact that you have a system where many people are looking for many diverse pieces of data. In addition, structures such as indexes work by retrieving one block that points to another block and finally to the data block itself. Thus almost all I/O can be considered random.

The basic structure of the disk drive has not changed significantly in the last few years. Since disk drives have gone from 2 GB in size to 300 GB in size, the average seek time has dropped from 9 ms to 4.5 ms. This represents a 150 times increase in size, with a 2 times increase in performance. This has caused a problem where large databases are being configured on fewer and fewer disk drives.

To make things worse, some vendors are now selling Serial-ATA drives that have an average seek time of 8 ms or higher. This represents a 250 times larger drive with essentially no increase in performance. SATA drives should be avoided for Oracle storage if Fibre Channel or SCSI drives could be used.

At the same time, with Oracle on Windows (32-bit) databases are getting larger and larger with memory not increasing significantly. This has caused an upturn in I/O problems occurring with Oracle systems on Windows. The good news is that the solution to this problem is simple, add more disk drives.

How do you determine if you are having I/O problems? There are several methods. The Windows performance monitor (Perfmon) provides a wealth of I/O performance data. To me, the most significant I/O counters are Avg. Disk sec/Read and Avg. Disk sec/Write. These represent the disk latencies and are your first indication that you are having a problem. Disk latencies should be in the range of 5 ms – 15 ms for reads and 0 – 10 ms for writes (if a write cache is used). The performance counters for Windows are very accurate and very useful in determining if you have a performance problem.

In addition to the Windows Perfmon counters, Oracle AWR reports can also be quite useful in determining if you are having an I/O problem. Look for the top 5 wait events. If you see one or more of the Top 5 Timed Events in the System I/O wait class, this is an indication of an I/O problem. In addition, the I/O statistics also report average read ms and average write ms per data file and per tablespace. This information is very useful.

In summary, I/O is critical to the performance of your Oracle Database system. Don’t confuse size with performance. The performance capacity of your I/O subsystem depends on the number of disk drives, not the size of the drives. Performance and sizing are critically linked. An undersized I/O subsystem will perform poorly no mater how much CPU capacity and memory you have. If you are experiencing performance problems, check the I/O subsystem first.

Comments: Post a Comment



<< Home

This page is powered by Blogger. Isn't yours?