Monday, June 26, 2006
Oracle Performance Monitoring on Windows
Performance monitoring is a regular task for most Oracle DBAs. For Oracle on Windows this is no exception. The only difference are the tools that are available to you. With Oracle on Windows you still have all of the tools available to you that are available on other platforms, such as the Automatic Workload Repository, but in addition, you can view Oracle statistics via the Windows Performance Monitor (perfmon).
There are several advantages to monitoring Oracle via perfmon. By using perfmon, you can collect Oracle data as well as OS data that might be important to your overall performance monitoring plan. By consolidating your data collection, you will find it easier to analyze data. In addition, perfmon data can be collected and analyzed via the Microsoft Operations Manager (MOM) product. MOM allows you to collect and analyze long term data as well as to configure alerting functions that can let you know that there is a problem.
By default, Oracle statistics are not collected for use by perfmon. By following the steps outlined here you can quickly and easily configure Oracle for perfmon data collection. By default, you still have many ways of monitoring Oracle. Among these are the following:
- Oracle Automatic Workload Repository. The Automatic Workload Repository (AWR) is performance data that is automatically collected by Oracle for performance and trouble analysis.
- Event Viewer Alerts. The event viewer alerts are events that are automatically logged directly into the Windows alert log.
- Oracle Alert Log. The Oracle alert log is the log where Oracle logs status and error messages.
- Trace Analyzer. Oracle sessions can be traced and that trace data analyzed via the Oracle trace analyzer.
The Oracle Counters for Windows Performance Monitor package is not installed by default. In order to install them when you install Oracle, select the custom install option. You can also install this option later via the Oracle installer. Select custom installation and select the Oracle for Windows Performance option. This will install this package.
Once Oracle Counters for Windows Performance Monitor has been installed, you must perform one more piece of setup. The Oracle performance counters are set up to monitor one Oracle instance. Information about this instance must be configured in the registry. In order to do this, from a command prompt run orafcfg.exe with a username, password and Oracle net service name as follows:
operfcfg –U system –P password –D orcl
This will update the registry. You should now be able to monitor Oracle via perfmon. Some of the things that you can monitor are:
- The Oracle Buffer Cache. Here you can see the cache miss ratio.
- Shared Pool Stats. This collection includes the data dictionary cache, and the library cache.
Log Buffer. Provides information on log space requests.
- Database Data Files. This object provides physical read and write per second counters.
- DBWR stats. Provides information on the DB Writer processes.
- Miscellaneous. Other statistics include dynamic space management, free lists and dynamic sorts.
By taking advantage of Oracle Counters for Windows Performance Monitor you can easily and efficiently monitor Oracle along with monitoring the OS. As I mentioned earlier, perfmon provides valuable performance data that is easily collected and analyzed. Some of the most important and first counters that I look at when performance monitoring a system are:
- Processor: %Processor Time. This gives me a quick look at how busy the system is.
- Physical Disk: Avg. Disk sec/Read, Avg. Disk sec/Write. This provides me with an overview of how well the I/O subsystem is doing.
When first looking at a system I am actually more interested in disk latencies than throughput. The Avg. Disk sec/Read and Avg. Disk sec/Write should be in the range of 5-15 ms (0.005 – 0.015). Anything higher than this indicates a problem. Once I have detected a problem, I can try to determine its cause.
Of all of the platforms I work on; Linux, AIX, HP/UX, and Solaris, I find that Windows provides the best performance monitoring tool. Perfmon is easy to use, quite intuitive and easy to view and save the results in a number of formats. Oracle Counters for Windows Performance Monitor is not an end-all, but in conjunction with AWR stats and tracing will help to provide a complete picture.
Monday, June 19, 2006
I/O Performance and Oracle
One of the biggest problems that performance tuning consultants have encountered in the last few years is the introduction of larger and larger disk drives. This has led to database systems being created with larger and larger disk drives, but fewer of them. This has led to an epidemic of I/O performance problems. Let me explain further.
Most I/O in Oracle is random in nature. This is due to the fact that you have a system where many people are looking for many diverse pieces of data. In addition, structures such as indexes work by retrieving one block that points to another block and finally to the data block itself. Thus almost all I/O can be considered random.
The basic structure of the disk drive has not changed significantly in the last few years. Since disk drives have gone from 2 GB in size to 300 GB in size, the average seek time has dropped from 9 ms to 4.5 ms. This represents a 150 times increase in size, with a 2 times increase in performance. This has caused a problem where large databases are being configured on fewer and fewer disk drives.
To make things worse, some vendors are now selling Serial-ATA drives that have an average seek time of 8 ms or higher. This represents a 250 times larger drive with essentially no increase in performance. SATA drives should be avoided for Oracle storage if Fibre Channel or SCSI drives could be used.
At the same time, with Oracle on Windows (32-bit) databases are getting larger and larger with memory not increasing significantly. This has caused an upturn in I/O problems occurring with Oracle systems on Windows. The good news is that the solution to this problem is simple, add more disk drives.
How do you determine if you are having I/O problems? There are several methods. The Windows performance monitor (Perfmon) provides a wealth of I/O performance data. To me, the most significant I/O counters are Avg. Disk sec/Read and Avg. Disk sec/Write. These represent the disk latencies and are your first indication that you are having a problem. Disk latencies should be in the range of 5 ms – 15 ms for reads and 0 – 10 ms for writes (if a write cache is used). The performance counters for Windows are very accurate and very useful in determining if you have a performance problem.
In addition to the Windows Perfmon counters, Oracle AWR reports can also be quite useful in determining if you are having an I/O problem. Look for the top 5 wait events. If you see one or more of the Top 5 Timed Events in the System I/O wait class, this is an indication of an I/O problem. In addition, the I/O statistics also report average read ms and average write ms per data file and per tablespace. This information is very useful.
In summary, I/O is critical to the performance of your Oracle Database system. Don’t confuse size with performance. The performance capacity of your I/O subsystem depends on the number of disk drives, not the size of the drives. Performance and sizing are critically linked. An undersized I/O subsystem will perform poorly no mater how much CPU capacity and memory you have. If you are experiencing performance problems, check the I/O subsystem first.
Saturday, June 10, 2006
Oracle High Availability on Windows using Fail Safe
Oracle on Windows supports the same Oracle Disaster Recovery (DR) and High Availability (HA) features as other platforms, such as Oracle RAC and Data Guard. In addition to these, Oracle on Windows supports Oracle Fail Safe. These are all excellent solutions with different usages, costs and performance.
Oracle RAC is a high availability solution that allows multiple Oracle instances to access the same database. This provides for both high availability and additional performance, since all nodes are active. Oracle RAC is an excellent product, but somewhat complicated to install and manage and can be expensive.
In addition to Oracle RAC is Oracle Data Guard. Data Guard is really more of a Disaster Recovery (DR) solution than a High Availability solution, but it can be very useful in the event of a system failure. Data Guard is a fantastic product which I have installed many times. Data Guard is one of those products where the base installation can be fairly straight forward, but there are so many options and modes of operation that it can quite quickly become very complicated.
The final solution in this discussion is Oracle Fail Safe. Fail Safe is a Windows only product that makes Oracle Microsoft Cluster Services (MSCS) aware. Failsafe is available for all Windows platforms (32-bit, x64 and Itanium) and is downloadable from the Oracle web site. Fail Safe is a plug in to MSCS to allow the Oracle instance and listener to be managed under the control of MSCS. This allows the Oracle instance to fail over in the event of a failure of one of the nodes.
The MSCS architecture is very straight forward. Two systems share the same disk, of which only one system controls at a time. In the event of a failure (determined by the heartbeat mechanism), the instance currently running the Oracle instance and controlling the storage is replaced by the standby system. This allows the Oracle database to be back up and running in a matter of minutes, thus improving availability.
Oracle Fail Safe has the advantage over RAC in that it is much easier to implement and administer. Most Windows shops already have implemented MSCS and are very familiar with it. In addition, Oracle Fail Safe is a core feature of Oracle9i and Oracle10g for Windows and no additional license is needed.
With Oracle Fail Safe you can fail over the following Oracle components:
- Oracle Database Server.
- Oracle Listener.
- Oracle Intelligent Agent.
The failover is seamless and works very well. In addition, the management tools allow you to configure Fail Safe to automatically fail back a node at a scheduled time, thus allowing you to configure different hardware resources for the primary and standby servers.
I have worked on multiple Oracle Fail Safe installations. In a Windows environment where you want to have assurance of High Availability at a low entry point cost, Oracle Fail Safe might be for you.
Configuring Oracle Fail Safe is easy once the Microsoft Cluster Services have been installed. The following steps are done in order to install Oracle Fail Safe.
- Install Windows on both servers.
- Install MSCS (if applicable) and create the cluster.
- Install Oracle Database Software on each node’s local disks, using the same Oracle Home directory.
- Create the Oracle database on one node using the shared disk for storage.
- Install Oracle Fail Safe on each node in the cluster.
- Configure the cluster using the Fail Safe manager.
You will find that Fail Safe is an easy to use, inexpensive way to provide high availability in a Windows environment.
Wednesday, June 07, 2006
Oracle RAC on Windows
For those of you who are not familiar with Oracle RAC, let me give you a brief overview. Oracle RAC (Real Application Clusters) has been around since Oracle9i. It is a follow-on product to Oracle Parallel Server (but much better). In essence, Oracle RAC is a clustering product that allows Oracle on multiple servers to access the same database.
This is unlike a typical failover cluster in that all nodes are active and accessing the same data. In a failover cluster only one node can access the database at a time. With RAC it is all nodes all the time.
Oracle defines an instance to be the memory and processes necessary to access the Oracle database. In a RAC cluster, there are multiple instances accessing the same database. RAC accomplishes this via a sophisticated locking mechanism that keeps data consistent among the nodes. Because of the way RAC does its locking and cache consistency, the performance of the cluster interconnect is critical. The cluster interconnect must be Gigabit Ethernet as a minimum.
Setting up RAC on Windows is pretty much the same as setting up RAC on other platforms with a few exceptions. Because of the way Windows does its networking, it is important to have the Public (talks to the world) interface listed first in the advanced network properties.
All nodes in the cluster must be able to access all other nodes without having to type in a password. This can be accomplished via several methods. The preferred method is to install and configure Oracle and RAC using a domain account. This account must be available on all nodes in the cluster and it must have local administrator rights. The second method is to use the local Administrator account, but make sure that the password is identical on all nodes in the cluster.
Because all nodes in the cluster access the same database, they must all be able to see the same storage. There are several methods supported under Windows for sharing storage. They include the use of raw devices, Oracle ASM (Automatic Storage Management) and the Oracle Cluster Filesystem (OCFS). Oracle ASM and raw devices are available on all platforms that support RAC, but OCFS is only available on Windows and Linux. I have implemented RAC clusters on Windows using all of these methods, and they all work well, but raw devices can be very difficult to administer. I prefer to either use ASM or OCFS. With OCFS and OCFS-2, Oracle supports the use of a shared Oracle Home directory so that there is only one copy of the Oracle binaries.
Oracle RAC is an important feature to Oracle since it provides a scalable solution that allows smaller systems to provide greater overall performance to the cluster. This opens the door for commodity hardware and operating systems such as Microsoft Windows to provide higher performance systems by combining the power of multiple systems into one cluster. Oracle currently supports up to an 8 node cluster with Oracle 10g R2 on Windows, however, the largest RAC cluster on Windows that I have been involved with has been 6 nodes. The most common clusters that I have seen are 4 or 2 node clusters.
I recommend using the largest system that you can, so that as you need more nodes you can add them in one at a time. It is easier to add a node than to add more resources to an existing node. Don’t forget to allocate plenty of memory for Oracle. The RAC cluster is scalable, but you will eventually run out of bandwidth on the interconnect if you add too many nodes (depending on what you are doing).
In general, Oracle RAC is a great product that provides for both scalability and failover capabilities. Microsoft Windows is a great platform to run Oracle RAC on.
Edward Whalen is the Chief Technologist and Founder of Performance Tuning® Corporation and has been working with Oracle for over 15 years. He is currently working on his 9th book; “SQL Server 2005 Administrator’s Companion”. He holds the distinction of being one of the few people to write books on both Oracle and MS SQL Server and writes for both Oracle Press and Microsoft Press.