Saturday, 21 September 2013

SCHED_NOAGE CPU problems with Oracle log writer on HPUX

One of Oracle's recommendations running Oracle on HPUX is using the SCHED_NOAGE CPU scheduler:

http://docs.oracle.com/cd/B28359_01/server.111/b32009/appb_hpux.htm

This is instead of the standard HPUX CPU scheduler which will change the priority of processes during their lifetime.  One of the reasons for this recommendation is to avoid  Cursor: pin-s problems when CPU peaks are common:

http://srivenukadiyala.wordpress.com/2012/01/30/sched_noage-and-latch-contention/

By changing the CPU scheduling policy, we introduced another problem.  One of the charactiristics of SCHED_NOAGE is that every process is started with the same priority.  Default, oracle recommends to use 178, which is the highest priority (the lower the number, the higher the priority) you can give to a process in SCHED_NOAGE.  The lower numbers are reserved for other scheduling policies (like realtime scheduling).  

The problem we have encountered is that at high loads, the log writer has to wait for other processes to complete.  Processes that are waiting for a commit must wait for the log writer to recieve CPU time.  This can cause heavy delays on the applications waiting for these commits.  

To solve this, we changed the priority of the log writer process (lgwr_DBNAME) so it would have priority over all the other processes.  

This limited the commit time trimendously.  Application performance boosted.  

The ideal way to give more CPU time to the log writer process, is to change the default priority of all the Oracle processes to for example 180.  After the DB is started, the priority of the log writer is changed to 178. 

The only problem is that Oracle support recommended against it, but was unable to supply an alternative.


No comments:

Post a Comment