Reprinted with Permission by Quest Software Aug 2005


Case Study: Using “On Servererror” Triggers to Work Around ORA-4031s

Brian Keating, DatabaseSpecialists 

One of the new features in Oracle 10g is “automated shared memory management”, or “ASMM”. When this feature is enabled, the instance, itself, will dynamically increase and decrease four of the SGA structures, in response to the current workload in the instance. One of main benefits of this feature is that it will (hopefully) minimize the possibility of ORA-4031 errors in the instance – because this feature will presumably increase the memory available to the shared pool automatically, before the onset of an ORA-4031.

Of course, for many sites it is not feasible to upgrade to 10g at the present time; so it is useful to determine if it is possible to implement any sort of “dynamic SGA” mechanism in previous releases. In particular, this mechanism should be able to dynamically increase the size of the shared pool, in response to ORA-4031 errors. It turns out that it is possible to implement such a mechanism in 9i; with a combination of the sga_max_size parameter, and “on servererror” triggers. This document will provide an example of setting up such a mechanism, along with providing some caveats on using this mechanism.

First of all, in 9i, it is possible to dynamically increase many of the SGA structures; however, the total size of all of the SGA structures can never exceed the value of the sga_max_size parameter. Also, the default value of sga_max_size is equal to the sum of all of the SGA components’ initial values (from the spfile or init.ora file). Finally, it is not possible to increase the sga_max_size parameter, itself, dynamically. Therefore, if you wish to be able to increase any of the individual components’ values dynamically, then you will have to explicitly set the sga_max_size parameter – in the spfile or init.ora file – to a value higher than the sum of all of the individual components’ values.

The other item required in this mechanism is an “on servererror” trigger. These types of system triggers fire whenever the instance encounters a particular, specified ORA- error. Therefore, it is possible to create a trigger that will increase the size of the shared pool, whenever the instance encounters an ORA-4031 error. An example of such a trigger is as follows (note that some of the lines wrap around to the next line):

CREATE OR REPLACE TRIGGER check_4031_trig
AFTER SERVERERROR ON DATABASE

DECLARE
  alter_string varchar2(100);
  new_sp_size number;
  cursor c1 is select value from v$parameter where name = 'shared_pool_size';
  cursor c2 is select instance_name, host_name from v$instance;
  g_mail_conn         utl_smtp.connection;
  g_mailserver        varchar2(80) := 'my_mailserver.my_domain';
  g_mailserver_port   NUMBER  := 25;
  g_mail_from         varchar2(1024);
  p_email             varchar2(1024) := 'my_email@my_address';
  v_subject           varchar2(4000);
  e_email_error       exception;
  crlf                VARCHAR2( 2 ):= CHR( 13 ) || CHR( 10 );

BEGIN

  if is_servererror(4031) then

    for cur_sp_size in c1 loop

      if cur_sp_size.value >= 2147483648 then
        null; -- don't increase the shared pool if it's already over 2 Gb

      else
        new_sp_size := cur_sp_size.value + 104857600;
        alter_string := 'alter system set shared_pool_size = '||new_sp_size;
        execute immediate (alter_string);

        for cur_db_host in c2 loop

          g_mail_from := 'oracle@'||cur_db_host.host_name;

          v_subject :=
            'From:'||g_mail_from|| crlf ||
            'To:'||p_email|| crlf ||
            'Subject: Warning: Shared pool increased in'||cur_db_host.instance_name|| crlf
            || crlf ||
            'Attention: an ORA-4031 error has occurred in the '||cur_db_host.instance_name||' instance on '||cur_db_host.host_name||'; and as a result the check_4031_trig trigger increased the shared pool size in that instance from '||cur_sp_size.value||' bytes to '||new_sp_size||' bytes.';

          g_mail_conn := utl_smtp.open_connection(g_mailserver, g_mailserver_port);
          utl_smtp.helo(g_mail_conn, g_mailserver);
          utl_smtp.mail(g_mail_conn, g_mail_from);
          utl_smtp.rcpt(g_mail_conn, p_email);
          utl_smtp.data(g_mail_conn, v_subject);
          utl_smtp.quit(g_mail_conn);

        end loop;

      end if;

    end loop;

  end if;

END;
/
As written above, this trigger increases the size of the shared pool by 100 megabytes, whenever an ORA-4031 error occurs in the instance in question. The trigger then sends an email message stating that fact to the email address specified in the “p_email” variable, through the mail server specified in the “g_mailserver” variable.

I have implemented such a trigger in one of my production environments, and for the most part, the trigger has worked out well. Before this trigger was implemented, ORA-4031 errors would frequently cause a “snowball” effect; i.e., the first ORA-4031 error would cause more and more subsequent ORA-4031 errors to occur, as contention for the shared pool became more and more intense. Needless to say, these ORA-4031 errors were quite disruptive to end users. Since this trigger has been implemented, the “snowball” effect has been eliminated; because the first ORA-4031 error causes the trigger to fire – and that trigger causes 100 additional megabytes of unfragmented memory to become available to the shared pool.

There are three warnings about the use of this mechanism. First – and most importantly – this mechanism is not a replacement for using good coding practices! In particular, one should never assume that this mechanism will eliminate the need to use bind variables appropriately. I completely agree with Tom Kyte, who has written extensively on the paramount importance of bind variables in relation to shared pool operations. Of course, in many cases DBAs do not have any influence over the SQL that will access a given instance (particularly in the case of third party applications); but it needs to be stated that the most effective way to prevent ORA-4031 errors is to use bind variables appropriately, in the SQL in question.

The second item of note is that this trigger might encounter problems when it tries to fire, if it is not “kept” in the shared pool. The reason for this is that if the trigger is not kept, then its code will almost certainly not be present in the shared pool, whenever the trigger tries to fire. As a result, the instance will have to load the trigger's code into the shared pool at the very moment that the trigger tries to fire – and since that trigger only tries to fire when an ORA-4031 error has occurred, it stands to reason that the instance may have problems when it tries to load the trigger's code into the shared pool.

I have seen this very issue occur with this trigger, many times. When I initially implemented the trigger, it was not kept in the shared pool; and as a result, this trigger would frequently encounter errors when it tried to fire. The errors encountered by the trigger were not ORA-4031 errors, though; the errors were ORA-600 [17113] errors. ORA-600 [17113] errors indicate shared memory corruption; so those errors also seem to be a result of shared pool contention. Note that the trigger did not always encounter these errors – sometimes the trigger was able to fire successfully, without any errors.

In any case, I have recently caused that trigger to get kept in the shared pool, every time the instance starts up (with the use of an “on database startup” trigger). Since then, the number of occurrences of ORA-600 errors encountered by the trigger has been dramatically reduced; but it has not been eliminated – the trigger still does occasionally encounter such errors, despite the fact that it is kept in the shared pool. I assume the reason for this is that the trigger, as written above, opens two cursors. So, I have now written a modified version of the trigger, which does not open any cursors – this version of the trigger just sets the shared pool to a hard-coded value (in this case, 1 gigabyte). The source code of the modified trigger is as follows:

CREATE OR REPLACE TRIGGER check_4031_trig
AFTER SERVERERROR ON DATABASE

DECLARE
  g_mail_conn         utl_smtp.connection;
  g_mailserver        varchar2(80) := 'my_mailserver.my_domain';
  g_mailserver_port   NUMBER  := 25;
  g_mail_from         varchar2(1024);
  p_email             varchar2(1024) := 'my_email@my_address';
  v_subject           varchar2(4000);
  e_email_error       exception;
  crlf                VARCHAR2( 2 ):= CHR( 13 ) || CHR( 10 );

BEGIN

  if is_servererror(4031) then

        execute immediate ('alter system set shared_pool_size = 1073741824');

          g_mail_from := 'oracle@my_host';

          v_subject :=
            'From:'||g_mail_from|| crlf ||
            'To:'||p_email|| crlf ||
            'Subject: Warning: Shared pool increased in my_instance'|| crlf
            || crlf ||
            'Attention: an ORA-4031 error has occurred in the my_instance instance on my_host; and as a result the check_4031_trig trigger increased the shared pool size in that instance to 1073741824 bytes.';

          g_mail_conn := utl_smtp.open_connection(g_mailserver,g_mailserver_port);
          utl_smtp.helo(g_mail_conn, g_mailserver);
          utl_smtp.mail(g_mail_conn, g_mail_from);
          utl_smtp.rcpt(g_mail_conn, p_email);
          utl_smtp.data(g_mail_conn, v_subject);
          utl_smtp.quit(g_mail_conn);

  end if;

END;
/

The version of the trigger above has never encountered any ORA-600 errors. Of course, this version of the trigger is only useful the very first time it is executed (after an instance restart) –
because the value of the shared pool size is hard-coded to a specific value. Therefore, subsequent firings of that trigger will set the shared pool size to the same value that it is currently set to – i.e., subsequent firings will have no effect. The trigger could be manually updated to set the shared pool to a higher value, of course; but the whole point of this mechanism is to avoid manual intervention.

The third and final item of note is that an alternative to this mechanism is to increase the shared pool size permanently, in the spfile – and therefore have the larger shared pool available at all times, rather than only having a larger shared pool after the trigger fires. There are two main reasons why increasing the shared pool permanently is a less effective solution than this mechanism. First of all, increasing the size of the shared pool has its own drawbacks; in particular, it causes the instance to have to search through a larger free list, in order to find free shared memory. As a result, increasing the size of the shared pool can potentially cause performance degradation – and in fact I have personally seen such degradation occur, multiple times. The other main reason that no matter how large the shared pool is, the entire shared pool can presumably end up with some degree of fragmentation. As a result, unless the SQL in question uses 100% bind variables, my experience is that an instance will always have the potential to encounter ORA-4031 errors, no matter how large its “permanent” shared pool size is. The advantage of this mechanism is that the additional memory that is added to the shared pool by the trigger is initially unfragmented – which allows the instance to use that additional memory to satisfy current shared pool requests.


About the Author

Brian Keating, OCP, has been an Oracle DBA and Unix system administrator for over ten years. He is currently employed as a senior database administrator at Autodesk, Inc., in San Rafael. You can contact Brian Keating at brianpkeating@yahoo.com.