Reprinted with Permission by Quest Software April  2004


Transaction, Heal Thyself!  Part 3
Darryl Hurley, Pinnacle Publishing

Oracle 9i introduces a long awaited feature that suspends certain transactions instead of failing them straight away. In last month’s article Darryl discussed how to augment resumable statements to automatically resolve storage issues. In this third article of the series he focuses on a key area of resumable statements, namely, responding to timeouts and terminations.

Click here to download the code used in this series of articles.

Where Are We Now?

The first two parts of this series laid the groundwork for automatically fixing space related errors by augmenting Oracle 9i’s resumable statement functionality. This expansion comes complete with a catchy name, THY, which loosely stands for Transaction, Heal Thyself. It’s also the name of the single Oracle schema holding all of the working code described below.

AFTER LOGON Trigger
A database wide trigger that fires whenever an Oracle session is established to enable resumable statements for qualified users.

AFTER SUSPEND Trigger
Another database wide trigger firing whenever an Oracle session is suspended due to an Oracle space related error. This trigger saves error details for later processing by the FIXER package.

FIXER Package
Running at regular intervals via DBMS_JOB this suitably named PL/SQL package engineers fixes for statements collected by the AFTER SUSPEND trigger.

SPACE_QUOTA_EXCEEDED Package
As the name implies, this procedure resolves space quota errors when invoked by the FIXER package.
They all work together in a scenario like this:
  1. User logs on and the AFTER LOGON trigger enables resumable statements
  2. User executes a statement requiring more tablespace than their allotment
  3. The statement enters a suspended (and potentially resumable) state
  4. The AFTER SUSPEND trigger fires and commits details of the suspension into a table
  5. The FIXER package retrieves suspension information from the table
  6. The FIXER package calls the SPACE_QUOTA_EXCEEDED procedure
  7. The SPACE_QUOTA_EXCEEDED procedure fixes the error
  8. The suspended statement magically resumes
Retrying Fixes

The delusional scenario above assumes a single fix attempt is enough to free the suspended session. What if it’s not quite enough? Imagine a situation where a quota increase of 100K would allow a suspended statement to resume but its repair entry adds 5K at a time. Should the statement be aborted or should the fix be retried? Being a firm believer in (and frequent recipient of) second chances, I think I’ll add multiple tries to my solution.

Two things must be established before adding code, the maximum attempts allowed for a fix and a count of actual attempts made. To be flexible, I allow differing maximums for each user/tablespace combination by adding a MAX_ATTEMPTS column to the SPACE_QUOTA_EXCEEDED_INFO table. Seeing as fixes in progress are recorded in the FIXER_INFO table it’s the perfect place for a NUM_ATTEMPTS column.

This example entry dictates that if SCOTT exceeds his USERS tablespace quota there will only be 20 attempts made to alleviate the problem adding 100 bytes at a time.

SQL> SELECT username,
  2     tablespace_name,
  3     max_attempts,
  4     increment_value 
  5     FROM space_quota_exceeded_info; 
  
USERNAME TABLESPACE_NAME MAX_ATTEMPTS INCREMENT_VALUE 
-------- ---------------- ------------ --------------- 
DRH      USERS                      20             100

SQE = SPACE_QUOTA_EXCEEDED Procedure

To reduce confusion and typing I will refer to this procedure as SQE from here on in.

Now on to the code! The SPACE_QUOTED_EXCEEDED (SQE) and FIXER packages work together to manage multiple retries like this:

SQE already queries fix information so I just have to add a call back to FIXER as show in Listing 1.
OPEN curs_get_info(fixer.v_fixer_info_rec.username,
                   fixer.v_fixer_info_rec.object_name);
FETCH curs_get_info INTO v_increment_value,
                         v_timeout_or_abort,
                         v_max_attempts,
                         v_tight_loop;

-- if handling info exists...
IF curs_get_info%FOUND THEN

  -- inform FIXER of maximum
  fixer.set_max_attempts(v_max_attempts);

Listing 1. SQE informing FIXER of maximum attempts value

The corresponding procedure in FIXER simply sets a local variable (i.e.: declared in the FIXER package body only) as shown in Listing 2.

/*---------------------------------------------------*/
PROCEDURE set_max_attempts(p_max_attempts NUMBER) IS
/*---------------------------------------------------*/
BEGIN
  v_max_attempts := p_max_attempts;
END set_max_attempts;

Listing 2. Setting maximum attempts value

The FIXER package also compares the current number of attempts with the maximum allowed to determine how to proceed as shown in Listing 3.

-- exceeded the maximum number of attempts allowed?
IF NVL(v_max_attempts,0) = 
   NVL(v_fixer_info_rec.num_attempts,0) THEN

  -- set attempts, processed date and comment
  UPDATE fixer_info
  SET num_attempts = NVL(num_attempts,0) + 1,
      processed = SYSDATE,
      comments = 'Max attempts exceeded'
  WHERE fixer_id = v_fixer_info_rec.fixer_id;

ELSE  -- have not exceeded max attempts

  UPDATE fixer_info
  SET num_attempts = NVL(num_attempts,0) + 1
  WHERE fixer_id = v_fixer_info_rec.fixer_id;

END IF;
Listing 3. Checking maximum attempts to fix

Attempts Must Be Recorded In The Database

It’s imperative that the count of fix attempts be saved to the database so it can be reestablished when another attempt is made - remember there may be more than one statement seeking resolution.

Tight Loops

So each suspended statement is processed in order of suspension time and receives a single fix attempt. Only after all other suspended statements take a turn does it get another try. But some statements like the company payroll for example merit immediate successive fix attempts before continuing with more routine tasks. I implemented this capability under the moniker of Tight Loops.

Implementing this feature turned out to be as easy as coming up with its name. Here’s what I did:

/*-- v_tight_loop declared in package body --*/
/*---------------------------------------------------*/
PROCEDURE set_tight_loop( p_tight_loop VARCHAR2 ) IS
/*---------------------------------------------------*/
BEGIN
  v_tight_loop := p_tight_loop;
END set_tight_loop;
Listing 4. Advisement of tight loop processing
-- if attempting to fix same entry again (tight
-- loop) dont bother fetching
IF v_tight_loop = 'N' THEN
  FETCH curs_get_info INTO v_fixer_info_rec;
  v_tight_attempts := 0;
END IF;
Listing 5. Don’t fetch if in tight loop

Being a very paranoid programmer (as you will see from the code) I also call SET_TIGHT_LOOP passing ‘N’ (No) within each and every possible conclusion in the FIXER package. And that segues nicely into the next topic, possible fix outcomes.

Possible Fix Outcomes

Now that fixes can automatically be applied, it’s important to evaluate their outcomes and there are several possibilities as described below.

Statement No Longer Suspended

It’s entirely possible that a suspended statement can become un-suspended between the time its details are queried from the FIXER_INFO table and the time it is processed. This is easily detected by calling the STATUS_CHECK function I introduced in the second article in this series. I check if the session is still in a suspended state and if not, I set the status to ‘Expired’ as shown in Listing 6.

-- if the entry is not suspended any more then set the
-- entry as expired
IF NOT status_check(v_fixer_info_rec.sid,'SUSPENDED') 
  THEN
  -- set entry as expired
  UPDATE fixer_info
  SET processed = SYSDATE,
      comments = 'Expired'
  WHERE fixer_id = v_fixer_info_rec.fixer_id;`
  -- reset tight loop to ensure fetch of next record
  set_tight_loop('N');
Listing 6. Handling statement no longer suspended

No Fixing Routine

Recall that in a fit of simplicity I implemented this code in the FIXER package to call a procedure based on the error type.

EXECUTE IMMEDIATE 'BEGIN ' ||
REPLACE(v_fixer_info_rec.error_type,' ','_') || ';' ||
                  'END;';
So SPACE QUOTA EXCEEDED errors trade blanks for underscores and call SPACE_QUOTA_EXCEEDED. But what if that procedure doesn’t exist? Should everything grind to a halt? I think not. But it should at least be logged as shown in Listing 7.
BEGIN
…
-- attempt to call stored procedure to fix problem
EXECUTE IMMEDIATE 'BEGIN ' ||
REPLACE(v_fixer_info_rec.error_type,' ','_') || ';' ||
                  'END;';
…
EXCEPTION  -- problem calling stored procedure
  WHEN OTHERS THEN
  -- if the error stack contains PLS-00201 that means
  -- the procedure we tried to call does not exist so
  -- just update the entry
  IF INSTR(DBMS_UTILITY.FORMAT_ERROR_STACK,
           'PLS-00201') > 0 THEN
    UPDATE fixer_info
    SET processed = SYSDATE,
        comments = 'No Fix Routine Available'
    WHERE fixer_id = v_fixer_info_rec.fixer_id;
Listing 7. Handling lack of fixing procedure

The exception handler simply examines the error stack for PLS-00201 signifying a non-existent object was called and responds accordingly. Listing 8 shows a simple example of the PLS-00201 error.

SQL> EXEC i_dont_exist;
BEGIN i_dont_exist; END;

      *
ERROR at line 1:
ORA-06550: line 1, column 7:
PLS-00201: identifier 'I_DONT_EXIST' must be declared
ORA-06550: line 1, column 7:
PL/SQL: Statement ignored
Listing 8. Demonstration of PLS-00201

No Fix Information Exists

What if the called procedure exists, but there’s no applicable fix information defined. For example, if no entry exists in the SPACE_QUOTA_EXCEEDED_INFO table for the particular user and tablespace. I handled this by having SQE notify FIXER if it did not find a suitable entry (Listing 9).

/*-- v_info_exists declared as boolean in
     fixer package body --*/
/*------------------------------------------*/
PROCEDURE info_exists IS
/*------------------------------------------*/
BEGIN
  v_info_exists := TRUE;
END info_exists;
Listing 9. Declare that fix info exists

The FIXER package checks the flag (Listing 10) straight away after calling the fixing procedure.

-- if no fix info existed...
IF NOT v_info_exists THEN

  UPDATE fixer_info
  SET processed = SYSDATE,
      comments = 'Specific Fix Info Not Available'
  WHERE fixer_id = v_fixer_info_rec.fixer_id;
Listing 10. Checking fix info flag

Problem Fixed

Once the fixing procedure is called and a few second sleep has occurred (see second article) I check if the statement is still suspended and if not then I make the bold assumption that I fixed it as revealed in Listing 11.

-- if entry no longer exists as suspended then assume
-- we fixed it so set it as such...
IF status_check(v_fixer_info_rec.sid,'NORMAL') THEN
  UPDATE fixer_info
  SET processed = SYSDATE,
      comments = 'Fixed'
  WHERE fixer_id = v_fixer_info_rec.fixer_id;
Listing 11. Problem fixed!

Problem Not Fixed

Not fixing the problem is nothing to get upset about unless the maximum attempts allowed has been exceeded as shown in Listing 12.

-- if we have exceeded the maximum number of
-- attempts allowed (tight  or not then set entry
-- as such
IF NVL(v_max_attempts,0) = 
     NVL(v_fixer_info_rec.num_attempts,0) + 1 OR
   NVL(v_max_attempts,0) = 
     NVL(v_tight_attempts,0) + 1 THEN

  -- set attempts, processed date and comment
  UPDATE fixer_info
  SET num_attempts = NVL(num_attempts,0) + 1,
      processed = SYSDATE,
      comments = 'Max attempts exceeded'
  WHERE fixer_id = v_fixer_info_rec.fixer_id;

ELSE  -- have not exceeded max attempts

 -- update attempts, processed date and comment
 UPDATE fixer_info
 SET num_attempts = NVL(num_attempts,0) + 1
 WHERE fixer_id = v_fixer_info_rec.fixer_id;

END IF;  -- have not exceeded maximum attempts
Listing 12. Check maximum attempts exceeded

Session Went Away

Before I discuss this one I have to admit this scenario would not have crossed my mind except that it occurred during testing for this article. This proves that one of the best ways to test application code is by attempting to write an article about it. That’s something I recommend every one try - but I am getting off on a tangent. Back to the code!

This circumstance also exposed a weakness in the STATUS_CHECK function. It only returned one of two values; TRUE if the session existed with the status or FALSE if it did not. But what if the session wasn’t there at all regardless of status? Thus I made it more powerful as shown in Listing 13 to return one of three package constants.

/*-----------------------------------------------------------------*/
FUNCTION status_check ( p_sid    NUMBER,
                        p_status VARCHAR2 ) RETURN NUMBER IS
/*-----------------------------------------------------------------*/

  -- query specific SID and status
  CURSOR curs_check ( cp_sid    NUMBER,
                      cp_status VARCHAR2 ) IS
  SELECT status
    FROM dba_resumable
   WHERE session_id = cp_sid;

  v_status  VARCHAR2(30);
  v_ret_val NUMBER;

BEGIN

  -- try to query entry from DBA_RESUMABLE
  OPEN curs_check(p_sid,p_status);
  FETCH curs_check INTO v_status;

  -- if no entry then session went away
  IF curs_check%NOTFOUND THEN
    v_ret_val := c_no_entry;
  -- if status matches
  ELSIF v_status = p_status THEN
       v_ret_val := c_status_match;
  -- status does not match
  ELSE
    v_ret_val := c_no_status_match;
  END IF;

  CLOSE curs_check;

  RETURN(v_ret_val);

END status_check;
Listing 13. Revamped STATUS_CHECK function

Now when sessions go away FIXER labels them as shown in Listing 14.

-- if session went away...
IF v_status = c_no_entry THEN
  -- update attempts, processed date and comment
  UPDATE fixer_info
  SET processed = SYSDATE,
      comments = 'Session Terminated'
  WHERE fixer_id = v_fixer_info_rec.fixer_id;
Listing 14. Handling sessions that go away

Timeout or Abort

In part II of this series I discussed how applications might handle suspended statements that timeout or abort. Specifically how they could adjust to having specific error numbers and messages replaced with far more generic substitutes provided by resumable statements:

ORA-30032: the suspended (resumable) statement has
           timed out
ORA-01013: user requested cancel of current operation

Since I discussed handling both messages I guess I have to build an option into THY to throw either one. I’ll start by adding a field called TIMEOUT_OR_ABORT to the SPACE_QUOTA_EXCEEDED_INFO table. Then I’ll add the code from Listing 15 to the FIXER package so it can be called from SQE.

-- fail with timeout or abort
v_timeout_or_abort VARCHAR2(1);
/*----------------------------------*/
PROCEDURE set_to_timeout IS
/*----------------------------------*/
BEGIN
  v_timeout_or_abort := 'T';
END set_to_timeout;

/*----------------------------------*/
PROCEDURE set_to_abort IS
/*----------------------------------*/
BEGIN
  v_timeout_or_abort := 'A';
END set_to_abort;
Listing 15. Maintaining timeout/abort flag

Then when it comes time to terminate a statement I make this choice.

-- abort or timeout...what to do?
IF v_timeout_or_abort = 'A' THEN
  DBMS_RESUMABLE.ABORT(v_fixer_info_rec.sid);
ELSE
   DBMS_RESUMABLE.SET_SESSION_TIMEOUT(
                     v_fixer_info_rec.sid,1);
END IF; 

Coming Up Next Time

The first three articles in this series, of which this one is the third, has focused on a solid foundation with an eye towards making it easy to add extra pieces. The core functionality is in place in the form of triggers and the FIXER package and the first extra piece, SQE, had been added. The next article will tackle the next error type, MAXIMUM EXTENTS REACHED (codenamed MER) that will be tougher than SQE because of the number of possible object types. See you then.
 


Darryl Hurley has been working with Oracle technology for 15 years with a significant focus on Database Administration and PL/SQL Development. His days are spent as the Database Manager for MDSI Mobile Data Solutions Inc and his spare time is spent writing articles and teaching under the moniker of ImpleStrat Solutions . He has written several articles for the Oracle Development Tools User Group and contributed to several Oracle books from O’Reilly Publishers.

Copyright (c) Pinnacle Publishing, a division of Lawrence Ragan Communications, Inc. All rights reserved. This article was originally published in the August, 2003 issue of Oracle Professional.