Reprinted with Permission by Quest Software Oct. 2007


Automated Refactoring Arrives for PL/SQL Developers!
by Steven Feuerstein

Here I sit in the Praha (Prague) airport waiting for a flight back to Chicago. Over the last ten days, I have been on a whirlwind round of trainings and seminars in Prague (Czech Republic), Zagreb (Croatia), Ljubljana (Slovenia) and Bratislava (Slovakia).  Very nice countries, wonderful people.

And some really sharp technologists, too. I was inspired by their questions and ideas to finally build a tool to automatically refactor PL/SQL programs to improve modularity, reuse, and readability. As far as I know, what you are about to read and the code you are about to get your hands on, offer the first automated refactoring capability in the world of PL/SQL.

What is refactoring? It is a process formalized and coined by Martin Fowler; so let's use a quote from his book of the same name:

"Refactoring is the process of changing a software system in such a way that it does not alter the external behavior of the code yet improves its internal structure.

"It is a disciplined way to clean up code that minimizes the chances of introducing bugs. In essence when you refactor you are improving the design of the code after it has been written."

You can read more about refactoring on Wikipedia.

So you are now asking yourself: How did this very exciting development (my implementation of automated refactoring) come about? Well, in my Best of PL/SQL course in Zagreb, I presented one of my four "top tips" for PL/SQL:

Write tiny chunks of code.

Which in essence means: instead of writing big blobs of executable sections consisting of hundreds or thousands of lines of code, you should instead break up those sections into lots of small, local or nested modules. This is a classic refactoring to improve readability and maintainability of code.

So I showed the class how I hide the details of my logic behind program calls and then create (initially) empty "stubs" for those programs. This way, I can compile the code, validate the logic at this high level and then zoom down into the lower level details. Here is an example:

CREATE OR REPLACE PACKAGE BODY call_support_pkg

IS

   PROCEDURE distribute_calls (

      department_id_in IN employees.department_id%TYPE

   )

   IS

      l_case_id   cases.ID%TYPE;

 

      CURSOR support_dept_cur (

         department_id_in IN employees.department_id%TYPE

      )

      IS

         SELECT *

           FROM employees

          WHERE department_id = department_id_in;

 

      FUNCTION calls_still_unhandled

         RETURN BOOLEAN

      IS

      BEGIN

         NULL;

      END calls_still_unhandled;

 

      PROCEDURE assign_next_open_call_to (

         employee_id_in IN employees.employee_id%TYPE

       , case_id_out OUT cases.ID%TYPE

      )

      IS

      BEGIN

         NULL;

      END assign_next_open_call_to;

 

      PROCEDURE notify_customer (case_id_in IN cases.ID%TYPE)

      IS

      BEGIN

         NULL;

      END notify_customer;

   BEGIN

      WHILE (calls_still_unhandled ())

      LOOP

         FOR emp_rec IN support_dept_cur (department_id_in)

         LOOP

            IF call_analysis.current_caseload (emp_rec.employee_id) <

               call_analysis.avg_caseload_for_dept (department_id_in)

            THEN

               assign_next_open_call_to (emp_rec.employee_id

                                       , l_case_id);

               notify_customer (l_case_id);

            END IF;

         END LOOP;

      END LOOP;

   END distribute_calls;

END call_support_pkg;

/

Notice that the calls_still_unhandled, assign_next_open_call_to and notify_customer programs are defined within their "parent" program and their bodies are NULL. I have not yet fully implemented this program, but at least at the top level – the main executable section, I have completely expressed the logic of my program.

Well, one of the students – a young fellow who had just started working with Oracle over the last year – suggested that instead of putting NULL; in the stub programs, I should raise an exception or display a message to remind myself that I have not yet implemented the program. Otherwise, I might forget and leave it NULL.

Very good idea. And that got me thinking (watch out world!): why not create a reusable program specifically designed to remind a developer that the program is "to be completed"?

So I quickly threw together this package to facilitate top-down design (that's another way of describing the level-by-level construction of code I briefly reviewed above) in this way:

PACKAGE BODY topdown

IS

   PROCEDURE tbc (program_name_in IN VARCHAR2)

   IS

   BEGIN

         DBMS_OUTPUT.put_line (

            'TO BE COMPLETED: "' || program_name_in || '"');

         DBMS_OUTPUT.put_line ('This program was called as follows:');

         DBMS_OUTPUT.put_line (DBMS_UTILITY.format_call_stack);

         RAISE_APPLICATION_ERROR

              (-20999

             ,    'Program named "'

               || program_name_in

               || '" has not yet been implemented.'

               || ' Enable SERVEROUTPUT to see callstack.'

              );

   END tbc;

END topdown;

/

With this program defined, I can change my stubs to look like this:

PROCEDURE notify_customer (case_id_in IN cases.ID%TYPE)

IS

BEGIN

  topdown.tbc ('notify_customer');

END notify_customer;

And if I try to run my program I see this:

That's kind of hard to ignore, so it ensures that I will complete this program. Like I said: a really nice idea. But that got me to thinking even more about this situation.

See, what I would really like is to not have to actually build each of those stub local modules. Instead, I would like my editor to do it for me automatically. Unfortunately, PL/SQL IDEs do not yet offer automated refactoring. Perhaps, I thought to myself as I was flying from Zagreb to Bratislava in an Aérospatiale prop-plane, I could come up with a way to automated lots of the steps needed for this refactoring.

And that is what I have done.

Using my TopDown package (greatly enhanced beyond the version you saw above), you can spend a small amount of time placing "indicators" in your code, essentially instructions on what and how you want local modules created. You then compile this sort-of-template into the database, call TopDown.Refactor for that program unit, and voila! local modules are created as you requested.

You can then repeat that process for each level down through the complexities of your program, very quickly defining a highly modular architecture that you and others will appreciate for years to come.

TopDown refactoring: a simple example

Let's start with a simple example. I want to build a procedure that uses dynamic SQL to provide "SELECT * FROM table" functionality from within PL/SQL; that is, display the contents of the specified table. Since I don't know how many columns that table may have in advance, I will have to write Method 4 dynamic SQL using DBMS_SQL. it is going to be a long, complicated program – and it could easily get very messy.

To avoid that messiness, I will use the top down approach. Based on previous work with DBMS_SQL, I expect that I will want to implement my program with the following steps:

  1. Load the column information for that table from the data dictionary (ALL_TAB_COLUMNS).
  2. Construct the query fro mall its dynamic elements, and then parse it.
  3. Define each of the columns and then execute the query.
  4. Build a line of output for each fetched row and display it.

So, with the TopDown API available for use, I write the following procedure:

 1  CREATE OR REPLACE PROCEDURE intab (

 2     owner_in IN VARCHAR2, table_in IN VARCHAR2)

 3  IS

 4  --topdown.ish

 5  BEGIN

 6     TopDown.PPH ('load_column_information');

 7     TopDown.PPH ('construct_and_parse_query');

 8     TopDown.PPH ('define_columns_and_execute');

 9     TopDown.PPH ('build_and_display_output');

10     TopDown.PPH ('cleanup');

11  END intab;

Here is an explanation of my use of the TopDown package:

Line(s)

Significance

4

"topdown.ish" says "This is where I want to insert the stub programs." In other words, Insert Stubs Here. You might say: "Why do I have to do that? The stubs have to go at the bottom of the declaration section, so you can figure that out for yourself." Perhaps, but to do that, I would have to parse the PL/SQL code (since you can have nested blocks, simply finding the "last" BEGIN in the code will not be good enough).

So we compromise: I don't have to write lots of complicated parsing logic and you just have to add a little comment.

5 - 10

For each of the steps in my dynamic SQL processing, I want to create a local module to hold the details required to implement that step. So I call Topdown.PPH and pass it the name of the program. "Pph" stands for "Put the Procedure Here". In other words, I want to create a local module with this name, and I want to replace each TopDown.PPH call with the actual call to a program of that name.

I then compile this program into the database. It compiles just fine, since the PPH procedure is, indeed, implemented in the TopDown package; though, interesting it is itself a "dummy" program with a NULL implementation:

PROCEDURE pph (info_in IN VARCHAR2)

IS

BEGIN

   NULL;

END pph;

It doesn't really do anything but give information to the TopDown.Refactor program. So, now it is, in fact, time to do my refactoring.

I run this block of code:

BEGIN

   topdown.refactor (USER, 'INTAB');

END;
/

and then I visit the Schema Browser in Toad, find the INTAB procedure, load the source inside an edit window, reformat my code (it is definitely not pretty when TopDown.Refactor is done with it), and I see this:

CREATE OR REPLACE PROCEDURE intab (

   owner_in IN VARCHAR2, table_in IN VARCHAR2)

IS

   PROCEDURE load_column_information

   IS

   BEGIN

      topdown.tbc ('load_column_information');

   END load_column_information;

 

   PROCEDURE construct_and_parse_query

   IS

   BEGIN

      topdown.tbc ('construct_and_parse_query');

   END construct_and_parse_query;

 

   PROCEDURE define_columns_and_execute

   IS

   BEGIN

      topdown.tbc ('define_columns_and_execute');

   END define_columns_and_execute;

 

   PROCEDURE build_and_display_output

   IS

   BEGIN

      topdown.tbc ('build_and_display_output');

   END build_and_display_output;

 

   PROCEDURE cleanup

   IS

   BEGIN

      topdown.tbc ('cleanup');

   END cleanup;

BEGIN

   load_column_information ();

   construct_and_parse_query ();

   define_columns_and_execute ();

   build_and_display_output ();

   cleanup ();

END intab;

And there you have it: as far as I know, this is the first automated refactoring tool for PL/SQL programs!

What else you can do with TopDown

You have seen one very simple example of applying TopDown.Refactor. You can ask the program to generate much more useful and interesting code than that. You can geneate a function call, rather than a procedure. You can specify the parameter list of the to-be-created local module. You can also tell TopDown what value or expression you want passed to the program after it has been refactored. To see some of these features in action, let's look at another example.

Suppose I need to build a procedure to distribute open or unhandled support calls out to members of the support organization. Here is the basic algorithm:

While there are still unhandled calls in the queue, assign  them to employees who are under-utilized (have fewer calls assigned to them then the average for their department), and notify the customer that the status of their case has changed.

Consequently, the first iteration of my program might look like this:

CREATE OR REPLACE PACKAGE BODY call_support_pkg

IS

   PROCEDURE distribute_calls (

      department_id_in IN employees.department_id%TYPE

   )

   IS

      l_case_id   cases.ID%TYPE;

 

      CURSOR support_dept_cur (

         department_id_in IN employees.department_id%TYPE

      )

      IS

         SELECT *

           FROM employees

          WHERE department_id = department_id_in;

   BEGIN

      WHILE (calls_still_unhandled ())

      LOOP

         FOR emp_rec IN support_dept_cur (department_id_in)

         LOOP

            IF current_caseload (emp_rec.employee_id) <

               avg_caseload_for_dept (department_id_in)

            THEN

               assign_next_open_call_to (

                         emp_rec.employee_id, l_case_id);

               notify_customer (l_case_id);

            END IF;

         END LOOP;

      END LOOP;

   END distribute_calls;

END call_support_pkg;

And then I have to create the local modules, create the parameter lists, etc. Instead of taking that old, slow approach, I will use the brand-new, accelerated TopDown approach (TopDown calls highlighted in bold and green):

CREATE OR REPLACE PACKAGE BODY call_support_pkg

IS

   PROCEDURE distribute_calls (

      department_id_in IN employees.department_id%TYPE

   )

   IS

      l_case_id   cases.ID%TYPE;

 

      CURSOR support_dept_cur (

         department_id_in IN employees.department_id%TYPE

      )

      IS

         SELECT *

           FROM employees

          WHERE department_id = department_id_in;

   --topdown.ish

   BEGIN

      WHILE (topdown.pfh ('calls_still_unhandled|BOOLEAN'))

      LOOP

         FOR emp_rec IN support_dept_cur (department_id_in)

         LOOP

            IF topdown.pfh ('current_caseload|PLS_INTEGER|employee_id_in:employees.employee_id%TYPE:emp_rec.employee_id')  <

               topdown.pfh ('avg_caseload_for_dept|PLS_INTEGER|department_id_in:departments.department_id%TYPE:emp_rec.department_id')

            THEN

               topdown.pph ('assign_next_open_call_to|employee_id_in:employees.employee_id%TYPE:emp_rec.employee_id');

               topdown.pph ('notify_customer|case_id_in:cases.id%type:l_case_id');

            END IF;

         END LOOP;

      END LOOP;

   END distribute_calls;

END call_support_pkg;

Now, I am the first to admit that this code is really hard to read. So if you like (if, specifically, you want or need to specify multiple arguments and that single line just gets unwieldy), you can use this alternate format:

IF

/*PFH

current_caseload

PLS_INTEGER

employee_id_in:employees.employee_id%TYPE:emp_rec.employee_id

*/

/*PFH

avg_caseload_for_dept

PLS_INTEGER

department_id_in:departments.department_id%TYPE:emp_rec.department_id

*/

THEN

   /*PPH

   assign_next_open_call_to

   employee_id_in:employees.employee_id%TYPE:emp_rec.employee_id

   case_id_out:cases.id%type:l_case_id

   */

   topdown.pph ('notify_customer|case_id_in:cases.id%type:l_case_id');

END IF;

In other words, use a multi-line comment block to describe the program.

And after running it through the TopDown.Refactor engine:

BEGIN

   topdown.refactor (USER, 'CALL_SUPPORT_PKG');

END;

/

and reformat the code inside Toad, I have the following program definition:

CREATE OR REPLACE PACKAGE BODY call_support_pkg

IS

   PROCEDURE distribute_calls (

      department_id_in IN employees.department_id%TYPE

   )

   IS

      l_case_id   cases.ID%TYPE;

 

      CURSOR support_dept_cur (

         department_id_in IN employees.department_id%TYPE

      )

      IS

         SELECT *

           FROM employees

          WHERE department_id = department_id_in;

 

      FUNCTION calls_still_unhandled

         RETURN BOOLEAN

      IS

      BEGIN

         topdown.tbc ('calls_still_unhandled');

         RETURN NULL;

      END calls_still_unhandled;

 

      FUNCTION current_caseload (

         employee_id_in IN employees.employee_id%TYPE)

         RETURN PLS_INTEGER

      IS

      BEGIN

         topdown.tbc ('current_caseload');

         RETURN NULL;

      END current_caseload;

 

      FUNCTION avg_caseload_for_dept (

         department_id_in IN departments.department_id%TYPE

      )

         RETURN PLS_INTEGER

      IS

      BEGIN

         topdown.tbc ('avg_caseload_for_dept');

         RETURN NULL;

      END avg_caseload_for_dept;

 

      PROCEDURE assign_next_open_call_to (

         employee_id_in IN employees.employee_id%TYPE

       , case_id_out IN cases.ID%TYPE

      )

      IS

      BEGIN

         topdown.tbc ('assign_next_open_call_to');

      END assign_next_open_call_to;

 

      PROCEDURE notify_customer (case_id_in IN cases.ID%TYPE)

      IS

      BEGIN

         topdown.tbc ('notify_customer');

      END notify_customer;

   BEGIN

      WHILE (calls_still_unhandled ())

      LOOP

         FOR emp_rec IN support_dept_cur (department_id_in)

         LOOP

            IF current_caseload (employee_id_in => emp_rec.employee_id)

                  <

               avg_caseload_for_dept

                  (department_id_in => emp_rec.department_id)

            THEN

               assign_next_open_call_to

                  (employee_id_in      => emp_rec.employee_id

                  , case_id_out        => l_case_id

               );

               notify_customer (case_id_in => l_case_id);

            END IF;

         END LOOP;

      END LOOP;

   END distribute_calls;

END call_support_pkg;

/

Notice all the lovely named notation – I almost never take the time to use named notation – it simply takes too long. Now I get it for free.

A guide to source and demonstration files

Check out the following files (they assume the existence of the typical HR employees and departments tables). By running the demo scripts, you can see the transformations to the programs.

File name

Significance

topdown_basic.pkg

The first version of my TopDown package; it simply includes a call to "tbc" (To Be Completed) and can be used in conjuction with topdown_example.sql (see below).

topdown.pks

TopDown package specification, supporting all the features described in this document.

topdown.pkb

TopDown package body, supporting all the features described in this document.

topdown_install.sql

Single script that installs the package specification and body of TopDown.

 

 

topdown_example.sql

Provides an example of how you can apply top-down design to iteratively implement increasingly detailed levels of logic in your program. Requires only the topdown_basic.pkg to be installed.

topdown_demo1.sql

Demonstrates the use of the full TopDown package to describe the local modules needed. Run this first.

topdown_demo2.sql

Shows how the call_support_pkg looks after the first pass of TopDown.Refactor. And then I drill down one more level, add another set of calls to TopDown, and refactor again. Then I show you that refactored code.

topdown_demo3.sql

Demonstrates use to TopDown for the intab procedure. Clear and simple example of how you can use TopDown to jump-start your development and stay at a very high level of specification of logic.

topdown_demo4.sql

Demonstrates how to use the alternate format (comment block) to specify the creation of the local module.

Give it a try!

Of course, it would be much better if this sort of automated refactoring were built right into PL/SQL IDEs. I am confident, in fact, that in the relatively near future, Toad will offer this and much more.

In the meantime, however, I hope that you will give this utility a try. It is a bit awkward, but once you get the hang of it, it could save you a bunch of time – and leave you with much higher quality code!

You can download the code from here.

If you have any problems, suggestions or questions, don't hesitate to get in touch.


Steven Feuerstein (steven@stevenfeuerstein.com) is considered one of the world's leading experts on the Oracle PL/SQL language, having written ten books on the subject, including Oracle PL/SQL Programming and Oracle PL/SQL Best Practices (O'Reilly Media). Steven serves as a senior technology advisor for Quest Software and is currently building a unit testing tool for PL/SQL programs (www.unit-test.com).