|
|
Here I sit in the Praha (Prague) airport waiting for a flight back to Chicago. Over the last ten days, I have been on a whirlwind round of trainings and seminars in Prague (Czech Republic), Zagreb (Croatia), Ljubljana (Slovenia) and Bratislava (Slovakia). Very nice countries, wonderful people.
And some really sharp technologists, too. I was inspired by their questions and ideas to finally build a tool to automatically refactor PL/SQL programs to improve modularity, reuse, and readability. As far as I know, what you are about to read and the code you are about to get your hands on, offer the first automated refactoring capability in the world of PL/SQL.
What is refactoring? It is a process formalized and coined by Martin Fowler; so let's use a quote from his book of the same name:
"Refactoring is the process of changing a software system in such a way that it does not alter the external behavior of the code yet improves its internal structure.
"It is a disciplined way to clean up code that minimizes the chances of introducing bugs. In essence when you refactor you are improving the design of the code after it has been written."
You can read more about refactoring on Wikipedia.
So you are now asking yourself: How did this very exciting development (my implementation of automated refactoring) come about? Well, in my Best of PL/SQL course in Zagreb, I presented one of my four "top tips" for PL/SQL:
Write tiny chunks of code.
Which in essence means: instead of writing big blobs of executable sections consisting of hundreds or thousands of lines of code, you should instead break up those sections into lots of small, local or nested modules. This is a classic refactoring to improve readability and maintainability of code.
So I showed the class how I hide the details of my logic behind program calls and then create (initially) empty "stubs" for those programs. This way, I can compile the code, validate the logic at this high level and then zoom down into the lower level details. Here is an example:
CREATE OR REPLACE PACKAGE BODY call_support_pkg
IS
PROCEDURE distribute_calls (
department_id_in IN employees.department_id%TYPE
)
IS
l_case_id cases.ID%TYPE;
CURSOR support_dept_cur (
department_id_in IN employees.department_id%TYPE
)
IS
SELECT *
FROM employees
WHERE department_id = department_id_in;
FUNCTION calls_still_unhandled
RETURN BOOLEAN
IS
BEGIN
NULL;
END calls_still_unhandled;
PROCEDURE assign_next_open_call_to (
employee_id_in IN employees.employee_id%TYPE
, case_id_out OUT cases.ID%TYPE
)
IS
BEGIN
NULL;
END assign_next_open_call_to;
PROCEDURE notify_customer (case_id_in IN cases.ID%TYPE)
IS
BEGIN
NULL;
END notify_customer;
BEGIN
WHILE (calls_still_unhandled ())
LOOP
FOR emp_rec IN support_dept_cur (department_id_in)
LOOP
IF call_analysis.current_caseload (emp_rec.employee_id) <
call_analysis.avg_caseload_for_dept (department_id_in)
THEN
assign_next_open_call_to (emp_rec.employee_id
, l_case_id);
notify_customer (l_case_id);
END IF;
END LOOP;
END LOOP;
END distribute_calls;
END call_support_pkg;
/
Notice that the calls_still_unhandled, assign_next_open_call_to and notify_customer programs are defined within their "parent" program and their bodies are NULL. I have not yet fully implemented this program, but at least at the top level – the main executable section, I have completely expressed the logic of my program.
Well, one of the students – a young fellow who had just started working with Oracle over the last year – suggested that instead of putting NULL; in the stub programs, I should raise an exception or display a message to remind myself that I have not yet implemented the program. Otherwise, I might forget and leave it NULL.
Very good idea. And that got me thinking (watch out world!): why not create a reusable program specifically designed to remind a developer that the program is "to be completed"?
So I quickly threw together this package to facilitate top-down design (that's another way of describing the level-by-level construction of code I briefly reviewed above) in this way:
PACKAGE BODY topdown
IS
PROCEDURE tbc (program_name_in IN VARCHAR2)
IS
BEGIN
DBMS_OUTPUT.put_line (
'TO BE COMPLETED: "' || program_name_in || '"');
DBMS_OUTPUT.put_line ('This program was called as follows:');
DBMS_OUTPUT.put_line (DBMS_UTILITY.format_call_stack);
RAISE_APPLICATION_ERROR
(-20999
, 'Program named "'
|| program_name_in
|| '" has not yet been implemented.'
|| ' Enable SERVEROUTPUT to see callstack.'
);
END tbc;
END topdown;
/
With this program defined, I can change my stubs to look like this:
PROCEDURE notify_customer (case_id_in IN cases.ID%TYPE)
IS
BEGIN
topdown.tbc ('notify_customer');
END notify_customer;
And if I try to run my program I see this:
That's kind of hard to ignore, so it ensures that I will complete this program. Like I said: a really nice idea. But that got me to thinking even more about this situation.
See, what I would really like is to not have to actually build each of those stub local modules. Instead, I would like my editor to do it for me automatically. Unfortunately, PL/SQL IDEs do not yet offer automated refactoring. Perhaps, I thought to myself as I was flying from Zagreb to Bratislava in an Aérospatiale prop-plane, I could come up with a way to automated lots of the steps needed for this refactoring.
And that is what I have done.
Using my TopDown package (greatly enhanced beyond the version you saw above), you can spend a small amount of time placing "indicators" in your code, essentially instructions on what and how you want local modules created. You then compile this sort-of-template into the database, call TopDown.Refactor for that program unit, and voila! local modules are created as you requested.
You can then repeat that process for each level down through the complexities of your program, very quickly defining a highly modular architecture that you and others will appreciate for years to come.
Let's start with a simple example. I want to build a procedure that uses dynamic SQL to provide "SELECT * FROM table" functionality from within PL/SQL; that is, display the contents of the specified table. Since I don't know how many columns that table may have in advance, I will have to write Method 4 dynamic SQL using DBMS_SQL. it is going to be a long, complicated program – and it could easily get very messy.
To avoid that messiness, I will use the top down approach. Based on previous work with DBMS_SQL, I expect that I will want to implement my program with the following steps:
So, with the TopDown API available for use, I write the following procedure:
1 CREATE OR REPLACE PROCEDURE intab (
2 owner_in IN VARCHAR2, table_in IN VARCHAR2)
3 IS
4 --topdown.ish
5 BEGIN
6 TopDown.PPH ('load_column_information');
7 TopDown.PPH ('construct_and_parse_query');
8 TopDown.PPH ('define_columns_and_execute');
9 TopDown.PPH ('build_and_display_output');
10 TopDown.PPH ('cleanup');
11 END intab;
Here is an explanation of my use of the TopDown package:
|
Line(s) |
Significance |
|
4 |
"topdown.ish" says "This is where I want to insert the stub programs." In other words, Insert Stubs Here. You might say: "Why do I have to do that? The stubs have to go at the bottom of the declaration section, so you can figure that out for yourself." Perhaps, but to do that, I would have to parse the PL/SQL code (since you can have nested blocks, simply finding the "last" BEGIN in the code will not be good enough). So we compromise: I don't have to write lots of complicated parsing logic and you just have to add a little comment. |
|
5 - 10 |
For each of the steps in my dynamic SQL processing, I want to create a local module to hold the details required to implement that step. So I call Topdown.PPH and pass it the name of the program. "Pph" stands for "Put the Procedure Here". In other words, I want to create a local module with this name, and I want to replace each TopDown.PPH call with the actual call to a program of that name. |
I then compile this program into the database. It compiles just fine, since the PPH procedure is, indeed, implemented in the TopDown package; though, interesting it is itself a "dummy" program with a NULL implementation:
PROCEDURE pph (info_in IN VARCHAR2)
IS
BEGIN
NULL;
END pph;
It doesn't really do anything but give information to the TopDown.Refactor program. So, now it is, in fact, time to do my refactoring.
I run this block of code:
BEGIN
topdown.refactor (USER, 'INTAB');
END;
/
and then I visit the Schema Browser in Toad, find the INTAB procedure, load the source inside an edit window, reformat my code (it is definitely not pretty when TopDown.Refactor is done with it), and I see this:
CREATE OR REPLACE PROCEDURE intab (
owner_in IN VARCHAR2, table_in IN VARCHAR2)
IS
PROCEDURE load_column_information
IS
BEGIN
topdown.tbc ('load_column_information');
END load_column_information;
PROCEDURE construct_and_parse_query
IS
BEGIN
topdown.tbc ('construct_and_parse_query');
END construct_and_parse_query;
PROCEDURE define_columns_and_execute
IS
BEGIN
topdown.tbc ('define_columns_and_execute');
END define_columns_and_execute;
PROCEDURE build_and_display_output
IS
BEGIN
topdown.tbc ('build_and_display_output');
END build_and_display_output;
PROCEDURE cleanup
IS
BEGIN
topdown.tbc ('cleanup');
END cleanup;
BEGIN
load_column_information ();
construct_and_parse_query ();
define_columns_and_execute ();
build_and_display_output ();
cleanup ();
END intab;
And there you have it: as far as I know, this is the first automated refactoring tool for PL/SQL programs!
You have seen one very simple example of applying TopDown.Refactor. You can ask the program to generate much more useful and interesting code than that. You can geneate a function call, rather than a procedure. You can specify the parameter list of the to-be-created local module. You can also tell TopDown what value or expression you want passed to the program after it has been refactored. To see some of these features in action, let's look at another example.
Suppose I need to build a procedure to distribute open or unhandled support calls out to members of the support organization. Here is the basic algorithm:
While there are still unhandled calls in the queue, assign them to employees who are under-utilized (have fewer calls assigned to them then the average for their department), and notify the customer that the status of their case has changed.
Consequently, the first iteration of my program might look like this:
CREATE OR REPLACE PACKAGE BODY call_support_pkg
IS
PROCEDURE distribute_calls (
department_id_in IN employees.department_id%TYPE
)
IS
l_case_id cases.ID%TYPE;
CURSOR support_dept_cur (
department_id_in IN employees.department_id%TYPE
)
IS
SELECT *
FROM employees
WHERE department_id = department_id_in;
BEGIN
WHILE (calls_still_unhandled ())
LOOP
FOR emp_rec IN support_dept_cur (department_id_in)
LOOP
IF current_caseload (emp_rec.employee_id) <
avg_caseload_for_dept (department_id_in)
THEN
assign_next_open_call_to (
emp_rec.employee_id, l_case_id);
notify_customer (l_case_id);
END IF;
END LOOP;
END LOOP;
END distribute_calls;
END call_support_pkg;
And then I have to create the local modules, create the parameter lists, etc. Instead of taking that old, slow approach, I will use the brand-new, accelerated TopDown approach (TopDown calls highlighted in bold and green):
CREATE OR REPLACE PACKAGE BODY call_support_pkg
IS
PROCEDURE distribute_calls (
department_id_in IN employees.department_id%TYPE
)
IS
l_case_id cases.ID%TYPE;
CURSOR support_dept_cur (
department_id_in IN employees.department_id%TYPE
)
IS
SELECT *
FROM employees
WHERE department_id = department_id_in;
--topdown.ish
BEGIN
WHILE (topdown.pfh ('calls_still_unhandled|BOOLEAN'))
LOOP
FOR emp_rec IN support_dept_cur (department_id_in)
LOOP
IF topdown.pfh ('current_caseload|PLS_INTEGER|employee_id_in:employees.employee_id%TYPE:emp_rec.employee_id') <
topdown.pfh ('avg_caseload_for_dept|PLS_INTEGER|department_id_in:departments.department_id%TYPE:emp_rec.department_id')
THEN
topdown.pph ('assign_next_open_call_to|employee_id_in:employees.employee_id%TYPE:emp_rec.employee_id');
topdown.pph ('notify_customer|case_id_in:cases.id%type:l_case_id');
END IF;
END LOOP;
END LOOP;
END distribute_calls;
END call_support_pkg;
Now, I am the first to admit that this code is really hard to read. So if you like (if, specifically, you want or need to specify multiple arguments and that single line just gets unwieldy), you can use this alternate format:
IF
/*PFH
current_caseload
PLS_INTEGER
employee_id_in:employees.employee_id%TYPE:emp_rec.employee_id
*/
<
/*PFH
avg_caseload_for_dept
PLS_INTEGER
department_id_in:departments.department_id%TYPE:emp_rec.department_id
*/
THEN
/*PPH
assign_next_open_call_to
employee_id_in:employees.employee_id%TYPE:emp_rec.employee_id
case_id_out:cases.id%type:l_case_id
*/
topdown.pph ('notify_customer|case_id_in:cases.id%type:l_case_id');
END IF;
In other words, use a multi-line comment block to describe the program.
And after running it through the TopDown.Refactor engine:
BEGIN
topdown.refactor (USER, 'CALL_SUPPORT_PKG');
END;
/
and reformat the code inside Toad, I have the following program definition:
CREATE OR REPLACE PACKAGE BODY call_support_pkg
IS
PROCEDURE distribute_calls (
department_id_in IN employees.department_id%TYPE
)
IS
l_case_id cases.ID%TYPE;
CURSOR support_dept_cur (
department_id_in IN employees.department_id%TYPE
)
IS
SELECT *
FROM employees
WHERE department_id = department_id_in;
FUNCTION calls_still_unhandled
RETURN BOOLEAN
IS
BEGIN
topdown.tbc ('calls_still_unhandled');
RETURN NULL;
END calls_still_unhandled;
FUNCTION current_caseload (
employee_id_in IN employees.employee_id%TYPE)
RETURN PLS_INTEGER
IS
BEGIN
topdown.tbc ('current_caseload');
RETURN NULL;
END current_caseload;
FUNCTION avg_caseload_for_dept (
department_id_in IN departments.department_id%TYPE
)
RETURN PLS_INTEGER
IS
BEGIN
topdown.tbc ('avg_caseload_for_dept');
RETURN NULL;
END avg_caseload_for_dept;
PROCEDURE assign_next_open_call_to (
employee_id_in IN employees.employee_id%TYPE
, case_id_out IN cases.ID%TYPE
)
IS
BEGIN
topdown.tbc ('assign_next_open_call_to');
END assign_next_open_call_to;
PROCEDURE notify_customer (case_id_in IN cases.ID%TYPE)
IS
BEGIN
topdown.tbc ('notify_customer');
END notify_customer;
BEGIN
WHILE (calls_still_unhandled ())
LOOP
FOR emp_rec IN support_dept_cur (department_id_in)
LOOP
IF current_caseload (employee_id_in => emp_rec.employee_id)
<
avg_caseload_for_dept
(department_id_in => emp_rec.department_id)
THEN
assign_next_open_call_to
(employee_id_in => emp_rec.employee_id
, case_id_out => l_case_id
);
notify_customer (case_id_in => l_case_id);
END IF;
END LOOP;
END LOOP;
END distribute_calls;
END call_support_pkg;
/
Notice all the lovely named notation – I almost never take the time to use named notation – it simply takes too long. Now I get it for free.
Check out the following files (they assume the existence of the typical HR employees and departments tables). By running the demo scripts, you can see the transformations to the programs.
|
File name |
Significance |
|
topdown_basic.pkg |
The first version of my TopDown package; it simply includes a call to "tbc" (To Be Completed) and can be used in conjuction with topdown_example.sql (see below). |
|
topdown.pks |
TopDown package specification, supporting all the features described in this document. |
|
topdown.pkb |
TopDown package body, supporting all the features described in this document. |
|
topdown_install.sql |
Single script that installs the package specification and body of TopDown. |
|
|
|
|
topdown_example.sql |
Provides an example of how you can apply top-down design to iteratively implement increasingly detailed levels of logic in your program. Requires only the topdown_basic.pkg to be installed. |
|
topdown_demo1.sql |
Demonstrates the use of the full TopDown package to describe the local modules needed. Run this first. |
|
topdown_demo2.sql |
Shows how the call_support_pkg looks after the first pass of TopDown.Refactor. And then I drill down one more level, add another set of calls to TopDown, and refactor again. Then I show you that refactored code. |
|
topdown_demo3.sql |
Demonstrates use to TopDown for the intab procedure. Clear and simple example of how you can use TopDown to jump-start your development and stay at a very high level of specification of logic. |
|
topdown_demo4.sql |
Demonstrates how to use the alternate format (comment block) to specify the creation of the local module. |
Of course, it would be much better if this sort of automated refactoring were built right into PL/SQL IDEs. I am confident, in fact, that in the relatively near future, Toad will offer this and much more.
In the meantime, however, I hope that you will give this utility a try. It is a bit awkward, but once you get the hang of it, it could save you a bunch of time – and leave you with much higher quality code!
You can download the code from here.
If you have any problems, suggestions or questions, don't hesitate to get in touch.
Steven Feuerstein (steven@stevenfeuerstein.com) is considered one of the world's leading experts on the Oracle PL/SQL language, having written ten books on the subject, including Oracle PL/SQL Programming and Oracle PL/SQL Best Practices (O'Reilly Media). Steven serves as a senior technology advisor for Quest Software and is currently building a unit testing tool for PL/SQL programs (www.unit-test.com).