Designing Start & Stop Dates for Performance & Simplicity - By Joe Geller

© 2000 RevealNet, Inc.

Many kinds of data are in effect for a time period. When modified, a new row is stored with the new information. For good performance and ease of query, it is important to design the table with the correct choice of effective date, or a start and stop date. Index design is also critical to performance.

The first design that comes to mind is to use an effective date - a row of data is in effect from the effective date until the next date of another row in the table.

The alternative is to use a start and stop date in each row. A row is in effect between these dates.
Neither choice is always best. It depends on many factors including the number of rows (for a given key), how frequently the data changes, is the latest row the one that is usually wanted, and are there time periods when no data is in effect?
More often than not, having both a start & stop date can result in a simpler query, and more efficient access, at the cost of slightly more code for the table maintanence functions. Adding a new row requires updating the previous row as well as inserting the new one. With only one date, only the new row must be inserted. However, in both cases, previous rows must be checked to ensure there is no overlap of dates, or other incorrect action.
For retrieval, what is most often needed is either the current row or a row in effect on a specific date, so I will concentrate on these situations.

Effective Date Retrieval

To get the most recent effective date in a single statement (i.e., end-user query) you need a subselect to ensure it is the highest date:

SELECT data FROM Table T1 WHERE key=value AND EFFDT =
(SELECT MAX(EFFDT) FROM Table T2 WHERE T1.key=T2.key)

In a program, the simpler and more efficient method is to have the index on the key plus EFFDT DESC, and to use a cursor without the subselect, using just one FETCH to obtain the most recent row. This way the DBMS has to only look at one row to return your desired result.

Start and Stop Date Retrieval

To get the row in effect on a specific date you just need

SELECT data FROM Table T1 WHERE key=value AND
START <= date_wanted and STOP >= date_wanted

This simpler statement can be used by an end-user or in a program.

Index design for Start and Stop Date

Index design must be thought out carefully to ensure good performance. If there are only a few rows for each key, then this is not a significant issue, but if there are many rows per key, then it is very significant (this applies to many other situations in which part of the key is a range other than dates).
Assuming that the most recent row is needed more often, the DBA naturally decides to make the index descending on the start date. However, this can lead to poor performance. A much better choice is to have the index descending on the stop date. Logically the two are equivalent (if we do not allow overlapping date ranges). But, let's think about how the DBMS will access the data.

If the index is descending on STOPDATE, the predicate STOPDATE > date_wanted will enable the dbms to go directly to the most recent row meeting this criteria. If there is a row in effect on that date, the start date will also meet the predicate STARTDATE < date_wanted. The next row in the index will fail the predicate & we are done. Quick and easy.
If the index is descending on STARTDATE, the first predicate STARTDATE < date_wanted will be met by every row for the given key (i.e., all rows started before today's date). The DBMS must retrieve and examine every row for the key to see if the stop date is > date_wanted. Only the first one (the most recent) will meet it, but the DBMS doesn't know this. It must examine them all.

So, if you usually want the most recent row, use STOPDATE DESC in the index. If you usually want the earliest row, use STARTDATE ASC in the index.