Reprinted with Permission by Quest Software Aug.  2002

 

Freelist Internals - An Overview
Topic Extracted from Knowledge Xpert for Oracle Administration


A freelist is a list of free blocks associated with a segment, which are eligible for accepting data when a new insert request comes. This normally speeds up the insert process since Oracle does not need to look at the entire block to put that row inside a table. The freelist structure is managed by a chain structure called a linked list. A singly linked list is the data structure used in managing freelists.

In a singly linked list the current element will hold the address of the next element. The last element will hold the null pointer as the next element's address. The header will hold the address of the starting point, which is nothing but the first element. Consider the following example..There are five blocks in the freelist namely A, B, C, D, E.

The structure of the linked list will be like this.

A --> B --> C --> D --> E --|.

The data blocks will hold the following information,

Block    Next
 A           B
 B           C
 C           D
 D           E
 E           NULL

Freelist management is done as Last In First Out (LIFO) method. That means that the last block linked to the freelist will be the first block unlinked.

Each datablock will contain the information about its association with freelist chain in the header part of the block. There will be a flag in ALL data blocks, which indicates the status about that block to the freelist chain. The flag 0 indicates that the block is in the freelists and '-' indicates that the block is not in the freelist chain. (i.e. the block is above PCTUSED). Fnx indicates the next DBA* of the freelists.

Throughout this section the term DBA is used referring to the data block address.

The following is an example of a block dump:

Block header dump: dba: 0x24000923
Object id on Block? Y
seg/obj: 0xc0b csc: 0x2f0.4045f9ea itc: 1 flg: - typ: 1 - 
DATA
    fsl: 0 fnx: 0x0 
==> Seg/obj Object ID in dictionary
==> csc SCN of last block cleanout 
==> itc Number of ITL slots
==> flg O = On freelist , - = Not on freelist
==> typ 1 = DATA 2 = INDEX
==> fsl ITL TX freelist slot
==> fnx DBA of NEXT block on freelist

Freelist Link and Unlink Operation:

There are three types of freelists in a segment, Master Freelists (MFL), Process Freelists (PrFL), and Transaction Freelists. Master freelist (sometimes called a freelist pool) is created automatically when you create any new segment. It is common to all processes accessing that segment. 

All segments have at least one master freelists. Usually the first block in that segment holds the definition of that freelist. The structure of a freelist descriptor takes approximately 20 bytes in the segment header. This imposes the maximum limit of freelists per segment. The following listing gives the maximum number of freelists per segment depending on the block size. This freelist definition slightly differs in cases of rollback segments where only five blocks are linked in the free extent pool*.

Block Size    Max # Freelists 
  -----------          ----------------- 
     2K                     24 
     4K                     50 
     8K                   101 
   16K                   204 

Freelists in the rollback segments are slightly different from the freelists in the data/index segments. There is a maximum of 5 blocks linked in the rollback segment header, which has more than 400 bytes of free space. The free extent pool is visible in the rollback segment header dumps.

FREE BLOCK POOL:
        uba: 0x00c0040a.0071.05 ext: 0x6 spc: 0x578 
    uba: 0x00000000.0070.05 ext: 0x5 spc: 0x654 
    uba: 0x00000000.0052.09 ext: 0x1 spc: 0x530 
    uba: 0x00000000.0000.00 ext: 0x0 spc: 0x0 
    uba: 0x00000000.0000.00 ext: 0x0 spc: 0x0

Consider that there are 6 blocks linked into a freelist chain. There is a segment FREE_LIST that has 120 blocks starting from 1 to 120. Assume the high watermark is at 80.

10         24          45         46          65        80

             24	       45	       46	       65	        80 	      **

For ease of understanding, absolute numbers are being used as the data block addresses. The real address of the data block (DBA) is completely different from the data block numbers 

Now there is a new INSERT request, which needs 400 bytes of free space. That space is available in block 24. Now the data is put into block 24 and the free space in that block becomes lesser than the PCTUSED defined for that table. So the block 24 is unlinked from the freelist chain. 

The sole purpose of PCTFREE and PCTUSED parameters are to control the datablock movement from/to the freelist chain. Now the freelist chain will look like this:

10          45        46         65         80

       45	           46	       65	        80	           **

Subsequently the same transaction DELETEs some data in that segment. This results in the space in blocks 54 and 67 falling below PCTUSED. Now these blocks are linked in the freelist chain. The chain will look like this:

67          54        10         45         46         65        80 

          54	       10	         45	      46	         65	     80	       **

Consider a case where the freelist chain has grown very big. There may be thousands of blocks linked into a freelist when the table is huge. Now the freelist chain will look similar to this: 

1024     205        84        10        25         <>        80

      205	         84	      10	       25	          <	     80	       **

In such cases, while looking for an eligible block during inserts Oracle will only look for the first five blocks in the freelist chain. If it does not find a suitable block in those five blocks, it simply ignores the rest of the chain and formats a new set of five blocks. Newly formatted blocks are added to the head of the freelists and the high water mark (HWM) is bumped by five blocks. Based on this the freelist chain will look like this:

1030      1029    1028     1027     1026      <>          80

     1029	      1028	     1027	     1026	       <	        80	              **

Note that the new blocks are always linked at the head of the freelist chain. Therefore the last block linked in the freelist chain will be the first block unlinked during the inserts. While walking in the freelist chain looking for an eligible block, oracle may either skip those blocks or unlink the blocks from the freelists. This is the only situation where a block with space usage is below PCTUSED and not in the freelist.

Process Freelists (simply PFL) are created to avoid the contention in master freelists. The number of PFLs is defined through the FREELISTS parameter in the CREATE TABLE command. The number of PFLs should be closer to the number of concurrent inserts in that segment. This improves the INSERT performance at the cost of additional space.

The optimal value for processing freelists in a segment is often recommended as 2 * # of CPUs. Having more Process freelists will 'artificially increase' the high water mark and put an unnecessary load on the IO subsystem. During the sequential reads oracle has to scan more numbers of blocks than required. The number of blocks allocated in a master freelist, while bumping the high watermark, is always a function of the number of Process freelists.

The maximum number of process freelists per segment is block size dependent. (For example it is 101 for 8k data block). The freelist allocation for a process is done by a simple hashing algorithm PFL#=mod (PID, n)+1. This algorithm slightly differs in the OPS environment. Assume in one segment there are five process freelists (PrFL). A process with a Process ID 16 is asking for a block for INSERT. In this case PFL2 is assigned to that process.

Transaction freelists are created dynamically during the lifecycle of a transaction. The search for a new block starts with its own transaction freelists. Failing this, it goes to its own process freelist. If it does not find an eligible block, it will ask from the Master Freelist.

Starting with the 8i release 2 (8.1.6,) the number of process freelists can be controlled dynamically without recreating the object by using the ALTER TABLE table_name STORAGE (FREELISTS xx) command. You may need an exclusive lock on that segment, which is visible as TX.

The overhead of having more process freelists is much more than the potential benefits. The higher number of freelists artificially increases the high watermark of the segment. There should be a careful balance between the number of concurrent inserts and the number of process freelists.

Oracle will not keep on searching for an eligible block in the freelist chain, when a new INSERT request comes. It searches only the first five blocks in the freelist (which are linked to the freelist as last five blocks) and this number is tunable through the '_walk_insert_threshold' parameter. If it does not find an eligible block in that '_walk_insert_threshold' number of blocks, it allocates five NEW blocks and bumps the high water mark to the new value. If more process freelists are configured for that segment, the high watermark is incremented to that many numbers of 5 blocks.

The '_bump_high_watermark_count' defines the number of blocks to be bumped during the new block allocation to the master freelist. The new blocks are allocated to the head of the master freelist. This comes to the transaction freelists via the process free list.

Ineligible blocks happen due to another parameter called '_release_insert_threshold' that also defaults to five blocks. This parameter controls the maximum number of blocks to be unlinked from a freelist during an INSERT.

In earlier versions (8.1.5 and below) you have to recreate the segment to alter the 'freelists'. Starting from Oracle 8.1.6 the creation of segment with the "freelist" option is dynamic and can be altered without recreating the segment using the Alter command. 

Oracle9i enhancements

Starting with Oracle9i, the freelist management inside a segment is done by bitmap blocks (BMB). These reside in the segment header and block header. Each block has a flag in the header part about the space usage inside that block.

In this case the old storage parameters such as PCTFREE, PCTUSED and FREELIST parameters are ignored while creating a segment. But the CREATE TABLE command will not err out. This is provided for downward compatibility. The storage settings will not be considered in Oracle9i.