Reorg is a close friend of a DBA. But who and what is a reorg ? Why do we need him ?
Reorg is an utility in DB2 that reorganizes the data is the clustering sequences and can free up empty space. Reorg has evolved a lot in recent versions of DB2. It used to be during reorgs the table is offline, now the reorgs allow read/write access to the table and can be paused, resumed or even stop in the middle and the table is not effected. Why do we need to reorg tables ? Well think about a well sorted deck of cards, isn’t it easy to find a particular card while its still sorted. Now once playing a few rounds of poker, check the deck. Are they still sorted, I don’t think so ! Now try to find a particular card , i tell you its going to take more time to go through them to find it. In other words the performance to find the card has decreased. So inorder to search through the deck quicker, you need to sort the cards again…right ? Thats exactly what reorg does to DB2 tables and indexes. A reorg sorts or organizes the table in the sequence of the index, clean up page splits, clean up empty pages and so on.
How do you find what all tables and indexes actually need to be reorg’d ?
This is where the reorgchk utility comes into play. DB2 provides another need little utility that checks various aspects of table and index statistics to figure out what needs to be reorg’d . Ensure you have good runstats before you run reorgchk to get the current state of the tables and indexes.
There are two type of reorg available in DB2 LUW :
1. Offline table reorganization uses a shadow copy approach, building a full copy of the table that is being reorganized. The shadow copy image is built within the table space that the table being reorganized resides in. Optionally, the copy can be built in a system temporary table space if the USE option is specified when issuing the REORG TABLE command. During offline reorg application can read the table while the copy is being build, but reorg takes exclusive control of the table after that. This type of reorg is not practical for 24 x 7 tables.
2. Online table reorganization allows the user to reorganize a table while permitting read / write access to the table. While inplace REORG provides uninterrupted user access to the data, the performance of inplace REORG is slower than offline REORG. During an inplace table reorganization, the entire table is not reorganized at once. Instead, portions of the table are reorganized incrementally. Data is not copied out to a temporary table space: rows are moved within the existing table object to re-establish clustering, reclaim free spaces, and eliminate overflow rows.
One of the issues I have faced while doing online or inplace reorg is , they are asynchronous . So, if you have 50 tables in your script to reorg, all of them can potentially run at the same time. This causes few problems :
a. Each reorg uses different agents, so it can eat up available free agents
b. The i/o can be go high due to multiple of them running at the same time
c. Inplace reorgs log each data movement, so multiple reorgs running at the same time can cause the active logs to spike.
Unfortunately, IBM does not provide any easy way to handle the issue. While pondering my thoughts on that, I wrote a simple script to run inplace reorg one at time . The script basically submits a reorg statements and checks for the presence of db2reorg thread, as long as it finds one it waits and it runs the next reorg statment when the previous one ends.
The script is attached before. Use it after making necessary changes to your convenience and test it before you implement it
Inplace reorg :reorg_inplace.txt