.3 CSCI315 Database Design and Implementation Experiment 4.3 How to defragment persistent storage at an extent level ? Experimented and described by Dr. Janusz R. Getta School of Computer Science and Software Engineering, University of Wollongong, Australia, Bldg. 3, room 210, phone +61 02 42214339, fax +61 02 42214170, e-mail: [email protected], Web: http://www.uow.edu.au/∼jrg, Msn: jgetta, Skype: jgetta007 Table of contents Step 0 How to begin and what you need to know before you start ? Step 1 How to find the values of PCTRFREE and PCTUSED parameters ? Step 2 How to analyze the storage parameters of a relational table ? Step 3 How to find the values of row identifiers ? Step 4 How to analyze the distribution of rows within the data blocks ? Step 5 How to eliminate the empty blocks from the extents ? Step 6 How to release the empty extents ? Step 7 How to clean up after the experiment ? References Actions Step 0 How to begin and what you need to know before you start ? A printable copy of this experiment in pdf format is available here .We speak about fragmentation of persistent storage at an extent level in a situation when an extent consists of many data blocks and some of these blocks are empty. Then, it is possible to move the rows 0 These are the specifications of the homeworks in a subject Database Design and Implementation (CSCI315) delivered in January 2009 at Singapore Institute of Management by Dr. Janusz R. Getta 4-1 Experiment 4.3: How to defragment persistent storage at an extent level ? 4-2 from the other not fully occupied extents into a lower number of extents. It reduces the total number of read block operations performed by the system during a full scan of relational table. Defragmentation at an extent level means that empty blocks within the extents are removed and the empty extents allocated at the end of segment are released. Defragmentation at an extent level is performed when a segment goes through intensive deletion operations or when a parameter PCTFREE has been set to a wrong value. You can learn more about PCTFREE and PCTUSED parameters at the end of this action. Turn the system on (if it is off) and logon as PC user. Download and uncompress SQL scripts used in Homework 4. Use cd command to navigate to a folder where the downloaded and uncompressed SQL scripts are located. Start SQL*Plus client in a way described in either Experiment 1.1 for XP operating system or in Experiment 1.2 for Linux operating system. Connect as a user STUDENT . A password is: student .While connected as a user STUDENT , execute a script file creatbs40k.sql to create a locally managed tablespace TBS40K with a uniform size of each extent equal to 40 Kbytes.Remain connected as a user STUDENT . Next, execute a script grantquota.sql to grant 5 Mbytes quota on a tablespace TBS40K to a user STUDENT .Execute a script makedef.sql to make a tablespace TBS40K a default tablespace of a user STUDENT .Execute a script dbcreate4.sql to create a sample database. The space management parameters PCTFREE and PCTUSED control the use of free space within a data block by insert, update, and delete statements. Both parameters should be determined when creating (or altering) the database objects. PCTFREE parameter sets the minimum percentage of a data block to be reserved as free space for possible updates to rows that already exist in that block. For example, assume that CREATE TABLE statement sets a value of PCTFREE to 20. It means that 20will be kept free and it will be available for the future updates on the rows within each block. New rows can be inserted into a block before it is 80PCTUSED parameter sets the minimum percentage of a block that can be used for row data plus overhead before new rows can be added to the block. After a data block is filled to the limit determined by PCTFREE , the system considers the block as unavailable for the insertions of new rows until the percentage of occupancy in that block falls below a value of parameter PCTUSED . Until this value is achieved, the system uses free space in the data block only for the updates to the rows already contained in the block. For example, assume that CREATE TABLE statement sets PCTUSED to 40. Then, each data block used for this table’s data segment is considered unavailable for the insertion of any new rows until the amount of used space in the block falls to 39(assuming that the block’s used space has previously reached a value of PCTFREE ). Freelist is a list of data blocks allocated for a segment’s extents such that each block has an amount of free space greater than a value of parameter PCTFREE for this segment. When the system processes INSERT statement then a free list is scanned to find the first block available for the insertion. Processing of DELETE and UPDATE statements that decrease amount of storage used in a block below a value of PCTUSED parameter cause the block to be appended to a free list. Then, subsequent insertions into the block reduce amount of free space below PCTFREE the block is excluded from a freelist. A freelist forms a chain of free blocks in each segment. There are different types of freelists: master freelist, process freelist and transaction freelist. Experiment 4.3: How to defragment persistent storage at an extent level ? 4-3 Step 1 How to find the values of PCTFREE and PCTUSED parameters ? While connected as a user STUDENT , execute a script file listpcts.sql to list the values of PCTFREE and PCTUSED parameters for the relational tables POSITION, SNEEDED, APPLICANT owned by a user CSCI315 .For example, to get the values of PCTFREE and PCTUSED parameters for a relational table APPLICANT owned by a user CSCI315 the script accesses a data dictionary view DBA_TABLES in the following way: SELECT TABLE_NAME, OWNER, TABLESPACE_NAME, PCT_FREE, PCT_USED FROM SYS.DBA_TABLES WHERE TABLE_NAME = ’APPLICANT’ AND OWNER = ’CSCI315’; All relational tables owned by a user CSCI315 and included in a sample database of TPC-W benchmark have the default values of PCTFREE and PCTUSED parameters equal to PCTFREE=10 and a value of PCTUSED is NULL because a tablespace CSCI315 has been created with a default value of parameter SEGMENT SPACE MANAGEMENT i.e. a value used wasAUTO . Step 2 How to analyze the storage parameters of a relational table ? A data dictionary view DBA_TABLES contains a lot of statistical data related to the storage parameters of the relational tables. For example, it contains information about a number of empty (never used) blocks, the total number of rows stored in a relational table, average amount of free space per data block, the total number of chained rows, i.e. the rows that span over more than one block, average row length, average amount of free space of all blocks listed on the freelists, the total number of blocks on freelist, etc. While connected as a user STUDENT , execute a script file listanalysis.sql to analyze the storage structures of a relational table APPLICANT owned by a user STUDENT .The script executes ANALYZE TABLE statement with COMPUTE STATISTICS clause in order to perform the full analysis of a relational table. The results of analysis are stored in a data dictionary view DBA_TABLES and retrieved by the script from there. Because of the performance reasons the system does not update all statistics after each modification of a table. For example, to analyse a relational table ITEM owned by a user CSCI315 the script executes the following statements: ANALYZE TABLE CSCI315.ITEM COMPUTE STATISTICS; Experiment 4.3: How to defragment persistent storage at an extent level ? 4-4 SELECT TABLE_NAME, OWNER, TABLESPACE_NAME, NUM_ROWS, AVG_SPACE, CHAIN_CNT FROM SYS.DBA_TABLES WHERE TABLE_NAME = ’ITEM’ AND OWNER = ’CSCI315’; SELECT TABLE_NAME, AVG_ROW_LEN, NUM_FREELIST_BLOCKS, AVG_SPACE_FREELIST_BLOCKS FROM SYS.DBA_TABLES WHERE TABLE_NAME = ’ITEM’ AND OWNER = ’CSCI315’; Remain connected as a user STUDENT and execute a script listanalysis.sql to perform the analysis of the relational tables ITEM, CUSTOMER, COUNTRY owned by a user CSCI315 .Analysis of a relational table ITEM provides the following data. The table consists of 10,000 rows. Average amount of free space per block is equal to 1,112 bytes. It means that an average block is around 86There is no rows that span over more than one data block, i.e. so called ”chained rows”. Average row length is 502 bytes, which means that each block contains on average 14 rows. The total number of allocated data blocks and still available for insertions, i.e. the blocks where occupancy level after an insertion does not exceed 100-PCTFREE = 90equal to 5. An average free space in each block available for insertion is equal to 7,346 bytes. Step 3 How to find the values of row identifiers ? A row identifier uniquely determines a physical location of a row in a persistent storage. A row identifier consists of the following values: (i) file number, (ii) block number within a file, and (iii) row number within a block. Access to a relational table through a row identifier is the fastest possible way how a row can be read from a relational table. In implementation of data application this kind of access is not recommended as the system does not guarantee persistence of the row identifiers, i.e. reorganisation of a relational table changes the values of row identifiers. While connected as a user STUDENT , execute a script file listrowids.sql to lists the identifiers of rows in a relational table APPLICANT owned by a user STUDENT .The script uses the functions ROWID_RELATIVE_FNO, ROWID_BLOCK_NUMBER, and ROWID_ROW_NUMBER from a PL/SQL package DBMS_ROWID to extract the values of file number, Experiment 4.3: How to defragment persistent storage at an extent level ? 4-5 relative block number in a file, and relative row number in a block. For example, the identifiers of rows located in a relational table COUNTRY can be listed in the following way: SELECT DBMS_ROWID.ROWID_RELATIVE_FNO(ROWID) FNO, DBMS_ROWID.ROWID_BLOCK_NUMBER(ROWID) BNO, DBMS_ROWID.ROWID_ROW_NUMBER(ROWID) RNO FROM CSCI315.COUNTRY ORDER BY BNO; Execute a script listrowids.sql to list the identifiers of rows in a relational table COUNTRY owned by a user CSCI315 .The script reports the locations of 92 rows included in a relational table COUNTRY All rows are located in one data block. File number is equivalent to file identifier ( File id. ) reported by a script listfiles.sql . When listing the names of files enter CSCI315 as name of tablespace. Execution of the script provides a full name of a file that contains the rows included in a relational table COUNTRY . Step 4 How to analyze the distribution of rows within the data blocks ? While connected as a user STUDENT , execute a script creatab2k.sql that contains the following statement. CREATE TABLE TAB2K( ATTRIBUTE1 CHAR(1024), ATTRIBUTE2 CHAR(1024)) PCTFREE 30 PCTUSED 70; The script creates a relational table TAB2K with the row length equal to 2 Kbytes and located in a locally managed tablespace TBS40K with the uniform extent size equal to 16 Kbytes. At the moment the tablespace is a default tablespace of a userSTUDENT .Each row in a relational table TAB2K consist of two fixed size strings each 1 Kbytes long. Size of each extent is equal to 40 Kbytes. A value of PCTFREE parameter is equal to 30, which means that 30must be left free for updates. A value of PCTUSED parameter is equal to 70, which means a block will be available for insertions when its occupancy level will go below 70Execute a script listexts.sql . The script should find the allocation of extents in a relational table TAB2K .The results reported by the script confirms our expectations that a relational table TAB2K consist of one extent 40 Kbytes long (5 data blocks, each 8 Kbytes). Note, that the first three block in the first extent are always used by the system for the organizational purposes. Next, execute a script listanalysis.sql to analyze and to list the storage properties of a relational table TAB2K . The results of analysis are trivial. At the moment a relational table TAB2K is empty. Execute a script ins7rows.sql that inserts 7 rows like: Experiment 4.3: How to defragment persistent storage at an extent level ? 4-6 INSERT INTO TAB2K VALUES(’01’,’01’); Then, execute a statement: SELECT COUNT(*) FROM TAB2K; to verify the total number of rows in a relational table TAB2K .Insertions of the rows should trigger allocation of the new extents. Note that, a value of an attribute ATTRIBUTE1 in a table TAB2K uniquely identify each row. It allows for the selective deletions of the rows later on. Execute a script listexts.sql to find the total number of extents allocated for a relational table TAB2K after the insertions. To accommodate 7 rows the system allocated 2 extents: 40 Kbytes and 40 Kbytes. A data block, which is 8 Kbytes long may theoretically accomodate 3 rows. It is impossible to store 4 rows because to store a row in in a block the system needs 2 Kbytes for data included in a row and 2 bytes for an entry in a row directory within a block. Note, that a relational table TAB2K has been created with a value of parameter PCTFREE equal to 30. It means that 30Therefore, only 8 Kbytes - (8 Kbytes*0.3) = 5.6 Kbytes is available for the insertions. A value of PCTFREE parameter equal to 30 limits the total number of rows per block to 2. Remain connected as a user STUDENT . To verify the row distribution analysis described above, execute a script listrowids.sql for a relational table TAB2K . The results listed by the script confirm our expectations - each data block (except the first one) contains 2 rows. Execute a script listmap.sql to list a map of free an allocated extents in a tablespace TBS40K .The tablespace consists of a number of extents. Some of the extents are allocated by the relational tables of a sample database created at the very beginning of this experiment. Two extents, each one 40 Kbytes long, are allocated by a relational table TAB2K The last extent is free and it contains all free space in the tablespace. Next, Execute a script listanalysis.sql to analyze and list the storage properties of a relational table TAB2K after the insertion of 7 rows. The results of analysis are as follows. Average free space per block is equal to 6,013 bytes. This is because the total amount of disk storage allocated in 2 extents is is 7 blocks * 8 Kbytes = 56 Kbytes. If we forget about the space consumed by a row dictionary and variable headers in the blocks then implementation of a row takes 2 Kbytes. It means that data occupy 7 rows * 2 Kbytes = 14 Kbytes. Hence, 56 Kbytes - 14 Kbytes = 42 Kbytes are free. A free area is distributed over 7 blocks (as we already mentioned the first extent contributes only with two blocks), hence an average free space per block is equal to 42 Kbytes / 7 blocks = 6 Kbytes, which is almost exactly the same result as obtained from ANALYZE TABLE statement. Execution of a script listanalysis.sql reports that there are no data blocks on free list. Remain connected as a user STUDENT . In the next step, execute a script del4rows.sql to delete the first 4 rows from a relational table TAB2K and to commit the deletions. The script executes the statements: DELETE FROM TAB2K Experiment 4.3: How to defragment persistent storage at an extent level ? 4-7 WHERE ATTRIBUTE1 <= ’04’; COMMIT; Again, execute the scripts: listexts.sql , listanalysis.sql , listrowids.sql .The results of the analysis performed above show that even after the deletions a relational table TAB2K still consists of 3 extents. None of the extents has been deallocated because the table has not been dropped or truncated. Comparison if the results returned listrowids.sql with the results produced earlier shows that the second block in the first extent and the first block in the second extent are empty. These blocks has been attached to a freelist of blocks for this segment. This is why the total number of blocks on a freelist is now equal to 4. The blocks attached to a freelist are available for the insertions. It is also an evidence that deletion of the rows from a relational table does not return the earlier allocated extents to a pool free space available in a tablespace. Deletion of the rows from a relational table creates the ”holes” inside the extents. This effect is called as fragmentation of disk space at an extent level. Unfortunately, the system does not remove these holes automatically. It may happen that even more than 50is not used and still a full scan of the table access all data blocks. Therefore, extent-level fragmentation has a very negative impact on performance of full table scans. Step 5 How to eliminate the empty blocks from the extents ? To eliminate fragmentation of persistent storage at an extent level we can use the same techniques as for defragmentation at a segment level, i.e. either create a new table and copy the contents of old table to a new table, drop the table, and rename new into old, or export and import a table. Elimination of fragmentation at an extent level allows more rows to be packed within each block and as the result it reduces a number of read block operations. While connected as a user STUDENT , execute a script creanew2k.sql that creates a relational table NEW2K in the following way. CREATE TABLE NEW2K( ATTRIBUTE1 CHAR(1024), ATTRIBUTE2 CHAR(1024)) PCTFREE 30 PCTUSED 70; A relational table NEW2K has the same schema as the original table TAB2K . Execute a script file copytab2k.sql that copies the contents of a relational table TAB2K into a relational table NEW2K . The script file contains the following statement. INSERT INTO NEW2K ( SELECT * FROM TAB2K ); Experiment 4.3: How to defragment persistent storage at an extent level ? 4-8 Execute the scripts: listexts.sql , listanalysis.sql , listrowids.sql . to list information about the internal structures of a relational table NEW2K .A copy of the contents of a relational table TAB2K into a relational table NEW2K allocated two extents instead of three extents allocated by the original table. Three extents are enough to accommodate three rows in a relational table NEW2K . The results of analysis also show that average amount of free space per block has decreased, which means more economic usage of disk storage. Step 6 How to release the empty extents ? While connected as a user STUDENT , execute a script deltab2k.sql to delete all rows from a relational table TAB2K and to commit the deletions. The script contains the following statements. DELETE FROM TAB2K; COMMIT; Next, execute the scripts listexts.sql listrowids.sql to find the total number of extents allocated and to find the identifiers of rows in a relational table TAB2K Despite, that the table is empty it still consists of three extents. To release the unused extents we have to use TRUNCATE TABLE statement. Remain connected as a user STUDENT and truncate a relational table TAB2K with TRUNCATE statement with DROP STORAGE option, see below. TRUNCATE TABLE TAB2K DROP STORAGE; Next, execute a script listexts.sql to check the total number of allocated extents. At this point an empty relational table TAB2K allocates only one extent, which consists of two data blocks. It is the smallest possible allocation for an empty relational table. Step 7 How to clean up after the experiment ? While connected as a user STUDENT , execute a script makedef.sql to make a tablespace USERS a default tablespace of a user STUDENT .At the end of the experiment we remove the database objects created so far. To drop the tablespaces created in this experiment connect as a user STUDENT and execute a script file clean4-3.sql that contains the following statement. DROP TABLESPACE TBS40K INCLUDING CONTENTS AND DATAFILES; References Experiment 4.3: How to defragment persistent storage at an extent level ? SQL Reference, CREATE TABLESPACE statement SQL Reference, DROP TABLESPACE statement SQL Reference, ALTER USER statement SQL Reference, CREATE TABLE statement SQL Reference, DROP TABLE statement SQL Reference, ANALYZE statement SQL Reference, INSERT statement SQL Reference, DELETE statement SQL Reference, TRUNCATE statement Reference, DBA TABLES view Reference, DBA DATA FILES view Reference, DBA EXTENTS view Reference, DBA FREE SPACE view PL/SQL Packages and Types Reference, DBMS ROWID package 4-9
© Copyright 2024