了解高水位与truncate

什么是High Water Mark 高水位?

HWM

“The high water mark starts at the first block of a newly created table. Ads data is inserted, the high water mark rises. And the HWM will remain at that level in spite of delete operations.
The HWM matters since Oracle will scan all blocks below the HWM even when they contain no data during a full scan – just to see if they have data. TRUNCATE will reset the HWM, so will other operations described next.”

我们知道Oracle中的数据段(segment)均有一个存放数据的上边界。这条上边界被称作High Water Mark”高水位”或者简写为HWM。

高水位标记了那些已经分配给数据段但还没有真正被使用的数据块。常规情况下高水位以每次5个数据块的速度上涨。全表扫描通常从起始Extent开始到高水位标记结束。注意delete操作是无法降低高水位线的,而truncate操作可以,不管是truncate drop storage还是 truncate use storage 均会把 高水位线”置零”。

这里的置零具体是指:

在truncate drop storage 的情况下:

SQL> create table maclean_test1 tablespace users as select * from dba_objects;
Table created.

SQL> create index objd_test1 on maclean_test1(object_id) tablespace users;
Index created.

SQL> create table maclean_test2 tablespace users as select * from dba_objects;
Table created.

SQL> create index objd_test2 on maclean_test2(object_id) tablespace users;
Index created.

SQL> exec dbms_stats.gather_table_stats('SYS','MACLEAN_TEST1',cascade=>TRUE);
PL/SQL procedure successfully completed.

SQL> exec dbms_stats.gather_table_stats('SYS','MACLEAN_TEST2',cascade=>TRUE);
PL/SQL procedure successfully completed.

SQL> analyze index objd_test1 validate structure;
Index analyzed.

SQL> select HEIGHT,BLOCKS,LF_BLKS,BR_BLKS,BTREE_SPACE from index_stats;

    HEIGHT     BLOCKS    LF_BLKS    BR_BLKS BTREE_SPACE
---------- ---------- ---------- ---------- -----------
         2         40         29          1      239912

SQL> analyze index objd_test2 validate structure;

Index analyzed.

SQL> select HEIGHT,BLOCKS,LF_BLKS,BR_BLKS,BTREE_SPACE from index_stats;

HEIGHT BLOCKS LF_BLKS BR_BLKS BTREE_SPACE
---------- ---------- ---------- ---------- -----------
2 40 29 1 239912

set serveroutput on
DECLARE
TOTAL_BLOCKS number;
TOTAL_BYTES number;
UNUSED_BLOCKS number;
UNUSED_BYTES number;
LAST_USED_EXTENT_FILE_ID number;
LAST_USED_EXTENT_BLOCK_ID number;
LAST_USED_BLOCK number;
BEGIN
dbms_space.unused_space('SYS',
'MACLEAN_TEST1',
'TABLE',
TOTAL_BLOCKS,
TOTAL_BYTES,
UNUSED_BLOCKS,
UNUSED_BYTES,
LAST_USED_EXTENT_FILE_ID,
LAST_USED_EXTENT_BLOCK_ID,
LAST_USED_BLOCK);

dbms_output.put_line('OBJECT_NAME = MACLEAN_TEST1');
dbms_output.put_line('-----------------------------------');
dbms_output.put_line('TOTAL BLOCKS = ' || TOTAL_BLOCKS);
dbms_output.put_line('TOTAL SIZE(KByte) = ' ||
TOTAL_BYTES / 1024);
dbms_output.put_line('UNUSED BLOCKS = ' || UNUSED_BLOCKS);
dbms_output.put_line('UNUSED SIZE(KByte) = ' ||
UNUSED_BYTES / 1024);
END;
/
OBJECT_NAME = MACLEAN_TEST1
-----------------------------------
TOTAL BLOCKS = 256
TOTAL SIZE(KByte) = 2048
UNUSED BLOCKS = 67
UNUSED SIZE(KByte) = 536

PL/SQL procedure successfully completed.

SQL> truncate table maclean_test1 drop storage;
Table truncated.

exec dbms_stats.gather_table_stats('SYS','MACLEAN_TEST1',cascade=>TRUE);
PL/SQL procedure successfully completed.

set serveroutput on
DECLARE
TOTAL_BLOCKS number;
TOTAL_BYTES number;
UNUSED_BLOCKS number;
UNUSED_BYTES number;
LAST_USED_EXTENT_FILE_ID number;
LAST_USED_EXTENT_BLOCK_ID number;
LAST_USED_BLOCK number;
BEGIN
dbms_space.unused_space('SYS',
'MACLEAN_TEST1',
'TABLE',
TOTAL_BLOCKS,
TOTAL_BYTES,
UNUSED_BLOCKS,
UNUSED_BYTES,
LAST_USED_EXTENT_FILE_ID,
LAST_USED_EXTENT_BLOCK_ID,
LAST_USED_BLOCK);

dbms_output.put_line('OBJECT_NAME = MACLEAN_TEST1');
dbms_output.put_line('-----------------------------------');
dbms_output.put_line('TOTAL BLOCKS = ' || TOTAL_BLOCKS);
dbms_output.put_line('TOTAL SIZE(KByte) = ' ||
TOTAL_BYTES / 1024);
dbms_output.put_line('UNUSED BLOCKS = ' || UNUSED_BLOCKS);
dbms_output.put_line('UNUSED SIZE(KByte) = ' ||
UNUSED_BYTES / 1024);
END;
/

OBJECT_NAME = MACLEAN_TEST1
-----------------------------------
TOTAL BLOCKS = 8
TOTAL SIZE(KByte) = 64
UNUSED BLOCKS = 5
UNUSED SIZE(KByte) = 40

 这里一个非deffered segment至少仍会使用3个数据块,  而truncate drop storage后高水位线就在这第三个数据块之上

SQL> analyze index objd_test1 validate structure;

Index analyzed.

SQL> select HEIGHT,BLOCKS,LF_BLKS,BR_BLKS,BTREE_SPACE from index_stats;

    HEIGHT     BLOCKS    LF_BLKS    BR_BLKS BTREE_SPACE
---------- ---------- ---------- ---------- -----------
         1          8          1          0        7996

 对表truncate drop storage后,表上的索引收缩到高度为1,且只有一个叶子块

SQL> truncate table maclean_test2 reuse storage;

Table truncated.

SQL> exec dbms_stats.gather_table_stats('SYS','MACLEAN_TEST2',cascade=>TRUE);
PL/SQL procedure successfully completed.

OBJECT_NAME = MACLEAN_TEST2
-----------------------------------
TOTAL BLOCKS = 256
TOTAL SIZE(KByte) = 2048
UNUSED BLOCKS = 253
UNUSED SIZE(KByte) = 2024

PL/SQL procedure successfully completed.

 truncate reuse storage后高水位同样回归到第三个数据块以上, 而该数据段的空间并没有回收给表空间

SQL> analyze index objd_test2 validate structure;

Index analyzed.

SQL> select HEIGHT,BLOCKS,LF_BLKS,BR_BLKS,BTREE_SPACE,DEL_LF_ROWS from index_stats;

    HEIGHT     BLOCKS    LF_BLKS    BR_BLKS BTREE_SPACE DEL_LF_ROWS
---------- ---------- ---------- ---------- ----------- -----------
         1         40          1          0        7996           0

 与表段类似索引中的数据已被删除,但是extent空间并不被回收

 

总结以上测试结果:

Truncate Drop Storage:

  1. 高水位下降到所能下降的最低位置
  2. 删除表上的所有数据行并释放空间
  3. 删除索引上的所有数据,索引的实际结构收缩到高度为1的最少数据块,并释放原有空间

Truncate Reuse Storage:

  1. 高水位下降到所能下降的最低位置
  2. 删除表上的所有数据行,但是不释放空间
  3. 删除索引上的所有数据,索引的实际结构收缩到高度为1的最少数据块,但不释放原有空间

此外值得一提的是11.2.0.2中出现truncate的新特性,截断表目前有了第三种选项:即drop all storage。

对表执行drop all storage的truncate将会导致与该表相关的所有segment均被drop掉,如下例:

SQL> select * from v$version;

BANNER
--------------------------------------------------------------------------------
Oracle Database 11g Enterprise Edition Release 11.2.0.2.0 - 64bit Production
PL/SQL Release 11.2.0.2.0 - Production
CORE 11.2.0.2.0 Production
TNS for Linux: Version 11.2.0.2.0 - Production
NLSRTL Version 11.2.0.2.0 - Production

select * from global_name;
GLOBAL_NAME
---------------------------------------------
www.askmac.cn

conn maclean/maclean

SQL> create table maclean_test3 tablespace users as select * from dba_objects;
Table created.

SQL> create index objd_test3 on maclean_test3(object_id) tablespace users;
Index created.

SQL> col segment_name for a20

SQL> select segment_name,blocks from dba_segments where segment_name in ('MACLEAN_TEST3','OBJD_TEST3');

SEGMENT_NAME             BLOCKS
-------------------- ----------
MACLEAN_TEST3               256
OBJD_TEST3                   40

SQL> truncate table maclean_test3 drop all storage;
Table truncated.

SQL> select segment_name,blocks from dba_segments where segment_name in ('MACLEAN_TEST3','OBJD_TEST3');

no rows selected

以上可以看到在普通用户模式下对表truncate drop all storage后,该表相关的segment均被事实上的drop了。

注意该drop all storage特性对于SYS用户模式下的对象是无效的,如:

SQL> conn / as sysdba
Connected.

SQL> create table tab1(t1 int);

Table created.

SQL> insert into tab1 values(1);

1 row created.

SQL> commit;

Commit complete.

SQL> select segment_name,blocks from user_segments where segment_name in ('TAB1');

SEGMENT_NAME             BLOCKS
-------------------- ----------
TAB1                          8

SQL> truncate table tab1 drop all storage;

Table truncated.

SQL> select segment_name,blocks from user_segments where segment_name in ('TAB1');

SEGMENT_NAME             BLOCKS
-------------------- ----------
TAB1                          8

Know about Oracle High Water Mark

there’s no HWM for datafiles, it’s just a virtual term to describe the last block containing data in the data file, which is the minimum size allowed for sizing down the data file.

This article intends to provide an SQL script to find tables which are fragmented (i.e Data is much lower then High Water Mark),so that we can target those segments (tables) for recreation.

Software Requirements/Prerequisites

Execution Environment    :SQL, SQL*Plus

Access Privileges              :Requires dba privileges as script is to be run as   the owner SYS or SYSTEM

Prerequisites                     :Do an Analyze  with compute statistics on all tables present in the Users schema

i.e Analyze table <table_name> compute statistics

Usage                                :Sqlplus username/<password>

SQL> @fragment.sql

Advisory                            Will not work on Compressed tables, may return negative numbers.

 

Configuring the Script

1.User needs dba privileges to access dba_tables  .

2.Statistics needs to be collected on all the tables using  compute statistics

option for the input schema before fragment.sql is   run.

Running the Script

Step 1:- Copy this script to a file named fragment.sql.

Step 2:- Connect as user SYS or SYSTEM.

Step 3:- Run Analyze on all the tables present in the schema  for which you want to find the fragmented table.

SQL> Analyze table <table_name> compute statistics ;

Step 4:- Execute the fragment.sql script.Note the script will prompt for Schema name.

SQL> @fragment.sql

 

Caution

This script is provided for educational purposes only and not supported by Oracle Support Services. It has been tested internally, however, and works as documented. We do not guarantee that it will work for you, so be sure to test it in your environment before relying on it. Proofread this script before using it! Due to the differences in the way text editors, e-mail packages and operating systems handle text formatting (spaces, tabs and carriage returns), this script may not be in an executable state when you first receive it. Check over the script to ensure that errors of this type are corrected.

Script

REM This is an example SQL*Plus Script to find tables fragmentated below high water mark

set heading off verify off echo off
Spool fragment.sql

REM The below queries gives information about the size of the table with respect to the High water Mark
REM note that BLOCKS*8192 is BLOCKS times the block size: 8192.  Substitue your DB blocksize.
REM SELECT BLOCKS*8192/1024/1024 FROM  DBA_TABLES WHERE  TABLE_NAME='<TABLE_NAME>'  and    owner='<owner>'   ;
REM The below queries gives the actual size in MB used by the table in terms of data .
REM SELECT NUM_ROWS*AVG_ROW_LEN/1024/1024 FROM  DBA_TABLES WHERE TABLE_NAME='<TABLE_NAME>' and  owner='<owner'
REM
REM You can use the difference of the two sql statements specified above to get the table which
REM has fragementation below high water mark prompt Enter name(s) of schema for which you want to find
REM fragemented object.
PROMPT Please enter the schema name

SELECT TABLE_NAME ,  (BLOCKS *8192 / 1024/1024 ) - (NUM_ROWS*AVG_ROW_LEN/1024/1024)
"Data lower than HWM in MB"   FROM  DBA_TABLES WHERE  UPPER(owner) =UPPER('&OWNER') order by 2 desc;

Spool off

Goal

This article explains, with examples, how to view the high water mark and when the high water mark is reset. The queries given in this article applies when the segment , whose high water mark has to be determined ,  is in one datafile and is not spawned across multiple datafiles .

Solution

The high water mark is the boundary between used and unused space in a segment. As requests for new free blocks that cannot be satisfied by existing free lists are received, the block to which the high water mark points to becomes a used block, and the high water mark is advanced to the next block. In other words, the segment space to the left of the high water mark is used, and the space to the right of it is unused.

The high-water mark is the level at which blocks have never been formatted to receive data.

When a table is created in a tablespace, some initial number of blocks / extents are allocated to the table. Later, as the number of rows inserted increases, extents are allocated accordingly.

To find out how many blocks / extents are allocated to the table, query DBA_SEGMENTS for ‘blocks’ and ‘extents’.

For example:

SQL>create table test1 (num number) tablespace tbsp1;

Table created

SQL>select blocks, extents from dba_segments where segment_name=’TEST1′

BLOCKS EXTENTS
——– ————-
8 1

Now, to view the high water mark, perform an:

SQL> analyze table test1 compute statistics;

Querying dba_tables for ‘Blocks’ and ‘Empty_blocks’ should give the high water mark.

Blocks — > Number blocks that has been formatted to recieve data
Empty_blocks —> Among the allocated blocks, the blocks that were never used

SQL> select blocks,empty_blocks,num_rows from dba_tables where table_name=’TEST1′

BLOCKS EMPTY_BLOCKS NUM_ROWS
————– ————————– ——————–
0 7 0

If you insert some rows, then the output of the above query returns:

BLOCKS EMPTY_BLOCKS NUM_ROWS
————– ————————– ——————–
1 6 8

blocks+Empty_blocks=1+6=7 (but not 8) because 1 block is for segment header.

Insert some more rows into table ‘TEST1’ to increase the number of extents allocated so that
DBA_SEGMENTS will show :

BLOCKS EXTENTS
————– —————
32 4

And dba_tables (after analyze table) shows:

BLOCKS EMPTY_BLOCKS NUM_ROWS
————– ————————– ——————–
28 3 14338

Deleting the records doesn’t lower the high water mark. Therefore, deleting the records doesn’t raise the ‘Empty_blocks’. After deleting the records, if you query dba_segments or dba_tables, there would be no change. Even an ‘Alter table test1 deallocate unused;’ will not bring the high water mark down.

To determine the exact number of blocks that contain data i.e. space used by table below the high water mark, query rowid and get the first and last used block from the rowid.

SQL> select count (distinct dbms_rowid.rowid_block_number(rowid)) “used blocks” from TEST1;

This works fine if only one file is used for the segment. If more files, we need to include the file number in some way, for instance:

SQL> select count (distinct dbms_rowid.rowid_block_number(rowid)||’-‘||dbms_rowid.rowid_relative_fno (rowid)) “used blocks” from TEST1

USED_BLOCKS
———————–
22

From this we can conclude that for table ‘TEST1’, 32 blocks are allocated out of which 28 blocks are formatted to receive data but only 22 blocks contain the actual data.

The high water mark can be reset with a truncate table or if the table is moved to another tablespace.  Additionally, in 10g the following option to shrink a segment was introduced to reset the high water mark. eg. ALTER TABLE <tablename> SHRINK SPACE;

When the table is created with CTAS from another table, the high water mark of the latter table is not reflected in the new table. If the tablespace is moved back to the same tablespace, the high water mark is reset. In this case, query on obj#, dataobj# of obj$. Obj# remains the same but dataobj# changes.

 

PURPOSE
This article describes how to find out how many blocks are really being
used within a table ie. are not empty. Please note that this article does
not cover what to do when chaining is taking place.

SCOPE & APPLICATION
For DBA’s needing to determine how many blocks within a table are
empty blocks.

How many blocks contain data (are not empty)
——————————————–
Each row in the table has pseudocolumn called ROWID.
This pseudo contains information about physical location
of the row in format
block_number.row.file

If the table is stored in a tablespace which has one
datafile, all we have to do is to get DISTINCT
number of block_number from ROWID column of this table.

But if the table is stored in a tablespace with more than one
datafile then you can have the same block_number but in
different datafiles so we have to get DISTINCT number of
block_number+file from ROWID.

The SELECT statements which give us the number of “really used”
blocks is below. They are different for ORACLE 7 and ORACLE 8
because of different structure of ROWID column in these versions.

For ORACLE 7:

SELECT COUNT(DISTINCT SUBSTR(rowid,15,4)||
SUBSTR(rowid,1,8)) “Used”
FROM schema.table;

For ORACLE 8+:

SELECT COUNT (DISTINCT
DBMS_ROWID.ROWID_BLOCK_NUMBER(rowid)||
DBMS_ROWID.ROWID_RELATIVE_FNO(rowid)) “Used”
FROM schema.table;
or

SELECT COUNT (DISTINCT SUBSTR(rowid,1,15)) “Used”
FROM schema.table;

You could ask why the above information could not be determined
by using the ANALYZE TABLE command. The ANALYZE TABLE command only
identifies the number of ‘ever’ used blocks or the high water mark
for the table.

What is the High Water Mark?
—————————-
All Oracle segments have an upper boundary containing the data within
the segment. This upper boundary is called the “high water mark” or HWM.
The high water mark is an indicator that marks blocks that are allocated
to a segment, but are not used yet. This high water mark typically bumps
up at 5 data blocks at a time. It is reset to “zero” (position to the start
of the segment) when a TRUNCATE command is issued. So you can have empty
blocks below the high water mark, but that means that the block has been
used (and is probably empty caused by deletes). Oracle does not move the
HWM, nor does it *shrink* tables, as a result of deletes. This is also
true of Oracle8. Full table scans typically read up to the high water mark.

Data files do not have a high water mark; only segments do have them.

How to determine the high water mark
————————————
To view the high water mark of a particular table::

ANALYZE TABLEESTIMATE/COMPUTE STATISTICS;

This will update the table statistics. After generating the statistics,
to determine the high water mark:

SELECT blocks, empty_blocks, num_rows
FROM user_tables
WHERE table_name =;

BLOCKS represents the number of blocks ‘ever’ used by the segment.
EMPTY_BLOCKS represents only the number of blocks above the ‘HIGH WATER MARK’
.
Deleting records doesn’t lower the high water mark. Therefore, deleting
records doesn’t raise the EMPTY_BLOCKS figure.

Let us take the following example based on table BIG_EMP1 which
has 28672 rows (Oracle 8.0.6):

SQL> connect system/manager
Connected.

SQL> SELECT segment_name,segment_type,blocks
2> FROM dba_segments
3> WHERE segment_name=’BIG_EMP1′;
SEGMENT_NAME SEGMENT_TYPE BLOCKS EXTENTS
—————————– —————– ———- ——-
BIG_EMP1 TABLE 1024 2
1 row selected.

SQL> connect scott/tiger

SQL> ANALYZE TABLE big_emp1 ESTIMATE STATISTICS;
Statement processed.

SQL> SELECT table_name,num_rows,blocks,empty_blocks
2> FROM user_tables
3> WHERE table_name=’BIG_EMP1′;
TABLE_NAME NUM_ROWS BLOCKS EMPTY_BLOCKS
—————————— ———- ———- ————
BIG_EMP1 28672 700 323
1 row selected.

Note: BLOCKS + EMPTY_BLOCKS (700+323=1023) is one block less than
DBA_SEGMENTS.BLOCKS. This is because one block is reserved for the
segment header. DBA_SEGMENTS.BLOCKS holds the total number of blocks
allocated to the table. USER_TABLES.BLOCKS holds the total number of
blocks allocated for data.

SQL> SELECT COUNT (DISTINCT
2> DBMS_ROWID.ROWID_BLOCK_NUMBER(rowid)||
3> DBMS_ROWID.ROWID_RELATIVE_FNO(rowid)) “Used”
4> FROM big_emp1;
Used
———-
700
1 row selected.

SQL> DELETE from big_emp1;
28672 rows processed.

SQL> commit;
Statement processed.

SQL> ANALYZE TABLE big_emp1 ESTIMATE STATISTICS;
Statement processed.

SQL> SELECT table_name,num_rows,blocks,empty_blocks
2> FROM user_tables
3> WHERE table_name=’BIG_EMP1′;
TABLE_NAME NUM_ROWS BLOCKS EMPTY_BLOCKS
—————————— ———- ———- ————
BIG_EMP1 0 700 323
1 row selected.

SQL> SELECT COUNT (DISTINCT
2> DBMS_ROWID.ROWID_BLOCK_NUMBER(rowid)||
3> DBMS_ROWID.ROWID_RELATIVE_FNO(rowid)) “Used”
4> FROM big_emp1;
Used
———-
0
1 row selected.

SQL> TRUNCATE TABLE big_emp1;
Statement processed.

SQL> ANALYZE TABLE big_emp1 ESTIMATE STATISTICS;
Statement processed.

SQL> SELECT table_name,num_rows,blocks,empty_blocks
2> FROM user_tables
3> WHERE table_name=’BIG_EMP1′;
TABLE_NAME NUM_ROWS BLOCKS EMPTY_BLOCKS
—————————— ———- ———- ————
BIG_EMP1 0 0 511
1 row selected.

SQL> connect system/manager
Connected.

SQL> SELECT segment_name,segment_type,blocks
2> FROM dba_segments
3> WHERE segment_name=’BIG_EMP1′;
SEGMENT_NAME SEGMENT_TYPE BLOCKS EXTENTS
—————————– —————– ———- ——-
BIG_EMP1 TABLE 512 1
1 row selected.

Note: TRUNCATE has also deallocated the space from the deleted rows.
To retain the space from the deleted rows allocated to the table use:
TRUNCATE TABLE big_emp1 REUSE STORAGE

沪ICP备14014813号-2

沪公网安备 31010802001379号