实测DB_BLOCK_CHECKSUM=FULL的作用

从10.2.0.3开始 DB_BLOCK_CHECKSUM有三个选项:OFF TYPICAL FULL
在10.2.0.3之前 DB_BLOCK_CHECKSUM有2个选项: TRUE FALSE

在11g中DB_BLOCK_CHECKSUM和 DB_BLOCK_CHECKING参数 被 DB_ULTRA_SAFE参数整合到一起:

DB_ULTRA_SAFE sets the default values for other parameters that control protection levels.
Values:

OFF
When any of DB_BLOCK_CHECKING, DB_BLOCK_CHECKSUM, or DB_LOST_WRITE_PROTECT are explicitly set, no changes are made.

DATA_ONLY
DB_BLOCK_CHECKING will be set to MEDIUM.
DB_LOST_WRITE_PROTECT will be set to TYPICAL.
DB_BLOCK_CHECKSUM will be set to FULL.

 

DATA_AND_INDEX
DB_BLOCK_CHECKING will be set to FULL.
DB_LOST_WRITE_PROTECT will be set to TYPICAL.
DB_BLOCK_CHECKSUM will be set to FULL.

DB_BLOCK_CHECKSUM 参数决定了DBWn进程和直接路径读取进程是否为块计算checksum并将该checksum存放在每个数据块的cache header并写入到磁盘中。当该数据块被读取时,该checksum会受到验证, 前提是DB_BLOCK_CHECKSUM 被设置为TYPICAL 或 FULL,且最近一次该块的写出中存有checksum。

在FULL模式下,Oracle还会当块要发生变化应用前对该块验证checksum,并会在DML update/insert/delete语句引起变化被应用到块后再次计算该checksum。
此外,Oracle会对写入到当前redo日志文件的每一个redo block计算checksum。

若该参数设置为OFF,则 DBWn进程仅为system表空间上的对象计算checksum, 而对于普通表空间不计算。

checksum让Oracle具备检测由底层磁盘、存储子系统、IO子系统引起的坏块。

若设置为FULL, 则DB_BLOCK_CHECKSUM还会捕捉内存讹误并避免将存在逻辑讹误的块被写入到磁盘上。

官方文档介绍设置DB_BLOCK_CHECKSUM为TYPICAL模式可能引起1%-2%的性能损耗,设置为FULL mode可能引起4%-5%的性能损耗。

 

除非有极高的数据安全性要求,否则推荐用户设置DB_BLOCK_CHECKSUM为TYPICAL。
实际测试发现在 Linux +版本11.2.0.3上DB_BLOCK_CHECKSUM=FULL对普通的内存讹误(memory corruption)几乎无效,详见以下测试:

 

SQL> select * from v$version;

BANNER
--------------------------------------------------------------------------------
Oracle Database 11g Enterprise Edition Release 11.2.0.3.0 - 64bit Production
PL/SQL Release 11.2.0.3.0 - Production
CORE    11.2.0.3.0      Production
TNS for Linux: Version 11.2.0.3.0 - Production
NLSRTL Version 11.2.0.3.0 - Production

SQL> show parameter db_block

NAME                                 TYPE        VALUE
------------------------------------ ----------- ------------------------------
db_block_buffers                     integer     0
db_block_checking                    string      FULL
db_block_checksum                    string      FULL
db_block_size                        integer     8192

create table maclean_corruption(t1 int) tablespace system;

insert into maclean_corruption values(1);
insert into maclean_corruption values(1);
commit;

 1*  select t1,dump(t1,16) from maclean_corruption
SQL> /

        T1 DUMP(T1,16)
---------- ------------------------------
         1 Typ=2 Len=2: c1,2
         1 Typ=2 Len=2: c1,2

SQL> select dbms_rowid.rowid_block_number(rowid),dbms_rowid.rowid_relative_fno(rowid) from maclean_corruption;

DBMS_ROWID.ROWID_BLOCK_NUMBER(ROWID) DBMS_ROWID.ROWID_RELATIVE_FNO(ROWID)
------------------------------------ ------------------------------------
                               89985                                    1
                               89985                                    1

ALTER SESSION SET EVENTS 'immediate trace name buffers level 3';

SQL> oradebug setmypid
Statement processed.

SQL>  oradebug tracefile_name
/s01/diag/rdbms/prodb/PRODB/trace/PRODB_ora_28545.trc							   

BH (0xb0f748c8) file#: 1 rdba: 0x00415f81 (1/89985) class: 1 ba: 0xb0154000
  set: 6 pool: 3 bsz: 8192 bsi: 0 sflg: 2 pwc: 0,0
  dbwrid: 0 obj: 76977 objn: 76977 tsn: 0 afn: 1 hint: f
  hash: [0xce69cd58,0xce69cd58] lru: [0xb0f74ae0,0xb0f74880]
  obj-flags: object_ckpt_list
  ckptq: [0xb0f6c168,0xb0f76a08] fileq: [0xb0f6c178,0xb0f76a18] objq: [0xc70de768,0xc70de768] objaq: [0xc70de748,0xc70de748]
  st: XCURRENT md: NULL fpin: 'kcbwh6: kcbnew' tch: 4
  flags: buffer_dirty redo_since_read
  LRBA: [0xc.92.0] LSCN: [0x0.2329de] HSCN: [0x0.2329e0] HSUB: [1]
  buffer tsn: 0 rdba: 0x00415f81 (1/89985)
  scn: 0x0000.002329e0 seq: 0x01 flg: 0x06 tail: 0x29e00601
  frmt: 0x02 chkval: 0x0a60 type: 0x06=trans data
Hex dump of block: st=0, typ_found=1
Dump of memory from 0x00000000B0154000 to 0x00000000B0156000
0B0154000 0000A206 00415F81 002329E0 06010000  [....._A..)#.....]
0B0154010 00000A60 00000001 00012CB1 002329DD  [`........,...)#.]
0B0154020 00000000 00030002 00000000 001E0002  [................]
0B0154030 00000348 00C0312E 0032008E 00002002  [H....1....2.. ..]
0B0154040 002329E0 00000000 00000000 00000000  [.)#.............]
0B0154050 00000000 00000000 00000000 00020100  [................]
0B0154060 0016FFFF 1F781F94 00001F78 1F9A0002  [......x.x.......]
0B0154070 00001F94 00000000 00000000 00000000  [................]
0B0154080 00000000 00000000 00000000 00000000  [................]
        Repeat 497 times
0B0155FA0 5F840000 C1020041 41000645 41004C50  [..._A...E..APL.A]
0B0155FB0 00415F83 0432C102 275F4100 00000000  [._A...2..A_'....]
0B0155FC0 00000000 00000000 00000000 00000000  [................]
        Repeat 2 times
0B0155FF0 0201012C 012C02C1 02C10201 29E00601  [,.....,........)]
Block header dump:  0x00415f81
 Object id on Block? Y
 seg/obj: 0x12cb1  csc: 0x00.2329dd  itc: 2  flg: O  typ: 1 - DATA
     fsl: 0  fnx: 0x0 ver: 0x01

 Itl           Xid                  Uba         Flag  Lck        Scn/Fsc
0x01   0x0002.01e.00000348  0x00c0312e.008e.32  --U-    2  fsc 0x0000.002329e0
0x02   0x0000.000.00000000  0x00000000.0000.00  ----    0  fsc 0x0000.00000000

bdba: 0x00415f81
data_block_dump,data header at 0xb015405c
===============
tsiz: 0x1fa0
hsiz: 0x16
pbl: 0xb015405c
     76543210
flag=--------
ntab=1
nrow=2
frre=-1
fsbo=0x16
fseo=0x1f94
avsp=0x1f78
tosp=0x1f78
0xe:pti[0]      nrow=2  offs=0
0x12:pri[0]     offs=0x1f9a
0x14:pri[1]     offs=0x1f94
block_row_dump:
tab 0, row 0, @0x1f9a
tl: 6 fb: --H-FL-- lb: 0x1  cc: 1
col  0: [ 2]  c1 02
tab 0, row 1, @0x1f94
tl: 6 fb: --H-FL-- lb: 0x1  cc: 1
col  0: [ 2]  c1 02
end_of_block_dump

0B0155FF0 0201012C 012C02C1 02C10201 29E00601  [,.....,........)]

0B0155FF4  012C02C1  这里的 02C1即 C102=> 十进制的 "1"
0B0155FF8  02C10201  同样的 02C1即 C102=> 十进制的 "1"

修改 02C1为 04C1 即将 1修改为 3 

 oradebug setmypid
  oradebug poke 0x0B0155FF4 4 0x012C04C1

SQL> select t1,dump(t1,16) from maclean_corruption;

        T1 DUMP(T1,16)
---------- ------------------------------
         1 Typ=2 Len=2: c1,2
         3 Typ=2 Len=2: c1,4

SQL> delete maclean_corruption where t1=3;

1 row deleted.

SQL> commit;

Commit complete.

SQL> alter system flush buffer_cache;

System altered.

SQL> select  * from maclean_corruption;

        T1
----------
         1

SQL> ALTER SESSION SET EVENTS 'immediate trace name buffers level 3';

Session altered.

0B004BFF0 0201023C 002C04C1 02C10201 2B350601  

oradebug poke 0x0B004BFF8 4 0x50C10201;

BEFORE: [0B004BFF8, 0B004BFFC) = FFC10201
AFTER:  [0B004BFF8, 0B004BFFC) = 50C10201
SQL> 
SQL> 
SQL> select  * from maclean_corruption;

        T1
----------
        79

SQL> update maclean_corruption set t1=t1+1 where t1=80;

0 rows updated.

SQL> update maclean_corruption set t1=t1+1 where t1=79;

1 row updated.

SQL> select  * from maclean_corruption;

        T1
----------
        80

SQL> 
SQL>  oradebug peek 0x0B004BFF8 4
[0B004BFF8, 0B004BFFC) = 50C10201

BH (0xafff92d8) file#: 1 rdba: 0x00415f81 (1/89985) class: 1 ba: 0xaff4a000
  set: 6 pool: 3 bsz: 8192 bsi: 0 sflg: 2 pwc: 0,0
  dbwrid: 0 obj: 76977 objn: 76977 tsn: 0 afn: 1 hint: f
  hash: [0xb0f6ab88,0xce69cd58] lru: [0xafff94f0,0xcd1371e8]
  obj-flags: object_ckpt_list
  ckptq: [0xcd1554b8,0xcd1554b8] fileq: [0xcd1554d8,0xcd1554d8] objq: [0xc80e2bc0,0xc80e2bc0] objaq: [0xc80e2ba0,0xc80e2ba0]
  st: XCURRENT md: NULL fpin: 'kdswh11: kdst_fetch' tch: 5
  flags: buffer_dirty redo_since_read
  LRBA: [0xc.1d7.0] LSCN: [0x0.232b9d] HSCN: [0x0.232bb7] HSUB: [1]
  buffer tsn: 0 rdba: 0x00415f81 (1/89985)
  scn: 0x0000.00232bb7 seq: 0x01 flg: 0x06 tail: 0x2bb70601
  frmt: 0x02 chkval: 0x5e6a type: 0x06=trans data
Hex dump of block: st=0, typ_found=1
Dump of memory from 0x00000000AFF4A000 to 0x00000000AFF4C000
0AFF4A000 0000A206 00415F81 00232BB7 06010000  [....._A..+#.....]
0AFF4A010 00005E6A 00000001 00012CB1 00232B33  [j^.......,..3+#.]
0AFF4A020 00000000 00030002 00000000 000D0003  [................]
0AFF4A030 00000346 00C03094 0021008F 00002001  [F....0....!.. ..]
0AFF4A040 00232BB7 0004000A 00000290 00C00AF0  [.+#.............]
0AFF4A050 0028007B 00072001 00232B35 00020100  [{.(.. ..5+#.....]
0AFF4A060 0016FFFF 1F781F94 00001F81 1F9A0002  [......x.........]
0AFF4A070 00001F94 00000000 00000000 00000000  [................]
0AFF4A080 00000000 00000000 00000000 00000000  [................]
        Repeat 497 times
0AFF4BFA0 5F840000 C1020041 41000645 41004C50  [..._A...E..APL.A]
0AFF4BFB0 00415F83 0432C102 275F4100 00000000  [._A...2..A_'....]
0AFF4BFC0 00000000 00000000 00000000 00000000  [................]
        Repeat 2 times
0AFF4BFF0 0201023C 012C04C1 51C10201 2BB70601  [

Block header dump:  0x00415f81
 Object id on Block? Y
 seg/obj: 0x12cb1  csc: 0x00.232b33  itc: 2  flg: O  typ: 1 - DATA
     fsl: 0  fnx: 0x0 ver: 0x01

 Itl           Xid                  Uba         Flag  Lck        Scn/Fsc
0x01   0x0003.00d.00000346  0x00c03094.008f.21  --U-    1  fsc 0x0000.00232bb7
0x02   0x000a.004.00000290  0x00c00af0.007b.28  --U-    1  fsc 0x0007.00232b35
bdba: 0x00415f81
data_block_dump,data header at 0xaff4a05c
===============
tsiz: 0x1fa0
hsiz: 0x16
pbl: 0xaff4a05c
     76543210
flag=--------
ntab=1
nrow=2
frre=-1
fsbo=0x16
fseo=0x1f94
avsp=0x1f78
tosp=0x1f81
0xe:pti[0]      nrow=2  offs=0
0x12:pri[0]     offs=0x1f9a
0x14:pri[1]     offs=0x1f94
block_row_dump:
tab 0, row 0, @0x1f9a
tl: 6 fb: --H-FL-- lb: 0x1  cc: 1
col  0: [ 2]  c1 51
tab 0, row 1, @0x1f94
tl: 2 fb: --HDFL-- lb: 0x2
end_of_block_dump

0AFF4BFF0 0201023C 012C04C1 51C10201 2BB70601  

 oradebug poke 0x0AFF4BFF8 4 0x59C10201; ==> 修改 51=》59 记修改T1=80 为88

select * from maclean_corruption;

        T1
----------
        88

SQL> delete maclean_corruption where t1=88;

1 row deleted.

SQL> commit;

Commit complete.

SQL> alter system flush buffer_cache;

System altered.

SQL> /

System altered.

SQL> alter system archive log current;

System altered.

SQL> /

System altered.

SQL> select * from maclean_corruption;

no rows selected

SQL> 

SQL> analyze table maclean_corruption validate structure;

Table analyzed.

 

 

以上通过oradebug poke命令反复修改数据库块内的T1字段,模拟数据块在内存中出现内存讹误(block corruption in memory),发现DB_BLOCK_CHECKSUM=FULL对该种简单的内存讹误却无法实际检测到(这不代表DB_BLOCK_CHECKSUM=TYPICAL无法检查到检测由底层磁盘、存储子系统、IO子系统引起的坏块,DB_BLOCK_CHECKSUM=TYPICAL仍是检测物理坏块的最佳默认推荐配置),通过update/delete修改对应手工讹误的内存块,均可以正常工作并生成redo且flush脏块(dirty buffer)到磁盘上,则会将该内存讹误的情况持久化,且即便在db_block_checking=FULL的情况下也无法检测到一丁点逻辑讹误,这说明不管是db_block_checksum还是db_block_checking都对内存中数据块的细微讹误无法有效检测,虽然这不代表checksum+checking无法检测到更破坏块结构的内存讹误,但多少还是有些悲哀的。

 

Comments

  1. Kevin says

    求论坛邀请码,谢谢。发邮箱即可。。

  2. Kevin says

    邮箱:vonwind@gmail.com

  3. 路一直在 says

    这是否只能说明db_block_checksum做校验时并没有包括数据块中存储数据的部分。

  4. 路一直在 says

    或者在oradebug修改数据块时checksum也被修改了,是否能dump出checksum前后的变化?

Kevin进行回复 取消回复

*

沪ICP备14014813号-2

沪公网安备 31010802001379号