Oracle内部错误:ORA-00600[13013][5001]故障诊断一例

周五被叫到客户现场解决一套10.2.0.4 RAC数据库上的ORA-00600[13013]内部错误问题,这个问题同事已经在上午通过远程拨号了解过情况,
初步判断是索引存在讹误corruption引起的600。

前期诊断

同事在我抵达现场之前已经做了初步的诊断,该10.2.0.4上的RAC系统主用节点的告警日志中多次出现ORA-00600:[13013], [5001]、ORA-00600:[qertbFetchByRowID]及ORA-00600: [25027] 等内部错误,具体的日志如下:

Fri Sep 16 01:16:54 2011
Errors in file /oravl01/oracle/admin/CRMDB2/udump/crmdb22_ora_2138788.trc:
ORA-00600: internal error code, arguments: [13013], [5001], [66209], [50730514], [23], [50730514], [3], []
Fri Sep 16 01:16:55 2011
Trace dumping is performing id=[cdmp_20110916011655]
Fri Sep 16 01:17:06 2011
Errors in file /oravl01/oracle/admin/CRMDB2/udump/crmdb22_ora_2138788.trc:
ORA-00600: internal error code, arguments: [13013], [5001], [66209], [50730514], [23], [50730514], [3], []

以下为13013 trace

ORA-00600: internal error code, arguments: [13013], [5001], [66209], [50730514], [23], [50730514], [3], []
Current SQL statement for this session:
UPDATE CUST_SUBSCRB_PERSONAL A SET a.cust_mobile='A04204441' WHERE a.subscrbid=71524739
----- Call Stack Trace -----
calling              call     entry                argument values in hex
location             type     point                (? means dubious value)
-------------------- -------- -------------------- ----------------------------
ksedst+001c          bl       ksedst1              088424844 ? 041124844 ?
ksedmp+0290          bl       ksedst               104A54870 ?
ksfdmp+0018          bl       03F30204
kgeriv+0108          bl       _ptrgl
kgesiv+0080          bl       kgeriv               1000D8FE4 ? 0785F7165 ?
10501ED28 ? FFFFFFFFFFF6748 ?
06C626E72 ?
ksesic6+0060         bl       kgesiv               110072D08 ? 7000000AC52B8F8 ?
7000000AC52B3A8 ? 07FFFFFFF ?
000000000 ?
updThreePhaseExe+0c  bl       01F9D7A8
4c
updexe+041c          bl       updThreePhaseExe     7000000AC52B8F8 ? 000000000 ?
110540128 ? FFFFFFFFFFF7C80 ?
opiexe+27d8          bl       updexe               7000000AC52B8F8 ?
FFFFFFFFFFF82B8 ?
kpoal8+0edc          bl       opiexe               FFFFFFFFFFFB434 ?
FFFFFFFFFFFB198 ?
FFFFFFFFFFF9608 ?
opiodr+0ae0          bl       _ptrgl
ttcpip+1020          bl       _ptrgl
opitsk+1124          bl       01F9F2A0
opiino+0990          bl       opitsk               000000000 ? 000000000 ?

============
Plan Table
============
------------------------------------------------------+-----------------------------------+
| Id  | Operation           | Name                    | Rows  | Bytes | Cost  | Time      |
------------------------------------------------------+-----------------------------------+
| 0   | UPDATE STATEMENT    |                         |       |       |     1 |           |
| 1   |  UPDATE             | CUST_SUBSCRB_PERSONAL   |       |       |       |           |
| 2   |   INDEX UNIQUE SCAN | CUST_SUBSCRB_PERSONAL_PK|     1 |    33 |     1 |  00:00:01 |
------------------------------------------------------+-----------------------------------+

===============================================================================================================

同时还伴随有ORA-00600:[qertbFetchByRowID]出现

Fri Sep 16 01:08:57 2011
Errors in file /oravl01/oracle/admin/CRMDB2/udump/crmdb22_ora_2138788.trc:
ORA-00600: internal error code, arguments: [qertbFetchByRowID], [], [], [], [], [], [], []

ORA-00600: internal error code, arguments: [qertbFetchByRowID], [], [], [], [], [], [], []
Current SQL statement for this session:
SELECT * from CUST_SUBSCRB_PERSONAL A WHERE a.subscrbid=307557025
----- Call Stack Trace -----
calling              call     entry                argument values in hex
location             type     point                (? means dubious value)
-------------------- -------- -------------------- ----------------------------
ksedst+001c          bl       ksedst1              578318D500000003 ?
FFFFFFFFFFF80D0 ?
ksedmp+0290          bl       ksedst               104A54870 ?
ksfdmp+0018          bl       03F30204
kgerinv+00dc         bl       _ptrgl
kgeasnmierr+004c     bl       kgerinv              11048AE68 ? 000000000 ?
000000001 ? 104BDEC18 ?
110489398 ?
qertbFetchByRowID+0  bl       03F2EF38
d34
opifch2+141c         bl       03F2EB1C
opifch+003c          bl       opifch2              1100DD338 ? 000000000 ?
FFFFFFFFFFF9980 ?

============
Plan Table
============
---------------------------------------------------------------+-----------------------------------+
| Id  | Operation                    | Name                    | Rows  | Bytes | Cost  | Time      |
---------------------------------------------------------------+-----------------------------------+
| 0   | SELECT STATEMENT             |                         |       |       |     1 |           |
| 1   |  TABLE ACCESS BY INDEX ROWID | CUST_SUBSCRB_PERSONAL   |     1 |   287 |     1 |  00:00:01 |
| 2   |   INDEX UNIQUE SCAN          | CUST_SUBSCRB_PERSONAL_PK|     1 |       |     1 |  00:00:01 |
---------------------------------------------------------------+-----------------------------------+

===============================================================================================================

此外还会出现ORA-00600: [25027], [6], [1443670797], [], [], [], [], []

Fri Sep 16 15:06:00 2011
Errors in file /oravl01/oracle/admin/CRMDB2/udump/crmdb22_ora_1872436.trc:
ORA-00600: internal error code, arguments: [25027], [6], [1443670797], [], [], [], [], []

ksedmp: internal or fatal error
ORA-00600: internal error code, arguments: [25027], [6], [4278267905], [], [], [], [], []
Current SQL statement for this session:
select q.subscrbid,q.serv_lvl,q.cust_lvl
from cust_subscrb_personal q
where q.serv_lvl is null
or q.cust_lvl is null
----- Call Stack Trace -----
calling              call     entry                argument values in hex
location             type     point                (? means dubious value)
-------------------- -------- -------------------- ----------------------------
ksedst+001c          bl       ksedst1              088424844 ? 041124844 ?
ksedmp+0290          bl       ksedst               104A54870 ?
ksfdmp+0018          bl       03F30204
kgeriv+0108          bl       _ptrgl
kgesiv+0080          bl       kgeriv               000000000 ? 000000000 ?
000000000 ? 105673724 ?
00000000C ?
ksesic2+0060         bl       kgesiv               000000245 ? 0FFFFFFFF ?
700000010013BD0 ?
700000010018078 ? 110000998 ?
krtd2abh+040c        bl       ksesic2              61C3000061C3 ? 000000000 ?
000000006 ? 000000000 ?
0FF013001 ? 000000FA0 ?
105673FB8 ? 105673FA8 ?
kcbgtcr+24a0         bl       krtd2abh             700000471D95738 ?
7000001A9F0F050 ?
FFFFFFFFFFF78E0 ?
700000471D93500 ? 000000003 ?
ktrget+04ac          bl       kcbgtcr              1105A75A8 ? 000002000 ?
000000000 ? 000001940 ?
kdsgrp+00bc          bl       ktrget               000000064 ? 110000998 ?
700000010008000 ?
kdsgnp+0444          bl       kdsgrp               000000000 ? 000000000 ?
FFFFFFFFFFF80D0 ?
kafger+08fc          bl       kdsgnp               000000000 ? 000000000 ?
111119E38 ?
kdstf1100101km+0f58  bl       kafger               1105A74D8 ? 111119E38 ?
FFFFFFFFFFF8190 ? 000000000 ?
70000046DCA8488 ? 000000000 ?
0FFFF9608 ? 110471948 ?
kdsttgr+1a04         bl       kdstf1100101km       111119E38 ? 0000102A1 ?
70000042DF6DC50 ?
FFFFFFFFFFF8418 ? 0FFFFFFFF ?
000001FE8 ? 000000000 ?
000000000 ?
qertbFetch+09b8      bl       kdsttgr              111119E38 ? 000000418 ?
000000000 ? 000000000 ?
000000000 ? 000000000 ?
000000418 ? BE4610195B2C ?
opifch2+141c         bl       03F2EB1C
opifch+003c          bl       opifch2              1100DD338 ? 000000000 ?
FFFFFFFFFFF9980 ?

============
Plan Table
============
--------------------------------------------------+-----------------------------------+
| Id  | Operation          | Name                 | Rows  | Bytes | Cost  | Time      |
--------------------------------------------------+-----------------------------------+
| 0   | SELECT STATEMENT   |                      |       |       |  4898 |           |
| 1   |  TABLE ACCESS FULL | CUST_SUBSCRB_PERSONAL|    15 |   165 |  4898 |  00:00:59 |
--------------------------------------------------+-----------------------------------+

我们先从ORA-00600:[13013], [5001]内部错误入手,Mos Note <How to resolve ORA-00600 [13013], [5001] [ID 816784.1]> 比较翔实地介绍了该600错误,其argument的具体含义如下:

Arg [a] Passcount
Arg [b] Data Object number
Arg [c] Tablespace Relative DBA of block containing the row to be updated
Arg [d] Row Slot number
Arg [e] Relative DBA of block being updated (should be same as [c])
Arg [f] Code

我们这里的[13013], [5001], [66209], [50730514], [23], [50730514], [3], [] 就是

data_object_id = 66209 = 0x102A1
RDBA = 50730514 即  3061612 fild id=12  block id =398866
Row Slot number = 23
Code = 3

翻译过来就是对data_object_id为66209 的对象在12号数据文件上的398866块的第23行做代码为3的检测时发现了异常,通过data_object_id可以定位到该对象就是正在执行的SQL语句中的CUST_SUBSCRB_PERSONAL表。

ORA-00600:[13013], [5001]可能由表上的数据行或索引中的记录逻辑讹误(logical Corruption)所造成,这里要强调一下是逻辑讹误,而非物理损坏。逻辑讹误一般是由于Oracle Bug或者memory Corruption 所引起的,该ORA-00600:[13013], [5001]已知可能由5085288和4549673等多个Bug 引发,在版本10.2.0.4 上这些bug 触发概率较高。

伴随ORA-00600:[13013], [5001]发生的ORA-600 [qertbFetchByRowID]常由索引损坏引起,介绍了该问题:

ORA-600 [qertbFetchByRowID] Select Queries on 10.2.0.4 Database [ID 755592.1]

Applies to:
Oracle Server - Enterprise Edition - Version: 10.2.0.4 and later   [Release: 10.2 and later ]

Symptoms

The following error was received during SELECT operation:

ORA-00600: internal error code, arguments: [qertbFetchByRowID], [], [], [], [], [], [], []

The call stack contains:

qertbFetchByRowID <- qergiFetch <- rwsfcd <- qerhjFetch
< - qerjotFetch <- rwsfcd <- qertqoFetch <- qerpxSlaveFetch <- qerpxFetch

Cause
Index corruption.

Solution

1.
Analyze the affected object to look for logical corruption.

SQL> analyze table TABLENAME validate structure cascade;

This will place an exclusive lock on the table.

If an error is reported try dropping and recreating the index(es) first.

2.
If this problem is occurring on a version less than 10.2.0.4, ensure the patch for
Bug 4883635 has been applied if Materialized Views are used.

BUG:5866783 - ORA-00600: [QERTBFETCHBYROWID] ON SELECT FROM ONE TABLE

而ORA-00600: [25027]也是类似的问题:

ORA-600 [25027] [ID 284433.1]

PURPOSE:
  This article represents a partially published OERI note.

  It has been published because the ORA-600 error has been
  reported in at least one confirmed bug.

  Therefore, the SUGGESTIONS section of this article may help
  in terms of identifying the cause of the error.

  This specific ORA-600 error may be considered for full publication
  at a later date. If/when fully published, additional information
  will be available here on the nature of this error.

ERROR:

  Format: ORA-600 [25027] [a] [b]

VERSIONS:
  versions 9.2 and above

ARGUMENTS:
  Arg [a]  Tablespace Number (TSN)
  Arg [b]  Decimal Relative Data Block Address (RDBA)

SUGGESTIONS:

 1. If the Arg [b] (the RDBA) is 0 (zero), then this could be due to fake indexes.

  The following query will list fake indexes:

     select do.owner,do.object_name, do.object_type,sysind.flags
     from dba_objects do, sys.ind$ sysind
     where do.object_id = sysind.obj#
     and bitand(sysind.flags,4096)=4096;

If the above query returns any rows, check the objects involved and consider
dropping them as they can cause this error. 

2. Run analyze table validate structure on the table referenced in the Current SQL statement in
    the related trace file.

  If the Known Issues section below does not help in terms of identifying
  a solution, please submit the trace files and alert.log to Oracle
  Support Services for further analysis.

但是请注意在该故障示例中ORA-00600: [25027]给出的Arg [b] Decimal Relative Data Block Address (RDBA)是一个完全不相干的RDBA,具体原因尚不明确。

小插曲

这当中有一个小的插曲,原来客户这里的应用人员需要尽快在该CUST_SUBSCRB_PERSONAL表上执行一段查询语句,但是该语句一旦执行就会遇到ORA-600 [qertbFetchByRowID]错误,因为是周五小周末,所以如果今天无法运行的话,就要拖到下个礼拜了。

所以被要求优先解决该语句执行的问题,首先看了一下该语句的执行计划:

SQL> explain plan for select count(*) from SHUCRM2C.cust_subscrb_personal a,  SHUCRM2C.wzy_bs20110916 b
2  where a.svcnum=b.svcnum and a.countyid='A00' and a.serv_lvl=0;

Explained.

SQL> select * from table(dbms_xplan.display());

PLAN_TABLE_OUTPUT
------------------------------------------------
Plan hash value: 3616548176

-----------------------------------------------------------------------------------------------------------
| Id  | Operation                     | Name                      | Rows  | Bytes | Cost (%CPU)| Time     |
-----------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT              |                           |     1 |    31 |   231   (1)| 00:00:03 |
|   1 |  SORT AGGREGATE               |                           |     1 |    31 |            |          |
|*  2 |   HASH JOIN                   |                           |   425 | 13175 |   231   (1)| 00:00:03 |
|*  3 |    TABLE ACCESS BY INDEX ROWID| CUST_SUBSCRB_PERSONAL     |   425 |  8075 |   208   (0)| 00:00:03 |
|*  4 |     INDEX RANGE SCAN          | CUST_SUBSCRB_PERSONAL_2IX | 42462 |       |    20   (0)| 00:00:01 |
|   5 |    TABLE ACCESS FULL          | WZY_BS20110916            | 39792 |   466K|    22   (0)| 00:00:01 |
-----------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

2 - access("A"."SVCNUM"="B"."SVCNUM")
3 - filter(TO_NUMBER("A"."SERV_LVL")=0)
4 - access("A"."COUNTYID"='A00')

Note
-----
- dynamic sampling used for this statement

23 rows selected.

SQL> select count(*) from SHUCRM2C.cust_subscrb_personal a,  SHUCRM2C.wzy_bs20110916 b
  2  where a.svcnum=b.svcnum and a.countyid='A00' and a.serv_lvl=0;
select count(*) from SHUCRM2C.cust_subscrb_personal a,  SHUCRM2C.wzy_bs20110916 b
                              *
ERROR at line 1:
ORA-00600: internal error code, arguments: [qertbFetchByRowID], [], [], [], [], [], [], []

以上执行计划中对表上的索引CUST_SUBSCRB_PERSONAL_2IX做了range scan后通过获得的rowid到表上去fetch记录(qertbFetchByRowID),在实际fetch by rowid 的时候引发了ORA-600错误。

因为需求较为紧急,所以我还是考虑能否使执行计划绕过该索引,投石问路给语句加上了RULE提示,再次执行:

SQL> select /*+ rule */
2  count(*) from SHUCRM2C.cust_subscrb_personal a,  SHUCRM2C.wzy_bs20110916 b
3  where a.svcnum=b.svcnum and a.countyid='A00' and a.serv_lvl=0;

COUNT(*)
----------
11559

Execution Plan
----------------------------------------------------------
Plan hash value: 2851452146

-------------------------------------------------------------------
| Id  | Operation                     | Name                      |
-------------------------------------------------------------------
|   0 | SELECT STATEMENT              |                           |
|   1 |  SORT AGGREGATE               |                           |
|   2 |   NESTED LOOPS                |                           |
|   3 |    TABLE ACCESS FULL          | WZY_BS20110916            |
|*  4 |    TABLE ACCESS BY INDEX ROWID| CUST_SUBSCRB_PERSONAL     |
|   5 |     AND-EQUAL                 |                           |
|*  6 |      INDEX RANGE SCAN         | CUST_SUBSCRB_SVCNUM_2006  |
|*  7 |      INDEX RANGE SCAN         | CUST_SUBSCRB_PERSONAL_2IX |
-------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

4 - filter(TO_NUMBER("A"."SERV_LVL")=0)
6 - access("A"."SVCNUM"="B"."SVCNUM")
7 - access("A"."COUNTYID"='A00')

Note
-----
- rule based optimizer used (consider using cbo)

想不到居然执行成功了,但是执行计划当中仍有CUST_SUBSCRB_PERSONAL_2IX这个索引,这让我潜意识中认识到很可能是表而非索引存在逻辑讹误。

不管怎么说至少解了燃眉之急,先把这个方法告诉应用人员,然后回过头来继续诊断。

初步验证索引

为了确定到底是表还是索引存在逻辑讹误,一般需要使用analyze table validate structure cascade命令以验证表和索引的结构,但是该命令会要求以共享方式锁住表(TM mode=4),对于更新频繁的生产系统中的关键应用表,这是不可接受的。同事在接手这个故障时已经考虑过该问题了,所以他推荐用查询的方式来检验到底是那些索引存在corruption,具体方法如下:

通过ROWID_CREATE 构造出ROWID

SQL> select dbms_rowid.ROWID_CREATE(1,66209,12,398866,23) from dual;

DBMS_ROWID.ROWID_C
------------------
AAAQKhAAMAABhYSAAX

SQL> select * from SHUCRM2O.CUST_SUBSCRB_PERSONAL a where rowid=chartorowid('AAAQKhAAMAABhYSAAX');

no rows selected

SQL> SELECT INDEX_NAME,COLUMN_NAME FROM DBA_IND_COLUMNS WHERE TABLE_NAME='CUST_SUBSCRB_PERSONAL' ORDER BY 1;

INDEX_NAME                     COLUMN_NAME
------------------------------ ----------------------------------------
CUST_SUBSCRB_CARD_NUM          VIP_CARD_NUM
CUST_SUBSCRB_MANAGERID_2006    MANAGERID
CUST_SUBSCRB_PERSONAL_2IX      COUNTYID
CUST_SUBSCRB_PERSONAL_3IX      CUST_LVL
CUST_SUBSCRB_PERSONAL_PK       SUBSCRBID
CUST_SUBSCRB_SERV_COUNTY       SERV_COUNTYID
CUST_SUBSCRB_SVCNUM_2006       SVCNUM
IDX_CUST_SUBSCRB_PERSONAL_01   SERV_LVL

之后强制使用index提示使用不同的索引

SQL> select /*+ INDEX(a CUST_SUBSCRB_CARD_NUM ) */ * from SHUCRM2O.CUST_SUBSCRB_PERSONAL a
where rowid=chartorowid('AAAQKhAAMAABhYSAAX');

no rows selected

SQL> select /*+ INDEX(a CUST_SUBSCRB_MANAGERID_2006) */ * from SHUCRM2O.CUST_SUBSCRB_PERSONAL a
where rowid=chartorowid('AAAQKhAAMAABhYSAAX');

no rows selected

SQL> select /*+ INDEX(a CUST_SUBSCRB_PERSONAL_2IX) */ * from SHUCRM2O.CUST_SUBSCRB_PERSONAL a
where rowid=chartorowid('AAAQKhAAMAABhYSAAX');
select /*+ INDEX(a CUST_SUBSCRB_PERSONAL_2IX) */ * from SHUCRM2O.CUST_SUBSCRB_PERSONAL a
where rowid=chartorowid('AAAQKhAAMAABhYSAAX')
*
ERROR at line 1:
ORA-00600: internal error code, arguments: [qertbFetchByRowID], [], [], [], [], [], [], []

SQL> select /*+ INDEX(a CUST_SUBSCRB_PERSONAL_3IX) */ * from SHUCRM2O.CUST_SUBSCRB_PERSONAL a
where rowid=chartorowid('AAAQKhAAMAABhYSAAX');

no rows selected

SQL> select /*+ INDEX(a CUST_SUBSCRB_PERSONAL_PK ) */ * from SHUCRM2O.CUST_SUBSCRB_PERSONAL a
where rowid=chartorowid('AAAQKhAAMAABhYSAAX');
select /*+ INDEX(a CUST_SUBSCRB_PERSONAL_PK ) */ * from SHUCRM2O.CUST_SUBSCRB_PERSONAL a
where rowid=chartorowid('AAAQKhAAMAABhYSAAX')
*
ERROR at line 1:
ORA-00600: internal error code, arguments: [qertbFetchByRowID], [], [], [], [], [], [], []

SQL> select /*+ INDEX(a CUST_SUBSCRB_SERV_COUNTY) */ * from SHUCRM2O.CUST_SUBSCRB_PERSONAL a
where rowid=chartorowid('AAAQKhAAMAABhYSAAX');

no rows selected

SQL> select /*+ INDEX(a CUST_SUBSCRB_SVCNUM_2006) */ * from SHUCRM2O.CUST_SUBSCRB_PERSONAL a
where rowid=chartorowid('AAAQKhAAMAABhYSAAX');
select /*+ INDEX(a CUST_SUBSCRB_SVCNUM_2006) */ * from SHUCRM2O.CUST_SUBSCRB_PERSONAL a
where rowid=chartorowid('AAAQKhAAMAABhYSAAX')
*
ERROR at line 1:
ORA-00600: internal error code, arguments: [qertbFetchByRowID], [], [], [], [], [], [], []

SQL> select /*+ INDEX(a IDX_CUST_SUBSCRB_PERSONAL_01) */ * from SHUCRM2O.CUST_SUBSCRB_PERSONAL a
where rowid=chartorowid('AAAQKhAAMAABhYSAAX');

select /*+ INDEX(a IDX_CUST_SUBSCRB_PERSONAL_01) */ * from SHUCRM2O.CUST_SUBSCRB_PERSONAL a
where rowid=chartorowid('AAAQKhAAMAABhYSAAX')
*
ERROR at line 1:
ORA-00600: internal error code, arguments: [qertbFetchByRowID], [], [], [], [], [], [], []

可以看到使用以上方法在其中4个索引上遇到了ORA-00600:[qertbFetchByRowID]。 但是使用该方法还是无法彻底搞清楚逻辑到底存在于table还是index上?

 

必要的工具

 

这里我们要介绍一下validate structure还存在online在线使用的选项,在online模式下整个验证过程不会在表上加任何TM锁,这一点具体可以使用10704 事件来证明:

 

10704, 00000, "Print out information about what enqueues are being obtained"
// *Cause:  When enabled, prints out arguments to calls to ksqcmi and
//          ksqlrl and the return values.
// *Action: Level indicates details:
//   Level: 1-4: print out basic info for ksqlrl, ksqcmi
//          5-9: also print out stuff in callbacks:  ksqlac, ksqlop
//          10+: also print out time for each line

SQL> create table validate_me (t1 int);

Table created.

SQL> create index ind_validate_me on validate_me(t1);

Index created.

SQL> insert into validate_me select rownum  from dba_tables where rownum<201;

200 rows created.

SQL> commit;

Commit complete.

SQL> oradebug setmypid;
Statement processed.

SQL> oradebug event 10704 trace name context forever,level 10;
Statement processed.

SQL> analyze table validate_me validate structure cascade online;

Table analyzed.

SQL> oradebug tracefile_name;
c:\app\diag\rdbms\g11r2\g11r2\trace\g11r2_ora_3020.trc

g11r2_ora_3020.trc=========================================================================

*** 2011-09-18 20:55:25.373
Oradebug command 'event 10704 trace name context forever,level 10' console output: <none>

*** 2011-09-18 20:55:49.765
ksqgtl *** TX-00060005-000006a9 mode=6 flags=0x401 timeout=0 ***
ksqgtl: xcb=0x31434194, ktcdix=2147483647, topxcb=0x31434194
ktcipt(topxcb)=0x0

*** 2011-09-18 20:55:49.766
ksucti: init txn DID from session DID
ksqgtl:
ksqlkdid: 0001-001B-00000006

*** 2011-09-18 20:55:49.766
*** ksudidTrace: ksqgtl
ktcmydid(): 0001-001B-00000006
ksusesdi:   0000-0000-00000000
ksusetxn:   0001-001B-00000006
ksqgtl: RETURNS 0

*** 2011-09-18 20:55:49.766
ksqrcl: TX,60005,6a9
ksqrcl: returns 0

但是validate strucutre online也有它的缺点,那就是在线模式下结构验证命令将不填充索引的状态信息到index_stats视图,如:

SQL> analyze index  ind_validate_me validate structure  ;

Index analyzed.

SQL> select count(*) from index_stats;

COUNT(*)
----------
1

SQL> conn / as sysdba
Connected.
SQL> analyze index  ind_validate_me validate structure  online;

Index analyzed.

SQL> select count(*) from index_stats;

COUNT(*)
----------
0

但是因为我们这里只要用到validate structure的结构验证功能,而对索引的详细状态没有兴趣,所以我们可以充分利用该online模式。

 

具体验证

 

使用validate structure online具体验证该问题表和表上的索引:

先仅对表进行验证,以区分到底是表还是索引存在逻辑讹误 

SQL> analyze table SHUCRM2O.CUST_SUBSCRB_PERSONAL  validate structure online;
analyze table SHUCRM2O.CUST_SUBSCRB_PERSONAL  validate structure online
*
ERROR at line 1:
ORA-01498: block check failure - see trace file

若验证发现问题会出现ORA-01498错误,并产生trace 文件

kdrchk:  row is marked as both a Key and Clustered
prow=0x7000001f241c45c flag=0xff
Block Checking: DBA = 50730514, Block Type = KTB-managed data block
data header at 0x7000001f241c07c
kdbchk: bad row tab 0, slot 23
Block header dump:  0x03061612
Object id on Block? Y
seg/obj: 0x102a1  csc: 0xb43.ecde68ca  itc: 3  flg: E  typ: 1 - DATA
brn: 0  bdba: 0x3061609 ver: 0x01 opc: 0
inc: 0  exflg: 0

............

可以看到问题发生在  23 号槽位上

tab 0, row 23, @0x3e0
tl: 4 fb: KCHDFLPN lb: 0x1  cc: 0 cki: 1
tab 0, row 24, @0x3de
tl: 2 fb: --HD---N lb: 0x30
tab 0, row 25, @0x3dc
tl: 2 fb: --HD---- lb: 0x30
tab 0, row 26, @0x3d8
tl: 4 fb: KCHDFLPN lb: 0xff  cc: 0 cki: 255

这里居然第23个row piece的 flag 是KCHDFLPN 即实际fb = 0xff,也就是该row piece同时被标记为key和clustered(row is marked as both a Key and Clustered),因此不管当服务进程尝试update该问题行记录或者通过ROWID访问该row时都出现了ORA-00600错误,虽然其错误代码不同,但都是由于该数据块中第23行记录的flag存在讹误引起的。

关于该ORA-00600:[13013], [5001]问题的成因和解决方法,更多内容可以参考<手工模拟Oracle数据块逻辑讹误引发ORA-00600:[13013], [5001]一例>一文。

 

待修订!

手工模拟Oracle数据块逻辑讹误引发,ORA-00600:[13013] [5001]一例

上周在客户那里遇到了一例由Oracle Bug引发的表数据块逻辑讹误触发ORA-00600:[13013], [5001]的问题,这里为了更好地说明该问题,于是萌发了手工模拟该数据块逻辑讹误的想法。

基础知识

Oracle中表的数据块由块头、事务槽、行字典、行数据等多种结构组成。 行数据(rowdata)实际是由许多row piece 行片组成的,每一条row piece的头部都有flag、locks、cols(cc)三个标志位。

其中flag标记了该row piece的类型,该flag位占用一个字节,其不同的bit位代表不同的含义,见下表:

ROW_CLUSTER_KEY = 0x80;              KDRHFK
ROW_CTABLE_NUMBER = 0x40;            KDRHFC
ROW_HEAD_PIECE = 0x20;               KDRHFH
ROW_DELETED_ROW = 0x10;              KDRHFD
ROW_FIRST_PIECE = 0x08;              KDRHFF
ROW_LAST_PIECE = 0x04;               KDRHFL
ROW_FROM_PREVIOUS = 0x02;            KDRHFP
ROW_CONTINUE_NEXT = 0x01;            KDRHFN

一般来说最普通的一条row piece是普通堆表(heap table)的未被删除的且无行迁移/链接的,其flag位应为

普通row的flag一般为

Single Row =
ROW_HEAD_PIECE + ROW_FIRST_PIECE + ROW_LAST_PIECE= 0x20 + 0x08 + 0x04= 0x2c

===================================================================================

cluster key的flag一般为

Cluster Key =
ROW_CLUSTER_KEY + ROW_HEAD_PIECE + ROW_FIRST_PIECE + ROW_LAST_PIECE=
KDRHFL, KDRHFF, KDRHFH, KDRHFK =0x80 + 0x2c =  0xac

BBED> x /rn
rowdata[68]                                 @8166
-----------
flag@8166: 0xac (KDRHFL, KDRHFF, KDRHFH, KDRHFK)
lock@8167: 0x00
cols@8168:    1
kref@8169:    1
mref@8171:    1
hrid@8173:0x01800014.0
nrid@8179:0x01800014.0

col    0[2] @8185: 10 

===================================================================================

Cluster Row =
ROW_CTABLE_NUMBER + ROW_HEAD_PIECE + ROW_FIRST_PIECE + ROW_LAST_PIECE =
(KDRHFL, KDRHFF, KDRHFH, KDRHFC) = 0x6c 

BBED> x /rncc
rowdata[0]                                  @8098
----------
flag@8098: 0x6c (KDRHFL, KDRHFF, KDRHFH, KDRHFC)
lock@8099: 0x00
cols@8100:   10

col    0[2] @8102: 200
col    1[8] @8105: Jennifer
col    2[6] @8114: Whalen
col    3[7] @8121: JWHALEN
col   4[12] @8129: 515.123.4444
col    5[7] @8142: w....
col    6[7] @8150: AD_ASST
col    7[2] @8158: 

                    col    8[0] @8161: *NULL*
col    9[3] @8162: .

出现ORA-00600:[13013], [5001]且Arg [f] Code =3 代表这一row piece的flag >0xc0,
也就是该行片同时被标记为key和clustered(row is marked as both a Key and Clustered), 其检验代码为check code 6251。

当flag >= 0xc0 时 会出现kdrchk: row is marked as both a Key and Clustered Block 12 failed with check code 6251

当 0xac >flag >= 0xa0 时 会 kdrchk: row is Key and Not only piece of key Block 12 failed with check code 6255

当 flag = 0x43 是 会出现 kdrchk: C and neither of H or F Block 12 failed with check code 6263

当 flag = 0x83 时 会出现 kdrchk: row is marked both as a Key and being continued Block 12 failed with check code 6254

 

当Oracle进程访问数据块时首先会校验block的sum值并与block中的CHECKSUM值进行对比,若一致则说明该block没有物理讹误。但是光这一项检查是不够的,不足以保证block无误。所以Oracle引入了一些列的逻辑检验,每一种逻辑检验对应一个检测代码(check code),这些检测包括row piece的flag、cols(cc)状态是否正确等。

实际负责这类逻辑检验的函数包括:kdbchk、kddummy_blkchk、kco_blkchk、kdBlkCheckError、kdrchk等等。

这里当服务进程访问到问题数据块,检测代码发现其flag为0xff(KCHDFLPN),该flag从逻辑上讲是冲突的,所以检测代码认为该row piece存在异常,进而会引发update的ORA-00600:[13013], [5001]或查询的ORA-600 [qertbFetchByRowID]内部错误。

这里需要说明一下的是,很多人认为dbv工具时无法检测出逻辑讹误的,实际上dbv、rman、validate structure和bbed-verify均可以检测出一定程度的逻辑讹误,但是最可靠的还是db_block_checksum=true情况下的validate structure [online]验证命令。从另一个角度来说,普通的dbv只能做单一的检测,而无法做到交叉地检验,从而了解表和索引上的不一致问题,但是validate structure online却可以做到。


正式模拟

以上我们了解了ORA-00600:[13013], [5001]内部错误是如何被引发的,那么下面手工模拟该错误也就不困难了,当然这里需要用到bbed工具。

以下我们会创建实验用的tablespace,table,index:

SQL> select * from v$version;

BANNER
----------------------------------------------------------------
Oracle Database 10g Enterprise Edition Release 10.2.0.4.0 - 64bi
PL/SQL Release 10.2.0.4.0 - Production
CORE 10.2.0.4.0 Production
TNS for Linux: Version 10.2.0.4.0 - Production
NLSRTL Version 10.2.0.4.0 - Production

SQL> select * from global_name;

GLOBAL_NAME
--------------------------------------------------------------------------------
www.askmac.cn

/* 创建实验用的表空间  */

SQL> create tablespace maclean datafile '/home/oracle/maclean.dbf' size 20M;

Tablespace created.

SQL> create table tv tablespace maclean as select rownum t1,'find me' t2 from
dba_tables where rownumcreate index ind_tv on tv(t1) tablespace users;

Index created.

SQL> update tv set t2='corrption here' where t1=200;
update tv set t2='corrption here' where t1=200
*
ERROR at line 1:
ORA-12899: value too large for column "SYS"."TV"."T2" (actual: 14, maximum: 7)

SQL> alter table tv modify t2 varchar2(200);

Table altered.

SQL> update tv set t2='corruption here' where t1=200;

1 row updated.

SQL> commit;

Commit complete.

/* 以上创建了示例用表,其中t1=200的记录是之后将会
   手动修改为存在讹误的行             */

SQL> select dump(200,16) from dual;

DUMP(200,16)
-----------------
Typ=2 Len=2: c2,3

/* 通过16进制码可以方便找出该t1=200的记录行 */ 
SQL> alter system checkpoint;

System altered.

SQL> alter tablespace maclean read only;

Tablespace altered.

SQL> select dbms_rowid.rowid_block_number(rowid) bno ,dbms_rowid.rowid_relative_fno(rowid) fno from tv;

BNO FNO
---------- ----------
12 6

[oracle@rh2 ~]$ cp maclean.dbf maclean.dbf.bak

 

接着使用BBED工具找到目标行并实施手工修改:

 

[oracle@rh2 ~]$ bbed filename=maclean.dbf mode=edit
Password:

BBED: Release 2.0.0.0.0 - Limited Production on Sun Sep 18 22:14:59 2011

Copyright (c) 1982, 2007, Oracle. All rights reserved.

BBED> set blocksize 8192
BLOCKSIZE 8192

BBED> set block 13
BLOCK# 13

BBED> map /v

File: maclean.dbf (0)
Block: 13 Dba:0x00000000
------------------------------------------------------------
KTB Data Block (Table/Cluster)

struct kcbh, 20 bytes @0
ub1 type_kcbh @0
ub1 frmt_kcbh @1
ub1 spare1_kcbh @2
ub1 spare2_kcbh @3
ub4 rdba_kcbh @4
ub4 bas_kcbh @8
ub2 wrp_kcbh @12
ub1 seq_kcbh @14
ub1 flg_kcbh @15
ub2 chkval_kcbh @16
ub2 spare3_kcbh @18

struct ktbbh, 96 bytes @20
ub1 ktbbhtyp @20
union ktbbhsid, 4 bytes @24
struct ktbbhcsc, 8 bytes @28
b2 ktbbhict @36
ub1 ktbbhflg @38
ub1 ktbbhfsl @39
ub4 ktbbhfnx @40
struct ktbbhitl[3], 72 bytes @44

struct kdbh, 14 bytes @124
ub1 kdbhflag @124
b1 kdbhntab @125
b2 kdbhnrow @126
sb2 kdbhfrre @128
sb2 kdbhfsbo @130
sb2 kdbhfseo @132
b2 kdbhavsp @134
b2 kdbhtosp @136

struct kdbt[1], 4 bytes @138
b2 kdbtoffs @138
b2 kdbtnrow @140

sb2 kdbr[200] @142

ub1 freespace[4725] @542

ub1 rowdata[2921] @5267

ub4 tailchk @8188

BBED> find /x c203

File: maclean.dbf (0)
Block: 13 Offsets: 5271 to 5782 Dba:0x00000000
------------------------------------------------------------------------
c2030f63 6f727275 7074696f 6e206865 72652c00 0202c203 0766696e 64206d65
2c000203 c2026407 66696e64 206d652c 000203c2 02630766 696e6420 6d652c00
0203c202 62076669 6e64206d 652c0002 03c20261 0766696e 64206d65 2c000203
c2026007 66696e64 206d652c 000203c2 025f0766 696e6420 6d652c00 0203c202
5e076669 6e64206d 652c0002 03c2025d 0766696e 64206d65 2c000203 c2025c07
66696e64 206d652c 000203c2 025b0766 696e6420 6d652c00 0203c202 5a076669
6e64206d 652c0002 03c20259 0766696e 64206d65 2c000203 c2025807 66696e64
206d652c 000203c2 02570766 696e6420 6d652c00 0203c202 56076669 6e64206d
652c0002 03c20255 0766696e 64206d65 2c000203 c2025407 66696e64 206d652c
000203c2 02530766 696e6420 6d652c00 0203c202 52076669 6e64206d 652c0002
03c20251 0766696e 64206d65 2c000203 c2025007 66696e64 206d652c 000203c2
024f0766 696e6420 6d652c00 0203c202 4e076669 6e64206d 652c0002 03c2024d
0766696e 64206d65 2c000203 c2024c07 66696e64 206d652c 000203c2 024b0766
696e6420 6d652c00 0203c202 4a076669 6e64206d 652c0002 03c20249 0766696e
64206d65 2c000203 c2024807 66696e64 206d652c 000203c2 02470766 696e6420
6d652c00 0203c202 46076669 6e64206d 652c0002 03c20245 0766696e 64206d65

找到了t1=200的偏移值为5271

则其fb的偏移值为5271 -4 = 5267

BBED> set offset 5267

OFFSET 5267

BBED> d
File: maclean.dbf (0)
Block: 13 Offsets: 5267 to 5778 Dba:0x00000000
------------------------------------------------------------------------
2c020202 c2030f63 6f727275 7074696f 6e206865 72652c00 0202c203 0766696e
64206d65 2c000203 c2026407 66696e64 206d652c 000203c2 02630766 696e6420
6d652c00 0203c202 62076669 6e64206d 652c0002 03c20261 0766696e 64206d65
2c000203 c2026007 66696e64 206d652c 000203c2 025f0766 696e6420 6d652c00
0203c202 5e076669 6e64206d 652c0002 03c2025d 0766696e 64206d65 2c000203
c2025c07 66696e64 206d652c 000203c2 025b0766 696e6420 6d652c00 0203c202
5a076669 6e64206d 652c0002 03c20259 0766696e 64206d65 2c000203 c2025807
66696e64 206d652c 000203c2 02570766 696e6420 6d652c00 0203c202 56076669
6e64206d 652c0002 03c20255 0766696e 64206d65 2c000203 c2025407 66696e64
206d652c 000203c2 02530766 696e6420 6d652c00 0203c202 52076669 6e64206d
652c0002 03c20251 0766696e 64206d65 2c000203 c2025007 66696e64 206d652c
000203c2 024f0766 696e6420 6d652c00 0203c202 4e076669 6e64206d 652c0002
03c2024d 0766696e 64206d65 2c000203 c2024c07 66696e64 206d652c 000203c2
024b0766 696e6420 6d652c00 0203c202 4a076669 6e64206d 652c0002 03c20249
0766696e 64206d65 2c000203 c2024807 66696e64 206d652c 000203c2 02470766
696e6420 6d652c00 0203c202 46076669 6e64206d 652c0002 03c20245 0766696e

/* 找到指定行的地址为5267,其当前flag为正常的0x2c  */

BBED> x /rnc

rowdata[0] @5267
----------
flag@5267: 0x2c (KDRHFL, KDRHFF, KDRHFH)
lock@5268: 0x02
cols@5269: 2

col 0[2] @5270: 200
col 1[15] @5273: corruption here

修改该flag 为 0xff BBED> modify /x 0xff

Warning: contents of previous BIFILE will be lost. Proceed? (Y/N) y
File: maclean.dbf (0)
Block: 13 Offsets: 5267 to 5778 Dba:0x00000000
------------------------------------------------------------------------
ff020202 c2030f63 6f727275 7074696f 6e206865 72652c00 0202c203 0766696e
64206d65 2c000203 c2026407 66696e64 206d652c 000203c2 02630766 696e6420
6d652c00 0203c202 62076669 6e64206d 652c0002 03c20261 0766696e 64206d65
2c000203 c2026007 66696e64 206d652c 000203c2 025f0766 696e6420 6d652c00
0203c202 5e076669 6e64206d 652c0002 03c2025d 0766696e 64206d65 2c000203
c2025c07 66696e64 206d652c 000203c2 025b0766 696e6420 6d652c00 0203c202
5a076669 6e64206d 652c0002 03c20259 0766696e 64206d65 2c000203 c2025807
66696e64 206d652c 000203c2 02570766 696e6420 6d652c00 0203c202 56076669
6e64206d 652c0002 03c20255 0766696e 64206d65 2c000203 c2025407 66696e64
206d652c 000203c2 02530766 696e6420 6d652c00 0203c202 52076669 6e64206d
652c0002 03c20251 0766696e 64206d65 2c000203 c2025007 66696e64 206d652c
000203c2 024f0766 696e6420 6d652c00 0203c202 4e076669 6e64206d 652c0002
03c2024d 0766696e 64206d65 2c000203 c2024c07 66696e64 206d652c 000203c2
024b0766 696e6420 6d652c00 0203c202 4a076669 6e64206d 652c0002 03c20249
0766696e 64206d65 2c000203 c2024807 66696e64 206d652c 000203c2 02470766
696e6420 6d652c00 0203c202 46076669 6e64206d 652c0002 03c20245 0766696e

BBED> x /rnc

rowdata[0] @5267
----------
flag@5267: 0xff (KDRHFN, KDRHFP, KDRHFL, KDRHFF, KDRHFD, KDRHFH, KDRHFC, KDRHFK)
lock@5268: 0x02
cols@5269: 0
ckix@5270: 2

BBED> sum apply

Check value for File 0, Block 13:
current = 0x0000, required = 0x0000

我们使用bbed的verify命令验证数据块会发现问题flag

BBED> verify
DBVERIFY - Verification starting
FILE = maclean.dbf
BLOCK = 12

kdrchk: row is marked as both a Key and Clustered

prow=0x7f5335f05693 flag=0xff
Block Checking: DBA = 25165836, Block Type = KTB-managed data block
data header at 0x7f5335f0427c
kdbchk: bad row tab 0, slot 199
Block 12 failed with check code 6251

DBVERIFY - Verification complete

Total Blocks Examined : 1
Total Blocks Processed (Data) : 1
Total Blocks Failing (Data) : 1
Total Blocks Processed (Index): 0
Total Blocks Failing (Index): 0
Total Blocks Empty : 0
Total Blocks Marked Corrupt : 0
Total Blocks Influx : 0

使用dbv工具是也可以验证这种逻辑讹误的

[oracle@rh2 ~]$ dbv file=maclean.dbf

DBVERIFY: Release 10.2.0.4.0 - Production on Sun Sep 18 22:27:49 2011

Copyright (c) 1982, 2007, Oracle. All rights reserved.

DBVERIFY - Verification starting : FILE = maclean.dbf
kdrchk: row is marked as both a Key and Clustered
prow=0x7f9ef25f7693 flag=0xff
Block Checking: DBA = 25165836, Block Type = KTB-managed data block
data header at 0x7f9ef25f627c
kdbchk: bad row tab 0, slot 199
Page 12 failed with check code 6251

DBVERIFY - Verification complete

Total Pages Examined : 2560
Total Pages Processed (Data) : 1
Total Pages Failing (Data) : 1
Total Pages Processed (Index): 0
Total Pages Failing (Index): 0
Total Pages Processed (Other): 11
Total Pages Processed (Seg) : 0
Total Pages Failing (Seg) : 0
Total Pages Empty : 2548
Total Pages Marked Corrupt : 0
Total Pages Influx : 0
Highest block SCN : 691691 (0.691691)

 

回到sqlplus中访问之前修改的数据行,触发ORA-600[13013] [5001]错误:

 

SQL> alter system flush buffer_cache;

System altered.

SQL> update tv set t2='correct here' where t1=200;
update tv set t2='correct here' where t1=200
*
ERROR at line 1:
ORA-00600: internal error code, arguments: [13013], [5001], [52937],
[25165836], [199], [25165836], [3], []

PLAN_TABLE_OUTPUT
---------------------------------------------------------
Plan hash value: 568795662

----------------------------------------------------------------------------
| Id  | Operation         | Name   | Rows  | Bytes | Cost (%CPU)| Time     |
----------------------------------------------------------------------------
|   0 | UPDATE STATEMENT  |        |     1 |   115 |     2   (0)| 00:00:01 |
|   1 |  UPDATE           | TV     |       |       |            |          |
|*  2 |   INDEX RANGE SCAN| IND_TV |     1 |   115 |     1   (0)| 00:00:01 |
----------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - access("T1"=200)

SQL> select * from tv where t1=200;
select * from tv where t1=200
*
ERROR at line 1:
ORA-00600: internal error code, arguments: [qertbFetchByRowID], [], [], [], [],
[], [], []

PLAN_TABLE_OUTPUT
------------------------------------------------------------------------------------------------------------------------
Plan hash value: 1015724781

--------------------------------------------------------------------------------------
| Id  | Operation                   | Name   | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |        |     1 |   115 |     2   (0)| 00:00:01 |
|   1 |  TABLE ACCESS BY INDEX ROWID| TV     |     1 |   115 |     2   (0)| 00:00:01 |
|*  2 |   INDEX RANGE SCAN          | IND_TV |     1 |       |     1   (0)| 00:00:01 |
--------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - access("T1"=200)

可以看到当正好update到问题行记录时如预料出现了ORA-00600:[13013], [5001]错误,而ACCESS BY INDEX ROWID时出现了ORA-00600:[qertbFetchByRowID]。

解决方案

1.在有备份的情况下可以通过blockrecovery在线修复该问题数据块:

RMAN> blockrecover datafile 6 block 12;

Starting blockrecover at 18-SEP-11
using target database control file instead of recovery catalog
allocated channel: ORA_DISK_1
channel ORA_DISK_1: sid=144 devtype=DISK

starting media recovery
media recovery complete, elapsed time: 00:00:01

Finished blockrecover at 18-SEP-11

但是请注意如果该逻辑讹误确实是由Oracle Bug引起的话,那么很有可能blockrecover也无能为力,那么可以借鉴第二种方法。

 

2. 第二种方法针对没有备份可用的数据库或者recover数据块不管用的场景,可以设置10231事件并ctas复制该表,但是这种方法可能会丢失有问题的行记录:

SQL> alter session set events ‘10231 trace name context forever, level 10’

SQL> Create table.TABLE_COPY as select * from TABLE;

 

了解更多关于kdrchk函数的信息:

Add check for continued row piece pointing to itself with
corruption description:

"kdrchk: Row piece pointing to itself"

DB_BLOCK_CHECKING = MEDIUM will check for row pieces where the
next rowid (nrid) points to itself (chained row points to itself).
It produces error ORA-600 [kddummy_blkchk] or ORA-600 [kdBlkCheckError]
with check code [6266] (3rd ORA-600 argument).

DBVERIFY reports the same corruption description if the block is corrupt on disk.

RMAN when run with the CHECK LOGICAL option reports it as
     corruption_type=CORRUPT/LOGICAL in v$database_block_corruption.

"ANALYZE TABLE  VALIDATE STRUCTURE" produces error ORA-1498 and trace file
shows the same corruption description.

With this fix in place DBMS_REPAIR can be used to identify and mark the affected
block as Soft Corrupt producing error ORA-1578 and it can be skipped it for DML's
using DBMS_REPAIR.SKIP_CORRUPT_BLOCKS.

 [CM][SG][event 1][domain Q423][mem 0] Joining shared group
  kdrchk: column length 0 but not null
            prow=0x2a97f4d9d6 flag=0x2c column=57
  Block Checking: DBA = 29635651, Block Type = KTB-managed data block
  data header at 0x2a97f4be7c
  kdbchk: bad row tab 0, slot 2
  data_block_dump,data header at 0x2a97d113d8
  data_block_dump,data header at 0x2a97d113d8

 kdrchk: found invalid symbol reference 48
  reference to delete symbol
  valid symbol range [0,78)
  Block Checking: DBA = 411055291, Block Type = KTB-managed data block
  data header at 0x68a3f4
  kdbchk: bad row tab 0, slot 4
  Page 13499 failed with check code 6265

kdrchk: C and neither of H or F
          prow=0x4282803ae flag=0x41
Block Checking: DBA = 322963095, Block Type = KTB-managed data block
data header at 0x42828007c

kdrchk: column length 0 but not null
          prow=0x10021035e flag=0x2c column=40
Block Checking: DBA = 25189259, Block Type = KTB-managed data block
data header at 0x10020fe7c
kdbchk: bad row tab 0, slot 0
Page 23435 failed with check code 6264
kdrchk: column length 0 but not null
          prow=0x1002122e5 flag=0x2c column=40
Block Checking: DBA = 25189260, Block Type = KTB-managed data block

kdrchk:  row is marked as both a Key and Clustered
prow=0xd2bfa981 flag=0xff
File#67, Block#74754

kdbchk: bad row tab 0, slot 0
kdrchk:  no columns, but has one of P or N
          prow=0x934fbffa flag=0x31

DIAGNOSTIC ANALYSIS:
====================
A look at the block dump in the analyze trace file revealed two very
suspicious looking rows:

tab 0, row 0, @0x1ede
tl: 2 fb: --HD---N lb: 0x0
tab 0, row 1, @0x1edc
tl: 2 fb: --HD---N lb: 0x0

The flag bytes in these rows look incorrect.

待修订!

沪ICP备14014813号-2

沪公网安备 31010802001379号