Global Cache CR Requested But Current Block Received

这篇文章和之前的《MINSCN与Cache Fusion Read Consistent》 是姊妹篇,他们源于同一个问题帖子

我们来重现提问者所看到的这样一个场景:

 

SQL> select * from V$version;

BANNER
--------------------------------------------------------------------------------
Oracle Database 11g Enterprise Edition Release 11.2.0.3.0 - 64bit Production
PL/SQL Release 11.2.0.3.0 - Production
CORE    11.2.0.3.0      Production
TNS for Linux: Version 11.2.0.3.0 - Production
NLSRTL Version 11.2.0.3.0 - Production

SQL> select count(*) from gv$instance;

  COUNT(*)
----------
         2

SQL> select * from global_name;

GLOBAL_NAME
--------------------------------------------------------------------------------
www.askmac.cn

 

 

 

在11gR2 2节点RAC环境中将一个数据块的status修改为XG,假设这个Xcurrent block当前在INSTANCE 2被hold住,这时我们在INSTANCE 1反复查询这个数据块,并观察结果:

 

SQL> select  * from test;

        ID
----------
         1
         2

SQL> select dbms_rowid.rowid_block_number(rowid),dbms_rowid.rowid_relative_fno(rowid) from test;

DBMS_ROWID.ROWID_BLOCK_NUMBER(ROWID) DBMS_ROWID.ROWID_RELATIVE_FNO(ROWID)
------------------------------------ ------------------------------------
                               89233                                    1
                               89233                                    1

SQL> alter system flush buffer_cache;

System altered.

INSTANCE 1 Session A:

SQL>  update test set id=id+1 where id=1;

1 row updated.

INSTANCE 1 Session B:

SQL> select state,cr_scn_bas from x$bh where file#=1 and dbablk=89233 and state!=0;

     STATE CR_SCN_BAS
---------- ----------
         1          0
         3    1755287

SQL> oradebug setmypid;
Statement processed.
SQL> oradebug dump gc_elements 255;
Statement processed.
SQL> oradebug tracefile_name;
/s01/orabase/diag/rdbms/vprod/VPROD1/trace/VPROD1_ora_19111.trc

GLOBAL CACHE ELEMENT DUMP (address: 0xa4ff3080):
  id1: 0x15c91 id2: 0x1 pkey: OBJ#76896 block: (1/89233)
  lock: X rls: 0x0 acq: 0x0 latch: 3
  flags: 0x20 fair: 0 recovery: 0 fpin: 'kdswh11: kdst_fetch'
  bscn: 0x0.146e20 bctx: (nil) write: 0 scan: 0x0
  lcp: (nil) lnk: [NULL] lch: [0xa9f6a6f8,0xa9f6a6f8]
  seq: 32 hist: 58 145:0 118 66 144:0 192 352 197 48 121 113 424 180 58
  LIST OF BUFFERS LINKED TO THIS GLOBAL CACHE ELEMENT:
    flg: 0x02000001 lflg: 0x1 state: XCURRENT tsn: 0 tsh: 2
      addr: 0xa9f6a5c8 obj: 76896 cls: DATA bscn: 0x0.1ac898
BH (0xa9f6a5c8) file#: 1 rdba: 0x00415c91 (1/89233) class: 1 ba: 0xa9e56000
  set: 5 pool: 3 bsz: 8192 bsi: 0 sflg: 3 pwc: 0,15
  dbwrid: 0 obj: 76896 objn: 76896 tsn: 0 afn: 1 hint: f
  hash: [0x91f4e970,0xbae9d5b8] lru: [0x91f58848,0xa9f6a828]
  lru-flags: debug_dump
  obj-flags: object_ckpt_list
  ckptq: [0x9df6d1d8,0xa9f6a740] fileq: [0xa2ece670,0xbdf4ed68] objq: [0xb4964e00,0xb4964e00] objaq: [0xb4964de0,0xb4964de0]
  st: XCURRENT md: NULL fpin: 'kdswh11: kdst_fetch' tch: 2 le: 0xa4ff3080
  flags: buffer_dirty redo_since_read
  LRBA: [0x19.5671.0] LSCN: [0x0.1ac898] HSCN: [0x0.1ac898] HSUB: [1]
  buffer tsn: 0 rdba: 0x00415c91 (1/89233)
  scn: 0x0000.001ac898 seq: 0x01 flg: 0x00 tail: 0xc8980601
  frmt: 0x02 chkval: 0x0000 type: 0x06=trans data

 

 

可以看到此时block: (1/89233)的GLOBAL CACHE ELEMENT DUMP中LOCK状态仍是X 而非XG , 这是因为这个Current Block仅在一个Instance中被modify修改过,没有在全局范围内被更新过。

 

 

紧接着在Instance 2 修改该块:

 

 

Instance 2 Session C:

SQL> update test set id=id+1 where id=2;

1 row updated.

Instance 2 Session D:

SQL>  select state,cr_scn_bas from x$bh where file#=1 and dbablk=89233 and state!=0;

     STATE CR_SCN_BAS
---------- ----------
         1          0
         3    1756658

SQL> oradebug setmypid;
Statement processed.
SQL> oradebug dump gc_elements 255;
Statement processed.
SQL> oradebug tracefile_name;
/s01/orabase/diag/rdbms/vprod/VPROD2/trace/VPROD2_ora_13038.trc

GLOBAL CACHE ELEMENT DUMP (address: 0x89fb25a0):
  id1: 0x15c91 id2: 0x1 pkey: OBJ#76896 block: (1/89233)
  lock: XG rls: 0x0 acq: 0x0 latch: 3
  flags: 0x20 fair: 0 recovery: 0 fpin: 'kduwh01: kdusru'
  bscn: 0x0.1acdf3 bctx: (nil) write: 0 scan: 0x0
  lcp: (nil) lnk: [NULL] lch: [0x96f4cf80,0x96f4cf80]
  seq: 61 hist: 324 21 143:0 19 16 352 329 144:6 14 7 352 197
  LIST OF BUFFERS LINKED TO THIS GLOBAL CACHE ELEMENT:
    flg: 0x0a000001 state: XCURRENT tsn: 0 tsh: 1
      addr: 0x96f4ce50 obj: 76896 cls: DATA bscn: 0x0.1acdf6
BH (0x96f4ce50) file#: 1 rdba: 0x00415c91 (1/89233) class: 1 ba: 0x96bd4000
  set: 5 pool: 3 bsz: 8192 bsi: 0 sflg: 2 pwc: 0,15
  dbwrid: 0 obj: 76896 objn: 76896 tsn: 0 afn: 1 hint: f
  hash: [0x96ee1fe8,0xbae9d5b8] lru: [0x96f4d0b0,0x96f4cdc0]
  obj-flags: object_ckpt_list
  ckptq: [0xbdf519b8,0x96f4d5a8] fileq: [0xbdf519d8,0xbdf519d8] objq: [0xb4a47b90,0xb4a47b90] objaq: [0x96f4d0e8,0xb4a47b70]
  st: XCURRENT md: NULL fpin: 'kduwh01: kdusru' tch: 1 le: 0x89fb25a0
  flags: buffer_dirty redo_since_read remote_transfered
  LRBA: [0x11.9e18.0] LSCN: [0x0.1acdf6] HSCN: [0x0.1acdf6] HSUB: [1]
  buffer tsn: 0 rdba: 0x00415c91 (1/89233)
  scn: 0x0000.001acdf6 seq: 0x01 flg: 0x00 tail: 0xcdf60601
  frmt: 0x02 chkval: 0x0000 type: 0x06=trans data

 GCS CLIENT 0x89fb2618,6 resp[(nil),0x15c91.1] pkey 76896.0
   grant 2 cvt 0 mdrole 0x42 st 0x100 lst 0x20 GRANTQ rl G0
   master 1 owner 2 sid 0 remote[(nil),0] hist 0x94121c601163423c
   history 0x3c.0x4.0xd.0xb.0x1.0xc.0x7.0x9.0x14.0x1.
   cflag 0x0 sender 1 flags 0x0 replay# 0 abast (nil).x0.1 dbmap (nil)
   disk: 0x0000.00000000 write request: 0x0000.00000000
   pi scn: 0x0000.00000000 sq[(nil),(nil)]
   msgseq 0x1 updseq 0x0 reqids[6,0,0] infop (nil) lockseq x2b8
   pkey 76896.0
   hv 93 [stat 0x0, 1->1, wm 32768, RMno 0, reminc 18, dom 0]
   kjga st 0x4, step 0.0.0, cinc 20, rmno 6, flags 0x0
   lb 0, hb 0, myb 15250, drmb 15250, apifrz 0

 

 

在Instance 2中被修改过后block: (1/89233)的 GLOBAL CACHE ELEMENT Lock Convert成lock: XG

除了通过GC_ELEMENTS DUMP来分析XCUR Cache Fusion外,也可以直接查询X$ VIEW,主要是 X$LE X$KJBR X$KJBL, 这三个X$ VIEW的更多信息可以很方便地从我的博客中找到:

 

 

INSTANCE 2 Session D:

SELECT *
  FROM x$le
 WHERE le_addr IN (SELECT le_addr
                     FROM x$bh
                    WHERE obj IN (SELECT data_object_id
                                    FROM dba_objects
                                   WHERE owner = 'SYS'
                                     AND object_name = 'TEST')
                      AND class = 1
                      AND state != 3);

ADDR                   INDX    INST_ID LE_ADDR              LE_ID1     LE_ID2
---------------- ---------- ---------- ---------------- ---------- ----------
    LE_RLS     LE_ACQ   LE_FLAGS    LE_MODE   LE_WRITE   LE_LOCAL LE_RECOVERY
---------- ---------- ---------- ---------- ---------- ---------- -----------
   LE_BLKS    LE_TIME LE_KJBL
---------- ---------- ----------------
00007F94CA14CF60       7003          2 0000000089FB25A0      89233          1
         0          0         32          2          0          1           0
         1          0 0000000089FB2618

 

 

PCM Resource NAME由[ID1][ID2],[BL]等组成, ID1和ID2 通过blockno和 fileno计算获得, 这里我们直接参考以上GC_elements dump中的 id1: 0x15c91 id2: 0x1 pkey: OBJ#76896 block: (1/89233)信息 ,则  kjblname 和 kjbrname 应以”[0x15c91][0x1],[BL]” 开头:

 

 

INSTANCE 2 Session D:

SQL> set linesize 80 pagesize 1400
SQL> SELECT *
  2    FROM x$kjbl l
  3   WHERE l.kjblname LIKE '%[0x15c91][0x1],[BL]%';

ADDR                   INDX    INST_ID KJBLLOCKP        KJBLGRANT KJBLREQUE
---------------- ---------- ---------- ---------------- --------- ---------
  KJBLROLE KJBLRESP         KJBLNAME
---------- ---------------- ------------------------------
KJBLNAME2                       KJBLQUEUE
------------------------------ ----------
KJBLLOCKST                                                       KJBLWRITING
---------------------------------------------------------------- -----------
KJBLREQWRITE  KJBLOWNER KJBLMASTER KJBLBLOCKED KJBLBLOCKER    KJBLSID KJBLRDOMID
------------ ---------- ---------- ----------- ----------- ---------- ----------
  KJBLPKEY
----------
00007F94CA22A288        451          2 0000000089FB2618 KJUSEREX  KJUSERNL
         0 00               [0x15c91][0x1],[BL][ext 0x0,0x
89233,1,BL                              0
GRANTED                                                                    0
           0          1          0           0           0          0          0
     76896

SQL> SELECT r.* FROM x$kjbr r WHERE r.kjbrname LIKE '%[0x15c91][0x1],[BL]%';

no rows selected

Instance 1 session B:

SQL>  SELECT r.* FROM x$kjbr r WHERE r.kjbrname LIKE '%[0x15c91][0x1],[BL]%';

ADDR                   INDX    INST_ID KJBRRESP         KJBRGRANT KJBRNCVL
---------------- ---------- ---------- ---------------- --------- ---------
  KJBRROLE KJBRNAME                       KJBRMASTER KJBRGRANTQ
---------- ------------------------------ ---------- ----------------
KJBRCVTQ         KJBRWRITER          KJBRSID KJBRRDOMID   KJBRPKEY
---------------- ---------------- ---------- ---------- ----------
00007F801ACA68F8       1355          1 00000000B5A62AE0 KJUSEREX  KJUSERNL
         0 [0x15c91][0x1],[BL][ext 0x0,0x          0 00000000B48BB330
00               00                        0          0      76896

 

 

接着我们将在Instance 1上查询block: (1/89233),这应当会引发Instance 2 build cr block 并传输给Instance 1, 为了进一步了解其细节 我们分别对 Instance 1的 Foreground Process 和 Instance 2的LMS进程做详细的RAC  TRACE:

 

Instance 2:

[oracle@vrh2 ~]$ ps -ef|grep ora_lms|grep -v grep
oracle   23364     1  0 Apr29 ?        00:33:15 ora_lms0_VPROD2

SQL> oradebug setospid 23364
Oracle pid: 13, Unix process pid: 23364, image: oracle@vrh2.oracle.com (LMS0)

SQL> oradebug event 10046 trace name context forever,level 8:10708 trace name context forever,level 103: trace[rac.*] disk high;
Statement processed.

SQL> oradebug tracefile_name
/s01/orabase/diag/rdbms/vprod/VPROD2/trace/VPROD2_lms0_23364.trc

Instance 1 session B :

SQL> select state,cr_scn_bas from x$bh where file#=1 and dbablk=89233 and state!=0;

     STATE CR_SCN_BAS
---------- ----------
         3    1756658
         3    1756661
         3    1755287

Instance 1 session A :

SQL> alter session set events '10046 trace name context forever,level 8:10708 trace name context forever,level 103: trace[rac.*] disk high';

Session altered.

SQL> select * from test;

        ID
----------
         2
         2

SQL> select state,cr_scn_bas from x$bh where file#=1 and dbablk=89233 and state!=0;

     STATE CR_SCN_BAS
---------- ----------
         3    1761520

 

从x$BH的信息来看,如我们所预期的Instance 1收到了build好的CR block,进一步分析 TRACE 文件:

 

Instance 1 foreground Process:

PARSING IN CURSOR #140336527348792 len=18 dep=0 uid=0 oct=3 lid=0 tim=1335939136125254 hv=1689401402 ad='b1a4c828' sqlid='c99yw1xkb4f1u'
select * from test
END OF STMT
PARSE #140336527348792:c=2999,e=2860,p=0,cr=0,cu=0,mis=1,r=0,dep=0,og=1,plh=1357081020,tim=1335939136125253
EXEC #140336527348792:c=0,e=40,p=0,cr=0,cu=0,mis=0,r=0,dep=0,og=1,plh=1357081020,tim=1335939136125373
WAIT #140336527348792: nam='SQL*Net message to client' ela= 6 driver id=1650815232 #bytes=1 p3=0 obj#=0 tim=1335939136125420

*** 2012-05-02 02:12:16.125
kclscrs: req=0 block=1/89233
2012-05-02 02:12:16.125574 : kjbcro[0x15c91.1 76896.0][4]

*** 2012-05-02 02:12:16.125
kclscrs: req=0 typ=nowait-abort

*** 2012-05-02 02:12:16.125
kclscrs: bid=1:3:1:0:f:1e:0:0:10:0:0:0:1:2:4:1:20:0:0:0:c3:49:0:0:0:0:0:0:0:0:0:0:0:0:0:0:0:0:0:0:4:3:2:1:2:0:1c:0:4d:26:a3:52:0:0:0:0:c7:c:ca:62:c3:49:0:0:0:0:1:0:14:8e:47:76:1:2:dc:5:a9:fe:17:75:0:0:0:0:0:0:0:0:0:0:0:0:99:ed:0:0:0:0:0:0:10:0:0:0
2012-05-02 02:12:16.125718 : kjbcro[0x15c91.1 76896.0][4]
2012-05-02 02:12:16.125751 : GSIPC:GMBQ: buff 0xba0ee018, queue 0xbb79a7b8, pool 0x60013fa0, freeq 0, nxt 0xbb79a7b8, prv 0xbb79a7b8
2012-05-02 02:12:16.125780 : kjbsentscn[0x0.1ae0f0][to 2]
2012-05-02 02:12:16.125806 : GSIPC:SENDM: send msg 0xba0ee088 dest x20001 seq 177740 type 36 tkts xff0000 mlen x1680198
2012-05-02 02:12:16.125918 : kjbmscr(0x15c91.1)reqid=0x8(req 0xa4ff30f8)(rinst 1)hldr 2(infosz 200)(lseq x2b8)
2012-05-02 02:12:16.126959 : GSIPC:KSXPCB: msg 0xba0ee088 status 30, type 36, dest 2, rcvr 1

*** 2012-05-02 02:12:16.127
kclwcrs: wait=0 tm=1233

*** 2012-05-02 02:12:16.127
kclwcrs: got 1 blocks from ksxprcv
WAIT #140336527348792: nam='gc cr block 2-way' ela= 1233 p1=1 p2=89233 p3=1 obj#=76896 tim=1335939136127199
2012-05-02 02:12:16.127272 : kjbcrcomplete[0x15c91.1 76896.0][0]
2012-05-02 02:12:16.127309 : kjbrcvdscn[0x0.1ae0f0][from 2][idx 2012-05-02 02:12:16.127329 : kjbrcvdscn[no bscn <= rscn 0x0.1ae0f0][from 2]

 

 

前台进程 kjbcro[0x15c91.1 76896.0][4] kjbsentscn[0x0.1ae0f0][to 2] 向Instance 2请求SCN=1ae0f0=1761520的 block: (1/89233),并进入’gc cr block 2-way’ 等待,之后成功收到请求的CR block。

 

Instance 2 LMS TRACE

 

2012-05-02 02:12:15.634057 : GSIPC:RCVD: ksxp msg 0x7f16e1598588 sndr 1 seq 0.177740 type 36 tkts 0
2012-05-02 02:12:15.634094 : GSIPC:RCVD: watq msg 0x7f16e1598588 sndr 1, seq 177740, type 36, tkts 0
2012-05-02 02:12:15.634108 : GSIPC:TKT: collect msg 0x7f16e1598588 from 1 for rcvr -1, tickets 0
2012-05-02 02:12:15.634162 : kjbrcvdscn[0x0.1ae0f0][from 1][idx 2012-05-02 02:12:15.634186 : kjbrcvdscn[no bscn1, wm 32768, RMno 0, reminc 18, dom 0]
   kjga st 0x4, step 0.0.0, cinc 20, rmno 6, flags 0x0
   lb 0, hb 0, myb 15250, drmb 15250, apifrz 0
 GCS CLIENT END
2012-05-02 02:12:15.635211 : kjbdowncvt[0x15c91.1 76896.0][1][options x0]
2012-05-02 02:12:15.635230 : GSIPC:AMBUF: rcv buff 0x7f16e1c56420, pool rcvbuf, rqlen 1103
2012-05-02 02:12:15.635308 : GSIPC:GPBMSG: new bmsg 0x7f16e1c56490 mb 0x7f16e1c56420 msg 0x7f16e1c564b0 mlen 152 dest x101 flushsz -1
2012-05-02 02:12:15.635334 : kjbmslset(0x15c91.1)) seq 0x4 reqid=0x6 (shadow 0xb48bb330.xb)(rsn 2)(mas@1)
2012-05-02 02:12:15.635355 : GSIPC:SPBMSG: send bmsg 0x7f16e1c56490 blen 184 msg 0x7f16e1c564b0 mtype 33 attr|dest x30101 bsz|fsz x1ffff
2012-05-02 02:12:15.635377 : GSIPC:SNDQ: enq msg 0x7f16e1c56490, type 65521 seq 118669, inst 1, receiver 1, queued 1

*** 2012-05-02 02:12:15.635
kclccctx: cleanup copy 0x7f16e1d94798
2012-05-02 02:12:15.635479 : [kjmpmsgi:compl][type 36][msg 0x7f16e1598588][seq 177740.0][qtime 0][ptime 1257]
2012-05-02 02:12:15.635511 : GSIPC:BSEND: flushing sndq 0xb491dd28, id 1, dcx 0xbc516778, inst 1, rcvr 1  qlen 0 1
2012-05-02 02:12:15.635536 : GSIPC:BSEND: no batch1 msg 0x7f16e1c56490 type 65521 len 184 dest (1:1)
2012-05-02 02:12:15.635557 : kjbsentscn[0x0.1ae0f1][to 1]
2012-05-02 02:12:15.635578 : GSIPC:SENDM: send msg 0x7f16e1c56490 dest x10001 seq 118669 type 65521 tkts x10002 mlen xb800e8
WAIT #0: nam='gcs remote message' ela= 180 waittime=1 poll=0 event=0 obj#=0 tim=1335939135635819
2012-05-02 02:12:15.635853 : GSIPC:RCVD: ksxp msg 0x7f16e167e0b0 sndr 1 seq 0.177741 type 32 tkts 0
2012-05-02 02:12:15.635875 : GSIPC:RCVD: watq msg 0x7f16e167e0b0 sndr 1, seq 177741, type 32, tkts 0
2012-05-02 02:12:15.636012 : GSIPC:TKT: collect msg 0x7f16e167e0b0 from 1 for rcvr -1, tickets 0
2012-05-02 02:12:15.636040 : kjbrcvdscn[0x0.1ae0f1][from 1][idx 2012-05-02 02:12:15.636060 : kjbrcvdscn[no bscn <= rscn 0x0.1ae0f1][from 1]
2012-05-02 02:12:15.636082 : GSIPC:TKT: dest (1:1) rtkt not acked 1  unassigned bufs 0  tkts 0  newbufs 0
2012-05-02 02:12:15.636102 : GSIPC:TKT: remove ctx dest (1:1)
2012-05-02 02:12:15.636125 : [kjmxmpm][type 32][seq 0.177741][msg 0x7f16e167e0b0][from 1]
2012-05-02 02:12:15.636146 : kjbmpocr(0xb0.6)seq 0x1,reqid=0x23a,(client 0x9fff7b58,0x1)(from 1)(lseq xdf0)

 

2号实例上LMS进程的情况是这样的 收到gcs remote message GSIPC 关于要求SCN=[0x0.1ae0f0] block=1/89233的请求,进入BAST kjbmpbast(0x15c91.1),因为 block=1/89233已被写入到磁盘 触发fairness算法(在11.2.0.3中默认_fairness_threshold=2),对current block做KCL: F156: fairness downconvert,从Xcurrent DownConvert为 Scurrent:

 

Instance 2:
SQL> select state,cr_scn_bas from x$bh where file#=1 and dbablk=89233 and state!=0;

STATE CR_SCN_BAS
---------- ----------
2 0
3 1756658

 

之后Instance 2 LMS 将cr block加入到 kjbmslset(0x15c91.1)) 传送队列SEND QUEUE GSIPC:SNDQ: enq msg 0x7f16e1c56490。

 

接着我们再次在Instance 1上运行对 block: (1/89233)的查询 看会发生什么:

 

Instance 2:

SQL> select CURRENT_RESULTS,LIGHT_WORKS from v$cr_block_server;

CURRENT_RESULTS LIGHT_WORKS
--------------- -----------
          29273         437

Instance 1 session A:

SQL>
SQL> select * from test;

        ID
----------
         2
         2

SQL> select state,cr_scn_bas from x$bh where file#=1 and dbablk=89233 and state!=0;

     STATE CR_SCN_BAS
---------- ----------
         3    1761942
         3    1761932
         1          0
         3    1761520

Instance 2:

SQL> select CURRENT_RESULTS,LIGHT_WORKS from v$cr_block_server;

CURRENT_RESULTS LIGHT_WORKS
--------------- -----------
          29274         437

select * from test
END OF STMT
PARSE #140336529675592:c=0,e=337,p=0,cr=0,cu=0,mis=0,r=0,dep=0,og=1,plh=1357081020,tim=1335939668940051
EXEC #140336529675592:c=0,e=96,p=0,cr=0,cu=0,mis=0,r=0,dep=0,og=1,plh=1357081020,tim=1335939668940204
WAIT #140336529675592: nam='SQL*Net message to client' ela= 5 driver id=1650815232 #bytes=1 p3=0 obj#=0 tim=1335939668940348

*** 2012-05-02 02:21:08.940
kclscrs: req=0 block=1/89233
2012-05-02 02:21:08.940676 : kjbcro[0x15c91.1 76896.0][5]

*** 2012-05-02 02:21:08.940
kclscrs: req=0 typ=nowait-abort

*** 2012-05-02 02:21:08.940
kclscrs: bid=1:3:1:0:f:21:0:0:10:0:0:0:1:2:4:1:20:0:0:0:c3:49:0:0:0:0:0:0:0:0:0:0:0:0:0:0:0:0:0:0:4:3:2:1:2:0:1f:0:4d:26:a3:52:0:0:0:0:c7:c:ca:62:c3:49:0:0:0:0:1:0:17:8e:47:76:1:2:dc:5:a9:fe:17:75:0:0:0:0:0:0:0:0:0:0:0:0:99:ed:0:0:0:0:0:0:10:0:0:0
2012-05-02 02:21:08.940799 : kjbcro[0x15c91.1 76896.0][5]
2012-05-02 02:21:08.940833 : GSIPC:GMBQ: buff 0xba0ee018, queue 0xbb79a7b8, pool 0x60013fa0, freeq 0, nxt 0xbb79a7b8, prv 0xbb79a7b8
2012-05-02 02:21:08.940859 : kjbsentscn[0x0.1ae28c][to 2]
2012-05-02 02:21:08.940870 : GSIPC:SENDM: send msg 0xba0ee088 dest x20001 seq 177810 type 36 tkts xff0000 mlen x1680198
2012-05-02 02:21:08.940976 : kjbmscr(0x15c91.1)reqid=0xa(req 0xa4ff30f8)(rinst 1)hldr 2(infosz 200)(lseq x2b8)
2012-05-02 02:21:08.941314 : GSIPC:KSXPCB: msg 0xba0ee088 status 30, type 36, dest 2, rcvr 1

*** 2012-05-02 02:21:08.941
kclwcrs: wait=0 tm=707

*** 2012-05-02 02:21:08.941
kclwcrs: got 1 blocks from ksxprcv
2012-05-02 02:21:08.941818 : kjbassume[0x15c91.1][sender 2][mymode x1][myrole x0][srole x0][flgs x0][spiscn 0x0.0][swscn 0x0.0]
2012-05-02 02:21:08.941852 : kjbrcvdscn[0x0.1ae28d][from 2][idx 2012-05-02 02:21:08.941871 : kjbrcvdscn[no bscn

 

前台进程与上一次类似申请的是SCN=[0x0.1ae28c]=1761932 Version的CR block, 但这一次receive的居然是Xcurrent Block且其SCN=1ae28d=1761933,Instance 1收到了Xcurrent Block后自己build出了查询所需要的SCN=1761932的CR BLOCK, 因为实际收到的是Current block,所以等待事件变成了’gc current block 2-way’。

这里可以看到前台进程并没有request current block,而仍是使用kjbcro;追根述源是Instance 2的LMS进程给它传送了Current Block:

 

Instance 2 LMS trace:

2012-05-02 02:21:08.448743 : GSIPC:RCVD: ksxp msg 0x7f16e14a4398 sndr 1 seq 0.177810 type 36 tkts 0
2012-05-02 02:21:08.448778 : GSIPC:RCVD: watq msg 0x7f16e14a4398 sndr 1, seq 177810, type 36, tkts 0
2012-05-02 02:21:08.448798 : GSIPC:TKT: collect msg 0x7f16e14a4398 from 1 for rcvr -1, tickets 0
2012-05-02 02:21:08.448816 : kjbrcvdscn[0x0.1ae28c][from 1][idx 2012-05-02 02:21:08.448834 : kjbrcvdscn[no bscn <= rscn 0x0.1ae28c][from 1]
2012-05-02 02:21:08.448857 : GSIPC:TKT: dest (1:1) rtkt not acked 2  unassigned bufs 0  tkts 0  newbufs 0
2012-05-02 02:21:08.448875 : GSIPC:TKT: remove ctx dest (1:1)
2012-05-02 02:21:08.448970 : [kjmxmpm][type 36][seq 0.177810][msg 0x7f16e14a4398][from 1]
2012-05-02 02:21:08.448993 : kjbmpbast(0x15c91.1) reqid=0x6 (req 0xa4ff30f8)(reqinst 1)(reqid 10)(flags x0)

*** 2012-05-02 02:21:08.449
kclcrrf: req=48054 block=1/89233

*** 2012-05-02 02:21:08.449
kcl_compress_block: compressed: 6 free space: 7680
2012-05-02 02:21:08.449085 : kjbsentscn[0x0.1ae28d][to 1]
2012-05-02 02:21:08.449142 : kjbdeliver[to 1][0xa4ff30f8][10][current 1]
2012-05-02 02:21:08.449164 : kjbmssch(reqlock 0xa4ff30f8,10)(to 1)(bsz 344)
2012-05-02 02:21:08.449183 : GSIPC:AMBUF: rcv buff 0x7f16e18bcec8, pool rcvbuf, rqlen 1102

*** 2012-05-02 02:21:08.449
kclccctx: cleanup copy 0x7f16e1d94838

*** 2012-05-02 02:21:08.449
kcltouched: touch seconds 3271

*** 2012-05-02 02:21:08.449
kclgrantlk: req=48054
2012-05-02 02:21:08.449347 : [kjmpmsgi:compl][type 36][msg 0x7f16e14a4398][seq 177810.0][qtime 0][ptime 1119]
WAIT #0: nam='gcs remote message' ela= 568 waittime=1 poll=0 event=0 obj#=0 tim=1335939668449962
2012-05-02 02:21:08.450001 : GSIPC:RCVD: ksxp msg 0x7f16e1bb22a0 sndr 1 seq 0.177811 type 32 tkts 0
2012-05-02 02:21:08.450024 : GSIPC:RCVD: watq msg 0x7f16e1bb22a0 sndr 1, seq 177811, type 32, tkts 0
2012-05-02 02:21:08.450043 : GSIPC:TKT: collect msg 0x7f16e1bb22a0 from 1 for rcvr -1, tickets 0
2012-05-02 02:21:08.450060 : kjbrcvdscn[0x0.1ae28e][from 1][idx 2012-05-02 02:21:08.450078 : kjbrcvdscn[no bscn <= rscn 0x0.1ae28e][from 1]
2012-05-02 02:21:08.450097 : GSIPC:TKT: dest (1:1) rtkt not acked 3  unassigned bufs 0  tkts 0  newbufs 0
2012-05-02 02:21:08.450116 : GSIPC:TKT: remove ctx dest (1:1)
2012-05-02 02:21:08.450136 : [kjmxmpm][type 32][seq 0.177811][msg 0x7f16e1bb22a0][from 1]
2012-05-02 02:21:08.450155 : kjbmpocr(0xb0.6)seq 0x1,reqid=0x23e,(client 0x9fff7b58,0x1)(from 1)(lseq xdf4)

 

 

为什么Instance 2上的LMS要偷懒,不构造build cr block,而直接传输给Instance 1自己所有的Current Block呢?通过观察Instance 2上的v$cr_block_server视图可以发现LIGHT_WORKS字段在发生current block transfer后并没有增长,说明并没有触发 CR server的 Light Work Rule(Light Work Rule是8i Cr Server中就存在的优化算法,当Remote LMS创建 build CR涉及的工作过多时,resource holder的LMS会选择传输现有block,而不完成CR build If creating the consistent read version block involves too much work (such as reading blocks from disk), then the holder sends the block to the requestor, and the requestor completes the CR fabrication. The holder maintains a fairness counter of CR requests. After the fairness threshold is reached, the holder downgrades it to lock mode.)。

 

到底何种条件下 CR Request 才会收到Current Block呢?

答案是:针对不同种类class的block,CR server有不同的做法。 对于undo block或者 undo header block的CR quest, LMS总是传输Current Block, 这是为了给 远程节点 可能正在进行的 block cleanout、 CR  Version构造提供便利。

 

对于普通的数据块 data blocks, 默认情况下总是 CR quest  & CR received的(排除之前所说的Light Work Rule,LMS”偷懒”), 除非Current Block已经DownConvert降级为S lock,那么LMS进程会宣称直接ship一个current version的block。

 

为了证明这一点 , 我们再次测试 ,这次我们将控制DownConvert的隐藏参数”_fairness_threshold“提高到200,这将造成Xcurrent Block无法降级为Scurrent, 也将导致LMS更少地传输Current Version的Data Block:

 

 

 

SQL> show parameter fair

NAME                                 TYPE        VALUE
------------------------------------ ----------- ------------------------------
_fairness_threshold                  integer     200

Instance 1:

SQL>  update test set id=id+1 where id=4;

1 row updated.

Instance 2:

SQL>  update test set id=id+1 where id=2;

1 row updated.

SQL> select state,cr_scn_bas from x$bh where file#=1 and dbablk=89233 and state!=0;

     STATE CR_SCN_BAS
---------- ----------
         1          0
         3    1838166

在Instance 1上 不断查询,并 通过instance 2的 v$cr_block_server验证

instance 1
SQL> select * from test;

        ID
----------
        10
         3

instance 2:

SQL>  select state,cr_scn_bas from x$bh where file#=1 and dbablk=89233 and state!=0;

     STATE CR_SCN_BAS
---------- ----------
         1          0
         3    1883707
         8          0

SQL> select * from test;

        ID
----------
        10
         3

SQL>  select state,cr_scn_bas from x$bh where file#=1 and dbablk=89233 and state!=0;

     STATE CR_SCN_BAS
---------- ----------
         1          0
         3    1883707
         8          0

...................

SQL> /

     STATE CR_SCN_BAS
---------- ----------
         2          0
         3    1883707
         3    1883695

repeat cr request on Instance 1

SQL> /

     STATE CR_SCN_BAS
---------- ----------
         8          0
         3    1883707
         3    1883695

 

 

实际测试发现_fairness_threshold似乎存在某个上限,虽然设置为200 但是在不多的几次CR serve后就Downgrade了lock, 这造成对data block的 CR Request最终还是Receive了 Current Block。

11gR2新特性:LMHB Lock Manager Heart Beat后台进程

LMHB是11gR2中新引入的后台进程,官方文档的介绍是Global Cache/Enqueue Service Heartbeat Monitor,Monitor the heartbeat of LMON, LMD, and LMSn processes,LMHB monitors LMON, LMD, and LMSn processes to ensure they are running normally without blocking or spinning。  Database and ASM instances, Oracle RAC

该进程负责监控LMON、LMD、LMSn等RAC关键的后台进程,保证这些background process不被阻塞或spin。 LMHB可能是Lock Manager Heartbeat的缩写。

 

我们来看一下该进程的trace跟踪文件以便了解其功能:

按照 100s -> 80s -> 100s -> 80s的间隔监控并输出一次LMSn、LCKn、LMON、LMD等进程的状态及wait chain,由kjfmGCR_HBCheckAll函数控制

 

*** 2012-02-03 00:03:10.066
==============================
LMS0 (ospid: 17247) has not moved for 77 sec (1328245389.1328245312)
kjfmGCR_HBCheckAll: LMS0 (ospid: 17247) has status 2
  : waiting for event 'gcs remote message' for 0 secs with wait_id 15327.
  ===[ Wait Chain ]===
  Wait chain is empty.
kjgcr_Main: KJGCR_ACTION - id 5

*** 2012-02-03 00:04:50.091
==============================
LMS0 (ospid: 17247) has not moved for 88 sec (1328245489.1328245401)
kjfmGCR_HBCheckAll: LMS0 (ospid: 17247) has status 2
  : waiting for event 'gcs remote message' for 0 secs with wait_id 24546.
  ===[ Wait Chain ]===
  Wait chain is empty.
kjgcr_Main: KJGCR_ACTION - id 5

 LCK0 (ospid: 2662) has not moved for 95 sec (1309746735.1309746640)
  kjfmGCR_HBCheckAll: LCK0 (ospid: 2662) has status 6
  ==================================================
  === LCK0 (ospid: 2662) Heartbeat Report
  ==================================================
  LCK0 (ospid: 2662) has no heartbeats for 95 sec. (threshold 70 sec)
   : Not in wait; last wait ended 80 secs ago.
   : last wait_id 2317342 at 'libcache interrupt action by LCK'.
  ..
  .
   Session Wait History:
       elapsed time of 1 min 20 sec since last wait
    0: waited for 'libcache interrupt action by LCK'
  ..

 

大约每3分钟输出一次TOP CPU User,CPU使用率高的session信息:

 

*** 2012-02-03 00:05:30.102
kjgcr_SlaveReqBegin: message queued to slave
kjgcr_Main: KJGCR_ACTION - id 3
CPU is high.  Top oracle users listed below:
     Session           Serial         CPU
      29                 7             0
     156                23             0
       3                 1             0
       4                 1             0
       5                 1             0

*** 2012-02-03 00:08:30.147
kjgcr_SlaveReqBegin: message queued to slave
kjgcr_Main: KJGCR_ACTION - id 3
CPU is high.  Top oracle users listed below:
     Session           Serial         CPU
      29                 7             0
     156                23             0
       3                 1             0
       4                 1             0
       5                 1             0

 

如果发现有session的CPU使用率极高,根据内部算法可能会激活 资源计划(resource management plan) ,甚至于kill 进程:

 

*** 2012-02-03 00:08:35.149
kjgcr_Main: Reset called for action high cpu, identify users, count 0

*** 2012-02-03 00:08:35.149
kjgcr_Main: Reset called for action high cpu, kill users, count 0

*** 2012-02-03 00:08:35.149
kjgcr_Main: Reset called for action high cpu, activate RM plan, count 0

*** 2012-02-03 00:08:35.149
kjgcr_Main: Reset called for action high cpu, set BG into RT, count 0

 

从11.2.0.2 开始LMHB开始使用slave 进程GCRn来完成实际的任务(Global Conflict Resolution Slave Process Performs synchronous tasks on behalf of LMHB GCRn processes are transient slaves that are started and stopped as required by LMHB to perform synchronous or resource intensive tasks.) LMHB会控制GCRn进程的启停,以便使用多个GCRn完成同步和缓解资源紧张的任务(例如kill进程)。

可以看到实际LMHB调用的多为kjgcr或kjfmGCR开头的内部函数,GCR意为Global Conflict Resolution。

 

kjgcr_Main: KJGCR_ACTION – id 5

GCR 进程的trace :

*** 2011-11-28 02:42:44.466
kjgcr_SlaveActionCbk: Callback failed, check trace
Dumping GCR slave work message at 0x96b81fc0
GCR layer information: type = 1, index = 0
Unformatted dump of ksv layer header:

 

LMHB进程的出现是为了提高RAC的可用性,特别是在资源紧张的环境中他会主动地去尝试kill掉最耗费资源的服务进程,以保证LMS等关键的RAC后台进程能正常工作; 因为该进程定期监控LMS、LMON等后台进程的等待事件及session的CPU使用率等信息,所以该LMHB进程的跟踪日志也可能成为诊断RAC故障的之一,这是11.2.0.1以来RAC一个潜在的新特性和增强。

相关隐式参数

_lm_hb_callstack_collect_time hb diagnostic call stack collection time in seconds — 5s
_lm_hb_disable_check_list list of process names to be disabled in heartbeat check — none

 

11.2是第一个引入LMHB进程的版本,所以并不是太成熟,在实际过程中对于资源使用率很高的RAC系统而言LMHB可能会帮一些倒忙,若你确实遇到了相关的问题或者是在11.2 RAC上碰到了一些诡异的现象,那么可以关注一下以下这些MOS Note:

 

ORA-29770 LMHB Terminates Instance as LMON Waited for Control File IO or LIBRARY CACHE or ROW CACHE Event for too Long [ID 1197674.1]
Bug 8888434 – LMHB crashes the instance with LMON waiting on controlfile read [ID 8888434.8]
Bug 11890804 – LMHB crashes instance with ORA-29770 after long “control file sequential read” waits [ID 11890804.8]
Bug 11890804: LMHB TERMINATE INSTANCE WHEN LMON WAIT CHANGE FROM CF READ AFTER 60 SEC
Bug 13467673: CSS MISSCOUNT AND ALL ASM DOWN WITH ORA-29770 BY LMHB
Bug 13390052: KJFMGCR_HBCHECKALL MESSAGES ARE CONTINUOUSLY LOGGED IN LMHB TRACE FILE.
Bug 13322797: LMHB TERMINATES THE INSTANCE DUE TO ERROR 29770
Bug 11827088 – Latch ‘gc element’ contention, LMHB terminates the instance [ID 11827088.8]

Bug 13061883: LMHB IS TERMINATING THE INSTANCE DURING SHUTDOWN IMMEDIATE
Bug 12564133 – ORA-600[1433] in LMHB process during RAC reconfiguration [ID 12564133.8]
Bug 12886605: ESSC: LMHB TERMINATE INSTANCE DUE TO 29770 – LMON WAIT ENQ: AM – DISK OFFLINE
Bug 12757321: LMHB TERMINATING THE INSTANCE DUE TO ERROR 29770
Bug 10296263: LMHB (OSPID: 15872): TERMINATING THE INSTANCE DUE TO ERROR 29770
Bug 11899415: ORA-29771 AND LMHB (OSPID: XXXX) KILLS USER (OSPID: XXX
Bug 10431752: SINGLE NODE RAC: LMHB TERMINATES INSTANCE DUE TO 29770
Bug 11656856: LMHB (OSPID: 27701): TERMINATING THE INSTANCE DUE TO ERROR 29770
Bug 10411143: INSTANCE CRASHES WITH IPC SEND TIMEOUT AND LMHB TERMINATES WITH ORA-29770
Bug 11704041: DATABASE INSTANCE CRASH BY LMHB PROCESS
Bug 10412545: ORA-29770 LMHB TERMINATE INSTANCE DUE TO VARIOUS LONG CSS WAIT
Bug 10147827: INSTANCE TERMINATED BY LMHB WITH ERROR ORA-29770
Bug 10016974: ORA-29770 LMD IS HUNG FOR MORE THAN 70 SECONDS AND LMHB TERMINATE INSTANCE
Bug 9376100: LMHB TERMINATING INSTANCE DUE ERROR 29770

 

 

How many LMS processes for Oracle Rac 9i?

Question:

How many LMS processes for Oracle 9i Rac?

Can you check if you have increased the CPU in the machine during the upgrade.
Also please let me know if you have changed any underscore parameter.

Do you have some formula for calculate LMS processes and Num of CPU ?? Let say 2 CPU for 1 LMS process ??

 

Answer:

This can go dynamically as and when required and the # of startup of lms is also controlled by
_lm_lms,_lm_max_lms _lm_min_lms , later 2 seems to be for dynamic control of this numbers.
These parameters should not be set manually.
Also the no. of lms process should be one less than the number of CPU on the node.

沪ICP备14014813号-2

沪公网安备 31010802001379号