Oracle队列锁:IV,Library Cache Invalidation

IV,Library Cache Invalidation Enqueue Lock

 

相关资源:

在Library cache中当前被缓存的有效或已存数据库对象,例如 表TABLE、视图View、存储过程procedure、package、package body、trigger、index、cluster、synonym;或cursor (SQL or PL/SQL)、pipe、等多种多样的库缓存资源类型

 

相关用户:

所有的可能的Oracle前台或后台进程

 

锁的原因:

LCK后台进程及其他进程视图在集群中的所有实例上使相关的library cache object失效(invalidate)

 

何时使用该队列锁?

当一个有效或现存的数据库对象被加载到library cache库缓存中,LCK RAC后台进程将针对该resource要求一个S共享模式的IV队列锁。直到该对象或者失效或者不再存在或者被age out出library cache,该IV lock才会被释放。 该IV lock存在的目的是在所有实例间使library cache中缓存的对象失效。 若一个进程想要使library cache object失效则首先请求以X mode锁定该对象资源,这将导致所有实例中均使该缓存对象失效以响应BAST并释放他们在该对象上的IV lock,之后发起invalidate进程将释放该X lock。

 

ID1、ID2的组合:

Object Number, Timestamp

 

Lock Value Block:
Not Used.
Init.ora Parameters:
None.

 

Scope:
Global Lock.

 

Deadlock Sensitive:
No.

 

Operation:
Synchronous.

利用44951 event解决LOB SPACE enq HW – contention等待争用

对存有LOB大对象的表的并发插入、更新引起的LOB Segment High Water Mark是常见的LOb并发争用; 特别是在ASSM(Auto Segment Space Management)的表空间上这种LOB的HWM可能比MSSM(Manual Segment Space Management)更为严重,其原因是在MSSM下LOB的HWM bump一次会获取128个chunk, 而在ASSM下默认只获取必要的chunk(default 1) (This is worse in ASSM than MSSM because ASSM only gets the amount of space requested while MSSM gets 128 chunks at a time.)。

 

针对该LOB HWM高水位争用问题,常见的一种解决方法是为LOB segment预分配空间,因为预分配了空间所以出现高水位争用的机会少了,则 enq HW – contention争用出现的概率也随之降低,为LOB  allocate extent的语法如下:

 

ALTER TABLE <lob_table>
MODIFY LOB (<column_name>) (allocate extent (size <extent size>));

 

使用该方法手动为LOB SEGMENT预分配空间时需要注意,存在<NOTE 1229669.1 Bug 8198906 – Segment header corruption if extent allocation operation is interrupted>。

 

另一种缓解该LOB HW争用的有效方式是从 10.2.0.3上Bug 6376915引入的44951 event,该事件在10.2.0.4和11.1.0.7之后也都被引入。在10.2.0.3中使用该44951 event则需要优先apply Bug 6376915的one-off patch。

44951 event的LEVEL参数定义了在ASSM下当LOB segment的HWM上升时一次获取的chunks数目,一般推荐设置为1024,即一次get 1024个chunk,由于单次跃升的HWM更高了,这让进入enq HW – contention等待的机会变少了。 需要注意的是该44951 event仅仅对ASSM表空间上的LOB SEGMENT有效。

设置方法如下:

 

alter system set events ‘44951 trace name context forever, level 1024’;

 

 

以下是在11.2.0.3 中的实际测试,通过大量并发插入到LOB表模拟enq HW,并通过AWR报告中的性能信息比对设置该44951 event先后的区别:

 

 

 

SQL> select * from v$version;

BANNER
--------------------------------------------------------------------------------
Oracle Database 11g Enterprise Edition Release 11.2.0.3.0 - 64bit Production
PL/SQL Release 11.2.0.3.0 - Production
CORE    11.2.0.3.0      Production
TNS for Linux: Version 11.2.0.3.0 - Production
NLSRTL Version 11.2.0.3.0 - Production

SQL> show parameter event

NAME                                 TYPE        VALUE
------------------------------------ ----------- ------------------------------
event                                string
xml_db_events                        string      enable
SQL> 

conn maclean/oracle

 CREATE TABLE "MACLEAN_LOB" ( "T1" VARCHAR2(200) NOT NULL , "T2" CLOB, "T3" CLOB)  tablespace users
    LOB ("T2") 
    STORE AS  ( TABLESPACE "USERS" CHUNK 16K PCTVERSION 50 CACHE ) 
    LOB ("T3") 
    STORE AS  ( TABLESPACE "USERS" CHUNK 16K PCTVERSION 50 CACHE );

SQL> select segment_space_management from dba_tablespaces where tablespace_name='USERS';

SEGMEN
------
AUTO

exec dbms_workload_repository.create_snapshot;

开3个进程并发插入LOB表

begin
for i in 1..10000 loop
 insert into maclean.maclean_lob values ('ABC',rpad('Z',32000,'L'),rpad('Z',32000,'L'));
 end loop;
 commit;
 end;
 /

 exec dbms_workload_repository.create_snapshot;

SQL>   select bytes/1024,segment_name from dba_segments where segment_name in (select segment_name from dba_lobs where table_name='MACLEAN_LOB' and owner='MACLEAN');

BYTES/1024 SEGMENT_NAME
---------- ---------------------------------------------------------------------------------
    490496 SYS_LOB0000076982C00003$$
    482304 SYS_LOB0000076982C00002$$

	SQL> truncate table maclean.maclean_lob;

Table truncated.

SQL> SQL> 
SQL> 
SQL>  select bytes/1024,segment_name from dba_segments where segment_name in (select segment_name from dba_lobs where table_name='MACLEAN_LOB' and owner='MACLEAN');

BYTES/1024 SEGMENT_NAME
---------- ---------------------------------------------------------------------------------
        64 SYS_LOB0000076982C00003$$
        64 SYS_LOB0000076982C00002$$

SQL> alter system flush buffer_cache;

System altered.

SQL> alter system flush shared_pool;

System altered.

alter system set events '44951 trace name context forever, level 1024';

		 exec dbms_workload_repository.create_snapshot;

	开3个进程并发插入LOB表

begin
for i in 1..10000 loop
 insert into maclean.maclean_lob values ('ABC',rpad('Z',32000,'L'),rpad('Z',32000,'L'));
 end loop;
 commit;
 end;
 /	

 		 exec dbms_workload_repository.create_snapshot;

 select bytes/1024,segment_name from dba_segments where segment_name in (select segment_name from dba_lobs where table_name='MACLEAN_LOB' and owner='MACLEAN');

BYTES/1024 SEGMENT_NAME
---------- ---------------------------------------------------------------------------------
    483328 SYS_LOB0000076982C00003$$
    483328 SYS_LOB0000076982C00002$$

 

 

以上可以看到虽然设置了44951 level 1024,但并不会因为单次bump hwm的chunks数增加而导致大量空间的浪费。

对比AWR可以发现设置44961 level 1024后 enq HW – contention消耗的DB TIME明显减少:

 

 

 

 

此外在10.2.0.3之前还有一种方案即设置LOB的PCTVERSION 为0/100,但是该方案会导致LOB占用的SPACE大幅上升,所以不推荐,你有大量的理由至少升级DB到10.2.0.5.9。

 

 

Oracle Internal Event:10201 consistent read undo application诊断事件

之前我介绍了<Oracle Internal Event:10200 Consistent Read诊断事件>一致性逻辑读诊断事件的用法和trace含义,10201 “consistent read undo application”是另一个十分有用的内部诊断事件,该事件可以用于诊断一致性读取时的UNDO应用情况。

 

10201 event可以用于探测为了创建CR(consistent read) block块以满足要求的SCN需要应用多少undo,该10201 event还可以配和10200 event使用。利用该10201 event,我们可以验证一些内部问题,例如何时会发生块上的cleanup。

注意启用10201 event可能导致在短期内产生大量的trace文件,所以不要随意在生产系统中使用。
10201 Internal Event主要会被ktrgcm( CR-rollback (ktrgcm() ) 、 ktrrbkblk  、 ktrcrf 这三个Oracle内核函数触发,这三个Internal Function的主要作用:

 

  • ktrgcm – common CR read code
    CR Requestor-Side Algorithm
    The following statistics are incremented by ktrgcm:
    “cleanouts and rollbacks – consistent read” is incremented if UNDO is applied to BUFFER and CLEANOUT is performed.
    “rollbacks only – consistent read gets” is incremented if UNDO is applied to BUFFER and no CLEANOUT is performed.
    “cleanouts only – consistent read gets” is incremented if no UNDO is applied and CLEANOUT is performed.
    “no work – consistent read gets” is incremented if no UNDO is applied and no CLEANOUT is performed.
    When UNDO is applied to produce a CR BUFFER, other UNDO blocks should be read.
    When CLEANOUT is performed, the TX transaction table must be read.
  • ktrrbkblk retrieves previous row version with ktundo,When all rows checked, calls ktrrbkblk to rollback block (calls ktundo)      常见的stack call : ktrviupk kdiulk kcoubk ktundo kturbk ktrrbkblk ktrvfxs qerixFetch qertbFetchByRowID
  •   ktrcrf (rdbms/kernel/knl/ktr.c kcbchg1 ==> ktrcrf)

 

 

SQL> select * from v$version;

BANNER
--------------------------------------------------------------------------------
Oracle Database 11g Enterprise Edition Release 11.2.0.2.0 - 64bit Production
PL/SQL Release 11.2.0.2.0 - Production
CORE    11.2.0.2.0      Production
TNS for Linux: Version 11.2.0.2.0 - Production
NLSRTL Version 11.2.0.2.0 - Production

SQL> select * from global_name;

GLOBAL_NAME
--------------------------------------------------------------------------------
www.askmac.cn & www.askmac.cn

SQL> create table maclean (t1 int);

Table created.

SQL> insert into maclean values(1);

1 row created.

SQL> commit;

Commit complete.

SQL> select current_scn from v$database;

CURRENT_SCN
-----------
    1213588

SQL> delete maclean;

1 row deleted.

SQL> commit;

Commit complete.

SQL> alter system flush buffer_cache;

System altered.

SQL> alter system flush buffer_cache;

System altered.

SQL>  alter session set events '10201 trace name context forever,level 10';

Session altered.

SQL> select * from maclean as of scn 1213588;

        T1
----------
         1

trace content=====================================

Applying CR undo to block 0 : 408a81 itl entry 02:
          xid:  0x0009.00b.0000017d uba: 0x00c00212.0092.2d
          flg: --U-    lkc:  1     fsc: 0x0007.0012849c
CRS upd rd env [0x2ac2ec7b5660]: (scn: 0x0000.00128494  xid: 0x0000.000.00000000
uba: 0x00000000.0000.00  statement num=0  parent xid: 0x0000.000.00000000  st-scn: 0x
0000.00000000  hi-scn: 0x0000.00000000  ma-scn: 0x0000.00000000  flg: 0x00000800)
undo env [0x7fffe8b74d10]: (
scn: 0x0000.0012849b  xid: 0x0009.00b.0000017d  uba: 0x00c00212.0092.2d  statement num=151548811
parent xid: 0x0000.000.00000000  st-scn: 0x0000.00000000  hi-scn: 0x9
f58.00000000  ma-scn: 0x2ac2.ec7b9c88  flg: 0x00002ac2)
CRS upd (before): 0x69bdfe68  cr-scn: 0x0000.00128494  xid: 0x0000.000.00000000
uba: 0x00000000.0000.00  cl-scn: 0x0000.001287c1  sfl: 0
CRS upd (after) : 0x69bdfe68  cr-scn: 0x0000.00128494  xid: 0x0009.00b.0000017d
uba: 0x00c00212.0092.2d  cl-scn: 0x0000.001287c1  sfl: 0

以上trace中各代码的含义如下:

Applying CR undo to block 0 : 408a81 itl entry 02:
这里的0是 tablespace number, 408a81 是 DBA
而 itl entry 02 是被回滚的事务槽记录

CRS upd rd env [0x2ac2ec7b5660]: (scn: 0x0000.00128494 …..undo env [0x7fffe8b74d10]: (
以上为当前读取的环境信息,包括env_scn等

CRS upd (before): 0x69bdfe68 cr-scn: 0x0000.00128494
xid: 0x0000.000.00000000 uba: 0x00000000.0000.00 cl-scn: 0x0000.001287c1 sfl: 0
CRS upd (before)为回滚完成前的Buffer descriptor

CRS upd (after) : 0x69bdfe68 cr-scn: 0x0000.00128494 CR-SCN为1213588 如查询语句所要求的
xid: 0x0009.00b.0000017d uba: 0x00c00212.0092.2d cl-scn: 0x0000.001287c1 sfl: 0

CRS upd (after) 为回滚完成后的Buffer descriptor

Oracle Internal Research深入研究Oracle内部原理

以下是我个人原创或总结的一些深入研究Oracle Internal内部原理的文章post列表:

Oracle Internal Event:10200 Consistent Read诊断事件

了解Oracle RAC Brain Split Resolution

了解Oracle内核代码层的作用

了解你所不知道的SMON功能(九):维护MON_MODS$字典基表

了解你所不知道的SMON功能(八):Transaction Recover

了解你所不知道的SMON功能(七):清理IND$字典基表

了解你所不知道的SMON功能(六):Instance Recovery

了解你所不知道的SMON功能(五):Recover Dead transaction

了解你所不知道的SMON功能(四):维护col_usage$字典基表

了解你所不知道的SMON功能(三):清理obj$基表

了解你所不知道的SMON功能(二):合并空闲区间

了解你所不知道的SMON功能(一):清理临时段

Know more about commit

Find INTCOL#=1001 in col_usage$?

Buffer Lock Mode and Compatibilities

关于V$OPEN_CURSOR

Advanced Diagnostic using oradebug dumpvar

还原真实的cache recovery

Oracle内部视图:x$targetrba

Oracle内部视图:x$ktfbfe

Oracle内部视图:x$ktfbue

Only ARCH Bgprocess may create archivelog?

Oracle Latch:一段描绘Latch运作的伪代码

Who pulls the background process dbwr’s trigger?

基于行跟踪的ROWDEPENDENCIES ORA_ROWSCN信息

Oracle闩:Cache Buffers chains

Latches and Tuning:Latches

Latches and Tuning:Redo Log Buffer and Latches

Latches and Tuning:Buffer Cache

Latches and Tuning:The Library Cache

Oracle Database Links Master Class

Oracle等待事件:Data file init write

What’s the Point of Oracle Checkpoints?

How does RECORDLENGTH affect your exp speed?

Oracle网络TNS协议介绍(revised)

logfile switch causes incremental checkpoint?

Know more about redo log buffer and latches

How does cpu_count parameter affect instance?

How to trigger ORA-00600,ORA-7445 by manual

Basics of C code within the Oracle kernel

Know more about Oracle Latches

Know more about Buffer Cache and Latch

Know More About Libarary Cache and Latches

Oracle内部视图:X$BH

Know more about DML

Known Oracle Internal Stack Call Meaning

深入了解ASMM

_shared_pool_reserved_pct or shared_pool_reserved_size with ASMM

PL/SQL Virtual Machine Memory Usage

Slide:深入了解Oracle自动内存管理ASMM by Maclean Liu

Know GCS AND GES structure size in shared pool

How does SGA/PGA allocate on AMM?

理解Oracle在AIX平台上的内存使用

Probe how does your PGA consume

ora-4031 and “obj stat memory” component in Shared Pool

直接路径读取对于延迟块清除的影响

How to make BBED(Oracle Block Brower and EDitor Tool) on Unix/Linux/Windows

隐藏参数_high_priority_processes与oradism

Oracle等待事件Enqueue CI:Cross Instance Call Invocation

Oracle常用的几个父栓

ddl操作是否会产生undo?

kgl simulator,共享池simulator以及缓冲simulator相关解释

Oracle Internal Event:10200 Consistent Read诊断事件

10200(consistent read buffer status)内部诊断事件可以用于探测一致性读CR(consistent read)块的访问情况,虽然cr读的统计信息可以从v$sysstat或AWR/statspack中获取,但是10200 event还是我们研究Consistent Read一致性读的有力工具。该事件可以通过在会话session级别设置ALTER SESSION SET EVENTS 或 DBMS_SYSTEM.SET_EV. Set 来开启,一般调用级别为Level 10。

该事件返回的trace跟踪文件,记录了为了实现一致性读的目的,哪些数据块以及这些块的各历史版本在执行过程中被创建(CR block creation)并检验(CR block inspection),以找出Best CR block满足Consistent一致性。

注意10200 Internal Event主要是被ktrgtc和ktrget(call ktrget to get one block ->calling KTR layer to apply RBS to have consistent read Block;)这2个Oracle内核功能函数触发,这2个内部函数Internal Function的主要作用:

ktrget:

  • Initializes a buffer cache CR scan request
  • Calls kcbgtcr for the best resident buffer to start from to build the CR buffer
  • Calls ktrgcm to build the CR buffer by applying undo
  • Returns CR buffer to the requestor

kcbgtcr:

  • If successful, returns the “best” candidate (performed by ktrexf or examination function)
  • Scans the hash bucket for the DBA for buffers that may be used to build a CR buffer
  • If not successful, calls kcbget

 

10200 event trace example:

 

[oracle@rh2 ~]$ sqlplus / as sysdba

SQL*Plus: Release 10.2.0.4.0 - Production on Fri Sep 30 21:23:47 2011

Copyright (c) 1982, 2007, Oracle.  All Rights Reserved.

Connected to:
Oracle Database 10g Enterprise Edition Release 10.2.0.4.0 - 64bit Production
With the Partitioning, OLAP, Data Mining and Real Application Testing option

SQL> select * from global_name;

GLOBAL_NAME
--------------------------------------------------------------------------------
www.askmac.cn & www.askmac.cn

SQL> create table tv(t1 int);

Table created.

SQL> alter session set events '10200 trace name context forever,level 10';

Session altered.

SQL> select * from tv;

        T1
----------
         1

10200 trace

Consistent read started for block 0 : 0040081a
  env: (scn: 0x0000.000cf852 xid: 0x0000.000.00000000 uba: 0x00000000.0000.00  statement num=0

parent xid: xid: 0x0000.000.00000000  scn: 0x0000.00000000 8sch: scn: 0x0000.00000000)
 CR exa ret 2 on: 0x600139d0  scn: 0xffff.ffffffff  xid: 0x0000.000.00000000  uba: 0x00000000.0000.00

scn: 0xffff.ffffffff  sfl: 0

Consistent read finished for block 0 : 40081a

Consistent read started for block 0 : 0040e508
  env: (scn: 0x0000.000cf852  xid: 0x0000.000.00000000  uba: 0x00000000.0000.00  statement num=0
parent xid: xid: 0x0000.000.00000000  scn: 0x0000.00000000 8sch: scn:
 0x0000.00000000)
Consistent read finished for block 0 : 40e508

 

以上trace中各代码的含义如下:

Consistent read started for block 0 : 0040081a
0 -> tablespace number 0040081a -> DBA

 

env: (scn: 0x0000.000cf852 xid: 0x0000.000.00000000 uba: 0x00000000.0000.00
以上为环境/会话信息,这个scn是env_scn ,一般就是数据库的current_scn

 

SFL :0  -> SFL 为 Snapshot Flag

 

CR exa ret 2 -> 此处的ret为reture code返回代码,是ktrgtc/ktrget函数的返回码

 

以下为ktrgtc/ktrget函数部分可能返回代码的含义:

 

#define KCBRSTOP (8|0) /* return this one now */
#define KCBRSAVE (8|1) /* save this one and continue */
#define KCBRSKIP (0|1) /* skip over this one and continue */
#define KCBRQUIT (0|2) /* chuck all, return nothing and stop */
#define KCBRLAST (0|3) /* quit if read from disk else skip */

 

10200诊断事件在11g中得到了加强,通过该event我们可以获得更多有用的trace信息了:

 

SQL> select * from v$version;

BANNER
--------------------------------------------------------------------------------
Oracle Database 11g Enterprise Edition Release 11.2.0.3.0 - 64bit Production
PL/SQL Release 11.2.0.3.0 - Production
CORE    11.2.0.3.0      Production
TNS for Linux: Version 11.2.0.3.0 - Production
NLSRTL Version 11.2.0.3.0 - Production

SQL> select * from global_name;

GLOBAL_NAME
--------------------------------------------------------------------------------
MACLEAN

SQL> alter session set events '10200 trace name context forever, level 10';

Session altered.

SQL> select * from tv;

        T1
----------
         1

SQL> oradebug setmypid;
Statement processed.

SQL> oradebug tracefile_name;
/s01/orabase/diag/rdbms/vprod/VPROD1/trace/VPROD1_ora_28365.trc

trace content

ktrget2(): started for block   objd: 0x000040e1
env [0x2b54cde2a704]: (scn: 0x0000.0026b064  xid: 0x0000.000.00000000  uba: 0x00000000.0000.00
statement num=0  parent xid: 0x0000.000.00000000  st-scn: 0x0000.00000000  hi-scn: 0x0000.00000000
ma-scn: 0x0000.0026b053  flg: 0x00000660)
ktrexf(): returning 9 on:  0xbb132d0  cr-scn: 0xffff.ffffffff  xid: 0x0000.000.00000000
uba: 0x00000000.0000.00  cl-scn: 0xffff.ffffffff  sfl: 0
ktrgcm(): completed for block   objd: 0x000040e1
ktrget3(): completed for  block  objd: 0x000040e1

了解Oracle RAC Brain Split Resolution

了解Oracle内核代码层的作用

Oracle RDBMS Kernel 的代码分成多层次,如<Oracle8i Internal Services>一书所介绍的:

oracle_kernel_layer

其中部分代码层的作用:

The cache layer (KC)

The cache layer manages the database buffer cache. It uses operating
system dependent facilities for data file I/O, provides concurrency control
facilities for local access to the cache buffers, and provides parallel cache
management (PCM) instance locking facilities for Oracle parallel server.
The other main responsibility of the cache layer is the control of redo
generation into the log buffer, and the writing of redo to the log files. The
cache layer also caches control file information.

  • To read blocks.
  • To update blocks in redo only mode.

The transaction layer (KT)

This layer is responsible for the allocation of transactions to rollback
segments, interested transaction list changes within data blocks, changes
to rollback segment blocks for undo generation, transaction control
facilities such as savepoints, and read consistency. The transaction layer is
also responsible for space management, both at the level of segment free
lists and at the level of tablespace extent allocation.

  • To update blocks with undo.
  • To check with the status of the transaction.
  • To manage recursive transactions for nested space operations.

The recursive program interface (RPI)

The recursive program interface is used to populate the dictionary cache
from the data dictionary. Row cache recursive SQL statements are
executed in a separate call context, but are not parsed and optimized in the
compilation layer.

Row Cache Layer

  • To read/update space dictionary information, where acutal updates happens by callbacks when row cache is flushed.

Recursive SQL

  • To read/update dictionary information.

File Layer

  • To create/drop/alter files in tablespace.

 

以上各代码层实现了Oracle Space Management 空间管理的主要功能。

了解你所不知道的SMON功能(九):维护MON_MODS$字典基表

SMON后台进程的作用还包括维护MON_MODS$基表,当初始化参数STATISTICS_LEVEL被设置为TYPICAL或ALL时默认会启用Oracle中表监控的特性,Oracle会默认监控表上的自上一次分析以后(Last analyzed)发生的INSERT,UPDATE,DELETE以及表是否被TRUNCATE截断,并将这些操作数量的近似值记录到数据字典基表MON_MODS$中,我们常用的一个DML视图dba_tab_modifications的数据实际来源于另一个数据字典基表MON_MODS_ALL$,SMON定期会将MON_MODS$中符合要求的数据MERGE到MON_MODS_ALL$中。

Rem DML monitoring

create table mon_mods$
(
  obj#              number,                                 /* object number */
  inserts           number,  /* approx. number of inserts since last analyze */
  updates           number,  /* approx. number of updates since last analyze */
  deletes           number,  /* approx. number of deletes since last analyze */
  timestamp         date,     /* timestamp of last time this row was changed */
  flags             number,                                         /* flags */
                                           /* 0x01 object has been truncated */
  drop_segments     number   /* number of segemnt in part/subpartition table */
)
  storage (initial 200K next 100k maxextents unlimited pctincrease 0)
/
create unique index i_mon_mods$_obj on mon_mods$(obj#)
  storage (maxextents unlimited)
/

Rem DML monitoring, has info aggregated to global level for paritioned objects
create table mon_mods_all$
(
  obj#              number,                                 /* object number */
  inserts           number,  /* approx. number of inserts since last analyze */
  updates           number,  /* approx. number of updates since last analyze */
  deletes           number,  /* approx. number of deletes since last analyze */
  timestamp         date,     /* timestamp of last time this row was changed */
  flags             number,                                         /* flags */
                                           /* 0x01 object has been truncated */
  drop_segments     number   /* number of segemnt in part/subpartition table */
)
  storage (initial 200K next 100k maxextents unlimited pctincrease 0)
/
create unique index i_mon_mods_all$_obj on mon_mods_all$(obj#)
  storage (maxextents unlimited)
/

Rem =========================================================================
Rem End Usage monitoring tables
Rem =========================================================================

VIEW DBA_TAB_MODIFICATIONS

select u.name, o.name, null, null,
       m.inserts, m.updates, m.deletes, m.timestamp,
       decode(bitand(m.flags,1),1,'YES','NO'),
       m.drop_segments
from sys.mon_mods_all$ m, sys.obj$ o, sys.tab$ t, sys.user$ u
where o.obj# = m.obj# and o.obj# = t.obj# and o.owner# = u.user#
union all
select u.name, o.name, o.subname, null,
       m.inserts, m.updates, m.deletes, m.timestamp,
       decode(bitand(m.flags,1),1,'YES','NO'),
       m.drop_segments
from sys.mon_mods_all$ m, sys.obj$ o, sys.user$ u
where o.owner# = u.user# and o.obj# = m.obj# and o.type#=19
union all
select u.name, o.name, o2.subname, o.subname,
       m.inserts, m.updates, m.deletes, m.timestamp,
       decode(bitand(m.flags,1),1,'YES','NO'),
       m.drop_segments
from sys.mon_mods_all$ m, sys.obj$ o, sys.tabsubpart$ tsp, sys.obj$ o2,
     sys.user$ u
where o.obj# = m.obj# and o.owner# = u.user# and
      o.obj# = tsp.obj# and o2.obj# = tsp.pobj#

现象:

SMON后台进程会每15分钟将SGA中的DML统计信息刷新到SYS.MON_MODS$基表中(SMON flush every 15 minutes to SYS.MON_MODS$),
同时会将SYS.MON_MODS$中符合要求的数据MERGE合并到MON_MODS_ALL$中,并清空原MON_MODS$中的数据。
MON_MODS_ALL$作为dba_tab_modifications视图的数据来源,起到辅助统计信息收集的作用,详见拙作<Does GATHER_STATS_JOB gather all objects’ stats every time?>

SMON具体将DML统计数据刷新到SYS.MON_MODS$、合并到MON_MODS_ALL$、并清除数据的操作如下:

SQL> select * from v$version;

BANNER
--------------------------------------------------------------------------------
Oracle Database 11g Enterprise Edition Release 11.2.0.2.0 - 64bit Production
PL/SQL Release 11.2.0.2.0 - Production
CORE    11.2.0.2.0      Production
TNS for Linux: Version 11.2.0.2.0 - Production
NLSRTL Version 11.2.0.2.0 - Production

SQL> select * from global_name;

GLOBAL_NAME
--------------------------------------------------------------------------------
www.askmac.cn

/* 填充mon_mods$字典基表 */

lock table sys.mon_mods$ in exclusive mode nowait

insert into sys.mon_mods$
  (obj#, inserts, updates, deletes, timestamp, flags, drop_segments)
values
  (:1, :2, :3, :4, :5, :6, :7)

update sys.mon_mods$
   set inserts       = inserts + :ins,
       updates       = updates + :upd,
       deletes       = deletes + :del,
       flags        =
       (decode(bitand(flags, :flag), :flag, flags, flags + :flag)),
       drop_segments = drop_segments + :dropseg,
       timestamp     = :time
 where obj# = :objn

lock table sys.mon_mods_all$ in exclusive mode

/* 以下merge命令会将mon_mods$中的记录合并到mon_mods_all$,
   若有匹配的记录,则在原记录的基础上增加inserts、updates、deletes总数,
   否则插入新的记录 
*/

merge /*+ dynamic_sampling(mm 4) dynamic_sampling_est_cdn(mm)                           
dynamic_sampling(m 4) dynamic_sampling_est_cdn(m) */
into sys.mon_mods_all$ mm
using (select m.obj#          obj#,
              m.inserts       inserts,
              m.updates       updates,
              m.deletes       deletes,
              m.flags         flags,
              m.timestamp     timestamp,
              m.drop_segments drop_segments fr om sys.mon_mods$ m,
              tab$            t where m.obj# = t.obj#) v
on (mm.ob j# = v.obj#)
when matched then
  update
     set mm.inserts       = mm.inserts + v.inserts,
         mm.updates       = mm.updates + v.updates,
         mm.deletes       = mm.deletes + v.deletes,
         mm.flags         = mm.flags + v.flags - bitand(mm.flags, v.flags) /* bitor(mm.flags,v.flags) */,
         mm.timestamp     = v.timestamp,
         mm.drop_segments = mm.drop_segments + v.drop_segments
when NOT matched then
  insert
    (obj#, inserts, updates, deletes, timestamp, flags, drop_segments)
  values
    (v.obj#,
     v.inserts,
     v.updates,
     v.deletes,
     sysdate,
     v.flags,
     v.drop_segments) / all merge /*+ dynamic_sampling(mm 4) dynamic_sampling_est_cdn(mm)                           
dynamic_sampling(m 4) dynamic_sampling_est_cdn(m) */
  into sys.mon_mods_all$ mm using
    (select m.obj#          obj#,
            m.inserts       inserts,
            m.updates       updates,
            m.deletes       deletes,
            m.flags         flags,
            m.timestamp     timestamp,
            m.drop_segments drop_segments fr om sys.mon_mods$ m,
            tab$            t where m.obj# = t.obj#) v on
    (mm.ob j# = v.obj#)
when matched then
  update
     set mm.inserts       = mm.inserts + v.inserts,
         mm.updates       = mm.updates + v.updates,
         mm.deletes       = mm.deletes + v.deletes,
         mm.flags         = mm.flags + v.flags - bitand(mm.flags, v.flags) 
         /* bitor(mm.flags,v.flags) */,
         mm.timestamp     = v.timestamp,
         mm.drop_segments = mm.drop_segments + v.drop_segments
when NOT matched then
  insert
    (obj#, inserts, updates, deletes, timestamp, flags, drop_segments)
  values
    (v.obj#,
     v.inserts,
     v.updates,
     v.deletes,
     sysdate,
     v.flags,
     v.drop_segments)

/* 最后删除sys.mon_mods$上的相关记录 */

delete /*+ dynamic_sampling(m 4) dynamic_sampling_est_cdn(m) */
from sys.mon_mods$ m
 where exists (select /*+ unnest */
         *
          from sys.tab$ t
         where t.obj# = m. obj#)

  select obj#
    from sys.mon_mods$
   where obj# not in (select obj# from sys.obj$)

Used to have a FULL TABLE SCAN on obj$ associated with monitoring information 
extracted in conjunction with mon_mods$ executed by SMON periodically.

因为当SMON或用户采用”DBMS_STATS.FLUSH_DATABASE_MONITORING_INFO”存储过程将DML数据刷新到mon_mods$或mon_mods_all$中时会要求持有表上的排它锁,所以在RAC环境中可能出现死锁问题。

另外在早期版本中SMON可能因维护监控表而造成shutdown immediate缓慢或系统性能下降的问题,详见:

<Shutdown immediate hangs if table monitoring enabled on [ID 263217.1]>
<Bug 2806297 – SMON can cause bad system performance if TABLE MONITORING enabled on lots of tables [ID 2806297.8]>

SMON维护MON_MODS$时相关的Stack CALL

kglpnal <- kglpin <- kxsGetRuntimeLock
<- kksfbc <- kkspsc0 <- kksParseCursor <- opiosq0 <- opiall0
<- opikpr <- opiodr <- PGOSF175_rpidrus <- skgmstack <- rpiswu2
<- kprball <- kprbbnd0 <- kprbbnd <- ksxmfmel <- ksxmfm
<- ksxmfchk <- ksxmftim <- ktmmon <- ktmSmonMain <- ksbrdp
<- opirip <- opidrv <- sou2o <- opimai_real <- ssthrdmain
<- main <- libc_start_main <- start

如何禁止SMON维护MON_MODS$

注意在缺省参数环境中创建的表总是启用table monitoring的:

SQL> select * from v$version;

BANNER
--------------------------------------------------------------------------------
Oracle Database 11g Enterprise Edition Release 11.2.0.2.0 - 64bit Production
PL/SQL Release 11.2.0.2.0 - Production
CORE    11.2.0.2.0      Production
TNS for Linux: Version 11.2.0.2.0 - Production
NLSRTL Version 11.2.0.2.0 - Production

SQL> select * from v$version;

BANNER
--------------------------------------------------------------------------------
Oracle Database 11g Enterprise Edition Release 11.2.0.2.0 - 64bit Production
PL/SQL Release 11.2.0.2.0 - Production
CORE    11.2.0.2.0      Production
TNS for Linux: Version 11.2.0.2.0 - Production
NLSRTL Version 11.2.0.2.0 - Production

SQL> create table maclean1 (t1 int);          

Table created.

/* 在10g以后nomonitoring或monitoring选项不再有效  */

SQL> create table maclean2 (t1 int) nomonitoring;

Table created.

SQL>  select table_name,monitoring from dba_tables  where table_name like 'MACLEAN%';


TABLE_NAME                     MON
------------------------------ ---
MACLEAN1                       YES
MACLEAN2                       YES

通常来说我们不需要禁止SMON维护MON_MODS$,除非是在SMON维护过程中遭遇shutdown过慢、性能降低或者异常情况恢复SMON随机terminate实例的问题。

在10g以前可以使用MONITORING和NOMONITORING这2个选项来控制表级别的监控是否被开启,此外我们还可以通过dbms_stats.ALTER_SCHEMA_TAB_MONITORING(‘maclean’,false)存储过程在schema级别的monitoring是否被开启,但是在10g以后这些方法不再有效,MONITORING和NOMONITORING选项被废弃(In 10g the MONITORING and NOMONITORING keywords are deprecated and will be ignored.),其原有功能被STATISTICS_LEVEL参数所覆盖。

Table-monitoring特性现在完全由STATISTICS_LEVEL参数所控制:
当STATISTICS_LEVEL设置为BASIC时,Table-monitoring将被禁用
当STATISTICS_LEVEL设置为TYPICAL或ALL时,Table-monitoring将启用

换而言之我们可以通过设置STATISTICS_LEVEL为BASIC达到禁止SMON后台进程该种功能的作用,具体修改该参数的命令如下:

show parameter statistics_level
alter system set statistics_level = basic;

但是请注意如果你正在使用AMM或ASMM自动内存管理特性的话,那么STATISTICS_LEVEL参数是不能设置为BASIC的,因为Auto-Memory或Auto-Sga特性都依赖于STATISTICS_LEVEL所控制的性能统计信息。若一定要这样做那么首先要diable AMM&ASMM:


 #diable 11g AMM ,have to bounce instance
 #alter system set memory_target =0 scope=spfile;
 #diable 10g ASMM
 alter system set sga_target=0;
 alter system set statistics_level = basic;

Automatic Tuning of Undo Retention

Oracle 10gR2 and higher automatically tunes the undo retention period based on how the undo tablespace is configured.

  1. If the undo tablespace is configured with the AUTOEXTEND option, the database dynamically tunes the undo retention period to be somewhat longer than the longest-running active query on the system. However, this retention period may be insufficient to accommodate Oracle Flashback operations. Oracle Flashback operations resulting in “snapshot too old” errors are the indicator that you must intervene to ensure that sufficient undo data is retained to support these operations. To better accommodate Oracle Flashback features, you can either set the UNDO_RETENTION parameter to a value equal to the longest expected Oracle Flashback operation, or you can change the undo tablespace to fixed size.
  2. If the undo tablespace is fixed size, the database dynamically tunes the undo retention period for the best possible retention for that tablespace size and the current system load. This best possible retention time is typically significantly greater than the duration of the longest-running active query.
  3. To guarantee the success of long-running queries or Oracle Flashback operations, you can enable retention guarantee. If retention guarantee is enabled, the specified minimum undo retention is guaranteed; the database never overwrites unexpired undo data even if it means that transactions fail due to lack of space in the undo tablespace. If retention guarantee is not enabled, the database can overwrite unexpired undo when space is low, thus lowering the undo retention for the system. This option is disabled by default.

了解你所不知道的SMON功能(八):Transaction Recover

SMON的作用还包括启动(startup)时的Transaction Recover:

SMON: enabling cache recovery
Archived Log entry 87 added for thread 1 sequence 58 ID 0xa044e7d dest 1:
[15190] Successfully onlined Undo Tablespace 2.
Undo initialization finished serial:0 start:421305354 end:421305534 diff:180 (1 seconds)
Verifying file header compatibility for 11g tablespace encryption..
Verifying 11g file header compatibility for tablespace encryption completed
SMON: enabling tx recovery

<了解你所不知道的SMON功能(五):Recover Dead transaction>中我们介绍了SMON清理死事务的功能,数据库打开时由SMON所启动的TX recovery与Recover Dead transaction所作的工作是类似的,fast_start_parallel_rollback参数决定了SMON在回滚事务时使用的并行度(详见原帖)。

但是请注意,实际startup时的TX recovery要比普通的Dead transaction recover复杂的多。其大致步骤如下:

1.在SYSTEM回滚段(Undo Segment Number为o)中的Active Transaction将被第一时间优先回滚

2.在其他回滚段中的Active Transaction将被标记为’DEAD’

3.之后SMON将扫描非SYSTEM的回滚段并实施对死事务的回滚,其典型的调用堆栈stack call如下:

 kturec <- kturax <- ktprbeg <- ktmmon <- ktmSmonMain

4.SMON仍将扫描_OFFLINE_ROLLBACK_SEGMENTS所列出的回滚段,但对其上的Active Transaction不做回滚,若发现corrupted则只报错

5.SMON将忽略_CORRUPTED_ROLLBACK_SEGMENTS所列出的回滚段,甚至在启动时不做扫描,所有指向这类回滚段地事务都被认为已经提交了。

具体SMON在对ktuini的函数调用中启动Transaction Recover,该function的经典stack call如下:

adbdrv -> ktuini -> ktuiup -> kturec -> kturrt
or
adbdrv -> ktuini -> ktuiof -> ktunti -> kqrpre -> kqrpre1 -> ktuscr

其中由ktuiof函数判断_OFFLINE_ROLLBACK_SEGMENTS和_CORRUPTED_ROLLBACK_SEGMENTS的值,并将这些重要的回滚段信息转存到fixed array。
注意SYSTEM回滚段是bootstrap的重要对象,所以我们不能指定system rollback segment为offline或者corrupted。

SMON执行Transaction Recover时的大致步骤如下:

调用ktuiof保存_OFFLINE_ROLLBACK_SEGMENTS和_CORRUPTED_ROLLBACK_SEGMENTS所列出的回滚段

调用ktuiup函数,开始恢复回滚段上的死事务

第一优先级地恢复USN=0的SYSTEM回滚段上的事务,由kturec函数控制

对undo$字典基表上的记录循环:

FOR usn in undo$ loop
IF usn==0

恢复SYSTEM回滚段上在第一轮中未完成的事务,同样由kturec控制;

ELSE

将任何活动事务标记为DEAD,由kturec控制;

USN++

end loop

相关诊断事件

与Transaction Recover密切相关的诊断事件有不少,其中最为重要的是event 10013和10015;10015事件对于普通的dead transaction rollback也有效,之所以把该事件列在<Transaction Recover>功能内,是因为我们经常在非正常手段打开数据库时会遇到一些ORA-600[4xxx]的内部错误,可以通过10015事件了解相关的usn,然后以_SYSSMU(USN#)$的形式加入到_CORRUPTED_ROLLBACK_SEGMENTS以绕过内部错误(注意在11g中不能这样做了):

  1. 10013, 00000, “Instance Recovery”
  2. 10015, 00000, “Undo Segment Recovery”
Event 10013:Monitor transaction recovery during startup

SQL> alter system set event='10013 trace name context forever,level 10' scope=spfile;

Event 10015:Dump undo segment headers before and after transaction recovery

SQL> alter system set event='10015 trace name context forever,level 10' scope=spfile;
System altered.

======================10015 sample trace===========================
UNDO SEG (BEFORE RECOVERY): usn = 0  Extent Control Header
  -----------------------------------------------------------------
  Extent Header:: spare1: 0      spare2: 0      #extents: 6      #blocks: 47
                  last map  0x00000000  #maps: 0      offset: 4128
      Highwater::  0x0040000b  ext#: 0      blk#: 1      ext size: 7
  #blocks in seg. hdr's freelists: 0
  #blocks below: 0
  mapblk  0x00000000  offset: 0
                   Unlocked
     Map Header:: next  0x00000000  #extents: 6    obj#: 0      flag: 0x40000000
  Extent Map
  -----------------------------------------------------------------
   0x0040000a  length: 7
   0x00400011  length: 8
   0x00400181  length: 8
   0x00400189  length: 8
   0x00400191  length: 8
   0x00400199  length: 8      

  TRN CTL:: seq: 0x012c chd: 0x0033 ctl: 0x0026 inc: 0x00000000 nfb: 0x0001
            mgc: 0x8002 xts: 0x0068 flg: 0x0001 opt: 2147483646 (0x7ffffffe)
            uba: 0x0040000b.012c.1b scn: 0x0000.021fa595
Version: 0x01
  FREE BLOCK POOL::
    uba: 0x0040000b.012c.1b ext: 0x0  spc: 0x4a0
    uba: 0x00000000.005c.07 ext: 0x2  spc: 0x1adc
    uba: 0x00000000.0034.37 ext: 0x4  spc: 0x550
    uba: 0x00000000.0000.00 ext: 0x0  spc: 0x0
    uba: 0x00000000.0000.00 ext: 0x0  spc: 0x0     

  TRN TBL::

  index  state cflags  wrap#    uel         scn            dba            parent-xid    nub     stmt_num
  ------------------------------------------------------------------------------------------------
   0x00    9    0x00  0x025d  0x002b  0x0000.02215c0b  0x0040000b  0x0000.000.00000000  0x00000001   0x00000000   0x0000
   0x01    9    0x00  0x025d  0x0006  0x0000.0220a58c  0x0040000a  0x0000.000.00000000  0x00000001   0x00000000   0x0000
   0x02    9    0x00  0x025d  0x000e  0x0000.0220a58a  0x0040000a  0x0000.000.00000000  0x00000001   0x00000000   0x0000
   0x03    9    0x00  0x025d  0x000f  0x0000.02215be4  0x0040000b  0x0000.000.00000000  0x00000001   0x00000000   0x0000
   0x04    9    0x00  0x025d  0x0008  0x0000.0220a57a  0x0040000a  0x0000.000.00000000  0x00000001   0x00000000   0x0000
   0x05    9    0x00  0x025d  0x0056  0x0000.0220a583  0x0040000a  0x0000.000.00000000  0x00000001   0x00000000   0x0000
   0x06    9    0x00  0x025d  0x0017  0x0000.0220a58d  0x0040000a  0x0000.000.00000000  0x00000001   0x00000000   0x0000
   0x07    9    0x00  0x025d  0x0050  0x0000.0220a57f  0x0040000a  0x0000.000.00000000  0x00000001   0x00000000   0x0000
   0x08    9    0x00  0x025d  0x0061  0x0000.0220a57c  0x0040000a  0x0000.000.00000000  0x00000001   0x00000000   0x0000
   0x09    9    0x00  0x025d  0x0013  0x0000.02215c01  0x0040000b  0x0000.000.00000000  0x00000001   0x00000000   0x0000
   0x0a    9    0x00  0x025d  0x0022  0x0000.02215bf7  0x0040000b  0x0000.000.00000000  0x00000001   0x00000000   0x0000
   0x0b    9    0x00  0x025d  0x0014  0x0000.02215bdd  0x0040000a  0x0000.000.00000000  0x00000001   0x00000000   0x0000
   0x0c    9    0x00  0x025c  0x003a  0x0000.021ff3fa  0x004001a0  0x0000.000.00000000  0x00000001   0x00000000   0x0000
   0x0d    9    0x00  0x025d  0x0010  0x0000.02215c05  0x0040000b  0x0000.000.00000000  0x00000001   0x00000000   0x0000
   0x0e    9    0x00  0x025d  0x0001  0x0000.0220a58b  0x0040000a  0x0000.000.00000000  0x00000001   0x00000000   0x0000
   0x0f    9    0x00  0x025d  0x001c  0x0000.02215be6  0x0040000b  0x0000.000.00000000  0x00000001   0x00000000   0x0000
   0x10    9    0x00  0x025d  0x002a  0x0000.02215c07  0x0040000b  0x0000.000.00000000  0x00000001   0x00000000   0x0000
   0x11    9    0x00  0x025d  0x0025  0x0000.02215bf2  0x0040000b  0x0000.000.00000000  0x00000001   0x00000000   0x0000
   0x12    9    0x00  0x025d  0x0018  0x0000.02215bee  0x0040000b  0x0000.000.00000000  0x00000001   0x00000000   0x0000
   0x13    9    0x00  0x025d  0x000d  0x0000.02215c03  0x0040000b  0x0000.000.00000000  0x00000001   0x00000000   0x0000
   0x14    9    0x00  0x025d  0x005a  0x0000.02215bdf  0x0040000a  0x0000.000.00000000  0x00000001   0x00000000   0x0000
   0x15    9    0x00  0x025d  0x0058  0x0000.0220a587  0x0040000a  0x0000.000.00000000  0x00000001   0x00000000   0x0000
   0x16    9    0x00  0x025d  0x000a  0x0000.02215bf6  0x0040000b  0x0000.000.00000000  0x00000001   0x00000000   0x0000
   0x17    9    0x00  0x025d  0x000b  0x0000.0220a58e  0x0040000a  0x0000.000.00000000  0x00000001   0x00000000   0x0000
   0x18    9    0x00  0x025d  0x0011  0x0000.02215bf0  0x0040000b  0x0000.000.00000000  0x00000001   0x00000000   0x0000
   0x19    9    0x00  0x025c  0x0044  0x0000.021ff410  0x004001a0  0x0000.000.00000000  0x00000001   0x00000000   0x0000
   0x1a    9    0x00  0x025d  0x005c  0x0000.02215bea  0x0040000b  0x0000.000.00000000  0x00000001   0x00000000   0x0000
   0x1b    9    0x00  0x025d  0x001d  0x0000.02215bfd  0x0040000b  0x0000.000.00000000  0x00000001   0x00000000   0x0000
   0x1c    9    0x00  0x025d  0x001a  0x0000.02215be8  0x0040000b  0x0000.000.00000000  0x00000001   0x00000000   0x0000
   0x1d    9    0x00  0x025d  0x0009  0x0000.02215bff  0x0040000b  0x0000.000.00000000  0x00000001   0x00000000   0x0000
   0x1e    9    0x00  0x025d  0x005f  0x0000.02215bfa  0x0040000b  0x0000.000.00000000  0x00000001   0x00000000   0x0000
   0x1f    9    0x00  0x025c  0x0032  0x0000.021fa59b  0x0040019f  0x0000.000.00000000  0x00000001   0x00000000   0x0000
   0x20    9    0x00  0x025c  0x0038  0x0000.021fa599  0x0040019f  0x0000.000.00000000  0x00000001   0x00000000   0x0000

可用以下命令分析smon的10015 trace,并列出相关回滚段名

[oracle@rh2 bdump]$ cat g10r2_smon_18738.trc|grep "usn ="|grep -v "usn = 0" |awk '{print "_SYSSMU"$7"$"}'|sort -u
_SYSSMU1$
_SYSSMU10$
_SYSSMU2$
_SYSSMU3$
_SYSSMU4$
_SYSSMU5$
_SYSSMU6$
_SYSSMU7$
_SYSSMU8$
_SYSSMU9$

沪ICP备14014813号-2

沪公网安备 31010802001379号