MINSCN与Cache Fusion Read Consistent

问题来源于 网友在Ask Maclean Home 上关于  RAC 中 Past Image PI块的提问, 网友提出了一个演示,我们在11.2.0.3 2 Node RAC的环境中重现这个实验:

 

SQL> select * from v$version;

BANNER
--------------------------------------------------------------------------------
Oracle Database 11g Enterprise Edition Release 11.2.0.3.0 - 64bit Production
PL/SQL Release 11.2.0.3.0 - Production
CORE    11.2.0.3.0      Production
TNS for Linux: Version 11.2.0.3.0 - Production
NLSRTL Version 11.2.0.3.0 - Production

SQL> select * from global_name;

GLOBAL_NAME
--------------------------------------------------------------------------------
www.askmac.cn

SQL> drop table test purge;

Table dropped.

SQL> alter system flush buffer_cache;

System altered.

SQL> create table test(id number);

insert into test values(1);

insert into test values(2);

commit;

/* 我们利用 rowid定位TEST表仅有的2行数据的数据块位置 */

select dbms_rowid.rowid_block_number(rowid),dbms_rowid.rowid_relative_fno(rowid) from test;

DBMS_ROWID.ROWID_BLOCK_NUMBER(ROWID) DBMS_ROWID.ROWID_RELATIVE_FNO(ROWID)
------------------------------------ ------------------------------------
                               89233                                    1
                               89233                                    1 

SQL> alter system flush buffer_cache;

System altered.

Instance 1  Session A 执行UPDATE操作:

SQL> update test set id=id+1 where id=1;

1 row updated.

Instance 1  Session B 查询x$BH buffer header视图 了解 相关Buffer的状态 

SQL> select state,cr_scn_bas from x$bh where file#=1 and dbablk=89233 and state!=0;

     STATE CR_SCN_BAS
---------- ----------
         1          0
         3    1227595

 

X$BH 视图的 STATE字段代表Buffer的状态, 以下是状态列表:

 

STATE       NUMBER
           KCBBHFREE         0       buffer free
           KCBBHEXLCUR       1       buffer current (and if DFS locked X)
           KCBBHSHRCUR       2       buffer current (and if DFS locked S)
           KCBBHCR           3       buffer consistant read
           KCBBHREADING      4       Being read
           KCBBHMRECOVERY    5       media recovery (current & special)
           KCBBHIRECOVERY    6       Instance recovery (somewhat special)

 

 

这个演示中我们需要用到的是 : state =1 Xcurrent 、 state=2 Scurrent 、 state=3 CR

 

接着在 Instance 2  更新 同一个数据块内的另一条记录 ,这回引发 gc current block 2 way  并将Current Block 传输到 Instance 2, 同时 Instance 1 的原”Current Block” Convert 成 Past Image:

 

 

Instance 2 Session C 

SQL> update test set id=id+1 where id=2;

1 row updated.

Instance 2 Session D

SQL> select state,cr_scn_bas from x$bh where file#=1 and dbablk=89233 and state!=0;

     STATE CR_SCN_BAS
---------- ----------
         1          0
         3    1227641
         3    1227638

 

STATE =1 的Xcurrent block已传输到 Instance 2 , 再来看 Instance 1 此时的 GC状态:

 

Instance 1 Session B 

SQL> select state,cr_scn_bas from x$bh where file#=1 and dbablk=89233 and state!=0;

     STATE CR_SCN_BAS
---------- ----------
         3    1227641
         3    1227638
         8          0
         3    1227595

 

 

问题出现在这里, 当网友再次在Instance 1上的session A中执行对TEST表的SELECT查询后 ,发现原来的 3个 State=3的CR 块 数量减少到了1个:

 

Instance 1 session A 即最初执行UPDATE的 session

SQL> alter session set events '10046 trace name context forever,level 8:10708 trace name context forever,level 103: trace[rac.*] disk high';

Session altered.

SQL> select * from test;

        ID
----------
         2
         2

select state,cr_scn_bas from x$bh where file#=1 and dbablk=89233 and state!=0;

      STATE CR_SCN_BAS
---------- ----------
         3    1227716
         3    1227713
         8          0

 

 

网友在原帖中是通过v$BH 视图观察CR块的数量,发现在执行SELECT查询后 CR块数量反而减少了,故而产生了疑问。 我们在以上演示中直接 观察X$BH视图可以发现 , 原本的三个CR块的SCN Version分别为: 1227641、1227638、1227595, 在SELECT查询完成后被 2个不同SCN version的CR块 1227716和 1227713 所替换, Oracle为什么要这样做呢?

 

所幸我们在实际执行SELECT查询前设置了event 10708和 rac.*的诊断TRACE,我们先来看看TRACE内容:

 

PARSING IN CURSOR #140444679938584 len=337 dep=1 uid=0 oct=3 lid=0 tim=1335698913632292 hv=3345277572 ad='bc0e68c8' sqlid='baj7tjm3q9sn4'
SELECT /* OPT_DYN_SAMP */ /*+ ALL_ROWS IGNORE_WHERE_CLAUSE NO_PARALLEL(SAMPLESUB) opt_param('parallel_execution_enabled', 'false') NO_PARALLEL_INDEX(SAMPLESUB) NO_SQL_TUNE */ NVL(SUM(C1),0), NVL(SUM(C2),0) FROM (SELECT /*+ NO_PARALLEL("TEST") FULL("TEST") NO_PARALLEL_INDEX("TEST") */ 1 AS C1, 1 AS C2 FROM "SYS"."TEST" "TEST") SAMPLESUB
END OF STMT
PARSE #140444679938584:c=1000,e=27630,p=0,cr=0,cu=0,mis=1,r=0,dep=1,og=1,plh=1950795681,tim=1335698913632252
EXEC #140444679938584:c=0,e=44,p=0,cr=0,cu=0,mis=0,r=0,dep=1,og=1,plh=1950795681,tim=1335698913632390

*** 2012-04-29 07:28:33.632
kclscrs: req=0 block=1/89233

*** 2012-04-29 07:28:33.632
kclscrs: bid=1:3:1:0:7:80:1:0:4:0:0:0:1:2:4:1:26:0:0:0:70:1a:0:0:0:0:0:0:0:0:0:0:0:0:0:0:0:0:0:0:4:3:2:1:2:0:3f:0:1c:86:2d:4:0:0:0:0:a2:3c:7c:b:70:1a:0:0:0:0:1:0:7a:f8:76:1d:1:2:dc:5:a9:fe:17:75:0:0:0:0:0:0:0:0:0:0:0:0:63:e5:0:0:0:0:0:0:10:0:0:0
2012-04-29 07:28:33.632578 : kjbcrc[0x15c91.1 76896.0][9]
2012-04-29 07:28:33.632616 : GSIPC:GMBQ: buff 0xba1e8f90, queue 0xbb79f278, pool 0x60013fa0, freeq 1, nxt 0xbb79f278, prv 0xbb79f278
2012-04-29 07:28:33.632634 : kjbmscrc(0x15c91.1)seq 0x2 reqid=0x1c(shadow 0xb4bb4458,reqid x1c)mas@2(infosz 200)(direct 1)
2012-04-29 07:28:33.632654 : kjbsentscn[0x0.12bbc1][to 2]
2012-04-29 07:28:33.632669 : GSIPC:SENDM: send msg 0xba1e9000 dest x20001 seq 24026 type 32 tkts xff0000 mlen x17001a0
2012-04-29 07:28:33.633385 : GSIPC:KSXPCB: msg 0xba1e9000 status 30, type 32, dest 2, rcvr 1

*** 2012-04-29 07:28:33.633
kclwcrs: wait=0 tm=689

*** 2012-04-29 07:28:33.633
kclwcrs: got 1 blocks from ksxprcv
WAIT #140444679938584: nam='gc cr block 2-way' ela= 689 p1=1 p2=89233 p3=1 obj#=76896 tim=1335698913633418
2012-04-29 07:28:33.633490 : kjbcrcomplete[0x15c91.1 76896.0][0]
2012-04-29 07:28:33.633510 : kjbrcvdscn[0x0.12bbc1][from 2][idx 2012-04-29 07:28:33.633527 : kjbrcvdscn[no bscn <= rscn 0x0.12bbc1][from 2]

*** 2012-04-29 07:28:33.633
kclwcrs: req=0 typ=cr(2) wtyp=2hop tm=689

 

 

通过TRACE不难发现 因为之前没有收集过TEST表的统计信息, 所以这里出发了Dynamic Sampling的动态采样,这本身会引发对TEST表的 CR读请求,实际产生了一次’gc cr block 2-way’ 等待:

2012-04-29 07:28:33.632654 : kjbsentscn[0x0.12bbc1][to 2]

12bbc1= 1227713  与上述X$BH中的一个CR块对应,kjbsentscn[0x0.12bbc1][to 2] 可以理解为 向 Instance 2 发送了SCN=12bbc1=1227713   DBA=0x15c91.1 76896.0 的  CR Request(obj#=76896)

之后kjbrcvdscn函数确认了 [no bscn <= rscn 0x0.12bbc1][from 2] ,即没有 比已receive的 SCN Version =12bbc1 更好的Best Version

 

CR Server Arch

CR Server in Oracle9i Release 2 There are three instances involved: the requestor instance, the lock master instance, and the current block owner instance. The lock is granted if one of the following is true: Resource held mode is NULL. Resource held mode is S and there is no holder of an S lock in the master node. Otherwise, the master forwards the CR request to the holder node. If the lock is global, then you choose a node to forward the CR request to as follows: If there is a past image (PI) at the lock master instance, and the PI SCN is greater than snap-scn, then the master node is this node. Otherwise, you choose a PI with the smallest SCN and PI-SCN greater than snap-SCN. The owner node of this PI is the node you forward the CR request to. The PI with smallest SCN is the most interesting one, because you have less UNDO to be applied. If there is no PI at all, you choose the node that the current buffer belongs to.

 

 

动态采样完成后才真正执行了用户发出的SELECT语句:

 

 

PARSING IN CURSOR #140444682869592 len=18 dep=0 uid=0 oct=3 lid=0 tim=1335698913635874 hv=1689401402 ad='b1a188f0' sqlid='c99yw1xkb4f1u'
select * from test
END OF STMT
PARSE #140444682869592:c=4999,e=34017,p=0,cr=7,cu=0,mis=1,r=0,dep=0,og=1,plh=1357081020,tim=1335698913635870
EXEC #140444682869592:c=0,e=23,p=0,cr=0,cu=0,mis=0,r=0,dep=0,og=1,plh=1357081020,tim=1335698913635939
WAIT #140444682869592: nam='SQL*Net message to client' ela= 7 driver id=1650815232 #bytes=1 p3=0 obj#=0 tim=1335698913636071

*** 2012-04-29 07:28:33.636
kclscrs: req=0 block=1/89233

*** 2012-04-29 07:28:33.636
kclscrs: bid=1:3:1:0:7:83:1:0:4:0:0:0:1:2:4:1:26:0:0:0:70:1a:0:0:0:0:0:0:0:0:0:0:0:0:0:0:0:0:0:0:4:3:2:1:2:0:2:0:1c:86:2d:4:0:0:0:0:a2:3c:7c:b:70:1a:0:0:0:0:1:0:7d:f8:76:1d:1:2:dc:5:a9:fe:17:75:0:0:0:0:0:0:0:0:0:0:0:0:63:e5:0:0:0:0:0:0:10:0:0:0
2012-04-29 07:28:33.636209 : kjbcrc[0x15c91.1 76896.0][9]
2012-04-29 07:28:33.636228 : GSIPC:GMBQ: buff 0xba0e5d50, queue 0xbb79f278, pool 0x60013fa0, freeq 1, nxt 0xbb79f278, prv 0xbb79f278
2012-04-29 07:28:33.636244 : kjbmscrc(0x15c91.1)seq 0x3 reqid=0x1d(shadow 0xb4bb4458,reqid x1d)mas@2(infosz 200)(direct 1)
2012-04-29 07:28:33.636252 : kjbsentscn[0x0.12bbc4][to 2]
2012-04-29 07:28:33.636358 : GSIPC:SENDM: send msg 0xba0e5dc0 dest x20001 seq 24029 type 32 tkts xff0000 mlen x17001a0
2012-04-29 07:28:33.636861 : GSIPC:KSXPCB: msg 0xba0e5dc0 status 30, type 32, dest 2, rcvr 1

*** 2012-04-29 07:28:33.637
kclwcrs: wait=0 tm=865

*** 2012-04-29 07:28:33.637
kclwcrs: got 1 blocks from ksxprcv
WAIT #140444682869592: nam='gc cr block 2-way' ela= 865 p1=1 p2=89233 p3=1 obj#=76896 tim=1335698913637294
2012-04-29 07:28:33.637356 : kjbcrcomplete[0x15c91.1 76896.0][0]
2012-04-29 07:28:33.637374 : kjbrcvdscn[0x0.12bbc4][from 2][idx 2012-04-29 07:28:33.637389 : kjbrcvdscn[no bscn <= rscn 0x0.12bbc4][from 2]

*** 2012-04-29 07:28:33.637
kclwcrs: req=0 typ=cr(2) wtyp=2hop tm=865

 

类似的, “SELECT * FROM TEST”也引发了一次’gc cr block 2-way’等待:
2012-04-29 07:28:33.637374 : kjbrcvdscn[0x0.12bbc4][from 2][idx 2012-04-29 07:28:33.637389 : kjbrcvdscn[no bscn

最后Foreground Process从 Remote LMS哪里got的是 SCN=1227716 Version的CR, 同样与 之前我们从X$BH 视图查到的scn对应。

 

这样就可以解释为什么Instance 1上出现了2个SCN更大的CR块, 但仍无法解释原来存在于Instance 1 Buffer Cache中的 三个SCN Version 较小的CR 块消失的原因。

 

我们来看下面的演示:

 

 

 

SQL> alter system set "_enable_minscn_cr"=false scope=spfile;

System altered.

SQL> alter system set "_db_block_max_cr_dba"=20 scope=spfile;

System altered.

SQL> startup force;
ORA-32004: obsolete or deprecated parameter(s) specified for RDBMS instance
ORACLE instance started.

Total System Global Area 1570009088 bytes
Fixed Size                  2228704 bytes
Variable Size             989859360 bytes
Database Buffers          570425344 bytes
Redo Buffers                7495680 bytes
Database mounted.
Database opened.

 

设置以上 “_enable_minscn_cr”=false 和 “_db_block_max_cr_dba”=20 并重启RAC所有实例, 重现上述演示:

 

 

在Instance 2 Session C 中update更新一次数据块 就对应地在Instance 1 中查询一次 ,以反复在Instance 1中Request CR 

SQL> update test set id=id+1 where id=2;              -- Instance 2

1 row updated.

SQL> select * From test;                         -- Instance 1

        ID
----------
         1
         2

下面为 Instance 1的 X$BH记录

select state,cr_scn_bas from x$bh where file#=1 and dbablk=89233 and state!=0;

    STATE CR_SCN_BAS
---------- ----------
         3    1273080
         3    1273071
         3    1273041
         3    1273039
         8          0

SQL>  update test set id=id+1 where id=3;

1 row updated.

SQL> select * From test;

        ID
----------
         1
         2

SQL> select state,cr_scn_bas from x$bh where file#=1 and dbablk=89233 and state!=0;

     STATE CR_SCN_BAS
---------- ----------
         3    1273091
         3    1273080
         3    1273071
         3    1273041
         3    1273039
         8          0

...................

SQL> select state,cr_scn_bas from x$bh where file#=1 and dbablk=89233 and state!=0;

     STATE CR_SCN_BAS
---------- ----------
         3    1273793
         3    1273782
         3    1273780
         3    1273769
         3    1273734
         3    1273715
         3    1273691
         3    1273679
         3    1273670
         3    1273643
         3    1273635
         3    1273623
         3    1273106
         3    1273091
         3    1273080
         3    1273071
         3    1273041
         3    1273039
         3    1273033

19 rows selected.

SQL> select state,cr_scn_bas from x$bh where file#=1 and dbablk=89233 and state!=0;

     STATE CR_SCN_BAS
---------- ----------
         3    1274993

 

 

如上述演示 在设置了 “_enable_minscn_cr”(enable/disable minscn optimization for CR)=false 和 “_db_block_max_cr_dba”=20 (Maximum Allowed Number of CR buffers per dba) 2个 参数后 最多的时候 Instance 1 中缓存了同一个数据块的 多达 19个版本的CR块。

 

“_enable_minscn_cr”是11g以后出现的新隐藏参数,它控制Oracle是否计算CR块的最小SCN,当Foreground Process Receive接收到同一个数据块的更新(SCN更大)的SCN Version CR Block时可能会清除CBC上的 SCN较小的、旧的CR块 , 这样做的目的是减少Buffer Cache中同一个数据块 不同版本SCN Version的CR块的数量,注意不管是 语句级别或者事务级别 其所要求的Snap_Scn 快照 SCN总是 语句或事务开始时的Current SCN, 保留一些旧的CR块虽然可能对一些持续时间长的查询或者游标有益, 但是实例Buffer Cache中 同一个数据块的多版本 CR块的总数量是有限的, 这个总数受到 “_db_block_max_cr_dba” 隐藏参数的控制, 如我们上述演示中设置为20 ,则最多可以在Buffer Cache中缓存多大19个版本的CR块; 注意该”_db_block_max_cr_dba” 参数的默认值为6 , 即一个实例Buffer cache中同一个数据块的CR 版本同时不多于6个。

 

引入”_enable_minscn_cr” 优化CR的最小MINSCN 是有其理由的, 即便那些版本较旧的CR块被新陈代谢了, 但只要需要 Foreground Process还是可以通过CR Request , 要求 Holder Instance LMS 去build一个 BEST CR 来用, 这不消我们去担心。

_external_scn_rejection_threshold_hour

1?Warning - High Database SCN: Current SCN value is 0x0b7b.0008e40b, threshold SCN value is 0x0b75.055dc000
If you have not previously reported this warning on this database, 
please notify Oracle Support so that additional diagnosis can be performed.

2?Warning: The SCN headroom for this database is only NN days!

3?Warning: The SCN headroom for this database is only N hours!

4?WARNING: This patch can not take full effect until this RAC database has been completely shutdown and restarted again.
Oracle recommends that it is done at the earliest convenience.

5?Rejected the attempt to advance SCN over limit by 9374 hours worth to 0x0c00.00000f66, by distributed transaction remote logon,
remote DB: REMDB.XX.ORACLE.COM.
Client info : DB logon user ME, machine yy, program sqlplus@yy (TNS V1-V3), and OS user uuu

6?Rejected the attempt to advance SCN over limit by 9375 hours worth to 0x0c00.000003c6, by distributed transaction logon,
remote DB: REMDB.XX.ORACLE.COM.
Client info : DB logon user TC, machine xx, program oracle@xx (TNS V1-V3), and OS user xxx 

7?Rejected the attempt to advance SCN over limit by 9374 hours worth to 0x0c00.00000f66, by XXXXX
Client info : DB logon user TC, machine mmm, program sqlplus@mmm (TNS V1-V3), and OS user uuu

Where XXXXX is a string such as:
 ? PL/SQL RPC (remote)
 ? sql exec with curSCN
 ? sql exec with outSCN

select version,
       date_time,
       dbms_flashback.get_system_change_number current_scn,
       indicator
  from (select version,
               to_char(SYSDATE, 'YYYY/MM/DD HH24:MI:SS') DATE_TIME,
               ((((((to_number(to_char(sysdate, 'YYYY')) - 1988) * 12 * 31 * 24 * 60 * 60) +
               ((to_number(to_char(sysdate, 'MM')) - 1) * 31 * 24 * 60 * 60) +
               (((to_number(to_char(sysdate, 'DD')) - 1)) * 24 * 60 * 60) +
               (to_number(to_char(sysdate, 'HH24')) * 60 * 60) +
               (to_number(to_char(sysdate, 'MI')) * 60) +
               (to_number(to_char(sysdate, 'SS')))) * (16 * 1024)) -
               dbms_flashback.get_system_change_number) /
               (16 * 1024 * 60 * 60 * 24)) indicator
          from v$instance)

VERSION           DATE_TIME           CURRENT_SCN  INDICATOR
----------------- ------------------- ----------- ----------
11.2.0.3.0        2012/05/17 12:38:49     2775567   9068.525

------------------------------------------------------------
ScnHealthCheck
------------------------------------------------------------
Current Date: 2012/01/17 01:01:09
Current SCN:  384089
Version:      11.1.0.7.0
------------------------------------------------------------
Result: A - SCN Headroom is good
Apply the latest recommended patches
based on your maintenance schedule
AND set _external_scn_rejection_threshold_hours=24 after apply.
For further information review MOS document id 1393363.1
------------------------------------------------------------

In addition to the above result the script output may advise to set the hidden parameter
"_external_scn_rejection_threshold_hours" on some Oracle versions.
The following text gives more information about setting this parameter:

    Set _external_scn_rejection_threshold_hours=24 after apply
    The hidden parameter "_external_scn_rejection_threshold_hours" is introduced in J
    anuary 2012 Critical Patch Update (CPU) and Patch Set Update (PSU) releases (and related bundles).
    Oracle recommends setting this parameter to the value 24 in 10g and 11.1 releases - 
    it does not need to be set in 11.2 releases.
    The parameter is static and so must be set in the init.ora or spfile used to start the instance.
    eg:

    In init.ora:
      # Set threshold on dd/mon/yyyy - See MOS Document 1393363.1
      _external_scn_rejection_threshold_hours = 24 

    In the spfile:
      alter system set "_external_scn_rejection_threshold_hours" = 24
       comment='Set threshold on dd/mon/yyyy - See MOS Document 1393363.1'
       scope=spfile ;

    Why do I need to set "_external_scn_rejection_threshold_hours"=24 ?
    Oracle has many hidden parameters which all have default or derived values, and those parameters are
    not generally intended to be set by customers. Oracle determined that the most suitable value for this
    new hidden parameter is "24" and that this value should be used across all releases.
    10g and 11.1 January 2012 CPU / PSU releases have a different default value compiled in and so this
    setting has to be made explicitly on those releases to ensure that the required value of 24 is used.
    Customers are not expected to tune this value themselves.

[oracle@vrh1 ~]$ oerr ora 19706
19706, 00000, "invalid SCN"
// *Cause:  The input SCN is either not a positive integer or too large.
// *Action: Check the input SCN and make sure it is a valid SCN.

The system change number (SCN) is a logical, internal timestamp used by the Oracle Database.
SCNs order events that occur within the database, which is necessary to satisfy the ACID properties of a transaction.

The database uses SCNs to query and track changes. For example, if a transaction updates a row,
then the database records the SCN at which this update occurred. Other modifications in this transaction typically
have the same SCN. When a transaction commits, the database records an SCN for this commit.
Multiple transactions that commit at the same time may share the same SCN.

SCNs occur in a monotonically increasing sequence, and there is a very large upper
limit to how many SCNs an Oracle Database can use - that limit is currently 281 trillion,
or specifically 281,474,976,710,656 (is 2^48) SCN values.

Given that there is an upper limit, it is important that any given Oracle Database does not
run out of available SCNs. The Oracle Database uses a time based rationing system to ensure that this does not happen.

At any point in time, the Oracle Database calculates a "not to exceed" limit for the number
of SCNs a database can have used, based on the number of seconds elapsed since 1988, multiplied by 16,384. 

This is known as the database's current maximum SCN limit.
Doing this ensures that Oracle Databases will ration SCNs over time,
allowing over 500 years of data processing for any Oracle Database.

The difference between the current SCN the database is using, and the "not to exceed" upper limit,
is known as the SCN headroom. For almost all Oracle Databases, this headroom is constantly increasing every second.

However, Oracle has determined that some software bugs could cause the database to attempt to exceed
the current maximum SCN value (or get closer to the limit than was warranted).

Generally if the database does try to exceed the current maximum SCN value, the transaction that
caused this event would be cancelled by the database, and the application would see an error. 

The next second the limit increases, so typically the application then continues with a slight hiccough in processing.
However, in some very rare cases, the database does need to shut down to preserve its integrity.
In no cases is data lost or corrupted.

Similar to how clocks are kept synchronized in a computer network, when two databases communicate
with each other over a database link, they synchronize their SCNs by picking the largest SCN in use by the two.
So in some cases, databases experienced rapidly decreasing SCN headroom not because of a bug in that specific database,
but because the bug was active in one or more of the databases that database was connected to. Since the database always
rejects SCNs that exceed the current maximum SCN, the provision of being able to run Oracle Databases for more than 500 years
was not affected in any of the cases.

Unix/Linux下删除Oracle控制文件Controlfile为什么实例Instance没有立即奔溃?

绝大多数教科书或网络教材都告诉初学者, Oracle中丢失Controlfile控制文件会造成不可挽回的损失, Oracle Instance实例会立即崩溃crash。

但是实际很多同学在实际操作时发现,在Unix/Linux上删除所有的Controlfile控制文件 后 实例并不会在第一时间crash, 相反居然还可以顺利完成FULL checkpoint 。

 

这是为什么呢?

 

注意这个问题仅适用于Unix/Linux, 在Windows平台上不允许删除 已经被其他进程打开的文件。

 

究其根本原因是在Linux/Unix上Read 或 Write一个文件,进程都会打开一个与此文件相关联的 打开文件描述符 Open File Descriptors (a file descriptor (FD) is an abstract indicator for accessing a file. The term is generally used in POSIX operating systems.In POSIX, a file descriptor is an integer, specifically of the C type int.)

在C语言里 这样的一个File Descriptor以整形的形式出现, 举例来说:

 

进程A 在运行过程中 用 open 函数打开了一个文件/etc/hosts:

 

int  open("/etc/hosts",O_RDONLY);      

此时若open成功则会返回一个 文件描述符, 例如是该进程打开的第一个文件 那么 就返回1
若出错则返回 -1

之后进程A 就可以利用如 open ,write 这样的读写函数 要求对 文件描述符 1相关的实际文件进行读写

当进程不再需要读写文件或 进程退出时 一般使用close函数关闭open File Descriptor

close(1);   -- 例如上面获得的 描述符是 1  ,这里代入1就可以关闭本进程对/etc/hosts的FD

 

 

而在Linux/Unix上当一个文件被删除时,它只是简单地被”unlink”即删除一个硬链接hard link; 实际包含  文件数据的inode 在所有与该文件关联的 打开文件描述符(Open File Descriptors)被关闭之前 都不会被实际删除, 已经获得 打开文件描述符的进程 只要不用close函数关闭掉这些描述符, 都可以继续正常地对文件进行读写。 当所有该文件 管理的 打开文件符都被 关闭时 文件才真正意义上被删除。

 

 “When a file is deleted in linux, it is simply ‘unlinked’. The inode, which contains the file’s data, is not deleted until all processes have finished with it. This is why processes can carry on writing to deleted files.

In Unix, when you remove a file, the file isn’t actually deleted until all open file descriptors to it are closed.”

 

我们来实际还原一下 这个问题现场:

 

SQL> select * from v$version;

BANNER
----------------------------------------------------------------
Oracle Database 10g Enterprise Edition Release 10.2.0.1.0 - 64bi
PL/SQL Release 10.2.0.1.0 - Production
CORE    10.2.0.1.0      Production
TNS for Linux: Version 10.2.0.1.0 - Production
NLSRTL Version 10.2.0.1.0 - Production

SQL> select * from global_name;

GLOBAL_NAME
--------------------------------------------------------------------------------
www.askmac.cn 

SQL> show parameter control_files

NAME                                 TYPE        VALUE
------------------------------------ ----------- ------------------------------
control_files                        string      /s01/oradata/G10R21/controlfil
                                                 e/o1_mf_7ch80zvk_.ctl

SQL> select open_mode from v$database;

OPEN_MODE
----------
READ WRITE

 

以上环境版本为10.2.0.1,只使用一个控制文件, 数据库已打开到READ WRITE MODE。

 

我们找到 Oracle中持续对Controlfile 每3 s做一次 heartbeat心跳的CKPT checkpoint进程的 SPID ,并dump其进程信息:

 

[oracle@vrh8 ~]$ ps -ef|grep ckpt|grep -v grep
oracle    3617     1  0 21:35 ?        00:00:00 ora_ckpt_G10R21

SQL> oradebug setospid 3617;
Oracle pid: 7, Unix process pid: 3617, image: oracle@vrh8.oracle.com (CKPT)

SQL> oradebug dump errorstack 4;
Statement processed.

SQL> oradebug tracefile_name;
/s01/admin/G10R21/bdump/g10r21_ckpt_3617.trc

SQL> Disconnected from Oracle Database 10g Enterprise Edition Release 10.2.0.1.0 - 64bit Production
With the Partitioning, OLAP and Data Mining options

 

检查ckpt进程的errorstack dump可以发现一些有趣的内容

 

[oracle@vrh8 ~]$ vi /s01/admin/G10R21/bdump/g10r21_ckpt_3617.trc

Process global information:
     process: 0x84e58dd0, call: 0x84f96b18, xact: (nil), curses: 0x84f785b0, usrses: 0x84f785b0
  ----------------------------------------
  SO: 0x84e58dd0, type: 2, owner: (nil), flag: INIT/-/-/0x00
  (process) Oracle pid=7, calls cur/top: 0x84f96b18/0x84f96b18, flag: (6) SYSTEM
            int error: 0, call error: 0, sess error: 0, txn error 0
  (post info) last post received: 0 0 24
              last post received-location: ksasnd
              last process to post me: 84e5cd10 5 0
              last post sent: 0 0 25
              last post sent-location: ksasnr
              last process posted by me: 84e5cd10 5 0
    (latch info) wait_event=0 bits=0
    Process Group: DEFAULT, pseudo proc: 0x84ea05f8
    O/S info: user: oracle, term: UNKNOWN, ospid: 3617
    OSD pid info: Unix process pid: 3617, image: oracle@vrh8.oracle.com (CKPT)

    (FOB) flags=2 fib=0x83995e10 incno=0 pending i/o cnt=0
     fname=/s01/oradata/G10R21/datafile/o1_mf_system_7ch8004t_.dbf
     fno=1 lblksz=4 fsiz=86309888
    ----------------------------------------
    SO: 0x83995a70, type: 6, owner: 0x84e58dd0, flag: INIT/-/-/0x00
    (FIB) flags=0 reference cnt=1 incno=0 seqno=1
     fname=/s01/oradata/G10R21/controlfile/o1_mf_7ch80zvk_.ctl
     fno=0 lblksz=4 fsiz=86309888
      (FOB) flags=393226 fib=0x83995a70 incno=0 pending i/o cnt=0
 fname=/s01/oradata/G10R21/controlfile/o1_mf_7ch80zvk_.ctl
       fno=0 lblksz=6 fsiz=86309888

 

 

可以发现这里存在一个FOB 对象其 fname=/s01/oradata/G10R21/controlfile/o1_mf_7ch80zvk_.ctl ==>指向 控制文件

 

在Linux下我们可以通过 proc文件系统了解CKPT进程当前所有 已打开的文件描述符(FD):

 

[oracle@vrh8 ~]$ cd /proc/3617/fd

[oracle@vrh8 fd]$ ls -l
total 0
lr-x------ 1 oracle oinstall 64 Mar  3 21:44 0 -> /dev/null
lr-x------ 1 oracle oinstall 64 Mar  3 21:44 1 -> /dev/null
l-wx------ 1 oracle oinstall 64 Mar  3 21:44 10 -> /s01/admin/G10R21/bdump/alert_G10R21.log
lr-x------ 1 oracle oinstall 64 Mar  3 21:44 11 -> /dev/zero
lr-x------ 1 oracle oinstall 64 Mar  3 21:44 12 -> /dev/zero
lr-x------ 1 oracle oinstall 64 Mar  3 21:44 13 -> /s01/oracle/product/10.2.0/db_1/rdbms/mesg/oraus.msb
lrwx------ 1 oracle oinstall 64 Mar  3 21:44 14 -> /s01/oracle/product/10.2.0/db_1/dbs/hc_G10R21.dat
lrwx------ 1 oracle oinstall 64 Mar  3 21:44 15 -> /s01/oracle/product/10.2.0/db_1/dbs/lkG10R21
lrwx------ 1 oracle oinstall 64 Mar 3 21:44 16 -> /s01/oradata/G10R21/controlfile/o1_mf_7ch80zvk_.ctl
lr-x------ 1 oracle oinstall 64 Mar  3 21:44 17 -> /s01/oracle/product/10.2.0/db_1/rdbms/mesg/oraus.msb
lrwx------ 1 oracle oinstall 64 Mar  3 21:44 18 -> /s01/oradata/G10R21/datafile/o1_mf_system_7ch8004t_.dbf
l-wx------ 1 oracle oinstall 64 Mar  3 21:44 2 -> /s01/admin/G10R21/bdump/g10r21_ckpt_3617.trc
lr-x------ 1 oracle oinstall 64 Mar  3 21:44 3 -> /dev/null
lr-x------ 1 oracle oinstall 64 Mar  3 21:44 4 -> /dev/null
l-wx------ 1 oracle oinstall 64 Mar  3 21:44 5 -> /s01/admin/G10R21/udump/g10r21_ora_3605.trc
lrwx------ 1 oracle oinstall 64 Mar  3 21:44 6 -> /s01/oracle/product/10.2.0/db_1/dbs/lkinstG10R21 (deleted)
lrwx------ 1 oracle oinstall 64 Mar  3 21:44 7 -> /s01/admin/G10R21/adump/ora_3605.aud
l-wx------ 1 oracle oinstall 64 Mar  3 21:44 8 -> /s01/admin/G10R21/bdump/alert_G10R21.log
lrwx------ 1 oracle oinstall 64 Mar  3 21:44 9 -> /s01/oracle/product/10.2.0/db_1/dbs/hc_G10R21.dat

 

可以看到 o1_mf_7ch80zvk_.ctl 这个控制文件的 FD number是 16

 

我们开始利用strace 追踪 CKPT的 SYSTEM CALL , 并尝试删除唯一的一个控制文件

 

[root@vrh8 yum.repos.d]# strace -o ckpt.log -p 3617
Process 3617 attached - interrupt to quit

[oracle@vrh8 fd]$ rm /s01/oradata/G10R21/controlfile/o1_mf_7ch80zvk_.ctl

oracle@vrh8 fd]$ sqlplus / as sysdba

SQL*Plus: Release 10.2.0.1.0 - Production on Sat Mar 3 21:50:29 2012

Copyright (c) 1982, 2005, Oracle.  All rights reserved.

Connected to:
Oracle Database 10g Enterprise Edition Release 10.2.0.1.0 - 64bit Production
With the Partitioning, OLAP and Data Mining options

SQL> alter system checkpoint;

System altered.

SQL> /

System altered.

 

以上可以看到”ALTER SYSTEM CHECKPOINT” 正常完成 ,说明此时做FULL CHECKPOINT完全检查点也没问题。

 

此时用 lsof 命令可以 观察到有那几个进程已打开了o1_mf_7ch80zvk_.ctl这个控制文件这个控制文件的 文件描述符:

 

[root@vrh8 ~]# lsof|grep o1_mf_7ch80zvk_.ctl

oracle    3613    oracle   16u      REG              253,0    12402688    6547239 /s01/oradata/G10R21/controlfile/o1_mf_7ch80zvk_.ctl (deleted)
oracle    3615    oracle   16u      REG              253,0    12402688    6547239 /s01/oradata/G10R21/controlfile/o1_mf_7ch80zvk_.ctl (deleted)
oracle    3617    oracle   16uW     REG              253,0    12402688    6547239 /s01/oradata/G10R21/controlfile/o1_mf_7ch80zvk_.ctl (deleted)
oracle    3625    oracle   19u      REG              253,0    12402688    6547239 /s01/oradata/G10R21/controlfile/o1_mf_7ch80zvk_.ctl (deleted)
oracle    3644    oracle   19u      REG              253,0    12402688    6547239 /s01/oradata/G10R21/controlfile/o1_mf_7ch80zvk_.ctl (deleted)
oracle    3646    oracle   19u      REG              253,0    12402688    6547239 /s01/oradata/G10R21/controlfile/o1_mf_7ch80zvk_.ctl (deleted)

[oracle@vrh8 ~]$ cd /proc/3617/fd

[oracle@vrh8 fd]$ ls -l 16
lrwx------ 1 oracle oinstall 64 Mar  3 21:44 16 -> /s01/oradata/G10R21/controlfile/o1_mf_7ch80zvk_.ctl (deleted)

可以看到(deleted)的标记,说明该文件已经被unlinked,但是其数据仍完好

 

此时我们甚至还可以通过DD 命令将 已被”unlinked”的 控制文件的内容复制出来, 这也是很多 意外删除场景中 实例未crash或重启时一种常用的恢复手段, 很多朋友可能觉得这个方法不可思议, 其实只要了解了原理就可以明白它的局限性。

 

[oracle@vrh8 ~]$ cd /proc/3617/fd

[oracle@vrh8 fd]$ ls -l 16
lrwx------ 1 oracle oinstall 64 Mar  3 21:44 16 -> /s01/oradata/G10R21/controlfile/o1_mf_7ch80zvk_.ctl (deleted)

[oracle@vrh8 fd]$ dd if=16  of=/tmp/controlfile
24224+0 records in
24224+0 records out

12402688 bytes (12 MB) copied, 0.0592615 seconds, 209 MB/s

[oracle@vrh8 fd]$ file /tmp/controlfile
/tmp/controlfile: data

 

我们通过之前一直在做的strace log 可以看到 CKPT仍正常地针对控制文件Controlfiles做 3s一次的heartbeat:

 

gettimeofday({1330830450, 587535}, NULL) = 0
times(NULL)                             = 429642464
pwrite(16, "2
gettimeofday({1330830450, 587535}, NULL) = 0
times(NULL)                             = 429642464
pwrite(16, "\25\302\0\0\3\0\0\0\0\0\0\0\0\0\1\4\211\273\0\0\2\0\0\0\0\0\0\0\f\0\0\0"..., 16384, 49152) = 16384
times(NULL)                             = 429642464
gettimeofday({1330830450, 588689}, NULL) = 0
gettimeofday({1330830450, 588905}, NULL) = 0
gettimeofday({1330830450, 587535}, NULL) = 0
times(NULL)                             = 429642464
pwrite(16, "\25\302\0\0\3\0\0\0\0\0\0\0\0\0\1\4\211\273\0\0\2\0\0\0\0\0\0\0\f\0\0\0"..., 16384, 49152) = 16384
times(NULL)                             = 429642464
gettimeofday({1330830450, 588689}, NULL) = 0
gettimeofday({1330830450, 588905}, NULL) = 0
gettimeofday({1330830450, 587535}, NULL) = 0
times(NULL)                             = 429642464
pwrite(16, "\25\302\0\0\3\0\0\0\0\0\0\0\0\0\1\4\211\273\0\0\2\0\0\0\0\0\0\0\f\0\0\0"..., 16384, 49152) = 16384
times(NULL)                             = 429642464
gettimeofday({1330830450, 588689}, NULL) = 0
gettimeofday({1330830450, 588905}, NULL) = 0
gettimeofday({1330830450, 587535}, NULL) = 0
times(NULL)                             = 429642464
pwrite(16, "\25\302\0\0\3\0\0\0\0\0\0\0\0\0\1\4\211\273\0\0\2\0\0\0\0\0\0\0\f\0\0\0"..., 16384, 49152) = 16384
times(NULL)                             = 429642464
gettimeofday({1330830450, 588689}, NULL) = 0
gettimeofday({1330830450, 588905}, NULL) = 0
gettimeofday({1330830450, 587535}, NULL) = 0
times(NULL)                             = 429642464
pwrite(16, "\25\302\0\0\3\0\0\0\0\0\0\0\0\0\1\4\211\273\0\0\2\0\0\0\0\0\0\0\f\0\0\0"..., 16384, 49152) = 16384
times(NULL)                             = 429642464
gettimeofday({1330830450, 588689}, NULL) = 0
gettimeofday({1330830450, 588905}, NULL) = 0
gettimeofday({1330830450, 587535}, NULL) = 0
times(NULL)                             = 429642464
pwrite(16, "\25\302\0\0\3\0\0\0\0\0\0\0\0\0\1\4\211\273\0\0\2\0\0\0\0\0\0\0\f\0\0\0"..., 16384, 49152) = 16384
times(NULL)                             = 429642464
gettimeofday({1330830450, 588689}, NULL) = 0
gettimeofday({1330830450, 588905}, NULL) = 0
gettimeofday({1330830450, 587535}, NULL) = 0
times(NULL)                             = 429642464
pwrite(16, "\25\302\0\0\3\0\0\0\0\0\0\0\0\0\1\4\211\273\0\0\2\0\0\0\0\0\0\0\f\0\0\0"..., 16384, 49152) = 16384
times(NULL)                             = 429642464
gettimeofday({1330830450, 588689}, NULL) = 0
gettimeofday({1330830450, 588905}, NULL) = 0
gettimeofday({1330830450, 587535}, NULL) = 0
times(NULL)                             = 429642464
pwrite(16, "\25\302\0\0\3\0\0\0\0\0\0\0\0\0\1\4\211\273\0\0\2\0\0\0\0\0\0\0\f\0\0\0"..., 16384, 49152) = 16384
times(NULL)                             = 429642464
gettimeofday({1330830450, 588689}, NULL) = 0
gettimeofday({1330830450, 588905}, NULL) = 0
gettimeofday({1330830450, 587535}, NULL) = 0
times(NULL)                             = 429642464
pwrite(16, "\25\302\0\0\3\0\0\0\0\0\0\0\0\0\1\4\211\273\0\0\2\0\0\0\0\0\0\0\f\0\0\0"..., 16384, 49152) = 16384
times(NULL)                             = 429642464
gettimeofday({1330830450, 588689}, NULL) = 0
gettimeofday({1330830450, 588905}, NULL) = 0
gettimeofday({1330830450, 587535}, NULL) = 0
times(NULL)                             = 429642464
pwrite(16, "\25\302\0\0\3\0\0\0\0\0\0\0\0\0\1\4\211\273\0\0\2\0\0\0\0\0\0\0\f\0\0\0"..., 16384, 49152) = 16384
times(NULL)                             = 429642464
gettimeofday({1330830450, 588689}, NULL) = 0
gettimeofday({1330830450, 588905}, NULL) = 0
gettimeofday({1330830450, 587535}, NULL) = 0
times(NULL)                             = 429642464
pwrite(16, "\25\302\0\0\3\0\0\0\0\0\0\0\0\0\1\4\211\273\0\0\2\0\0\0\0\0\0\0\f\0\0\0"..., 16384, 49152) = 16384
times(NULL)                             = 429642464
gettimeofday({1330830450, 588689}, NULL) = 0
gettimeofday({1330830450, 588905}, NULL) = 0
13
gettimeofday({1330830450, 587535}, NULL) = 0
times(NULL)                             = 429642464
pwrite(16, "\25\302\0\0\3\0\0\0\0\0\0\0\0\0\1\4\211\273\0\0\2\0\0\0\0\0\0\0\f\0\0\0"..., 16384, 49152) = 16384
times(NULL)                             = 429642464
gettimeofday({1330830450, 588689}, NULL) = 0
gettimeofday({1330830450, 588905}, NULL) = 0
gettimeofday({1330830450, 587535}, NULL) = 0
times(NULL)                             = 429642464
pwrite(16, "\25\302\0\0\3\0\0\0\0\0\0\0\0\0\1\4\211\273\0\0\2\0\0\0\0\0\0\0\f\0\0\0"..., 16384, 49152) = 16384
times(NULL)                             = 429642464
gettimeofday({1330830450, 588689}, NULL) = 0
gettimeofday({1330830450, 588905}, NULL) = 0
gettimeofday({1330830450, 587535}, NULL) = 0
times(NULL)                             = 429642464
pwrite(16, "\25\302\0\0\3\0\0\0\0\0\0\0\0\0\1\4\211\273\0\0\2\0\0\0\0\0\0\0\f\0\0\0"..., 16384, 49152) = 16384
times(NULL)                             = 429642464
gettimeofday({1330830450, 588689}, NULL) = 0
gettimeofday({1330830450, 588905}, NULL) = 0
gettimeofday({1330830450, 587535}, NULL) = 0
times(NULL)                             = 429642464
pwrite(16, "\25\302\0\0\3\0\0\0\0\0\0\0\0\0\1\4\211\273\0\0\2\0\0\0\0\0\0\0\f\0\0\0"..., 16384, 49152) = 16384
times(NULL)                             = 429642464
gettimeofday({1330830450, 588689}, NULL) = 0
gettimeofday({1330830450, 588905}, NULL) = 0
gettimeofday({1330830450, 587535}, NULL) = 0
times(NULL)                             = 429642464
pwrite(16, "\25\302\0\0\3\0\0\0\0\0\0\0\0\0\1\4\211\273\0\0\2\0\0\0\0\0\0\0\f\0\0\0"..., 16384, 49152) = 16384
times(NULL)                             = 429642464
gettimeofday({1330830450, 588689}, NULL) = 0
gettimeofday({1330830450, 588905}, NULL) = 0
gettimeofday({1330830450, 587535}, NULL) = 0
times(NULL)                             = 429642464
pwrite(16, "\25\302\0\0\3\0\0\0\0\0\0\0\0\0\1\4\211\273\0\0\2\0\0\0\0\0\0\0\f\0\0\0"..., 16384, 49152) = 16384
times(NULL)                             = 429642464
gettimeofday({1330830450, 588689}, NULL) = 0
gettimeofday({1330830450, 588905}, NULL) = 0
gettimeofday({1330830450, 587535}, NULL) = 0
times(NULL)                             = 429642464
pwrite(16, "\25\302\0\0\3\0\0\0\0\0\0\0\0\0\1\4\211\273\0\0\2\0\0\0\0\0\0\0\f\0\0\0"..., 16384, 49152) = 16384
times(NULL)                             = 429642464
gettimeofday({1330830450, 588689}, NULL) = 0
gettimeofday({1330830450, 588905}, NULL) = 0
gettimeofday({1330830450, 587535}, NULL) = 0
times(NULL)                             = 429642464
pwrite(16, "\25\302\0\0\3\0\0\0\0\0\0\0\0\0\1\4\211\273\0\0\2\0\0\0\0\0\0\0\f\0\0\0"..., 16384, 49152) = 16384
times(NULL)                             = 429642464
gettimeofday({1330830450, 588689}, NULL) = 0
gettimeofday({1330830450, 588905}, NULL) = 0
gettimeofday({1330830450, 587535}, NULL) = 0
times(NULL)                             = 429642464
pwrite(16, "\25\302\0\0\3\0\0\0\0\0\0\0\0\0\1\4\211\273\0\0\2\0\0\0\0\0\0\0\f\0\0\0"..., 16384, 49152) = 16384
times(NULL)                             = 429642464
gettimeofday({1330830450, 588689}, NULL) = 0
gettimeofday({1330830450, 588905}, NULL) = 0
\f
gettimeofday({1330830450, 587535}, NULL) = 0
times(NULL)                             = 429642464
pwrite(16, "\25\302\0\0\3\0\0\0\0\0\0\0\0\0\1\4\211\273\0\0\2\0\0\0\0\0\0\0\f\0\0\0"..., 16384, 49152) = 16384
times(NULL)                             = 429642464
gettimeofday({1330830450, 588689}, NULL) = 0
gettimeofday({1330830450, 588905}, NULL) = 0
gettimeofday({1330830450, 587535}, NULL) = 0
times(NULL)                             = 429642464
pwrite(16, "\25\302\0\0\3\0\0\0\0\0\0\0\0\0\1\4\211\273\0\0\2\0\0\0\0\0\0\0\f\0\0\0"..., 16384, 49152) = 16384
times(NULL)                             = 429642464
gettimeofday({1330830450, 588689}, NULL) = 0
gettimeofday({1330830450, 588905}, NULL) = 0
gettimeofday({1330830450, 587535}, NULL) = 0
times(NULL)                             = 429642464
pwrite(16, "\25\302\0\0\3\0\0\0\0\0\0\0\0\0\1\4\211\273\0\0\2\0\0\0\0\0\0\0\f\0\0\0"..., 16384, 49152) = 16384
times(NULL)                             = 429642464
gettimeofday({1330830450, 588689}, NULL) = 0
gettimeofday({1330830450, 588905}, NULL) = 0
"..., 16384, 49152) = 16384 times(NULL) = 429642464 gettimeofday({1330830450, 588689}, NULL) = 0 gettimeofday({1330830450, 588905}, NULL) = 0

 

 

这里肯定有同学要问了: ” 那么到底何时Oracle的后台进程如CKPT才会释放OPEN FD 打开文件描述符 呢?”

 

答案是:” 视乎后台进程的实际需求,仅在需要时Close Open Descriptor; 实际上close Open Descriptor的机会并不多,甚至于在我们的测试里半个小时内都没有出现close , CKPT运行良好”。

 

实际上我们可以手动促成 后台进程释放指定的一个FD, 注意不要在产品数据库环境中做以下操作, 这仅仅是为了研究:

之前我们已经知道了在CKPT进程中 o1_mf_7ch80zvk_.ctl 控制文件的FD number是 16 , 而CKPT的SPID操作系统进程号是3617;

 

SQL> oradebug setospid 3617;
Oracle pid: 7, Unix process pid: 3617, image: oracle@vrh8.oracle.com (CKPT)

SQL> oradebug call close 16;
Function returned 0

 

以上使用oradebug call在CKPT进程中强制调用close(16),将关闭16 号文件描述符; 关闭后CKPT无法再使用原有的 16 FD,也没有办法再open已经被unlinked的 控制文件, 可以在 strace 日志和 ckpt trace 及 alert告警日志中看到 一连串反应:

 

ckpt strace :

close(16)                               = 0
gettimeofday({1330830800, 213316}, NULL) = 0
times(NULL)                             = 429677421
pwrite(16, "2
ckpt strace :

close(16)                               = 0
gettimeofday({1330830800, 213316}, NULL) = 0
times(NULL)                             = 429677421
pwrite(16, "\25\302\0\0\3\0\0\0\0\0\0\0\0\0\1\4L\273\0\0\2\0\0\0\0\0\0\0\v\0\0\0"..., 16384, 49152) = -1 EBADF (Bad file descriptor)
gettimeofday({1330830800, 213988}, NULL) = 0
==========================================================================================

之后ckpt 会终止实例 , kill掉其他后台进程

times(NULL)                             = 429677424
open("/proc/3670/stat", O_RDONLY)       = 16
read(16, "3670 (oracle) S 1 3670 3670 0 -1"..., 999) = 239
close(16)                               = 0
kill(3670, SIGKILL)                     = 0
kill(3670, SIGCONT)                     = -1 ESRCH (No such process)
open("/proc/3673/stat", O_RDONLY)       = 16
read(16, "3673 (oracle) S 1 3673 3673 0 -1"..., 999) = 239
close(16)                               = 0
kill(3673, SIGKILL)                     = 0
kill(3673, SIGCONT)                     = -1 ESRCH (No such process)
open("/proc/3650/stat", O_RDONLY)       = 16
read(16, "3650 (oracle) S 1 3650 3650 0 -1"..., 999) = 239
..................................................................
open("/s01/admin/G10R21/bdump/alert_G10R21.log", O_WRONLY|O_CREAT|O_APPEND, 0664) = 8
writev(8, [{"Instance terminated by CKPT, pid"..., 39}, {"\n", 1}], 2) = 40
semctl(262145, 0, IPC_RMID, 0)          = 0
close(17)                               = 0
close(13)                               = 0
shmctl(2129924, IPC_RMID, 0)            = 0
shmdt(0x60000000)                       = 0
open("/proc/3617/stat", O_RDONLY)       = 13
read(13, "3617 (oracle) R 1 3617 3617 0 -1"..., 999) = 224
close(13)                               = 0
kill(3617, SIGKILL 

ckpt trace:

*** 2012-03-03 22:13:17.711
Received ORADEBUG command 'call close 16' from process Unix process pid: 3914, image:
ORA-00206: error in writing (block 3, # blocks 1) of control file
ORA-00202: control file: '/s01/oradata/G10R21/controlfile/o1_mf_7ch80zvk_.ctl'
ORA-27072: File I/O error
Linux-x86_64 Error: 9: Bad file descriptor
Additional information: 4
Additional information: 3
Additional information: -1
error 221 detected in background process
ORA-00221: error on write to control file
ORA-00206: error in writing (block 3, # blocks 1) of control file
ORA-00202: control file: '/s01/oradata/G10R21/controlfile/o1_mf_7ch80zvk_.ctl'
ORA-27072: File I/O error
Linux-x86_64 Error: 9: Bad file descriptor
Additional information: 4
Additional information: 3
Additional information: -1

alert log:

Errors in file /s01/admin/G10R21/bdump/g10r21_ckpt_3617.trc:
ORA-00206: error in writing (block 3, # blocks 1) of control file
ORA-00202: control file: '/s01/oradata/G10R21/controlfile/o1_mf_7ch80zvk_.ctl'
ORA-27072: File I/O error
Linux-x86_64 Error: 9: Bad file descriptor
Additional information: 4
Additional information: 3
Additional information: -1
Sat Mar  3 22:13:20 2012
Errors in file /s01/admin/G10R21/bdump/g10r21_ckpt_3617.trc:
ORA-00221: error on write to control file
ORA-00206: error in writing (block 3, # blocks 1) of control file
ORA-00202: control file: '/s01/oradata/G10R21/controlfile/o1_mf_7ch80zvk_.ctl'
ORA-27072: File I/O error
Linux-x86_64 Error: 9: Bad file descriptor
Additional information: 4
Additional information: 3
Additional information: -1
Sat Mar  3 22:13:20 2012
CKPT: terminating instance due to error 221
Sat Mar  3 22:13:20 2012
Errors in file /s01/admin/G10R21/bdump/g10r21_psp0_3609.trc:
ORA-00221: error on write to control file
Sat Mar  3 22:13:20 2012
Errors in file /s01/admin/G10R21/bdump/g10r21_mman_3611.trc:
ORA-00221: error on write to control file
Instance terminated by CKPT, pid = 3617
ckpt strace :

close(16)                               = 0
gettimeofday({1330830800, 213316}, NULL) = 0
times(NULL)                             = 429677421
pwrite(16, "\25\302\0\0\3\0\0\0\0\0\0\0\0\0\1\4L\273\0\0\2\0\0\0\0\0\0\0\v\0\0\0"..., 16384, 49152) = -1 EBADF (Bad file descriptor)
gettimeofday({1330830800, 213988}, NULL) = 0
==========================================================================================

之后ckpt 会终止实例 , kill掉其他后台进程

times(NULL)                             = 429677424
open("/proc/3670/stat", O_RDONLY)       = 16
read(16, "3670 (oracle) S 1 3670 3670 0 -1"..., 999) = 239
close(16)                               = 0
kill(3670, SIGKILL)                     = 0
kill(3670, SIGCONT)                     = -1 ESRCH (No such process)
open("/proc/3673/stat", O_RDONLY)       = 16
read(16, "3673 (oracle) S 1 3673 3673 0 -1"..., 999) = 239
close(16)                               = 0
kill(3673, SIGKILL)                     = 0
kill(3673, SIGCONT)                     = -1 ESRCH (No such process)
open("/proc/3650/stat", O_RDONLY)       = 16
read(16, "3650 (oracle) S 1 3650 3650 0 -1"..., 999) = 239
..................................................................
open("/s01/admin/G10R21/bdump/alert_G10R21.log", O_WRONLY|O_CREAT|O_APPEND, 0664) = 8
writev(8, [{"Instance terminated by CKPT, pid"..., 39}, {"\n", 1}], 2) = 40
semctl(262145, 0, IPC_RMID, 0)          = 0
close(17)                               = 0
close(13)                               = 0
shmctl(2129924, IPC_RMID, 0)            = 0
shmdt(0x60000000)                       = 0
open("/proc/3617/stat", O_RDONLY)       = 13
read(13, "3617 (oracle) R 1 3617 3617 0 -1"..., 999) = 224
close(13)                               = 0
kill(3617, SIGKILL 

ckpt trace:

*** 2012-03-03 22:13:17.711
Received ORADEBUG command 'call close 16' from process Unix process pid: 3914, image:
ORA-00206: error in writing (block 3, # blocks 1) of control file
ORA-00202: control file: '/s01/oradata/G10R21/controlfile/o1_mf_7ch80zvk_.ctl'
ORA-27072: File I/O error
Linux-x86_64 Error: 9: Bad file descriptor
Additional information: 4
Additional information: 3
Additional information: -1
error 221 detected in background process
ORA-00221: error on write to control file
ORA-00206: error in writing (block 3, # blocks 1) of control file
ORA-00202: control file: '/s01/oradata/G10R21/controlfile/o1_mf_7ch80zvk_.ctl'
ORA-27072: File I/O error
Linux-x86_64 Error: 9: Bad file descriptor
Additional information: 4
Additional information: 3
Additional information: -1

alert log:

Errors in file /s01/admin/G10R21/bdump/g10r21_ckpt_3617.trc:
ORA-00206: error in writing (block 3, # blocks 1) of control file
ORA-00202: control file: '/s01/oradata/G10R21/controlfile/o1_mf_7ch80zvk_.ctl'
ORA-27072: File I/O error
Linux-x86_64 Error: 9: Bad file descriptor
Additional information: 4
Additional information: 3
Additional information: -1
Sat Mar  3 22:13:20 2012
Errors in file /s01/admin/G10R21/bdump/g10r21_ckpt_3617.trc:
ORA-00221: error on write to control file
ORA-00206: error in writing (block 3, # blocks 1) of control file
ORA-00202: control file: '/s01/oradata/G10R21/controlfile/o1_mf_7ch80zvk_.ctl'
ORA-27072: File I/O error
Linux-x86_64 Error: 9: Bad file descriptor
Additional information: 4
Additional information: 3
Additional information: -1
Sat Mar  3 22:13:20 2012
CKPT: terminating instance due to error 221
Sat Mar  3 22:13:20 2012
Errors in file /s01/admin/G10R21/bdump/g10r21_psp0_3609.trc:
ORA-00221: error on write to control file
Sat Mar  3 22:13:20 2012
Errors in file /s01/admin/G10R21/bdump/g10r21_mman_3611.trc:
ORA-00221: error on write to control file
Instance terminated by CKPT, pid = 3617
ckpt strace :

close(16)                               = 0
gettimeofday({1330830800, 213316}, NULL) = 0
times(NULL)                             = 429677421
pwrite(16, "\25\302\0\0\3\0\0\0\0\0\0\0\0\0\1\4L\273\0\0\2\0\0\0\0\0\0\0\v\0\0\0"..., 16384, 49152) = -1 EBADF (Bad file descriptor)
gettimeofday({1330830800, 213988}, NULL) = 0
==========================================================================================

之后ckpt 会终止实例 , kill掉其他后台进程

times(NULL)                             = 429677424
open("/proc/3670/stat", O_RDONLY)       = 16
read(16, "3670 (oracle) S 1 3670 3670 0 -1"..., 999) = 239
close(16)                               = 0
kill(3670, SIGKILL)                     = 0
kill(3670, SIGCONT)                     = -1 ESRCH (No such process)
open("/proc/3673/stat", O_RDONLY)       = 16
read(16, "3673 (oracle) S 1 3673 3673 0 -1"..., 999) = 239
close(16)                               = 0
kill(3673, SIGKILL)                     = 0
kill(3673, SIGCONT)                     = -1 ESRCH (No such process)
open("/proc/3650/stat", O_RDONLY)       = 16
read(16, "3650 (oracle) S 1 3650 3650 0 -1"..., 999) = 239
..................................................................
open("/s01/admin/G10R21/bdump/alert_G10R21.log", O_WRONLY|O_CREAT|O_APPEND, 0664) = 8
writev(8, [{"Instance terminated by CKPT, pid"..., 39}, {"\n", 1}], 2) = 40
semctl(262145, 0, IPC_RMID, 0)          = 0
close(17)                               = 0
close(13)                               = 0
shmctl(2129924, IPC_RMID, 0)            = 0
shmdt(0x60000000)                       = 0
open("/proc/3617/stat", O_RDONLY)       = 13
read(13, "3617 (oracle) R 1 3617 3617 0 -1"..., 999) = 224
close(13)                               = 0
kill(3617, SIGKILL 

ckpt trace:

*** 2012-03-03 22:13:17.711
Received ORADEBUG command 'call close 16' from process Unix process pid: 3914, image:
ORA-00206: error in writing (block 3, # blocks 1) of control file
ORA-00202: control file: '/s01/oradata/G10R21/controlfile/o1_mf_7ch80zvk_.ctl'
ORA-27072: File I/O error
Linux-x86_64 Error: 9: Bad file descriptor
Additional information: 4
Additional information: 3
Additional information: -1
error 221 detected in background process
ORA-00221: error on write to control file
ORA-00206: error in writing (block 3, # blocks 1) of control file
ORA-00202: control file: '/s01/oradata/G10R21/controlfile/o1_mf_7ch80zvk_.ctl'
ORA-27072: File I/O error
Linux-x86_64 Error: 9: Bad file descriptor
Additional information: 4
Additional information: 3
Additional information: -1

alert log:

Errors in file /s01/admin/G10R21/bdump/g10r21_ckpt_3617.trc:
ORA-00206: error in writing (block 3, # blocks 1) of control file
ORA-00202: control file: '/s01/oradata/G10R21/controlfile/o1_mf_7ch80zvk_.ctl'
ORA-27072: File I/O error
Linux-x86_64 Error: 9: Bad file descriptor
Additional information: 4
Additional information: 3
Additional information: -1
Sat Mar  3 22:13:20 2012
Errors in file /s01/admin/G10R21/bdump/g10r21_ckpt_3617.trc:
ORA-00221: error on write to control file
ORA-00206: error in writing (block 3, # blocks 1) of control file
ORA-00202: control file: '/s01/oradata/G10R21/controlfile/o1_mf_7ch80zvk_.ctl'
ORA-27072: File I/O error
Linux-x86_64 Error: 9: Bad file descriptor
Additional information: 4
Additional information: 3
Additional information: -1
Sat Mar  3 22:13:20 2012
CKPT: terminating instance due to error 221
Sat Mar  3 22:13:20 2012
Errors in file /s01/admin/G10R21/bdump/g10r21_psp0_3609.trc:
ORA-00221: error on write to control file
Sat Mar  3 22:13:20 2012
Errors in file /s01/admin/G10R21/bdump/g10r21_mman_3611.trc:
ORA-00221: error on write to control file
Instance terminated by CKPT, pid = 3617
ckpt strace :

close(16)                               = 0
gettimeofday({1330830800, 213316}, NULL) = 0
times(NULL)                             = 429677421
pwrite(16, "\25\302\0\0\3\0\0\0\0\0\0\0\0\0\1\4L\273\0\0\2\0\0\0\0\0\0\0\v\0\0\0"..., 16384, 49152) = -1 EBADF (Bad file descriptor)
gettimeofday({1330830800, 213988}, NULL) = 0
==========================================================================================

之后ckpt 会终止实例 , kill掉其他后台进程

times(NULL)                             = 429677424
open("/proc/3670/stat", O_RDONLY)       = 16
read(16, "3670 (oracle) S 1 3670 3670 0 -1"..., 999) = 239
close(16)                               = 0
kill(3670, SIGKILL)                     = 0
kill(3670, SIGCONT)                     = -1 ESRCH (No such process)
open("/proc/3673/stat", O_RDONLY)       = 16
read(16, "3673 (oracle) S 1 3673 3673 0 -1"..., 999) = 239
close(16)                               = 0
kill(3673, SIGKILL)                     = 0
kill(3673, SIGCONT)                     = -1 ESRCH (No such process)
open("/proc/3650/stat", O_RDONLY)       = 16
read(16, "3650 (oracle) S 1 3650 3650 0 -1"..., 999) = 239
..................................................................
open("/s01/admin/G10R21/bdump/alert_G10R21.log", O_WRONLY|O_CREAT|O_APPEND, 0664) = 8
writev(8, [{"Instance terminated by CKPT, pid"..., 39}, {"\n", 1}], 2) = 40
semctl(262145, 0, IPC_RMID, 0)          = 0
close(17)                               = 0
close(13)                               = 0
shmctl(2129924, IPC_RMID, 0)            = 0
shmdt(0x60000000)                       = 0
open("/proc/3617/stat", O_RDONLY)       = 13
read(13, "3617 (oracle) R 1 3617 3617 0 -1"..., 999) = 224
close(13)                               = 0
kill(3617, SIGKILL 

ckpt trace:

*** 2012-03-03 22:13:17.711
Received ORADEBUG command 'call close 16' from process Unix process pid: 3914, image:
ORA-00206: error in writing (block 3, # blocks 1) of control file
ORA-00202: control file: '/s01/oradata/G10R21/controlfile/o1_mf_7ch80zvk_.ctl'
ORA-27072: File I/O error
Linux-x86_64 Error: 9: Bad file descriptor
Additional information: 4
Additional information: 3
Additional information: -1
error 221 detected in background process
ORA-00221: error on write to control file
ORA-00206: error in writing (block 3, # blocks 1) of control file
ORA-00202: control file: '/s01/oradata/G10R21/controlfile/o1_mf_7ch80zvk_.ctl'
ORA-27072: File I/O error
Linux-x86_64 Error: 9: Bad file descriptor
Additional information: 4
Additional information: 3
Additional information: -1

alert log:

Errors in file /s01/admin/G10R21/bdump/g10r21_ckpt_3617.trc:
ORA-00206: error in writing (block 3, # blocks 1) of control file
ORA-00202: control file: '/s01/oradata/G10R21/controlfile/o1_mf_7ch80zvk_.ctl'
ORA-27072: File I/O error
Linux-x86_64 Error: 9: Bad file descriptor
Additional information: 4
Additional information: 3
Additional information: -1
Sat Mar  3 22:13:20 2012
Errors in file /s01/admin/G10R21/bdump/g10r21_ckpt_3617.trc:
ORA-00221: error on write to control file
ORA-00206: error in writing (block 3, # blocks 1) of control file
ORA-00202: control file: '/s01/oradata/G10R21/controlfile/o1_mf_7ch80zvk_.ctl'
ORA-27072: File I/O error
Linux-x86_64 Error: 9: Bad file descriptor
Additional information: 4
Additional information: 3
Additional information: -1
Sat Mar  3 22:13:20 2012
CKPT: terminating instance due to error 221
Sat Mar  3 22:13:20 2012
Errors in file /s01/admin/G10R21/bdump/g10r21_psp0_3609.trc:
ORA-00221: error on write to control file
Sat Mar  3 22:13:20 2012
Errors in file /s01/admin/G10R21/bdump/g10r21_mman_3611.trc:
ORA-00221: error on write to control file
Instance terminated by CKPT, pid = 3617
ckpt strace :

close(16)                               = 0
gettimeofday({1330830800, 213316}, NULL) = 0
times(NULL)                             = 429677421
pwrite(16, "\25\302\0\0\3\0\0\0\0\0\0\0\0\0\1\4L\273\0\0\2\0\0\0\0\0\0\0\v\0\0\0"..., 16384, 49152) = -1 EBADF (Bad file descriptor)
gettimeofday({1330830800, 213988}, NULL) = 0
==========================================================================================

之后ckpt 会终止实例 , kill掉其他后台进程

times(NULL)                             = 429677424
open("/proc/3670/stat", O_RDONLY)       = 16
read(16, "3670 (oracle) S 1 3670 3670 0 -1"..., 999) = 239
close(16)                               = 0
kill(3670, SIGKILL)                     = 0
kill(3670, SIGCONT)                     = -1 ESRCH (No such process)
open("/proc/3673/stat", O_RDONLY)       = 16
read(16, "3673 (oracle) S 1 3673 3673 0 -1"..., 999) = 239
close(16)                               = 0
kill(3673, SIGKILL)                     = 0
kill(3673, SIGCONT)                     = -1 ESRCH (No such process)
open("/proc/3650/stat", O_RDONLY)       = 16
read(16, "3650 (oracle) S 1 3650 3650 0 -1"..., 999) = 239
..................................................................
open("/s01/admin/G10R21/bdump/alert_G10R21.log", O_WRONLY|O_CREAT|O_APPEND, 0664) = 8
writev(8, [{"Instance terminated by CKPT, pid"..., 39}, {"\n", 1}], 2) = 40
semctl(262145, 0, IPC_RMID, 0)          = 0
close(17)                               = 0
close(13)                               = 0
shmctl(2129924, IPC_RMID, 0)            = 0
shmdt(0x60000000)                       = 0
open("/proc/3617/stat", O_RDONLY)       = 13
read(13, "3617 (oracle) R 1 3617 3617 0 -1"..., 999) = 224
close(13)                               = 0
kill(3617, SIGKILL 

ckpt trace:

*** 2012-03-03 22:13:17.711
Received ORADEBUG command 'call close 16' from process Unix process pid: 3914, image:
ORA-00206: error in writing (block 3, # blocks 1) of control file
ORA-00202: control file: '/s01/oradata/G10R21/controlfile/o1_mf_7ch80zvk_.ctl'
ORA-27072: File I/O error
Linux-x86_64 Error: 9: Bad file descriptor
Additional information: 4
Additional information: 3
Additional information: -1
error 221 detected in background process
ORA-00221: error on write to control file
ORA-00206: error in writing (block 3, # blocks 1) of control file
ORA-00202: control file: '/s01/oradata/G10R21/controlfile/o1_mf_7ch80zvk_.ctl'
ORA-27072: File I/O error
Linux-x86_64 Error: 9: Bad file descriptor
Additional information: 4
Additional information: 3
Additional information: -1

alert log:

Errors in file /s01/admin/G10R21/bdump/g10r21_ckpt_3617.trc:
ORA-00206: error in writing (block 3, # blocks 1) of control file
ORA-00202: control file: '/s01/oradata/G10R21/controlfile/o1_mf_7ch80zvk_.ctl'
ORA-27072: File I/O error
Linux-x86_64 Error: 9: Bad file descriptor
Additional information: 4
Additional information: 3
Additional information: -1
Sat Mar  3 22:13:20 2012
Errors in file /s01/admin/G10R21/bdump/g10r21_ckpt_3617.trc:
ORA-00221: error on write to control file
ORA-00206: error in writing (block 3, # blocks 1) of control file
ORA-00202: control file: '/s01/oradata/G10R21/controlfile/o1_mf_7ch80zvk_.ctl'
ORA-27072: File I/O error
Linux-x86_64 Error: 9: Bad file descriptor
Additional information: 4
Additional information: 3
Additional information: -1
Sat Mar  3 22:13:20 2012
CKPT: terminating instance due to error 221
Sat Mar  3 22:13:20 2012
Errors in file /s01/admin/G10R21/bdump/g10r21_psp0_3609.trc:
ORA-00221: error on write to control file
Sat Mar  3 22:13:20 2012
Errors in file /s01/admin/G10R21/bdump/g10r21_mman_3611.trc:
ORA-00221: error on write to control file
Instance terminated by CKPT, pid = 3617
ckpt strace :

close(16)                               = 0
gettimeofday({1330830800, 213316}, NULL) = 0
times(NULL)                             = 429677421
pwrite(16, "\25\302\0\0\3\0\0\0\0\0\0\0\0\0\1\4L\273\0\0\2\0\0\0\0\0\0\0\v\0\0\0"..., 16384, 49152) = -1 EBADF (Bad file descriptor)
gettimeofday({1330830800, 213988}, NULL) = 0
==========================================================================================

之后ckpt 会终止实例 , kill掉其他后台进程

times(NULL)                             = 429677424
open("/proc/3670/stat", O_RDONLY)       = 16
read(16, "3670 (oracle) S 1 3670 3670 0 -1"..., 999) = 239
close(16)                               = 0
kill(3670, SIGKILL)                     = 0
kill(3670, SIGCONT)                     = -1 ESRCH (No such process)
open("/proc/3673/stat", O_RDONLY)       = 16
read(16, "3673 (oracle) S 1 3673 3673 0 -1"..., 999) = 239
close(16)                               = 0
kill(3673, SIGKILL)                     = 0
kill(3673, SIGCONT)                     = -1 ESRCH (No such process)
open("/proc/3650/stat", O_RDONLY)       = 16
read(16, "3650 (oracle) S 1 3650 3650 0 -1"..., 999) = 239
..................................................................
open("/s01/admin/G10R21/bdump/alert_G10R21.log", O_WRONLY|O_CREAT|O_APPEND, 0664) = 8
writev(8, [{"Instance terminated by CKPT, pid"..., 39}, {"\n", 1}], 2) = 40
semctl(262145, 0, IPC_RMID, 0)          = 0
close(17)                               = 0
close(13)                               = 0
shmctl(2129924, IPC_RMID, 0)            = 0
shmdt(0x60000000)                       = 0
open("/proc/3617/stat", O_RDONLY)       = 13
read(13, "3617 (oracle) R 1 3617 3617 0 -1"..., 999) = 224
close(13)                               = 0
kill(3617, SIGKILL 

ckpt trace:

*** 2012-03-03 22:13:17.711
Received ORADEBUG command 'call close 16' from process Unix process pid: 3914, image:
ORA-00206: error in writing (block 3, # blocks 1) of control file
ORA-00202: control file: '/s01/oradata/G10R21/controlfile/o1_mf_7ch80zvk_.ctl'
ORA-27072: File I/O error
Linux-x86_64 Error: 9: Bad file descriptor
Additional information: 4
Additional information: 3
Additional information: -1
error 221 detected in background process
ORA-00221: error on write to control file
ORA-00206: error in writing (block 3, # blocks 1) of control file
ORA-00202: control file: '/s01/oradata/G10R21/controlfile/o1_mf_7ch80zvk_.ctl'
ORA-27072: File I/O error
Linux-x86_64 Error: 9: Bad file descriptor
Additional information: 4
Additional information: 3
Additional information: -1

alert log:

Errors in file /s01/admin/G10R21/bdump/g10r21_ckpt_3617.trc:
ORA-00206: error in writing (block 3, # blocks 1) of control file
ORA-00202: control file: '/s01/oradata/G10R21/controlfile/o1_mf_7ch80zvk_.ctl'
ORA-27072: File I/O error
Linux-x86_64 Error: 9: Bad file descriptor
Additional information: 4
Additional information: 3
Additional information: -1
Sat Mar  3 22:13:20 2012
Errors in file /s01/admin/G10R21/bdump/g10r21_ckpt_3617.trc:
ORA-00221: error on write to control file
ORA-00206: error in writing (block 3, # blocks 1) of control file
ORA-00202: control file: '/s01/oradata/G10R21/controlfile/o1_mf_7ch80zvk_.ctl'
ORA-27072: File I/O error
Linux-x86_64 Error: 9: Bad file descriptor
Additional information: 4
Additional information: 3
Additional information: -1
Sat Mar  3 22:13:20 2012
CKPT: terminating instance due to error 221
Sat Mar  3 22:13:20 2012
Errors in file /s01/admin/G10R21/bdump/g10r21_psp0_3609.trc:
ORA-00221: error on write to control file
Sat Mar  3 22:13:20 2012
Errors in file /s01/admin/G10R21/bdump/g10r21_mman_3611.trc:
ORA-00221: error on write to control file
Instance terminated by CKPT, pid = 3617
ckpt strace :

close(16)                               = 0
gettimeofday({1330830800, 213316}, NULL) = 0
times(NULL)                             = 429677421
pwrite(16, "\25\302\0\0\3\0\0\0\0\0\0\0\0\0\1\4L\273\0\0\2\0\0\0\0\0\0\0\v\0\0\0"..., 16384, 49152) = -1 EBADF (Bad file descriptor)
gettimeofday({1330830800, 213988}, NULL) = 0
==========================================================================================

之后ckpt 会终止实例 , kill掉其他后台进程

times(NULL)                             = 429677424
open("/proc/3670/stat", O_RDONLY)       = 16
read(16, "3670 (oracle) S 1 3670 3670 0 -1"..., 999) = 239
close(16)                               = 0
kill(3670, SIGKILL)                     = 0
kill(3670, SIGCONT)                     = -1 ESRCH (No such process)
open("/proc/3673/stat", O_RDONLY)       = 16
read(16, "3673 (oracle) S 1 3673 3673 0 -1"..., 999) = 239
close(16)                               = 0
kill(3673, SIGKILL)                     = 0
kill(3673, SIGCONT)                     = -1 ESRCH (No such process)
open("/proc/3650/stat", O_RDONLY)       = 16
read(16, "3650 (oracle) S 1 3650 3650 0 -1"..., 999) = 239
..................................................................
open("/s01/admin/G10R21/bdump/alert_G10R21.log", O_WRONLY|O_CREAT|O_APPEND, 0664) = 8
writev(8, [{"Instance terminated by CKPT, pid"..., 39}, {"\n", 1}], 2) = 40
semctl(262145, 0, IPC_RMID, 0)          = 0
close(17)                               = 0
close(13)                               = 0
shmctl(2129924, IPC_RMID, 0)            = 0
shmdt(0x60000000)                       = 0
open("/proc/3617/stat", O_RDONLY)       = 13
read(13, "3617 (oracle) R 1 3617 3617 0 -1"..., 999) = 224
close(13)                               = 0
kill(3617, SIGKILL 

ckpt trace:

*** 2012-03-03 22:13:17.711
Received ORADEBUG command 'call close 16' from process Unix process pid: 3914, image:
ORA-00206: error in writing (block 3, # blocks 1) of control file
ORA-00202: control file: '/s01/oradata/G10R21/controlfile/o1_mf_7ch80zvk_.ctl'
ORA-27072: File I/O error
Linux-x86_64 Error: 9: Bad file descriptor
Additional information: 4
Additional information: 3
Additional information: -1
error 221 detected in background process
ORA-00221: error on write to control file
ORA-00206: error in writing (block 3, # blocks 1) of control file
ORA-00202: control file: '/s01/oradata/G10R21/controlfile/o1_mf_7ch80zvk_.ctl'
ORA-27072: File I/O error
Linux-x86_64 Error: 9: Bad file descriptor
Additional information: 4
Additional information: 3
Additional information: -1

alert log:

Errors in file /s01/admin/G10R21/bdump/g10r21_ckpt_3617.trc:
ORA-00206: error in writing (block 3, # blocks 1) of control file
ORA-00202: control file: '/s01/oradata/G10R21/controlfile/o1_mf_7ch80zvk_.ctl'
ORA-27072: File I/O error
Linux-x86_64 Error: 9: Bad file descriptor
Additional information: 4
Additional information: 3
Additional information: -1
Sat Mar  3 22:13:20 2012
Errors in file /s01/admin/G10R21/bdump/g10r21_ckpt_3617.trc:
ORA-00221: error on write to control file
ORA-00206: error in writing (block 3, # blocks 1) of control file
ORA-00202: control file: '/s01/oradata/G10R21/controlfile/o1_mf_7ch80zvk_.ctl'
ORA-27072: File I/O error
Linux-x86_64 Error: 9: Bad file descriptor
Additional information: 4
Additional information: 3
Additional information: -1
Sat Mar  3 22:13:20 2012
CKPT: terminating instance due to error 221
Sat Mar  3 22:13:20 2012
Errors in file /s01/admin/G10R21/bdump/g10r21_psp0_3609.trc:
ORA-00221: error on write to control file
Sat Mar  3 22:13:20 2012
Errors in file /s01/admin/G10R21/bdump/g10r21_mman_3611.trc:
ORA-00221: error on write to control file
Instance terminated by CKPT, pid = 3617
ckpt strace :

close(16)                               = 0
gettimeofday({1330830800, 213316}, NULL) = 0
times(NULL)                             = 429677421
pwrite(16, "\25\302\0\0\3\0\0\0\0\0\0\0\0\0\1\4L\273\0\0\2\0\0\0\0\0\0\0\v\0\0\0"..., 16384, 49152) = -1 EBADF (Bad file descriptor)
gettimeofday({1330830800, 213988}, NULL) = 0
==========================================================================================

之后ckpt 会终止实例 , kill掉其他后台进程

times(NULL)                             = 429677424
open("/proc/3670/stat", O_RDONLY)       = 16
read(16, "3670 (oracle) S 1 3670 3670 0 -1"..., 999) = 239
close(16)                               = 0
kill(3670, SIGKILL)                     = 0
kill(3670, SIGCONT)                     = -1 ESRCH (No such process)
open("/proc/3673/stat", O_RDONLY)       = 16
read(16, "3673 (oracle) S 1 3673 3673 0 -1"..., 999) = 239
close(16)                               = 0
kill(3673, SIGKILL)                     = 0
kill(3673, SIGCONT)                     = -1 ESRCH (No such process)
open("/proc/3650/stat", O_RDONLY)       = 16
read(16, "3650 (oracle) S 1 3650 3650 0 -1"..., 999) = 239
..................................................................
open("/s01/admin/G10R21/bdump/alert_G10R21.log", O_WRONLY|O_CREAT|O_APPEND, 0664) = 8
writev(8, [{"Instance terminated by CKPT, pid"..., 39}, {"\n", 1}], 2) = 40
semctl(262145, 0, IPC_RMID, 0)          = 0
close(17)                               = 0
close(13)                               = 0
shmctl(2129924, IPC_RMID, 0)            = 0
shmdt(0x60000000)                       = 0
open("/proc/3617/stat", O_RDONLY)       = 13
read(13, "3617 (oracle) R 1 3617 3617 0 -1"..., 999) = 224
close(13)                               = 0
kill(3617, SIGKILL 

ckpt trace:

*** 2012-03-03 22:13:17.711
Received ORADEBUG command 'call close 16' from process Unix process pid: 3914, image:
ORA-00206: error in writing (block 3, # blocks 1) of control file
ORA-00202: control file: '/s01/oradata/G10R21/controlfile/o1_mf_7ch80zvk_.ctl'
ORA-27072: File I/O error
Linux-x86_64 Error: 9: Bad file descriptor
Additional information: 4
Additional information: 3
Additional information: -1
error 221 detected in background process
ORA-00221: error on write to control file
ORA-00206: error in writing (block 3, # blocks 1) of control file
ORA-00202: control file: '/s01/oradata/G10R21/controlfile/o1_mf_7ch80zvk_.ctl'
ORA-27072: File I/O error
Linux-x86_64 Error: 9: Bad file descriptor
Additional information: 4
Additional information: 3
Additional information: -1

alert log:

Errors in file /s01/admin/G10R21/bdump/g10r21_ckpt_3617.trc:
ORA-00206: error in writing (block 3, # blocks 1) of control file
ORA-00202: control file: '/s01/oradata/G10R21/controlfile/o1_mf_7ch80zvk_.ctl'
ORA-27072: File I/O error
Linux-x86_64 Error: 9: Bad file descriptor
Additional information: 4
Additional information: 3
Additional information: -1
Sat Mar  3 22:13:20 2012
Errors in file /s01/admin/G10R21/bdump/g10r21_ckpt_3617.trc:
ORA-00221: error on write to control file
ORA-00206: error in writing (block 3, # blocks 1) of control file
ORA-00202: control file: '/s01/oradata/G10R21/controlfile/o1_mf_7ch80zvk_.ctl'
ORA-27072: File I/O error
Linux-x86_64 Error: 9: Bad file descriptor
Additional information: 4
Additional information: 3
Additional information: -1
Sat Mar  3 22:13:20 2012
CKPT: terminating instance due to error 221
Sat Mar  3 22:13:20 2012
Errors in file /s01/admin/G10R21/bdump/g10r21_psp0_3609.trc:
ORA-00221: error on write to control file
Sat Mar  3 22:13:20 2012
Errors in file /s01/admin/G10R21/bdump/g10r21_mman_3611.trc:
ORA-00221: error on write to control file
Instance terminated by CKPT, pid = 3617
ckpt strace :

close(16)                               = 0
gettimeofday({1330830800, 213316}, NULL) = 0
times(NULL)                             = 429677421
pwrite(16, "\25\302\0\0\3\0\0\0\0\0\0\0\0\0\1\4L\273\0\0\2\0\0\0\0\0\0\0\v\0\0\0"..., 16384, 49152) = -1 EBADF (Bad file descriptor)
gettimeofday({1330830800, 213988}, NULL) = 0
==========================================================================================

之后ckpt 会终止实例 , kill掉其他后台进程

times(NULL)                             = 429677424
open("/proc/3670/stat", O_RDONLY)       = 16
read(16, "3670 (oracle) S 1 3670 3670 0 -1"..., 999) = 239
close(16)                               = 0
kill(3670, SIGKILL)                     = 0
kill(3670, SIGCONT)                     = -1 ESRCH (No such process)
open("/proc/3673/stat", O_RDONLY)       = 16
read(16, "3673 (oracle) S 1 3673 3673 0 -1"..., 999) = 239
close(16)                               = 0
kill(3673, SIGKILL)                     = 0
kill(3673, SIGCONT)                     = -1 ESRCH (No such process)
open("/proc/3650/stat", O_RDONLY)       = 16
read(16, "3650 (oracle) S 1 3650 3650 0 -1"..., 999) = 239
..................................................................
open("/s01/admin/G10R21/bdump/alert_G10R21.log", O_WRONLY|O_CREAT|O_APPEND, 0664) = 8
writev(8, [{"Instance terminated by CKPT, pid"..., 39}, {"\n", 1}], 2) = 40
semctl(262145, 0, IPC_RMID, 0)          = 0
close(17)                               = 0
close(13)                               = 0
shmctl(2129924, IPC_RMID, 0)            = 0
shmdt(0x60000000)                       = 0
open("/proc/3617/stat", O_RDONLY)       = 13
read(13, "3617 (oracle) R 1 3617 3617 0 -1"..., 999) = 224
close(13)                               = 0
kill(3617, SIGKILL 

ckpt trace:

*** 2012-03-03 22:13:17.711
Received ORADEBUG command 'call close 16' from process Unix process pid: 3914, image:
ORA-00206: error in writing (block 3, # blocks 1) of control file
ORA-00202: control file: '/s01/oradata/G10R21/controlfile/o1_mf_7ch80zvk_.ctl'
ORA-27072: File I/O error
Linux-x86_64 Error: 9: Bad file descriptor
Additional information: 4
Additional information: 3
Additional information: -1
error 221 detected in background process
ORA-00221: error on write to control file
ORA-00206: error in writing (block 3, # blocks 1) of control file
ORA-00202: control file: '/s01/oradata/G10R21/controlfile/o1_mf_7ch80zvk_.ctl'
ORA-27072: File I/O error
Linux-x86_64 Error: 9: Bad file descriptor
Additional information: 4
Additional information: 3
Additional information: -1

alert log:

Errors in file /s01/admin/G10R21/bdump/g10r21_ckpt_3617.trc:
ORA-00206: error in writing (block 3, # blocks 1) of control file
ORA-00202: control file: '/s01/oradata/G10R21/controlfile/o1_mf_7ch80zvk_.ctl'
ORA-27072: File I/O error
Linux-x86_64 Error: 9: Bad file descriptor
Additional information: 4
Additional information: 3
Additional information: -1
Sat Mar  3 22:13:20 2012
Errors in file /s01/admin/G10R21/bdump/g10r21_ckpt_3617.trc:
ORA-00221: error on write to control file
ORA-00206: error in writing (block 3, # blocks 1) of control file
ORA-00202: control file: '/s01/oradata/G10R21/controlfile/o1_mf_7ch80zvk_.ctl'
ORA-27072: File I/O error
Linux-x86_64 Error: 9: Bad file descriptor
Additional information: 4
Additional information: 3
Additional information: -1
Sat Mar  3 22:13:20 2012
CKPT: terminating instance due to error 221
Sat Mar  3 22:13:20 2012
Errors in file /s01/admin/G10R21/bdump/g10r21_psp0_3609.trc:
ORA-00221: error on write to control file
Sat Mar  3 22:13:20 2012
Errors in file /s01/admin/G10R21/bdump/g10r21_mman_3611.trc:
ORA-00221: error on write to control file
Instance terminated by CKPT, pid = 3617
ckpt strace :

close(16)                               = 0
gettimeofday({1330830800, 213316}, NULL) = 0
times(NULL)                             = 429677421
pwrite(16, "\25\302\0\0\3\0\0\0\0\0\0\0\0\0\1\4L\273\0\0\2\0\0\0\0\0\0\0\v\0\0\0"..., 16384, 49152) = -1 EBADF (Bad file descriptor)
gettimeofday({1330830800, 213988}, NULL) = 0
==========================================================================================

之后ckpt 会终止实例 , kill掉其他后台进程

times(NULL)                             = 429677424
open("/proc/3670/stat", O_RDONLY)       = 16
read(16, "3670 (oracle) S 1 3670 3670 0 -1"..., 999) = 239
close(16)                               = 0
kill(3670, SIGKILL)                     = 0
kill(3670, SIGCONT)                     = -1 ESRCH (No such process)
open("/proc/3673/stat", O_RDONLY)       = 16
read(16, "3673 (oracle) S 1 3673 3673 0 -1"..., 999) = 239
close(16)                               = 0
kill(3673, SIGKILL)                     = 0
kill(3673, SIGCONT)                     = -1 ESRCH (No such process)
open("/proc/3650/stat", O_RDONLY)       = 16
read(16, "3650 (oracle) S 1 3650 3650 0 -1"..., 999) = 239
..................................................................
open("/s01/admin/G10R21/bdump/alert_G10R21.log", O_WRONLY|O_CREAT|O_APPEND, 0664) = 8
writev(8, [{"Instance terminated by CKPT, pid"..., 39}, {"\n", 1}], 2) = 40
semctl(262145, 0, IPC_RMID, 0)          = 0
close(17)                               = 0
close(13)                               = 0
shmctl(2129924, IPC_RMID, 0)            = 0
shmdt(0x60000000)                       = 0
open("/proc/3617/stat", O_RDONLY)       = 13
read(13, "3617 (oracle) R 1 3617 3617 0 -1"..., 999) = 224
close(13)                               = 0
kill(3617, SIGKILL 

ckpt trace:

*** 2012-03-03 22:13:17.711
Received ORADEBUG command 'call close 16' from process Unix process pid: 3914, image:
ORA-00206: error in writing (block 3, # blocks 1) of control file
ORA-00202: control file: '/s01/oradata/G10R21/controlfile/o1_mf_7ch80zvk_.ctl'
ORA-27072: File I/O error
Linux-x86_64 Error: 9: Bad file descriptor
Additional information: 4
Additional information: 3
Additional information: -1
error 221 detected in background process
ORA-00221: error on write to control file
ORA-00206: error in writing (block 3, # blocks 1) of control file
ORA-00202: control file: '/s01/oradata/G10R21/controlfile/o1_mf_7ch80zvk_.ctl'
ORA-27072: File I/O error
Linux-x86_64 Error: 9: Bad file descriptor
Additional information: 4
Additional information: 3
Additional information: -1

alert log:

Errors in file /s01/admin/G10R21/bdump/g10r21_ckpt_3617.trc:
ORA-00206: error in writing (block 3, # blocks 1) of control file
ORA-00202: control file: '/s01/oradata/G10R21/controlfile/o1_mf_7ch80zvk_.ctl'
ORA-27072: File I/O error
Linux-x86_64 Error: 9: Bad file descriptor
Additional information: 4
Additional information: 3
Additional information: -1
Sat Mar  3 22:13:20 2012
Errors in file /s01/admin/G10R21/bdump/g10r21_ckpt_3617.trc:
ORA-00221: error on write to control file
ORA-00206: error in writing (block 3, # blocks 1) of control file
ORA-00202: control file: '/s01/oradata/G10R21/controlfile/o1_mf_7ch80zvk_.ctl'
ORA-27072: File I/O error
Linux-x86_64 Error: 9: Bad file descriptor
Additional information: 4
Additional information: 3
Additional information: -1
Sat Mar  3 22:13:20 2012
CKPT: terminating instance due to error 221
Sat Mar  3 22:13:20 2012
Errors in file /s01/admin/G10R21/bdump/g10r21_psp0_3609.trc:
ORA-00221: error on write to control file
Sat Mar  3 22:13:20 2012
Errors in file /s01/admin/G10R21/bdump/g10r21_mman_3611.trc:
ORA-00221: error on write to control file
Instance terminated by CKPT, pid = 3617
ckpt strace :

close(16)                               = 0
gettimeofday({1330830800, 213316}, NULL) = 0
times(NULL)                             = 429677421
pwrite(16, "\25\302\0\0\3\0\0\0\0\0\0\0\0\0\1\4L\273\0\0\2\0\0\0\0\0\0\0\v\0\0\0"..., 16384, 49152) = -1 EBADF (Bad file descriptor)
gettimeofday({1330830800, 213988}, NULL) = 0
==========================================================================================

之后ckpt 会终止实例 , kill掉其他后台进程

times(NULL)                             = 429677424
open("/proc/3670/stat", O_RDONLY)       = 16
read(16, "3670 (oracle) S 1 3670 3670 0 -1"..., 999) = 239
close(16)                               = 0
kill(3670, SIGKILL)                     = 0
kill(3670, SIGCONT)                     = -1 ESRCH (No such process)
open("/proc/3673/stat", O_RDONLY)       = 16
read(16, "3673 (oracle) S 1 3673 3673 0 -1"..., 999) = 239
close(16)                               = 0
kill(3673, SIGKILL)                     = 0
kill(3673, SIGCONT)                     = -1 ESRCH (No such process)
open("/proc/3650/stat", O_RDONLY)       = 16
read(16, "3650 (oracle) S 1 3650 3650 0 -1"..., 999) = 239
..................................................................
open("/s01/admin/G10R21/bdump/alert_G10R21.log", O_WRONLY|O_CREAT|O_APPEND, 0664) = 8
writev(8, [{"Instance terminated by CKPT, pid"..., 39}, {"\n", 1}], 2) = 40
semctl(262145, 0, IPC_RMID, 0)          = 0
close(17)                               = 0
close(13)                               = 0
shmctl(2129924, IPC_RMID, 0)            = 0
shmdt(0x60000000)                       = 0
open("/proc/3617/stat", O_RDONLY)       = 13
read(13, "3617 (oracle) R 1 3617 3617 0 -1"..., 999) = 224
close(13)                               = 0
kill(3617, SIGKILL 

ckpt trace:

*** 2012-03-03 22:13:17.711
Received ORADEBUG command 'call close 16' from process Unix process pid: 3914, image:
ORA-00206: error in writing (block 3, # blocks 1) of control file
ORA-00202: control file: '/s01/oradata/G10R21/controlfile/o1_mf_7ch80zvk_.ctl'
ORA-27072: File I/O error
Linux-x86_64 Error: 9: Bad file descriptor
Additional information: 4
Additional information: 3
Additional information: -1
error 221 detected in background process
ORA-00221: error on write to control file
ORA-00206: error in writing (block 3, # blocks 1) of control file
ORA-00202: control file: '/s01/oradata/G10R21/controlfile/o1_mf_7ch80zvk_.ctl'
ORA-27072: File I/O error
Linux-x86_64 Error: 9: Bad file descriptor
Additional information: 4
Additional information: 3
Additional information: -1

alert log:

Errors in file /s01/admin/G10R21/bdump/g10r21_ckpt_3617.trc:
ORA-00206: error in writing (block 3, # blocks 1) of control file
ORA-00202: control file: '/s01/oradata/G10R21/controlfile/o1_mf_7ch80zvk_.ctl'
ORA-27072: File I/O error
Linux-x86_64 Error: 9: Bad file descriptor
Additional information: 4
Additional information: 3
Additional information: -1
Sat Mar  3 22:13:20 2012
Errors in file /s01/admin/G10R21/bdump/g10r21_ckpt_3617.trc:
ORA-00221: error on write to control file
ORA-00206: error in writing (block 3, # blocks 1) of control file
ORA-00202: control file: '/s01/oradata/G10R21/controlfile/o1_mf_7ch80zvk_.ctl'
ORA-27072: File I/O error
Linux-x86_64 Error: 9: Bad file descriptor
Additional information: 4
Additional information: 3
Additional information: -1
Sat Mar  3 22:13:20 2012
CKPT: terminating instance due to error 221
Sat Mar  3 22:13:20 2012
Errors in file /s01/admin/G10R21/bdump/g10r21_psp0_3609.trc:
ORA-00221: error on write to control file
Sat Mar  3 22:13:20 2012
Errors in file /s01/admin/G10R21/bdump/g10r21_mman_3611.trc:
ORA-00221: error on write to control file
Instance terminated by CKPT, pid = 3617
L3
ckpt strace :

close(16)                               = 0
gettimeofday({1330830800, 213316}, NULL) = 0
times(NULL)                             = 429677421
pwrite(16, "\25\302\0\0\3\0\0\0\0\0\0\0\0\0\1\4L\273\0\0\2\0\0\0\0\0\0\0\v\0\0\0"..., 16384, 49152) = -1 EBADF (Bad file descriptor)
gettimeofday({1330830800, 213988}, NULL) = 0
==========================================================================================

之后ckpt 会终止实例 , kill掉其他后台进程

times(NULL)                             = 429677424
open("/proc/3670/stat", O_RDONLY)       = 16
read(16, "3670 (oracle) S 1 3670 3670 0 -1"..., 999) = 239
close(16)                               = 0
kill(3670, SIGKILL)                     = 0
kill(3670, SIGCONT)                     = -1 ESRCH (No such process)
open("/proc/3673/stat", O_RDONLY)       = 16
read(16, "3673 (oracle) S 1 3673 3673 0 -1"..., 999) = 239
close(16)                               = 0
kill(3673, SIGKILL)                     = 0
kill(3673, SIGCONT)                     = -1 ESRCH (No such process)
open("/proc/3650/stat", O_RDONLY)       = 16
read(16, "3650 (oracle) S 1 3650 3650 0 -1"..., 999) = 239
..................................................................
open("/s01/admin/G10R21/bdump/alert_G10R21.log", O_WRONLY|O_CREAT|O_APPEND, 0664) = 8
writev(8, [{"Instance terminated by CKPT, pid"..., 39}, {"\n", 1}], 2) = 40
semctl(262145, 0, IPC_RMID, 0)          = 0
close(17)                               = 0
close(13)                               = 0
shmctl(2129924, IPC_RMID, 0)            = 0
shmdt(0x60000000)                       = 0
open("/proc/3617/stat", O_RDONLY)       = 13
read(13, "3617 (oracle) R 1 3617 3617 0 -1"..., 999) = 224
close(13)                               = 0
kill(3617, SIGKILL 

ckpt trace:

*** 2012-03-03 22:13:17.711
Received ORADEBUG command 'call close 16' from process Unix process pid: 3914, image:
ORA-00206: error in writing (block 3, # blocks 1) of control file
ORA-00202: control file: '/s01/oradata/G10R21/controlfile/o1_mf_7ch80zvk_.ctl'
ORA-27072: File I/O error
Linux-x86_64 Error: 9: Bad file descriptor
Additional information: 4
Additional information: 3
Additional information: -1
error 221 detected in background process
ORA-00221: error on write to control file
ORA-00206: error in writing (block 3, # blocks 1) of control file
ORA-00202: control file: '/s01/oradata/G10R21/controlfile/o1_mf_7ch80zvk_.ctl'
ORA-27072: File I/O error
Linux-x86_64 Error: 9: Bad file descriptor
Additional information: 4
Additional information: 3
Additional information: -1

alert log:

Errors in file /s01/admin/G10R21/bdump/g10r21_ckpt_3617.trc:
ORA-00206: error in writing (block 3, # blocks 1) of control file
ORA-00202: control file: '/s01/oradata/G10R21/controlfile/o1_mf_7ch80zvk_.ctl'
ORA-27072: File I/O error
Linux-x86_64 Error: 9: Bad file descriptor
Additional information: 4
Additional information: 3
Additional information: -1
Sat Mar  3 22:13:20 2012
Errors in file /s01/admin/G10R21/bdump/g10r21_ckpt_3617.trc:
ORA-00221: error on write to control file
ORA-00206: error in writing (block 3, # blocks 1) of control file
ORA-00202: control file: '/s01/oradata/G10R21/controlfile/o1_mf_7ch80zvk_.ctl'
ORA-27072: File I/O error
Linux-x86_64 Error: 9: Bad file descriptor
Additional information: 4
Additional information: 3
Additional information: -1
Sat Mar  3 22:13:20 2012
CKPT: terminating instance due to error 221
Sat Mar  3 22:13:20 2012
Errors in file /s01/admin/G10R21/bdump/g10r21_psp0_3609.trc:
ORA-00221: error on write to control file
Sat Mar  3 22:13:20 2012
Errors in file /s01/admin/G10R21/bdump/g10r21_mman_3611.trc:
ORA-00221: error on write to control file
Instance terminated by CKPT, pid = 3617
ckpt strace :

close(16)                               = 0
gettimeofday({1330830800, 213316}, NULL) = 0
times(NULL)                             = 429677421
pwrite(16, "\25\302\0\0\3\0\0\0\0\0\0\0\0\0\1\4L\273\0\0\2\0\0\0\0\0\0\0\v\0\0\0"..., 16384, 49152) = -1 EBADF (Bad file descriptor)
gettimeofday({1330830800, 213988}, NULL) = 0
==========================================================================================

之后ckpt 会终止实例 , kill掉其他后台进程

times(NULL)                             = 429677424
open("/proc/3670/stat", O_RDONLY)       = 16
read(16, "3670 (oracle) S 1 3670 3670 0 -1"..., 999) = 239
close(16)                               = 0
kill(3670, SIGKILL)                     = 0
kill(3670, SIGCONT)                     = -1 ESRCH (No such process)
open("/proc/3673/stat", O_RDONLY)       = 16
read(16, "3673 (oracle) S 1 3673 3673 0 -1"..., 999) = 239
close(16)                               = 0
kill(3673, SIGKILL)                     = 0
kill(3673, SIGCONT)                     = -1 ESRCH (No such process)
open("/proc/3650/stat", O_RDONLY)       = 16
read(16, "3650 (oracle) S 1 3650 3650 0 -1"..., 999) = 239
..................................................................
open("/s01/admin/G10R21/bdump/alert_G10R21.log", O_WRONLY|O_CREAT|O_APPEND, 0664) = 8
writev(8, [{"Instance terminated by CKPT, pid"..., 39}, {"\n", 1}], 2) = 40
semctl(262145, 0, IPC_RMID, 0)          = 0
close(17)                               = 0
close(13)                               = 0
shmctl(2129924, IPC_RMID, 0)            = 0
shmdt(0x60000000)                       = 0
open("/proc/3617/stat", O_RDONLY)       = 13
read(13, "3617 (oracle) R 1 3617 3617 0 -1"..., 999) = 224
close(13)                               = 0
kill(3617, SIGKILL 

ckpt trace:

*** 2012-03-03 22:13:17.711
Received ORADEBUG command 'call close 16' from process Unix process pid: 3914, image:
ORA-00206: error in writing (block 3, # blocks 1) of control file
ORA-00202: control file: '/s01/oradata/G10R21/controlfile/o1_mf_7ch80zvk_.ctl'
ORA-27072: File I/O error
Linux-x86_64 Error: 9: Bad file descriptor
Additional information: 4
Additional information: 3
Additional information: -1
error 221 detected in background process
ORA-00221: error on write to control file
ORA-00206: error in writing (block 3, # blocks 1) of control file
ORA-00202: control file: '/s01/oradata/G10R21/controlfile/o1_mf_7ch80zvk_.ctl'
ORA-27072: File I/O error
Linux-x86_64 Error: 9: Bad file descriptor
Additional information: 4
Additional information: 3
Additional information: -1

alert log:

Errors in file /s01/admin/G10R21/bdump/g10r21_ckpt_3617.trc:
ORA-00206: error in writing (block 3, # blocks 1) of control file
ORA-00202: control file: '/s01/oradata/G10R21/controlfile/o1_mf_7ch80zvk_.ctl'
ORA-27072: File I/O error
Linux-x86_64 Error: 9: Bad file descriptor
Additional information: 4
Additional information: 3
Additional information: -1
Sat Mar  3 22:13:20 2012
Errors in file /s01/admin/G10R21/bdump/g10r21_ckpt_3617.trc:
ORA-00221: error on write to control file
ORA-00206: error in writing (block 3, # blocks 1) of control file
ORA-00202: control file: '/s01/oradata/G10R21/controlfile/o1_mf_7ch80zvk_.ctl'
ORA-27072: File I/O error
Linux-x86_64 Error: 9: Bad file descriptor
Additional information: 4
Additional information: 3
Additional information: -1
Sat Mar  3 22:13:20 2012
CKPT: terminating instance due to error 221
Sat Mar  3 22:13:20 2012
Errors in file /s01/admin/G10R21/bdump/g10r21_psp0_3609.trc:
ORA-00221: error on write to control file
Sat Mar  3 22:13:20 2012
Errors in file /s01/admin/G10R21/bdump/g10r21_mman_3611.trc:
ORA-00221: error on write to control file
Instance terminated by CKPT, pid = 3617
ckpt strace :

close(16)                               = 0
gettimeofday({1330830800, 213316}, NULL) = 0
times(NULL)                             = 429677421
pwrite(16, "\25\302\0\0\3\0\0\0\0\0\0\0\0\0\1\4L\273\0\0\2\0\0\0\0\0\0\0\v\0\0\0"..., 16384, 49152) = -1 EBADF (Bad file descriptor)
gettimeofday({1330830800, 213988}, NULL) = 0
==========================================================================================

之后ckpt 会终止实例 , kill掉其他后台进程

times(NULL)                             = 429677424
open("/proc/3670/stat", O_RDONLY)       = 16
read(16, "3670 (oracle) S 1 3670 3670 0 -1"..., 999) = 239
close(16)                               = 0
kill(3670, SIGKILL)                     = 0
kill(3670, SIGCONT)                     = -1 ESRCH (No such process)
open("/proc/3673/stat", O_RDONLY)       = 16
read(16, "3673 (oracle) S 1 3673 3673 0 -1"..., 999) = 239
close(16)                               = 0
kill(3673, SIGKILL)                     = 0
kill(3673, SIGCONT)                     = -1 ESRCH (No such process)
open("/proc/3650/stat", O_RDONLY)       = 16
read(16, "3650 (oracle) S 1 3650 3650 0 -1"..., 999) = 239
..................................................................
open("/s01/admin/G10R21/bdump/alert_G10R21.log", O_WRONLY|O_CREAT|O_APPEND, 0664) = 8
writev(8, [{"Instance terminated by CKPT, pid"..., 39}, {"\n", 1}], 2) = 40
semctl(262145, 0, IPC_RMID, 0)          = 0
close(17)                               = 0
close(13)                               = 0
shmctl(2129924, IPC_RMID, 0)            = 0
shmdt(0x60000000)                       = 0
open("/proc/3617/stat", O_RDONLY)       = 13
read(13, "3617 (oracle) R 1 3617 3617 0 -1"..., 999) = 224
close(13)                               = 0
kill(3617, SIGKILL 

ckpt trace:

*** 2012-03-03 22:13:17.711
Received ORADEBUG command 'call close 16' from process Unix process pid: 3914, image:
ORA-00206: error in writing (block 3, # blocks 1) of control file
ORA-00202: control file: '/s01/oradata/G10R21/controlfile/o1_mf_7ch80zvk_.ctl'
ORA-27072: File I/O error
Linux-x86_64 Error: 9: Bad file descriptor
Additional information: 4
Additional information: 3
Additional information: -1
error 221 detected in background process
ORA-00221: error on write to control file
ORA-00206: error in writing (block 3, # blocks 1) of control file
ORA-00202: control file: '/s01/oradata/G10R21/controlfile/o1_mf_7ch80zvk_.ctl'
ORA-27072: File I/O error
Linux-x86_64 Error: 9: Bad file descriptor
Additional information: 4
Additional information: 3
Additional information: -1

alert log:

Errors in file /s01/admin/G10R21/bdump/g10r21_ckpt_3617.trc:
ORA-00206: error in writing (block 3, # blocks 1) of control file
ORA-00202: control file: '/s01/oradata/G10R21/controlfile/o1_mf_7ch80zvk_.ctl'
ORA-27072: File I/O error
Linux-x86_64 Error: 9: Bad file descriptor
Additional information: 4
Additional information: 3
Additional information: -1
Sat Mar  3 22:13:20 2012
Errors in file /s01/admin/G10R21/bdump/g10r21_ckpt_3617.trc:
ORA-00221: error on write to control file
ORA-00206: error in writing (block 3, # blocks 1) of control file
ORA-00202: control file: '/s01/oradata/G10R21/controlfile/o1_mf_7ch80zvk_.ctl'
ORA-27072: File I/O error
Linux-x86_64 Error: 9: Bad file descriptor
Additional information: 4
Additional information: 3
Additional information: -1
Sat Mar  3 22:13:20 2012
CKPT: terminating instance due to error 221
Sat Mar  3 22:13:20 2012
Errors in file /s01/admin/G10R21/bdump/g10r21_psp0_3609.trc:
ORA-00221: error on write to control file
Sat Mar  3 22:13:20 2012
Errors in file /s01/admin/G10R21/bdump/g10r21_mman_3611.trc:
ORA-00221: error on write to control file
Instance terminated by CKPT, pid = 3617
ckpt strace :

close(16)                               = 0
gettimeofday({1330830800, 213316}, NULL) = 0
times(NULL)                             = 429677421
pwrite(16, "\25\302\0\0\3\0\0\0\0\0\0\0\0\0\1\4L\273\0\0\2\0\0\0\0\0\0\0\v\0\0\0"..., 16384, 49152) = -1 EBADF (Bad file descriptor)
gettimeofday({1330830800, 213988}, NULL) = 0
==========================================================================================

之后ckpt 会终止实例 , kill掉其他后台进程

times(NULL)                             = 429677424
open("/proc/3670/stat", O_RDONLY)       = 16
read(16, "3670 (oracle) S 1 3670 3670 0 -1"..., 999) = 239
close(16)                               = 0
kill(3670, SIGKILL)                     = 0
kill(3670, SIGCONT)                     = -1 ESRCH (No such process)
open("/proc/3673/stat", O_RDONLY)       = 16
read(16, "3673 (oracle) S 1 3673 3673 0 -1"..., 999) = 239
close(16)                               = 0
kill(3673, SIGKILL)                     = 0
kill(3673, SIGCONT)                     = -1 ESRCH (No such process)
open("/proc/3650/stat", O_RDONLY)       = 16
read(16, "3650 (oracle) S 1 3650 3650 0 -1"..., 999) = 239
..................................................................
open("/s01/admin/G10R21/bdump/alert_G10R21.log", O_WRONLY|O_CREAT|O_APPEND, 0664) = 8
writev(8, [{"Instance terminated by CKPT, pid"..., 39}, {"\n", 1}], 2) = 40
semctl(262145, 0, IPC_RMID, 0)          = 0
close(17)                               = 0
close(13)                               = 0
shmctl(2129924, IPC_RMID, 0)            = 0
shmdt(0x60000000)                       = 0
open("/proc/3617/stat", O_RDONLY)       = 13
read(13, "3617 (oracle) R 1 3617 3617 0 -1"..., 999) = 224
close(13)                               = 0
kill(3617, SIGKILL 

ckpt trace:

*** 2012-03-03 22:13:17.711
Received ORADEBUG command 'call close 16' from process Unix process pid: 3914, image:
ORA-00206: error in writing (block 3, # blocks 1) of control file
ORA-00202: control file: '/s01/oradata/G10R21/controlfile/o1_mf_7ch80zvk_.ctl'
ORA-27072: File I/O error
Linux-x86_64 Error: 9: Bad file descriptor
Additional information: 4
Additional information: 3
Additional information: -1
error 221 detected in background process
ORA-00221: error on write to control file
ORA-00206: error in writing (block 3, # blocks 1) of control file
ORA-00202: control file: '/s01/oradata/G10R21/controlfile/o1_mf_7ch80zvk_.ctl'
ORA-27072: File I/O error
Linux-x86_64 Error: 9: Bad file descriptor
Additional information: 4
Additional information: 3
Additional information: -1

alert log:

Errors in file /s01/admin/G10R21/bdump/g10r21_ckpt_3617.trc:
ORA-00206: error in writing (block 3, # blocks 1) of control file
ORA-00202: control file: '/s01/oradata/G10R21/controlfile/o1_mf_7ch80zvk_.ctl'
ORA-27072: File I/O error
Linux-x86_64 Error: 9: Bad file descriptor
Additional information: 4
Additional information: 3
Additional information: -1
Sat Mar  3 22:13:20 2012
Errors in file /s01/admin/G10R21/bdump/g10r21_ckpt_3617.trc:
ORA-00221: error on write to control file
ORA-00206: error in writing (block 3, # blocks 1) of control file
ORA-00202: control file: '/s01/oradata/G10R21/controlfile/o1_mf_7ch80zvk_.ctl'
ORA-27072: File I/O error
Linux-x86_64 Error: 9: Bad file descriptor
Additional information: 4
Additional information: 3
Additional information: -1
Sat Mar  3 22:13:20 2012
CKPT: terminating instance due to error 221
Sat Mar  3 22:13:20 2012
Errors in file /s01/admin/G10R21/bdump/g10r21_psp0_3609.trc:
ORA-00221: error on write to control file
Sat Mar  3 22:13:20 2012
Errors in file /s01/admin/G10R21/bdump/g10r21_mman_3611.trc:
ORA-00221: error on write to control file
Instance terminated by CKPT, pid = 3617
ckpt strace :

close(16)                               = 0
gettimeofday({1330830800, 213316}, NULL) = 0
times(NULL)                             = 429677421
pwrite(16, "\25\302\0\0\3\0\0\0\0\0\0\0\0\0\1\4L\273\0\0\2\0\0\0\0\0\0\0\v\0\0\0"..., 16384, 49152) = -1 EBADF (Bad file descriptor)
gettimeofday({1330830800, 213988}, NULL) = 0
==========================================================================================

之后ckpt 会终止实例 , kill掉其他后台进程

times(NULL)                             = 429677424
open("/proc/3670/stat", O_RDONLY)       = 16
read(16, "3670 (oracle) S 1 3670 3670 0 -1"..., 999) = 239
close(16)                               = 0
kill(3670, SIGKILL)                     = 0
kill(3670, SIGCONT)                     = -1 ESRCH (No such process)
open("/proc/3673/stat", O_RDONLY)       = 16
read(16, "3673 (oracle) S 1 3673 3673 0 -1"..., 999) = 239
close(16)                               = 0
kill(3673, SIGKILL)                     = 0
kill(3673, SIGCONT)                     = -1 ESRCH (No such process)
open("/proc/3650/stat", O_RDONLY)       = 16
read(16, "3650 (oracle) S 1 3650 3650 0 -1"..., 999) = 239
..................................................................
open("/s01/admin/G10R21/bdump/alert_G10R21.log", O_WRONLY|O_CREAT|O_APPEND, 0664) = 8
writev(8, [{"Instance terminated by CKPT, pid"..., 39}, {"\n", 1}], 2) = 40
semctl(262145, 0, IPC_RMID, 0)          = 0
close(17)                               = 0
close(13)                               = 0
shmctl(2129924, IPC_RMID, 0)            = 0
shmdt(0x60000000)                       = 0
open("/proc/3617/stat", O_RDONLY)       = 13
read(13, "3617 (oracle) R 1 3617 3617 0 -1"..., 999) = 224
close(13)                               = 0
kill(3617, SIGKILL 

ckpt trace:

*** 2012-03-03 22:13:17.711
Received ORADEBUG command 'call close 16' from process Unix process pid: 3914, image:
ORA-00206: error in writing (block 3, # blocks 1) of control file
ORA-00202: control file: '/s01/oradata/G10R21/controlfile/o1_mf_7ch80zvk_.ctl'
ORA-27072: File I/O error
Linux-x86_64 Error: 9: Bad file descriptor
Additional information: 4
Additional information: 3
Additional information: -1
error 221 detected in background process
ORA-00221: error on write to control file
ORA-00206: error in writing (block 3, # blocks 1) of control file
ORA-00202: control file: '/s01/oradata/G10R21/controlfile/o1_mf_7ch80zvk_.ctl'
ORA-27072: File I/O error
Linux-x86_64 Error: 9: Bad file descriptor
Additional information: 4
Additional information: 3
Additional information: -1

alert log:

Errors in file /s01/admin/G10R21/bdump/g10r21_ckpt_3617.trc:
ORA-00206: error in writing (block 3, # blocks 1) of control file
ORA-00202: control file: '/s01/oradata/G10R21/controlfile/o1_mf_7ch80zvk_.ctl'
ORA-27072: File I/O error
Linux-x86_64 Error: 9: Bad file descriptor
Additional information: 4
Additional information: 3
Additional information: -1
Sat Mar  3 22:13:20 2012
Errors in file /s01/admin/G10R21/bdump/g10r21_ckpt_3617.trc:
ORA-00221: error on write to control file
ORA-00206: error in writing (block 3, # blocks 1) of control file
ORA-00202: control file: '/s01/oradata/G10R21/controlfile/o1_mf_7ch80zvk_.ctl'
ORA-27072: File I/O error
Linux-x86_64 Error: 9: Bad file descriptor
Additional information: 4
Additional information: 3
Additional information: -1
Sat Mar  3 22:13:20 2012
CKPT: terminating instance due to error 221
Sat Mar  3 22:13:20 2012
Errors in file /s01/admin/G10R21/bdump/g10r21_psp0_3609.trc:
ORA-00221: error on write to control file
Sat Mar  3 22:13:20 2012
Errors in file /s01/admin/G10R21/bdump/g10r21_mman_3611.trc:
ORA-00221: error on write to control file
Instance terminated by CKPT, pid = 3617
ckpt strace :

close(16)                               = 0
gettimeofday({1330830800, 213316}, NULL) = 0
times(NULL)                             = 429677421
pwrite(16, "\25\302\0\0\3\0\0\0\0\0\0\0\0\0\1\4L\273\0\0\2\0\0\0\0\0\0\0\v\0\0\0"..., 16384, 49152) = -1 EBADF (Bad file descriptor)
gettimeofday({1330830800, 213988}, NULL) = 0
==========================================================================================

之后ckpt 会终止实例 , kill掉其他后台进程

times(NULL)                             = 429677424
open("/proc/3670/stat", O_RDONLY)       = 16
read(16, "3670 (oracle) S 1 3670 3670 0 -1"..., 999) = 239
close(16)                               = 0
kill(3670, SIGKILL)                     = 0
kill(3670, SIGCONT)                     = -1 ESRCH (No such process)
open("/proc/3673/stat", O_RDONLY)       = 16
read(16, "3673 (oracle) S 1 3673 3673 0 -1"..., 999) = 239
close(16)                               = 0
kill(3673, SIGKILL)                     = 0
kill(3673, SIGCONT)                     = -1 ESRCH (No such process)
open("/proc/3650/stat", O_RDONLY)       = 16
read(16, "3650 (oracle) S 1 3650 3650 0 -1"..., 999) = 239
..................................................................
open("/s01/admin/G10R21/bdump/alert_G10R21.log", O_WRONLY|O_CREAT|O_APPEND, 0664) = 8
writev(8, [{"Instance terminated by CKPT, pid"..., 39}, {"\n", 1}], 2) = 40
semctl(262145, 0, IPC_RMID, 0)          = 0
close(17)                               = 0
close(13)                               = 0
shmctl(2129924, IPC_RMID, 0)            = 0
shmdt(0x60000000)                       = 0
open("/proc/3617/stat", O_RDONLY)       = 13
read(13, "3617 (oracle) R 1 3617 3617 0 -1"..., 999) = 224
close(13)                               = 0
kill(3617, SIGKILL 

ckpt trace:

*** 2012-03-03 22:13:17.711
Received ORADEBUG command 'call close 16' from process Unix process pid: 3914, image:
ORA-00206: error in writing (block 3, # blocks 1) of control file
ORA-00202: control file: '/s01/oradata/G10R21/controlfile/o1_mf_7ch80zvk_.ctl'
ORA-27072: File I/O error
Linux-x86_64 Error: 9: Bad file descriptor
Additional information: 4
Additional information: 3
Additional information: -1
error 221 detected in background process
ORA-00221: error on write to control file
ORA-00206: error in writing (block 3, # blocks 1) of control file
ORA-00202: control file: '/s01/oradata/G10R21/controlfile/o1_mf_7ch80zvk_.ctl'
ORA-27072: File I/O error
Linux-x86_64 Error: 9: Bad file descriptor
Additional information: 4
Additional information: 3
Additional information: -1

alert log:

Errors in file /s01/admin/G10R21/bdump/g10r21_ckpt_3617.trc:
ORA-00206: error in writing (block 3, # blocks 1) of control file
ORA-00202: control file: '/s01/oradata/G10R21/controlfile/o1_mf_7ch80zvk_.ctl'
ORA-27072: File I/O error
Linux-x86_64 Error: 9: Bad file descriptor
Additional information: 4
Additional information: 3
Additional information: -1
Sat Mar  3 22:13:20 2012
Errors in file /s01/admin/G10R21/bdump/g10r21_ckpt_3617.trc:
ORA-00221: error on write to control file
ORA-00206: error in writing (block 3, # blocks 1) of control file
ORA-00202: control file: '/s01/oradata/G10R21/controlfile/o1_mf_7ch80zvk_.ctl'
ORA-27072: File I/O error
Linux-x86_64 Error: 9: Bad file descriptor
Additional information: 4
Additional information: 3
Additional information: -1
Sat Mar  3 22:13:20 2012
CKPT: terminating instance due to error 221
Sat Mar  3 22:13:20 2012
Errors in file /s01/admin/G10R21/bdump/g10r21_psp0_3609.trc:
ORA-00221: error on write to control file
Sat Mar  3 22:13:20 2012
Errors in file /s01/admin/G10R21/bdump/g10r21_mman_3611.trc:
ORA-00221: error on write to control file
Instance terminated by CKPT, pid = 3617
ckpt strace :

close(16)                               = 0
gettimeofday({1330830800, 213316}, NULL) = 0
times(NULL)                             = 429677421
pwrite(16, "\25\302\0\0\3\0\0\0\0\0\0\0\0\0\1\4L\273\0\0\2\0\0\0\0\0\0\0\v\0\0\0"..., 16384, 49152) = -1 EBADF (Bad file descriptor)
gettimeofday({1330830800, 213988}, NULL) = 0
==========================================================================================

之后ckpt 会终止实例 , kill掉其他后台进程

times(NULL)                             = 429677424
open("/proc/3670/stat", O_RDONLY)       = 16
read(16, "3670 (oracle) S 1 3670 3670 0 -1"..., 999) = 239
close(16)                               = 0
kill(3670, SIGKILL)                     = 0
kill(3670, SIGCONT)                     = -1 ESRCH (No such process)
open("/proc/3673/stat", O_RDONLY)       = 16
read(16, "3673 (oracle) S 1 3673 3673 0 -1"..., 999) = 239
close(16)                               = 0
kill(3673, SIGKILL)                     = 0
kill(3673, SIGCONT)                     = -1 ESRCH (No such process)
open("/proc/3650/stat", O_RDONLY)       = 16
read(16, "3650 (oracle) S 1 3650 3650 0 -1"..., 999) = 239
..................................................................
open("/s01/admin/G10R21/bdump/alert_G10R21.log", O_WRONLY|O_CREAT|O_APPEND, 0664) = 8
writev(8, [{"Instance terminated by CKPT, pid"..., 39}, {"\n", 1}], 2) = 40
semctl(262145, 0, IPC_RMID, 0)          = 0
close(17)                               = 0
close(13)                               = 0
shmctl(2129924, IPC_RMID, 0)            = 0
shmdt(0x60000000)                       = 0
open("/proc/3617/stat", O_RDONLY)       = 13
read(13, "3617 (oracle) R 1 3617 3617 0 -1"..., 999) = 224
close(13)                               = 0
kill(3617, SIGKILL 

ckpt trace:

*** 2012-03-03 22:13:17.711
Received ORADEBUG command 'call close 16' from process Unix process pid: 3914, image:
ORA-00206: error in writing (block 3, # blocks 1) of control file
ORA-00202: control file: '/s01/oradata/G10R21/controlfile/o1_mf_7ch80zvk_.ctl'
ORA-27072: File I/O error
Linux-x86_64 Error: 9: Bad file descriptor
Additional information: 4
Additional information: 3
Additional information: -1
error 221 detected in background process
ORA-00221: error on write to control file
ORA-00206: error in writing (block 3, # blocks 1) of control file
ORA-00202: control file: '/s01/oradata/G10R21/controlfile/o1_mf_7ch80zvk_.ctl'
ORA-27072: File I/O error
Linux-x86_64 Error: 9: Bad file descriptor
Additional information: 4
Additional information: 3
Additional information: -1

alert log:

Errors in file /s01/admin/G10R21/bdump/g10r21_ckpt_3617.trc:
ORA-00206: error in writing (block 3, # blocks 1) of control file
ORA-00202: control file: '/s01/oradata/G10R21/controlfile/o1_mf_7ch80zvk_.ctl'
ORA-27072: File I/O error
Linux-x86_64 Error: 9: Bad file descriptor
Additional information: 4
Additional information: 3
Additional information: -1
Sat Mar  3 22:13:20 2012
Errors in file /s01/admin/G10R21/bdump/g10r21_ckpt_3617.trc:
ORA-00221: error on write to control file
ORA-00206: error in writing (block 3, # blocks 1) of control file
ORA-00202: control file: '/s01/oradata/G10R21/controlfile/o1_mf_7ch80zvk_.ctl'
ORA-27072: File I/O error
Linux-x86_64 Error: 9: Bad file descriptor
Additional information: 4
Additional information: 3
Additional information: -1
Sat Mar  3 22:13:20 2012
CKPT: terminating instance due to error 221
Sat Mar  3 22:13:20 2012
Errors in file /s01/admin/G10R21/bdump/g10r21_psp0_3609.trc:
ORA-00221: error on write to control file
Sat Mar  3 22:13:20 2012
Errors in file /s01/admin/G10R21/bdump/g10r21_mman_3611.trc:
ORA-00221: error on write to control file
Instance terminated by CKPT, pid = 3617
ckpt strace :

close(16)                               = 0
gettimeofday({1330830800, 213316}, NULL) = 0
times(NULL)                             = 429677421
pwrite(16, "\25\302\0\0\3\0\0\0\0\0\0\0\0\0\1\4L\273\0\0\2\0\0\0\0\0\0\0\v\0\0\0"..., 16384, 49152) = -1 EBADF (Bad file descriptor)
gettimeofday({1330830800, 213988}, NULL) = 0
==========================================================================================

之后ckpt 会终止实例 , kill掉其他后台进程

times(NULL)                             = 429677424
open("/proc/3670/stat", O_RDONLY)       = 16
read(16, "3670 (oracle) S 1 3670 3670 0 -1"..., 999) = 239
close(16)                               = 0
kill(3670, SIGKILL)                     = 0
kill(3670, SIGCONT)                     = -1 ESRCH (No such process)
open("/proc/3673/stat", O_RDONLY)       = 16
read(16, "3673 (oracle) S 1 3673 3673 0 -1"..., 999) = 239
close(16)                               = 0
kill(3673, SIGKILL)                     = 0
kill(3673, SIGCONT)                     = -1 ESRCH (No such process)
open("/proc/3650/stat", O_RDONLY)       = 16
read(16, "3650 (oracle) S 1 3650 3650 0 -1"..., 999) = 239
..................................................................
open("/s01/admin/G10R21/bdump/alert_G10R21.log", O_WRONLY|O_CREAT|O_APPEND, 0664) = 8
writev(8, [{"Instance terminated by CKPT, pid"..., 39}, {"\n", 1}], 2) = 40
semctl(262145, 0, IPC_RMID, 0)          = 0
close(17)                               = 0
close(13)                               = 0
shmctl(2129924, IPC_RMID, 0)            = 0
shmdt(0x60000000)                       = 0
open("/proc/3617/stat", O_RDONLY)       = 13
read(13, "3617 (oracle) R 1 3617 3617 0 -1"..., 999) = 224
close(13)                               = 0
kill(3617, SIGKILL 

ckpt trace:

*** 2012-03-03 22:13:17.711
Received ORADEBUG command 'call close 16' from process Unix process pid: 3914, image:
ORA-00206: error in writing (block 3, # blocks 1) of control file
ORA-00202: control file: '/s01/oradata/G10R21/controlfile/o1_mf_7ch80zvk_.ctl'
ORA-27072: File I/O error
Linux-x86_64 Error: 9: Bad file descriptor
Additional information: 4
Additional information: 3
Additional information: -1
error 221 detected in background process
ORA-00221: error on write to control file
ORA-00206: error in writing (block 3, # blocks 1) of control file
ORA-00202: control file: '/s01/oradata/G10R21/controlfile/o1_mf_7ch80zvk_.ctl'
ORA-27072: File I/O error
Linux-x86_64 Error: 9: Bad file descriptor
Additional information: 4
Additional information: 3
Additional information: -1

alert log:

Errors in file /s01/admin/G10R21/bdump/g10r21_ckpt_3617.trc:
ORA-00206: error in writing (block 3, # blocks 1) of control file
ORA-00202: control file: '/s01/oradata/G10R21/controlfile/o1_mf_7ch80zvk_.ctl'
ORA-27072: File I/O error
Linux-x86_64 Error: 9: Bad file descriptor
Additional information: 4
Additional information: 3
Additional information: -1
Sat Mar  3 22:13:20 2012
Errors in file /s01/admin/G10R21/bdump/g10r21_ckpt_3617.trc:
ORA-00221: error on write to control file
ORA-00206: error in writing (block 3, # blocks 1) of control file
ORA-00202: control file: '/s01/oradata/G10R21/controlfile/o1_mf_7ch80zvk_.ctl'
ORA-27072: File I/O error
Linux-x86_64 Error: 9: Bad file descriptor
Additional information: 4
Additional information: 3
Additional information: -1
Sat Mar  3 22:13:20 2012
CKPT: terminating instance due to error 221
Sat Mar  3 22:13:20 2012
Errors in file /s01/admin/G10R21/bdump/g10r21_psp0_3609.trc:
ORA-00221: error on write to control file
Sat Mar  3 22:13:20 2012
Errors in file /s01/admin/G10R21/bdump/g10r21_mman_3611.trc:
ORA-00221: error on write to control file
Instance terminated by CKPT, pid = 3617
ckpt strace :

close(16)                               = 0
gettimeofday({1330830800, 213316}, NULL) = 0
times(NULL)                             = 429677421
pwrite(16, "\25\302\0\0\3\0\0\0\0\0\0\0\0\0\1\4L\273\0\0\2\0\0\0\0\0\0\0\v\0\0\0"..., 16384, 49152) = -1 EBADF (Bad file descriptor)
gettimeofday({1330830800, 213988}, NULL) = 0
==========================================================================================

之后ckpt 会终止实例 , kill掉其他后台进程

times(NULL)                             = 429677424
open("/proc/3670/stat", O_RDONLY)       = 16
read(16, "3670 (oracle) S 1 3670 3670 0 -1"..., 999) = 239
close(16)                               = 0
kill(3670, SIGKILL)                     = 0
kill(3670, SIGCONT)                     = -1 ESRCH (No such process)
open("/proc/3673/stat", O_RDONLY)       = 16
read(16, "3673 (oracle) S 1 3673 3673 0 -1"..., 999) = 239
close(16)                               = 0
kill(3673, SIGKILL)                     = 0
kill(3673, SIGCONT)                     = -1 ESRCH (No such process)
open("/proc/3650/stat", O_RDONLY)       = 16
read(16, "3650 (oracle) S 1 3650 3650 0 -1"..., 999) = 239
..................................................................
open("/s01/admin/G10R21/bdump/alert_G10R21.log", O_WRONLY|O_CREAT|O_APPEND, 0664) = 8
writev(8, [{"Instance terminated by CKPT, pid"..., 39}, {"\n", 1}], 2) = 40
semctl(262145, 0, IPC_RMID, 0)          = 0
close(17)                               = 0
close(13)                               = 0
shmctl(2129924, IPC_RMID, 0)            = 0
shmdt(0x60000000)                       = 0
open("/proc/3617/stat", O_RDONLY)       = 13
read(13, "3617 (oracle) R 1 3617 3617 0 -1"..., 999) = 224
close(13)                               = 0
kill(3617, SIGKILL 

ckpt trace:

*** 2012-03-03 22:13:17.711
Received ORADEBUG command 'call close 16' from process Unix process pid: 3914, image:
ORA-00206: error in writing (block 3, # blocks 1) of control file
ORA-00202: control file: '/s01/oradata/G10R21/controlfile/o1_mf_7ch80zvk_.ctl'
ORA-27072: File I/O error
Linux-x86_64 Error: 9: Bad file descriptor
Additional information: 4
Additional information: 3
Additional information: -1
error 221 detected in background process
ORA-00221: error on write to control file
ORA-00206: error in writing (block 3, # blocks 1) of control file
ORA-00202: control file: '/s01/oradata/G10R21/controlfile/o1_mf_7ch80zvk_.ctl'
ORA-27072: File I/O error
Linux-x86_64 Error: 9: Bad file descriptor
Additional information: 4
Additional information: 3
Additional information: -1

alert log:

Errors in file /s01/admin/G10R21/bdump/g10r21_ckpt_3617.trc:
ORA-00206: error in writing (block 3, # blocks 1) of control file
ORA-00202: control file: '/s01/oradata/G10R21/controlfile/o1_mf_7ch80zvk_.ctl'
ORA-27072: File I/O error
Linux-x86_64 Error: 9: Bad file descriptor
Additional information: 4
Additional information: 3
Additional information: -1
Sat Mar  3 22:13:20 2012
Errors in file /s01/admin/G10R21/bdump/g10r21_ckpt_3617.trc:
ORA-00221: error on write to control file
ORA-00206: error in writing (block 3, # blocks 1) of control file
ORA-00202: control file: '/s01/oradata/G10R21/controlfile/o1_mf_7ch80zvk_.ctl'
ORA-27072: File I/O error
Linux-x86_64 Error: 9: Bad file descriptor
Additional information: 4
Additional information: 3
Additional information: -1
Sat Mar  3 22:13:20 2012
CKPT: terminating instance due to error 221
Sat Mar  3 22:13:20 2012
Errors in file /s01/admin/G10R21/bdump/g10r21_psp0_3609.trc:
ORA-00221: error on write to control file
Sat Mar  3 22:13:20 2012
Errors in file /s01/admin/G10R21/bdump/g10r21_mman_3611.trc:
ORA-00221: error on write to control file
Instance terminated by CKPT, pid = 3617
\v
ckpt strace :

close(16)                               = 0
gettimeofday({1330830800, 213316}, NULL) = 0
times(NULL)                             = 429677421
pwrite(16, "\25\302\0\0\3\0\0\0\0\0\0\0\0\0\1\4L\273\0\0\2\0\0\0\0\0\0\0\v\0\0\0"..., 16384, 49152) = -1 EBADF (Bad file descriptor)
gettimeofday({1330830800, 213988}, NULL) = 0
==========================================================================================

之后ckpt 会终止实例 , kill掉其他后台进程

times(NULL)                             = 429677424
open("/proc/3670/stat", O_RDONLY)       = 16
read(16, "3670 (oracle) S 1 3670 3670 0 -1"..., 999) = 239
close(16)                               = 0
kill(3670, SIGKILL)                     = 0
kill(3670, SIGCONT)                     = -1 ESRCH (No such process)
open("/proc/3673/stat", O_RDONLY)       = 16
read(16, "3673 (oracle) S 1 3673 3673 0 -1"..., 999) = 239
close(16)                               = 0
kill(3673, SIGKILL)                     = 0
kill(3673, SIGCONT)                     = -1 ESRCH (No such process)
open("/proc/3650/stat", O_RDONLY)       = 16
read(16, "3650 (oracle) S 1 3650 3650 0 -1"..., 999) = 239
..................................................................
open("/s01/admin/G10R21/bdump/alert_G10R21.log", O_WRONLY|O_CREAT|O_APPEND, 0664) = 8
writev(8, [{"Instance terminated by CKPT, pid"..., 39}, {"\n", 1}], 2) = 40
semctl(262145, 0, IPC_RMID, 0)          = 0
close(17)                               = 0
close(13)                               = 0
shmctl(2129924, IPC_RMID, 0)            = 0
shmdt(0x60000000)                       = 0
open("/proc/3617/stat", O_RDONLY)       = 13
read(13, "3617 (oracle) R 1 3617 3617 0 -1"..., 999) = 224
close(13)                               = 0
kill(3617, SIGKILL 

ckpt trace:

*** 2012-03-03 22:13:17.711
Received ORADEBUG command 'call close 16' from process Unix process pid: 3914, image:
ORA-00206: error in writing (block 3, # blocks 1) of control file
ORA-00202: control file: '/s01/oradata/G10R21/controlfile/o1_mf_7ch80zvk_.ctl'
ORA-27072: File I/O error
Linux-x86_64 Error: 9: Bad file descriptor
Additional information: 4
Additional information: 3
Additional information: -1
error 221 detected in background process
ORA-00221: error on write to control file
ORA-00206: error in writing (block 3, # blocks 1) of control file
ORA-00202: control file: '/s01/oradata/G10R21/controlfile/o1_mf_7ch80zvk_.ctl'
ORA-27072: File I/O error
Linux-x86_64 Error: 9: Bad file descriptor
Additional information: 4
Additional information: 3
Additional information: -1

alert log:

Errors in file /s01/admin/G10R21/bdump/g10r21_ckpt_3617.trc:
ORA-00206: error in writing (block 3, # blocks 1) of control file
ORA-00202: control file: '/s01/oradata/G10R21/controlfile/o1_mf_7ch80zvk_.ctl'
ORA-27072: File I/O error
Linux-x86_64 Error: 9: Bad file descriptor
Additional information: 4
Additional information: 3
Additional information: -1
Sat Mar  3 22:13:20 2012
Errors in file /s01/admin/G10R21/bdump/g10r21_ckpt_3617.trc:
ORA-00221: error on write to control file
ORA-00206: error in writing (block 3, # blocks 1) of control file
ORA-00202: control file: '/s01/oradata/G10R21/controlfile/o1_mf_7ch80zvk_.ctl'
ORA-27072: File I/O error
Linux-x86_64 Error: 9: Bad file descriptor
Additional information: 4
Additional information: 3
Additional information: -1
Sat Mar  3 22:13:20 2012
CKPT: terminating instance due to error 221
Sat Mar  3 22:13:20 2012
Errors in file /s01/admin/G10R21/bdump/g10r21_psp0_3609.trc:
ORA-00221: error on write to control file
Sat Mar  3 22:13:20 2012
Errors in file /s01/admin/G10R21/bdump/g10r21_mman_3611.trc:
ORA-00221: error on write to control file
Instance terminated by CKPT, pid = 3617
ckpt strace :

close(16)                               = 0
gettimeofday({1330830800, 213316}, NULL) = 0
times(NULL)                             = 429677421
pwrite(16, "\25\302\0\0\3\0\0\0\0\0\0\0\0\0\1\4L\273\0\0\2\0\0\0\0\0\0\0\v\0\0\0"..., 16384, 49152) = -1 EBADF (Bad file descriptor)
gettimeofday({1330830800, 213988}, NULL) = 0
==========================================================================================

之后ckpt 会终止实例 , kill掉其他后台进程

times(NULL)                             = 429677424
open("/proc/3670/stat", O_RDONLY)       = 16
read(16, "3670 (oracle) S 1 3670 3670 0 -1"..., 999) = 239
close(16)                               = 0
kill(3670, SIGKILL)                     = 0
kill(3670, SIGCONT)                     = -1 ESRCH (No such process)
open("/proc/3673/stat", O_RDONLY)       = 16
read(16, "3673 (oracle) S 1 3673 3673 0 -1"..., 999) = 239
close(16)                               = 0
kill(3673, SIGKILL)                     = 0
kill(3673, SIGCONT)                     = -1 ESRCH (No such process)
open("/proc/3650/stat", O_RDONLY)       = 16
read(16, "3650 (oracle) S 1 3650 3650 0 -1"..., 999) = 239
..................................................................
open("/s01/admin/G10R21/bdump/alert_G10R21.log", O_WRONLY|O_CREAT|O_APPEND, 0664) = 8
writev(8, [{"Instance terminated by CKPT, pid"..., 39}, {"\n", 1}], 2) = 40
semctl(262145, 0, IPC_RMID, 0)          = 0
close(17)                               = 0
close(13)                               = 0
shmctl(2129924, IPC_RMID, 0)            = 0
shmdt(0x60000000)                       = 0
open("/proc/3617/stat", O_RDONLY)       = 13
read(13, "3617 (oracle) R 1 3617 3617 0 -1"..., 999) = 224
close(13)                               = 0
kill(3617, SIGKILL 

ckpt trace:

*** 2012-03-03 22:13:17.711
Received ORADEBUG command 'call close 16' from process Unix process pid: 3914, image:
ORA-00206: error in writing (block 3, # blocks 1) of control file
ORA-00202: control file: '/s01/oradata/G10R21/controlfile/o1_mf_7ch80zvk_.ctl'
ORA-27072: File I/O error
Linux-x86_64 Error: 9: Bad file descriptor
Additional information: 4
Additional information: 3
Additional information: -1
error 221 detected in background process
ORA-00221: error on write to control file
ORA-00206: error in writing (block 3, # blocks 1) of control file
ORA-00202: control file: '/s01/oradata/G10R21/controlfile/o1_mf_7ch80zvk_.ctl'
ORA-27072: File I/O error
Linux-x86_64 Error: 9: Bad file descriptor
Additional information: 4
Additional information: 3
Additional information: -1

alert log:

Errors in file /s01/admin/G10R21/bdump/g10r21_ckpt_3617.trc:
ORA-00206: error in writing (block 3, # blocks 1) of control file
ORA-00202: control file: '/s01/oradata/G10R21/controlfile/o1_mf_7ch80zvk_.ctl'
ORA-27072: File I/O error
Linux-x86_64 Error: 9: Bad file descriptor
Additional information: 4
Additional information: 3
Additional information: -1
Sat Mar  3 22:13:20 2012
Errors in file /s01/admin/G10R21/bdump/g10r21_ckpt_3617.trc:
ORA-00221: error on write to control file
ORA-00206: error in writing (block 3, # blocks 1) of control file
ORA-00202: control file: '/s01/oradata/G10R21/controlfile/o1_mf_7ch80zvk_.ctl'
ORA-27072: File I/O error
Linux-x86_64 Error: 9: Bad file descriptor
Additional information: 4
Additional information: 3
Additional information: -1
Sat Mar  3 22:13:20 2012
CKPT: terminating instance due to error 221
Sat Mar  3 22:13:20 2012
Errors in file /s01/admin/G10R21/bdump/g10r21_psp0_3609.trc:
ORA-00221: error on write to control file
Sat Mar  3 22:13:20 2012
Errors in file /s01/admin/G10R21/bdump/g10r21_mman_3611.trc:
ORA-00221: error on write to control file
Instance terminated by CKPT, pid = 3617
ckpt strace :

close(16)                               = 0
gettimeofday({1330830800, 213316}, NULL) = 0
times(NULL)                             = 429677421
pwrite(16, "\25\302\0\0\3\0\0\0\0\0\0\0\0\0\1\4L\273\0\0\2\0\0\0\0\0\0\0\v\0\0\0"..., 16384, 49152) = -1 EBADF (Bad file descriptor)
gettimeofday({1330830800, 213988}, NULL) = 0
==========================================================================================

之后ckpt 会终止实例 , kill掉其他后台进程

times(NULL)                             = 429677424
open("/proc/3670/stat", O_RDONLY)       = 16
read(16, "3670 (oracle) S 1 3670 3670 0 -1"..., 999) = 239
close(16)                               = 0
kill(3670, SIGKILL)                     = 0
kill(3670, SIGCONT)                     = -1 ESRCH (No such process)
open("/proc/3673/stat", O_RDONLY)       = 16
read(16, "3673 (oracle) S 1 3673 3673 0 -1"..., 999) = 239
close(16)                               = 0
kill(3673, SIGKILL)                     = 0
kill(3673, SIGCONT)                     = -1 ESRCH (No such process)
open("/proc/3650/stat", O_RDONLY)       = 16
read(16, "3650 (oracle) S 1 3650 3650 0 -1"..., 999) = 239
..................................................................
open("/s01/admin/G10R21/bdump/alert_G10R21.log", O_WRONLY|O_CREAT|O_APPEND, 0664) = 8
writev(8, [{"Instance terminated by CKPT, pid"..., 39}, {"\n", 1}], 2) = 40
semctl(262145, 0, IPC_RMID, 0)          = 0
close(17)                               = 0
close(13)                               = 0
shmctl(2129924, IPC_RMID, 0)            = 0
shmdt(0x60000000)                       = 0
open("/proc/3617/stat", O_RDONLY)       = 13
read(13, "3617 (oracle) R 1 3617 3617 0 -1"..., 999) = 224
close(13)                               = 0
kill(3617, SIGKILL 

ckpt trace:

*** 2012-03-03 22:13:17.711
Received ORADEBUG command 'call close 16' from process Unix process pid: 3914, image:
ORA-00206: error in writing (block 3, # blocks 1) of control file
ORA-00202: control file: '/s01/oradata/G10R21/controlfile/o1_mf_7ch80zvk_.ctl'
ORA-27072: File I/O error
Linux-x86_64 Error: 9: Bad file descriptor
Additional information: 4
Additional information: 3
Additional information: -1
error 221 detected in background process
ORA-00221: error on write to control file
ORA-00206: error in writing (block 3, # blocks 1) of control file
ORA-00202: control file: '/s01/oradata/G10R21/controlfile/o1_mf_7ch80zvk_.ctl'
ORA-27072: File I/O error
Linux-x86_64 Error: 9: Bad file descriptor
Additional information: 4
Additional information: 3
Additional information: -1

alert log:

Errors in file /s01/admin/G10R21/bdump/g10r21_ckpt_3617.trc:
ORA-00206: error in writing (block 3, # blocks 1) of control file
ORA-00202: control file: '/s01/oradata/G10R21/controlfile/o1_mf_7ch80zvk_.ctl'
ORA-27072: File I/O error
Linux-x86_64 Error: 9: Bad file descriptor
Additional information: 4
Additional information: 3
Additional information: -1
Sat Mar  3 22:13:20 2012
Errors in file /s01/admin/G10R21/bdump/g10r21_ckpt_3617.trc:
ORA-00221: error on write to control file
ORA-00206: error in writing (block 3, # blocks 1) of control file
ORA-00202: control file: '/s01/oradata/G10R21/controlfile/o1_mf_7ch80zvk_.ctl'
ORA-27072: File I/O error
Linux-x86_64 Error: 9: Bad file descriptor
Additional information: 4
Additional information: 3
Additional information: -1
Sat Mar  3 22:13:20 2012
CKPT: terminating instance due to error 221
Sat Mar  3 22:13:20 2012
Errors in file /s01/admin/G10R21/bdump/g10r21_psp0_3609.trc:
ORA-00221: error on write to control file
Sat Mar  3 22:13:20 2012
Errors in file /s01/admin/G10R21/bdump/g10r21_mman_3611.trc:
ORA-00221: error on write to control file
Instance terminated by CKPT, pid = 3617
"..., 16384, 49152) = -1 EBADF (Bad file descriptor) gettimeofday({1330830800, 213988}, NULL) = 0 ========================================================================================== 之后ckpt 会终止实例 , kill掉其他后台进程 times(NULL) = 429677424 open("/proc/3670/stat", O_RDONLY) = 16 read(16, "3670 (oracle) S 1 3670 3670 0 -1"..., 999) = 239 close(16) = 0 kill(3670, SIGKILL) = 0 kill(3670, SIGCONT) = -1 ESRCH (No such process) open("/proc/3673/stat", O_RDONLY) = 16 read(16, "3673 (oracle) S 1 3673 3673 0 -1"..., 999) = 239 close(16) = 0 kill(3673, SIGKILL) = 0 kill(3673, SIGCONT) = -1 ESRCH (No such process) open("/proc/3650/stat", O_RDONLY) = 16 read(16, "3650 (oracle) S 1 3650 3650 0 -1"..., 999) = 239 .................................................................. open("/s01/admin/G10R21/bdump/alert_G10R21.log", O_WRONLY|O_CREAT|O_APPEND, 0664) = 8 writev(8, [{"Instance terminated by CKPT, pid"..., 39}, {"\n", 1}], 2) = 40 semctl(262145, 0, IPC_RMID, 0) = 0 close(17) = 0 close(13) = 0 shmctl(2129924, IPC_RMID, 0) = 0 shmdt(0x60000000) = 0 open("/proc/3617/stat", O_RDONLY) = 13 read(13, "3617 (oracle) R 1 3617 3617 0 -1"..., 999) = 224 close(13) = 0 kill(3617, SIGKILL ckpt trace: *** 2012-03-03 22:13:17.711 Received ORADEBUG command 'call close 16' from process Unix process pid: 3914, image: ORA-00206: error in writing (block 3, # blocks 1) of control file ORA-00202: control file: '/s01/oradata/G10R21/controlfile/o1_mf_7ch80zvk_.ctl' ORA-27072: File I/O error Linux-x86_64 Error: 9: Bad file descriptor Additional information: 4 Additional information: 3 Additional information: -1 error 221 detected in background process ORA-00221: error on write to control file ORA-00206: error in writing (block 3, # blocks 1) of control file ORA-00202: control file: '/s01/oradata/G10R21/controlfile/o1_mf_7ch80zvk_.ctl' ORA-27072: File I/O error Linux-x86_64 Error: 9: Bad file descriptor Additional information: 4 Additional information: 3 Additional information: -1 alert log: Errors in file /s01/admin/G10R21/bdump/g10r21_ckpt_3617.trc: ORA-00206: error in writing (block 3, # blocks 1) of control file ORA-00202: control file: '/s01/oradata/G10R21/controlfile/o1_mf_7ch80zvk_.ctl' ORA-27072: File I/O error Linux-x86_64 Error: 9: Bad file descriptor Additional information: 4 Additional information: 3 Additional information: -1 Sat Mar 3 22:13:20 2012 Errors in file /s01/admin/G10R21/bdump/g10r21_ckpt_3617.trc: ORA-00221: error on write to control file ORA-00206: error in writing (block 3, # blocks 1) of control file ORA-00202: control file: '/s01/oradata/G10R21/controlfile/o1_mf_7ch80zvk_.ctl' ORA-27072: File I/O error Linux-x86_64 Error: 9: Bad file descriptor Additional information: 4 Additional information: 3 Additional information: -1 Sat Mar 3 22:13:20 2012 CKPT: terminating instance due to error 221 Sat Mar 3 22:13:20 2012 Errors in file /s01/admin/G10R21/bdump/g10r21_psp0_3609.trc: ORA-00221: error on write to control file Sat Mar 3 22:13:20 2012 Errors in file /s01/admin/G10R21/bdump/g10r21_mman_3611.trc: ORA-00221: error on write to control file Instance terminated by CKPT, pid = 3617

 

 

这个例子告诉我们 oracle数据库实例虽然很复杂, 有时候甚至”神奇” , 但是在Unix上它本质上和其他简单如shell的程序一样是作为用户向OS 申请资源,处理IO的。 要充分理解Oracle为什么这样设计,在运行时又为什么那样处理, 那么肯定要对Unix环境的原理有所了解。 包括: 文件I/O 、进程控制和环境、信号、进程间通信等等知识。 我本人推荐在学习Oracle之余 可以对Unix上的C语言编程有所涉猎, 推荐大家读一读 Richard Stevens的《Unix环境高级编程(第2版)》,这本书教会了我 Unix上的思考方式。

Oracle Internal Event:10235 Heap Checking诊断事件

10235 (check memory manager internal structures) event内部诊断事件,当oracle SGA内存堆heap 管理操作发生错误时会触发该检测事件(For the sga heap, the internal error is signaled only when heap checking (event 10235) is enabled),其级别列表如下:

 

[oracle@nas ~]$ oerr ora 10235
10235, 00000, "check memory manager internal structures"
// *Cause:
// *Action:

List of Levels for Event 10235
Level 1: Check heap before heap is freed.
Level 2: Level 1 + fill allocation and frees with 0xff.
Level 3: level 2 + checks to make sure chunk belongs to heap prior to operation (free, grow, make chunk as
freeable with mark). Scan of extents, could get expensive
Level 4: level 3 + allocates permanent chunks as freeable chunks. This accomplishes two things:
all permanent allocations are in their own chunks and the comments for the permanent chunks appear in a heap dump.
The following levels should be OR'ed in together. For example, in order to use levels 2, 8 and 16,
event 10235 needs to be set to level 26.
Level 8: Check heap on every operation to the heap.
Level 16: Level 8 + check top PGA heap and SGA heap.
Level 32: Level 16 + check all heaps in the top pga heap, recursively; check SGA heap.
Level 64: Level 32 + check SGA heap if started up in single_process=true.
Level 256: align chunks at the bottom of a page and page protect the page that follows the chunk,
to catch writes off the end of a chunk, for all heaps.
Level 512: Like level 256, except for all heaps in the pga heap; so this event is useful for detecting
corruption in the pga heap and pga subheaps. This can also be enabled on specific heaps or chunks with
specific comments.
Level 65536: Enable CPM(Commented PerManent chunk) Keep comment for permanent allocation. It's useful for
detecting memory growth/overrun in permanent space. If you need to investigate the allocations for "library cache",
The combination of event 10049 level 10100 and this event will help to have more descriptive information.

 

注意只有当 怀疑oracle实例存在内存讹误(memory corruption)或者深入研究KGH堆管理内部原理的时候才有必要使用该10235事件,不要在生产库设置该诊断事件!! 不推荐在session会话级别设置该event 可能引起问题,具体的设置方法如下:

 

alter system set event=' 10235 trace name context forever,level 512 ' scope=spfile;
restart instance;

 

 

实际上绝大多数情况下我们不会用到10235 event, 即便是提交有关内存泄露(memory leak)或者内存讹误(memory corruption)的SR后 oracle Support 要求你上传一些heapdump的trace信息, 直接做heapdump转储也已经足够了, 具体用法如下:

 

 

alter session set event 'trace name immediate heapdump level <n>';

level 级别列表如下:

 1:          pga heap,  1025:            pga heap w/ contents
 2:          sga heap,  2050:            sga heap w/ contents
 4:          uga heap,  5000:            uga heap w/ contents
 8:          current call heap,  8200: current call heap w/ contents
16:         user call heap, 16400:     user call heap w/ contents
32:         large alloc heap, 32800:  large alloc heap w/ contents

若希望dump转储某个特定的subheap ,则先要知道该heap descriptor 的address地址

alter session set event 'trace name immediate headump_addr level <addr>';

Oracle为何会发生归档日志archivelog大小远小于联机重做日志online redo log size的情况?

《rac不到redo大小就切换是什么原因?》

 

这个帖子详细讨论了RAC或单机环境中归档日志没到online redolog size 大小就马上切换问题的成因, 主要是由于oracle内部算法所决定的,  一般不会造成实例 性能或可用性问题, 唯一的困扰可能是 生成了较多的小归档日志。

 

Know more about RBA redo block address

A “Redo Block Address” (RBA) describes a physical location within a redo log file. An RBA consists of the following:

Log sequence number
Block number within log file
Byte offset within block

The structure of the redo block address is given below :
1 – The number of the redo log.
[4 bytes]

2 – The block offset from the beginning of the redo log file.
[4 bytes]

3 – The byte offset within the block.
[2 bytes]

Log sequence number v$logmnr_contents.RBASQN
Block number within log file v$logmnr_contents.RBABLK
Byte offset within block v$logmnr_contents.RBABYTE
So in this case, RBA [0x19.2.10] maps to Log squence 25, Block number 2 with byte offset 16.
With this information you should be able to find the details from v$logmnr_contents.

In particular, the buffer header information includes two fields: low-RBA and high-SCN. The low-RBA field indicates the point in the redo-log corresponding to the block’s first update after it came into the buffer cache. The high-SCN field indicates the SCN of the last update to the block.

***************************************************************************
CHECKPOINT PROGRESS RECORDS
***************************************************************************
 (size = 8180, compat size = 8180, section max = 11, section in-use = 0,
  last-recid= 0, old-recno = 0, last-recno = 0)
 (extent = 1, blkno = 2, numrecs = 11)
THREAD #1 - status:0x2 flags:0x0 dirty:47
low cache rba:(0x37a.11c.0) on disk rba:(0x37a.151.0)
on disk scn: 0x0000.0045d2f2 12/19/2011 01:03:21
resetlogs scn: 0x0000.00080634 11/07/2011 04:08:50
heartbeat: 770246919 mount id: 2811012380
THREAD #2 - status:0x0 flags:0x0 dirty:0
low cache rba:(0x0.0.0) on disk rba:(0x0.0.0)
on disk scn: 0x0000.00000000 01/01/1988 00:00:00
resetlogs scn: 0x0000.00000000 01/01/1988 00:00:00
heartbeat: 0 mount id: 0

ORA-00600[kjpsod1]&ORA-44203错误一例

一套HPUX-Itanium平台上版本为10.2.0.2 的系统出现ORA-00600: internal error code, arguments: [kjpsod1], [], [], [], [], [], [], [],并伴随有”ORA-44203: timeout waiting for lock on cursor”.错误,详细的日志如下:

 

Database get error
Errors in file /s01/admin/udump/prod_ora_14084.trc:
ORA-00600: internal error code, arguments: [kjpsod1], [], [], [], [], [], [], []
ORA-44203: timeout waiting for lock on cursor
ORA-44203: timeout waiting for lock on cursor

44203, 0000, "timeout waiting for lock on cursor"
// *Document : Yes
// *Cause    : A timeout occured while waiting for a cursor to be compiled.  
//             This is usually caused by the SQL parse requiring access to
//             system resources which are locked by concurrently executing
//             sessions.
// *Action   : Investigate possible causes of resource contention. If
//             neccessary, contact support for additional information
//             on how to diagnose this problem.
/////////////////////////////////////////////////////////////////////////////
//    Reserving 44301 - 44400 for DBMS_SERVICES errors
/////////////////////////////////////////////////////////////////////////////

 

 

与该Internal error相关的知识如下:

 

CAUSE DETERMINATION
===================
A fix for “ORA-00600: internal error code, arguments: [kjpsod1],” issue is in a patch 5169475.

CAUSE JUSTIFICATION
===================
A fix for “ORA-00600: internal error code, arguments: [kjpsod1],” issue is in a patch 5169475.

PROPOSED SOLUTION(S)
====================
Apply patch 5169475

PROPOSED SOLUTION JUSTIFICATION(S)
==================================
A fix for “ORA-00600: internal error code, arguments: [kjpsod1],” issue is in a patch 5169475.

SOLUTION / ACTION PLAN
======================
— To implement the solution, please execute the following steps::
Apply patch 5169475

KNOWLEDGE CONTENT
=================
Errors in file /s01/admin/udump/prod_ora_14084.trc:
ORA-00600: internal error code, arguments: [kjpsod1], [], [], [], [], [], [], []
ORA-44203: timeout waiting for lock on cursor
ORA-44203: timeout waiting for lock on cursor

Solution: Apply patch 5169475

 Bug 5169475  HPUX-Itanium: trace files may not dump all memory
 This note gives a brief overview of bug 5169475.
 The content was last updated on: 30-APR-2008
 Click here for details of each of the sections below.
Affects:

    Product (Component)    Oracle Server (Rdbms)
    Range of versions believed to be affected    Versions >= 10.2.0.2 but < 11
    Versions confirmed as being affected    

        10.2.0.2

    Platforms affected    

        HPUX Itanium 64bit

     It is believed to be a regression in default behaviour thus:
       Regression introduced in 10.2.0.2

Fixed:

    This issue is fixed in    

        10.2.0.3 (Server Patch Set)
        11.1.0.6 (Base Release)

Symptoms:

Related To:

    Diagnostic Output Problem / Improvement

    Miscellaneous

Description

    This problem is introduced in 10.2.0.2 on HPUX Itanium.

    Trace files produced on HPUX Itanium may not dump all
    memory , may show memory as inaccessible (“**********”)
    when it is accessible and may abort early due with
    false errors such as
      ERROR, BAD EXTENT ADDRESS IN DS(800000010021e098)
    due to incorrect response from the internal slrac()
    function.

    This can make it hard to progress a problem as the trace
    may omit essential information.

    Sub-Note:
      See bug 5918964 for a similar issue on HPUX PA-Risc.

 

解决方案是在10.2.0.2 的基础上打5169475补丁(Patch 5169475: SLRAC() IS UNRELIABLE ON HPUX – FALSE KGHU ERRORS AND KGH MESSAGES),或者升级到Patchset 10.2.0.5+ psu 10.2.0.5.5。

 

 

如何绕过ORA-00701错误和降低bootstrap对象的高水位

如何绕过ORA-701错误来实施对数据库自举对象bootstrap object的一些修改呢?

 

[oracle@mlab1 ~]$ oerr ora 701
00701, 00000, "object necessary for warmstarting database cannot be altered"
// *Cause:  Attempt to alter or drop a database object (table, cluster, or
//          index) which are needed for warmstarting the database.
// *Action: None.

 

 

首先需要说明的是,这纯粹为了技术教学,在实际的产品环境中不要使用如下手段!!!

 

SQL> alter index SYS.I_H_OBJ#_COL# rebuild;
alter index SYS.I_H_OBJ#_COL# rebuild
*
ERROR at line 1:
ORA-00701: object necessary for warmstarting database cannot be altered

SQL> ! oerr ora 38003
38003, 00000, "CBO Disable column stats for the dictionary objects in recursive SQL"
// *Cause:
// *Action:

 

 

1. 可以通过38003 event禁用递归SQL中队数据字典对象字段信息的核实,但是注意设置该38003 event需要重启实例:

 

 

SQL> alter system set event='38003 trace name context forever, level 10' scope=spfile;

System altered.

SQL> shutdown immediate;
Database closed.
Database dismounted.
ORACLE instance shut down.
SQL> startup ;
ORACLE instance started.

Total System Global Area 1653518336 bytes
Fixed Size                  2228904 bytes
Variable Size             536874328 bytes
Database Buffers         1107296256 bytes
Redo Buffers                7118848 bytes
Database mounted.
Database opened.

SQL>  alter index SYS.I_H_OBJ#_COL# rebuild;

Index altered.

 

以上38003 event对于alter index rebuild这种操作绝大多数场景都适用,但是如果你想MOVE/SHRINK OBJ$表则做不到:

 

2. 使用第二种方法必须将数据库置于MIGRATE/UPDATE模式,这要求停机时间:

 

 

create table tab$                                             /* table table */
( obj#          number not null,                            /* object number */
  /* DO NOT CREATE INDEX ON DATAOBJ#  AS IT WILL BE UPDATED IN A SPACE
   * TRANSACTION DURING TRUNCATE */

cluster c_obj#(obj#)
/

shutdown immediate;
startup upgrade;

SQL> alter table c_obj# enable row movement;

Table altered.

SQL> alter table c_obj# shrink space;

Table altered.

 

由于tab$其实是c_obj的cluster table所以,只需要SHRINK C_OBJ#即可回收空间。 但是请注意如果为了回收几百兆的空间去操作这些关键性的自举对象是没有意义的, 千万不要在产品环境中这样操作!!

了解你所不知道的SMON功能(十):维护SMON_SCN_TIME字典基表

SMON后台进程的作用还包括维护SMON_SCN_TIME基表。

SMON_SCN_TIME基表用于记录过去时间段中SCN(system change number)与具体的时间戳(timestamp)之间的映射关系,因为是采样记录这种映射关系,所以SMON_SCN_TIME可以较为较为粗糙地(不精确地)定位某个SCN的时间信息。实际的SMON_SCN_TIME是一张cluster table簇表。

 

 

 

SMON_SCN_TIME时间映射表最大的用途是为闪回类型的查询(flashback type queries)提供一种将时间映射为SCN的途径(The SMON time mapping is mainly for flashback type queries to map a time to an SCN)。

 

Metalink文档<Error ORA-01466 while executing a flashback query. [ID 281510.1]>介绍了SMON更新SMON_SCN_TIME的规律:

  • 在版本10g中SMON_SCN_TIME每6秒钟被更新一次(In Oracle Database 10g, smon_scn_time is updated every 6 seconds hence that is the minimum time that the flashback query time needs to be behind the timestamp of the first change to the table.)
  • 在版本9.2中SMON_SCN_TIME每5分钟被更新一次(In Oracle Database 9.2, smon_scn_time is updated every 5 minutes hence the required delay between the flashback time and table properties change is at least 5 minutes.)

 

另外从10g开始SMON也会清理SMON_SCN_TIME中的记录了,SMON后台进程会每5分钟被唤醒一次,检查SMON_SCN_TIME在磁盘上的映射记录总数,若总数超过144000条,则会使用以下语句删除最老的一条记录(time_mp最小):

 

delete from smon_scn_time
 where thread = 0
   and time_mp = (select min(time_mp) from smon_scn_time where thread = 0)

 

若仅仅删除一条记录不足以获得足够的空间,那么SMON会反复多次执行以上DELETE语句。

 

触发场景

 

虽然Metalink文档<Error ORA-01466 while executing a flashback query. [ID 281510.1]>指出了在10g中SMON会以每6秒一次的频率更新SMON_SCN_TIME基表,但是实际观测可以发现更新频率与SCN的增长速率相关,在较为繁忙的实例中SCN的上升极快时SMON可能会以6秒一次的最短间隔频率更新 , 但是在空闲的实例中SCN增长较慢,则仍会以每5或10分钟一次频率更新,例如:

 

[oracle@vrh8 ~]$ ps -ef|grep smon|grep -v grep
oracle    3484     1  0 Nov12 ?        00:00:02 ora_smon_G10R21

SQL> select * from v$version;

BANNER
----------------------------------------------------------------
Oracle Database 10g Enterprise Edition Release 10.2.0.1.0 - 64bi
PL/SQL Release 10.2.0.1.0 - Production
CORE    10.2.0.1.0      Production
TNS for Linux: Version 10.2.0.1.0 - Production
NLSRTL Version 10.2.0.1.0 - Production

SQL> select * from global_name;

GLOBAL_NAME
--------------------------------------------------------------------------------
www.askmac.cn & www.askmac.cn

SQL> oradebug setospid 3484;
Oracle pid: 8, Unix process pid: 3484, image: oracle@vrh8.oracle.com (SMON)

SQL> oradebug event 10500 trace name context forever,level 10 : 10046 trace name context forever,level 12;
Statement processed.
SQL>
SQL> oradebug tracefile_name;
/s01/admin/G10R21/bdump/g10r21_smon_3484.trc

/* 等待一定时间 */

 

找出SMON trace文件中insert数据到SMON_SCN_TIME的记录:

 

 grep -A20 "insert into smon_scn_time" /s01/admin/G10R21/bdump/g10r21_smon_3484.trc

insert into smon_scn_time (thread, time_mp, time_dp, scn, scn_wrp, scn_bas,  num_mappings, tim_scn_map)
values (0, :1, :2, :3, :4, :5, :6, :7)
END OF STMT
PARSE #4:c=0,e=43,p=0,cr=0,cu=0,mis=0,r=0,dep=1,og=4,tim=1290280848899596
BINDS #4:
kkscoacd
 Bind#0
  oacdty=02 mxl=22(22) mxlc=00 mal=00 scl=00 pre=00
  oacflg=00 fl2=0001 frm=00 csi=00 siz=24 off=0
  kxsbbbfp=7fb29844edb8  bln=22  avl=06  flg=05
  value=767145793
 Bind#1
  oacdty=12 mxl=07(07) mxlc=00 mal=00 scl=00 pre=00
  oacflg=10 fl2=0001 frm=00 csi=00 siz=8 off=0
  kxsbbbfp=7fff023ae780  bln=07  avl=07  flg=09
  value="11/14/2011 0:3:13"
 Bind#2
  oacdty=02 mxl=22(04) mxlc=00 mal=00 scl=00 pre=00
  oacflg=10 fl2=0001 frm=00 csi=00 siz=24 off=0
  kxsbbbfp=7fff023ae70c  bln=22  avl=04  flg=09
  value=954389
 Bind#3
--
insert into smon_scn_time (thread, time_mp, time_dp, scn, scn_wrp, scn_bas,  num_mappings, tim_scn_map)
values (0, :1, :2, :3, :4, :5, :6, :7)
END OF STMT
PARSE #1:c=0,e=21,p=0,cr=0,cu=0,mis=0,r=0,dep=1,og=4,tim=1290281434933390
BINDS #1:
kkscoacd
 Bind#0
  oacdty=02 mxl=22(22) mxlc=00 mal=00 scl=00 pre=00
  oacflg=00 fl2=0001 frm=00 csi=00 siz=24 off=0
  kxsbbbfp=7fb29844edb8  bln=22  avl=06  flg=05
  value=767146393
 Bind#1
  oacdty=12 mxl=07(07) mxlc=00 mal=00 scl=00 pre=00
  oacflg=10 fl2=0001 frm=00 csi=00 siz=8 off=0
  kxsbbbfp=7fff023ae780  bln=07  avl=07  flg=09
  value="11/14/2011 0:13:13"
 Bind#2
  oacdty=02 mxl=22(04) mxlc=00 mal=00 scl=00 pre=00
  oacflg=10 fl2=0001 frm=00 csi=00 siz=24 off=0
  kxsbbbfp=7fff023ae70c  bln=22  avl=04  flg=09
  value=954720
 Bind#3
--
insert into smon_scn_time (thread, time_mp, time_dp, scn, scn_wrp, scn_bas,  num_mappings, tim_scn_map)
values (0, :1, :2, :3, :4, :5, :6, :7)
END OF STMT
PARSE #3:c=0,e=20,p=0,cr=0,cu=0,mis=0,r=0,dep=1,og=4,tim=1290281727955249
BINDS #3:
kkscoacd
 Bind#0
  oacdty=02 mxl=22(22) mxlc=00 mal=00 scl=00 pre=00
  oacflg=00 fl2=0001 frm=00 csi=00 siz=24 off=0
  kxsbbbfp=7fb29844e960  bln=22  avl=06  flg=05
  value=767146993
 Bind#1
  oacdty=12 mxl=07(07) mxlc=00 mal=00 scl=00 pre=00
  oacflg=10 fl2=0001 frm=00 csi=00 siz=8 off=0
  kxsbbbfp=7fff023ae780  bln=07  avl=07  flg=09
  value="11/14/2011 0:23:13"
 Bind#2
  oacdty=02 mxl=22(04) mxlc=00 mal=00 scl=00 pre=00
  oacflg=10 fl2=0001 frm=00 csi=00 siz=24 off=0
  kxsbbbfp=7fff023ae70c  bln=22  avl=04  flg=09
  value=954926
 Bind#3
insert into smon_scn_time (thread, time_mp, time_dp, scn, scn_wrp, scn_bas,  num_mappings, tim_scn_map)
values (0, :1, :2, :3, :4, :5, :6, :7)
END OF STMT
PARSE #4:c=0,e=30,p=0,cr=0,cu=0,mis=0,r=0,dep=1,og=4,tim=1290282313990553
BINDS #4:
kkscoacd
 Bind#0
  oacdty=02 mxl=22(22) mxlc=00 mal=00 scl=00 pre=00
  oacflg=00 fl2=0001 frm=00 csi=00 siz=24 off=0
  kxsbbbfp=7fb29844edb8  bln=22  avl=06  flg=05
  value=767147294
 Bind#1
  oacdty=12 mxl=07(07) mxlc=00 mal=00 scl=00 pre=00
  oacflg=10 fl2=0001 frm=00 csi=00 siz=8 off=0
  kxsbbbfp=7fff023ae780  bln=07  avl=07  flg=09
  value="11/14/2011 0:28:14"
 Bind#2
  oacdty=02 mxl=22(04) mxlc=00 mal=00 scl=00 pre=00
  oacflg=10 fl2=0001 frm=00 csi=00 siz=24 off=0
  kxsbbbfp=7fff023ae70c  bln=22  avl=04  flg=09
  value=955036
 Bind#3

 

 

可以通过以上INSERT语句的TIME_DP绑定变量值中发现其更新SMON_SCN_TIME的时间规律,一般为5或10分钟一次。这说明SMON_SCN_TIME的更细频率与数据库实例的负载有关,其最短的间隔是每6秒一次,最长的间隔为10分钟一次。

 

由于SMON_SCN_TIME的更新频率问题可能引起ORA-01466错误,详见:
Error ORA-01466 while executing a flashback query. [ID 281510.1]

 

由于SMON_SCN_TIME的数据不一致可能引起ORA-00600[6711]或频繁地执行”delete from smon_scn_time”删除语句,详见:
ORA-00600[6711]错误一例
High Executions Of Statement “delete from smon_scn_time…” [ID 375401.1]

 

SMON维护SMON_SCN_TIME时相关的Stack CALL,ktf_scn_time是更新SMON_SCN_TIME的主要函数:

 

ksedst ksedmp ssexhd kghlkremf kghalo kghgex kghalf kksLoadChild kxsGetRuntimeLock kksfbc
kkspsc0 kksParseCursor opiosq0 opiall0 opikpr opiodr rpidrus skgmstack rpidru rpiswu2 kprball
ktf_scn_time
ktmmon ktmSmonMain ksbrdp opirip opidrv sou2o opimai_real main main_opd_entry

 

SMON 还可能使用以下SQL语句维护SMON_SCN_TIME字典基表:

 

select smontabv.cnt,
       smontab.time_mp,
       smontab.scn,
       smontab.num_mappings,
       smontab.tim_scn_map,
       smontab.orig_thread
  from smon_scn_time smontab,
       (select max(scn) scnmax,
               count(*) + sum(NVL2(TIM_SCN_MAP, NUM_MAPPINGS, 0)) cnt
          from smon_scn_time
         where thread = 0) smontabv
 where smontab.scn = smontabv.scnmax
   and thread = 0

insert into smon_scn_time
  (thread,
   time_mp,
   time_dp,
   scn,
   scn_wrp,
   scn_bas,
   num_mappings,
   tim_scn_map)
values
  (0, :1, :2, :3, :4, :5, :6, :7)

update smon_scn_time
   set orig_thread  = 0,
       time_mp      = :1,
       time_dp      = :2,
       scn          = :3,
       scn_wrp      = :4,
       scn_bas      = :5,
       num_mappings = :6,
       tim_scn_map  = :7
 where thread = 0
   and scn = (select min(scn) from smon_scn_time where thread = 0)

delete from smon_scn_time
 where thread = 0
   and scn = (select min(scn) from smon_scn_time where thread = 0)

 

如何禁止SMON更新SMON_SCN_TIME基表

 

 

可以通过设置诊断事件event=’12500 trace name context forever, level 10’来禁止SMON更新SMON_SCN_TIME基表(Setting the 12500 event at system level should stop SMON from updating the SMON_SCN_TIME table.):

SQL>  alter system set events '12500 trace name context forever, level 10';

System altered.

 

一般我们不推荐禁止SMON更新SMON_SCN_TIME基表,因为这样会影响flashback Query闪回查询的正常使用,但是在某些异常恢复的场景中SMON_SCN_TIME数据讹误可能导致实例的Crash,那么可以利用以上12500事件做到不触发SMON_SCN_TIME被更新。

 

 

如何手动清除SMON_SCN_TIME的数据

 

因为SMON_SCN_TIME不是bootstrap自举核心对象,所以我们可以手动更新该表上的数据、及重建其索引。

如我在<ORA-00600[6711]错误一例>中介绍了因为SMON_SCN_TIME与其索引的数据不一致时,可以通过重建索引来解决问题:

connect / as sysdba
drop index smon_scn_time_scn_idx;
drop index smon_scn_time_tim_idx;
create unique index smon_scn_time_scn_idx on smon_scn_time(scn);
create unique index smon_scn_time_tim_idx on smon_scn_time(time_mp);
analyze table smon_scn_time validate structure cascade;

 

可以在设置了12500事件后手动删除SMON_SCN_TIME上的记录,重启实例后SMON会继续正常更新SMON_SCN_TIME。除非是因为SMON_SCN_TIME表上的记录与索引smon_scn_time_tim_idx或smon_scn_time_scn_idx上的不一致造成DELETE语句无法有效删除该表上的记录:文档<LOCK ON SYS.SMON_SCN_TIME [ID 747745.1]>说明了该问题,否则我们没有必要手动去清除SMON_SCN_TIME上的数据。

具体方法如下:

 

SQL> conn / as sysdba

/* Set the event at system level */

SQL> alter system set events '12500 trace name context forever, level 10';

/* Delete the records from SMON_SCN_TIME */

SQL> delete from smon_scn_time;

SQL> commit;

SQL> alter system set events '12500 trace name context off';

完成以上步骤后重启实例restart instance

shutdown immediate;
startup;

Oracle队列锁enq:US,Undo Segment

一般情况下 稳定系统中突然出现的Enq:US 是因为并发事务数突然变多导致的,而实际引发并发事务数变多的可能是某些DML语句遇到了性能问题,导致事务处理速度变慢,或者是commit变慢所致。

针对上述问题,enq:US本身只是性能问题所导致的“果”,而不是原因,一般解决了上述问题,enq:US的这个果也就消失了。

对于那些短期内无法解决根本原因的环境,可以通过如下手段调优Undo Segment的管理来减少US争用。

 

ALTER SYSTEM SET “_rollback_segment_count”= 2000;
–副作用是可能的数据库startup变慢
ALTER SYSTEM SET “_undo_autotune” = false;
–最好重建undo tabelspace
ALTER SYSTEM SET “_highthreshold_undoretention”=3600;
–1小时

 

 

有多少Undo enqueue resource?

每一个rollback segment对应一个 US enqueue 资源

 

有多少US enqueue lock?

当进程要求访问一个rollback segment RBS时对应一个US enqueue lock

 

谁使用该enqueue lock?

所有的前台进程,SMON和PMON

 

何时使用该US enqueue ?

US,Undo Segment enqueue的存在是为了对给定undo segment(rollback segment)的串行化DDL,该US enqueue 串行化以下操作:

– CREATE ROLLBACK SEGMENT
– DROP ROLLBACK SEGMENT
– ALTER ROLLBACK SEGMENT ONLINE
– ALTER ROLLBACK OFFLINE
– ALTER ROLLBACK SEGMENT SHRINK
– ALTER ROLLBACK SEGMENT STORAGE
– Offlining PENDING OFFLINE RBS by SMON
– SMON – abortive offline cleanup.
– STARTUP.

 

注意在UNDO_MANAGEMENT=AUTO的 AMU模式下当Undo Tablespace可用空间紧张时也可能引起剧烈的enq: US队列锁争用等待事件。

 

Id1, Id2 组合:

Undo Segment#, Always 0.

Lock Value Block:

No.

Init.ora Parameters:

gc_rollback_locks, rollback_segments.

Scope:

Local and Global Enqueue.

Deadlock Sensitive:

Yes.

Operations:

Synchronous.

沪ICP备14014813号-2

沪公网安备 31010802001379号