fast incremental backup failed on standby database

一套Linux上的11.1.0.7的physical standby物理备库在使用fast incremental backup进行高于0级的增量备份时会出现ORA-19648错误,其出错记录如下:

RMAN>  backup incremental level 1 database;

Starting backup at 22-MAR-11
using channel ORA_DISK_1
using channel ORA_DISK_2
channel ORA_DISK_1: starting incremental level 1 datafile backup set
channel ORA_DISK_1: specifying datafile(s) in backup set
input datafile file number=00002 name=/standby/oradata/SBDB2/datafile/o1_mf_sysaux_22m6ov92_.dbf
input datafile file number=00003 name=/standby/oradata/SBDB2/datafile/o1_mf_undotbs1_23m6ovap_.dbf
input datafile file number=00006 name=/standby/oradata/SBDB2/datafile/o1_mf_enc_25m6ovba_.dbf
channel ORA_DISK_1: starting piece 1 at 22-MAR-11
channel ORA_DISK_2: starting incremental level 1 datafile backup set
channel ORA_DISK_2: specifying datafile(s) in backup set
input datafile file number=00001 name=/standby/oradata/SBDB2/datafile/o1_mf_system_21m6ov92_.dbf
input datafile file number=00005 name=/standby/oradata/SBDB2/datafile/o1_mf_example_24m6ovar_.dbf
input datafile file number=00004 name=/standby/oradata/SBDB2/datafile/o1_mf_users_26m6ovba_.dbf
channel ORA_DISK_2: starting piece 1 at 22-MAR-11
RMAN-03009: failure of backup command on ORA_DISK_1 channel at 03/22/2011 20:20:28
ORA-19648: datafile 2: incremental-start SCN equals checkpoint SCN
ORA-19640: datafile checkpoint is SCN 1249353 time 03/09/2011 05:50:27
continuing other job steps, job failed will not be re-run
RMAN-00571: ===========================================================
RMAN-00569: =============== ERROR MESSAGE STACK FOLLOWS ===============
RMAN-00571: ===========================================================
RMAN-03009: failure of backup command on ORA_DISK_2 channel at 03/22/2011 20:22:33
ORA-19648: datafile 1: incremental-start SCN equals checkpoint SCN
ORA-19640: datafile checkpoint is SCN 1249352 time 03/09/2011 05:50:26

经过分析该ORA-19640->ORA-19648是由11.1.0.7上的Bug引起的,MOS已经确认其为Bug 9288598:

Affects:
Product (Component)	 Oracle Server (Rdbms)
Range of versions believed to be affected	 Versions BELOW 12.1
Versions confirmed as being affected	
11.1.0.7
Platforms affected	 Generic (all / most platforms affected)
Fixed:

This issue is fixed in	
12.1 (Future Release)
11.2.0.2 (Server Patch Set)

An RMAN Backup incremental level 1 can fail with ORA-19648 / ORA-19640 if the standby database is open read only rather than just skipping the datafile.
PROBLEM:
--------
Following error occurring during incremental backup:

Starting backup at 02-DEC-2009 20:37:44
allocated channel: ORA_SBT_TAPE_1
channel ORA_SBT_TAPE_1: SID=154 device type=SBT_TAPE
channel ORA_SBT_TAPE_1: Veritas NetBackup for Oracle - Release 6.5 
(2008052301)
channel ORA_SBT_TAPE_1: starting incremental level 1 datafile backup set
channel ORA_SBT_TAPE_1: specifying datafile(s) in backup set
input datafile file number=00002 
name=/data/ora_data01/KAHCB4P/KAHCB4P_sysaux_01.dbf
input datafile file number=00001 
name=/data/ora_data01/KAHCB4P/KAHCB4P_system_01.dbf
input datafile file number=00003 
name=/data/ora_data01/KAHCB4P/KAHCB4P_undo_01.dbf
input datafile file number=00004 
name=/data/ora_data01/KAHCB4P/KAHCB4P_users_01.dbf
input datafile file number=00005 
name=/data/ora_data01/KAHCB4P/KAHCB4P_tools_01.dbf
input datafile file number=00006 
name=/data/ora_data01/KAHCB4P/KAHCB4P_backup_test_01.dbf
channel ORA_SBT_TAPE_1: starting piece 1 at 02-DEC-2009 20:38:01
RMAN-3009: failure of backup command on ORA_SBT_TAPE_1 channel at 12/02/2009 
20:38:02
ORA-19648: datafile 2: incremental-start SCN equals checkpoint SCN
ORA-19640: datafile checkpoint is SCN 3911695 time 11/27/2009 12:36:48
continuing other job steps, job failed will not be re-run
channel ORA_SBT_TAPE_1: starting incremental level 1 datafile backup set
channel ORA_SBT_TAPE_1: specifying datafile(s) in backup set
including current control file in backup set
including current SPFILE in backup set

DIAGNOSTIC ANALYSIS:
--------------------
Same problem as Bug 6903819
Requested backport, but this hasn't fixed the problem.
Confirmed that the patch was installed correctly and relinked successfully.

RELEASE NOTES:
]]Backup incremental leve 1 fails with ORA-19648 and ORA-19640 when standby
]]database opened for read only.
*** 02/14/10 02:43 pm *** (ADD: Impact/Symptom->FEATURE UNUSABLE )

REDISCOVERY INFORMATION:
Backup incremental level 1 fails with ORA-19648 and the standby database is
open read-only.
WORKAROUND:
None

可以从上述Note中看到该Bug需要到11.2.0.2中才得到fix,那么在11.1.0.7上我们有什么办法能解决或者绕过该问题吗?
经过测试,我得到2种workaround的方法:
该ORA-19648错误首先可以通过在物理备库(physical standby)上停止介质恢复(MRP)进程的方式来workaround:

SQL>  alter database recover managed standby database cancel;
Database altered.

RMAN> backup incremental level 1 database;

Starting backup at 22-MAR-11
using target database control file instead of recovery catalog
allocated channel: ORA_DISK_1
channel ORA_DISK_1: SID=141 device type=DISK
allocated channel: ORA_DISK_2
channel ORA_DISK_2: SID=140 device type=DISK
channel ORA_DISK_1: starting incremental level 1 datafile backup set
channel ORA_DISK_1: specifying datafile(s) in backup set
input datafile file number=00002 name=/standby/oradata/SBDB2/datafile/o1_mf_sysaux_22m6ov92_.dbf
input datafile file number=00004 name=/standby/oradata/SBDB2/datafile/o1_mf_users_26m6ovba_.dbf
input datafile file number=00006 name=/standby/oradata/SBDB2/datafile/o1_mf_enc_25m6ovba_.dbf
channel ORA_DISK_1: starting piece 1 at 22-MAR-11
channel ORA_DISK_2: starting incremental level 1 datafile backup set
channel ORA_DISK_2: specifying datafile(s) in backup set
input datafile file number=00001 name=/standby/oradata/SBDB2/datafile/o1_mf_system_21m6ov92_.dbf
input datafile file number=00003 name=/standby/oradata/SBDB2/datafile/o1_mf_undotbs1_23m6ovap_.dbf
input datafile file number=00005 name=/standby/oradata/SBDB2/datafile/o1_mf_example_24m6ovar_.dbf
channel ORA_DISK_2: starting piece 1 at 22-MAR-11
channel ORA_DISK_1: finished piece 1 at 22-MAR-11
piece handle=/standby/backup/2km7suui_1_1.bak tag=TAG20110322T212538 comment=NONE
channel ORA_DISK_1: backup set complete, elapsed time: 00:00:33
channel ORA_DISK_2: finished piece 1 at 22-MAR-11
piece handle=/standby/backup/2lm7suv1_1_1.bak tag=TAG20110322T212538 comment=NONE
channel ORA_DISK_2: backup set complete, elapsed time: 00:00:27
Finished backup at 22-MAR-11

Starting Control File and SPFILE Autobackup at 22-MAR-11
piece handle=/standby/backup/c-157018592-20110322-01.bak comment=NONE
Finished Control File and SPFILE Autobackup at 22-MAR-11

该ORA-19648错误也可以通过禁用fast incremental backup的块跟踪(block change tracking)特性来绕过问题,当然在可能的情况下我们更推荐使用上面那种方法,因为毕竟还可以利用到fast incremental backup特性:

SQL> alter database recover managed standby database using current logfile disconnect;
Database altered.

SQL> select process,status from v$managed_standby;

PROCESS   STATUS
--------- ------------
ARCH	  CONNECTED
ARCH	  CONNECTED
ARCH	  CONNECTED
ARCH	  CONNECTED
MRP0	  APPLYING_LOG

SQL> select status from v$block_change_tracking;

STATUS
----------
ENABLED


SQL> alter database disable block change tracking;
Database altered.


RMAN>  backup incremental level 2 database;

Starting backup at 22-MAR-11
using channel ORA_DISK_1
using channel ORA_DISK_2
channel ORA_DISK_1: starting incremental level 2 datafile backup set
channel ORA_DISK_1: specifying datafile(s) in backup set
input datafile file number=00002 name=/standby/oradata/SBDB2/datafile/o1_mf_sysaux_22m6ov92_.dbf
input datafile file number=00004 name=/standby/oradata/SBDB2/datafile/o1_mf_users_26m6ovba_.dbf
input datafile file number=00006 name=/standby/oradata/SBDB2/datafile/o1_mf_enc_25m6ovba_.dbf
channel ORA_DISK_1: starting piece 1 at 22-MAR-11
channel ORA_DISK_2: starting incremental level 2 datafile backup set
channel ORA_DISK_2: specifying datafile(s) in backup set
input datafile file number=00001 name=/standby/oradata/SBDB2/datafile/o1_mf_system_21m6ov92_.dbf
input datafile file number=00003 name=/standby/oradata/SBDB2/datafile/o1_mf_undotbs1_23m6ovap_.dbf
input datafile file number=00005 name=/standby/oradata/SBDB2/datafile/o1_mf_example_24m6ovar_.dbf
channel ORA_DISK_2: starting piece 1 at 22-MAR-11
channel ORA_DISK_1: finished piece 1 at 22-MAR-11
piece handle=/standby/backup/2nm7sv8d_1_1.bak tag=TAG20110322T213052 comment=NONE
channel ORA_DISK_1: backup set complete, elapsed time: 00:00:15
channel ORA_DISK_2: finished piece 1 at 22-MAR-11
piece handle=/standby/backup/2om7sv8s_1_1.bak tag=TAG20110322T213052 comment=NONE
channel ORA_DISK_2: backup set complete, elapsed time: 00:00:15
Finished backup at 22-MAR-11

Starting Control File and SPFILE Autobackup at 22-MAR-11
piece handle=/standby/backup/c-157018592-20110322-02.bak comment=NONE
Finished Control File and SPFILE Autobackup at 22-MAR-11

Large Memory Footprints on AIX

Connor Mcdonald-一位Oracle极客为我们分享了一个AIX平台上11g独享服务进程内存占用过量的问题,该问题最后被确认为Bug”11G SERVER PROCESSES CONSUMING MUCH MORE MEMORY THAT 10G OR 9I”,相关文档如下:

 

Memory Footprint For Dedicated Server Processes More Than Doubled After 11g Upgrade On AIX Platform [ID 1246995.1]

Bug 9796810: 11G SERVER PROCESSES CONSUMING MUCH MORE MEMORY THAT 10G OR 9I

Bug 10190759: PROCESSES CONSUMING ADDITIONAL MEMORY DUE TO ‘USLA HEAP’

可以看到上述问题仅发生在从9i/10g升级到11g后,作为一个已确认的升级Bug值得我们大家去关注;最近几年这样的升级会越来越多,同时希望该Bug能在11.2.0.3中修复。

实际上我在10.2.0.3上就遇到过类似的Process Large Footprints问题:用户在打上一个one-off patch[6110331]后单个server process的rss量明显上升,主机的内存使用量大幅提高,虽然这个问题同样提交了SR,但最后没有确认为Bug;用户试图询问Oracle GCS关于rss上升的原因,但语焉而不详。

Search Criteria:AIX 11.2

Memory Footprint For Dedicated Server Processes More Than Doubled After 11g Upgrade On AIX Platform (Doc ID 1246995.1)

1. Have you installed the patch 10190759 ?

Review the note:
Memory Footprint For Dedicated Server Processes More Than Doubled After 11g Upgrade On AIX Platform (Doc ID 1246995.1)

If you have not installed the patch ?
–>>there is one available for 11.2.0.2.0, 11.2.0.2.2, 11.2.0.2.3

If you need me to review the patches you have installed you can upload the opatch listing?

opatch lsinventory -patch -detail

2. If you have already installed the patch 10190759 then

The additional memory seen allocated to oracle processes in the 11.2 release is a consequence of the additional link options added to the oracle link
line, -bexpfull and -brtllib. The two link options were specifically added in 11.2.0.1 to support the online patching feature.
Patch Name or Number: 10190759

 

Changes in the make file have been implemented such that you can relink without these options (-bexpfull and -brtllib) to avoid
additional memory overhead incurred by adding these options.These changes are available via a one-off patch.

This is a known bug: BUG:10190759 – PROCESSES CONSUMING ADDITIONAL MEMORY DUE TO ‘USLA HEAP’

Install  Patch: 10190759

expdp/impdp导入导出Long字段类型产生乱码问题

LONG作为一种数据类型,在LOB大对象出现之后就不再推荐使用了。Oracle对LONG的态度是废弃,以至于一些新的工具对LONG类型的支持也并不好。

例如10g中推出的数据泵功能data pump expdp/impdp,在2个字符集完全一致(ZHS16GBK AL16UTF16)的数据库间导入、导出LONG  字段也可能存在乱码,这个现象在10g中比较常见。

 

对于该expdp/impdp long字段产生乱码的问题, Workaroud 的是用 exp/imp 传统导入导出工具来替代expdp /impdp,虽然export/import工具的速度不如data pump,但是因为是传统工具所以对于LONG这种近乎废弃的数据类型支持较好。

在11g上测试了一下,该乱码问题似乎已经得到修复:

 

 


SQL*Plus: Release 11.2.0.3.0 Production on 星期五 2月 22 06:13:53 2013

Copyright (c) 1982, 2011, Oracle.  All rights reserved.


Connected to:
Oracle Database 11g Enterprise Edition Release 11.2.0.3.0 - 64bit Production
With the Partitioning, Real Application Clusters, Automatic Storage Management, OLAP,
Data Mining and Real Application Testing options


SQL> conn maclean/oracle
Connected.
SQL> 
SQL> 
SQL> 
SQL> 
SQL> create table tlong (t1 int, t2 long);

Table created.


SQL> insert into tlong values(1,'Maclean的测试');

1 row created.

SQL> commit;

Commit complete.

SQL> select t2 from tlong;

T2
--------------------------------------------------------------------------------
Maclean的测试



  1* select name,value$ from sys.props$ where name like '%CHARACTERSET%'
SQL> /

NAME                           VALUE$
------------------------------ --------------------------------------------------
NLS_CHARACTERSET               AL32UTF8
NLS_NCHAR_CHARACTERSET         UTF8


[oracle@vmac1 ~]$ expdp maclean/oracle dumpfile=tmp:tlong.dmp tables=maclean.tlong 

Export: Release 11.2.0.3.0 - Production on 星期五 2月 22 06:18:55 2013

Copyright (c) 1982, 2011, Oracle and/or its affiliates.  All rights reserved.

Connected to: Oracle Database 11g Enterprise Edition Release 11.2.0.3.0 - 64bit Production
With the Partitioning, Real Application Clusters, Automatic Storage Management, OLAP,
Data Mining and Real Application Testing options
启动 "MACLEAN"."SYS_EXPORT_TABLE_01":  maclean/******** dumpfile=tmp:tlong.dmp tables=maclean.tlong 
正在使用 BLOCKS 方法进行估计...
处理对象类型 TABLE_EXPORT/TABLE/TABLE_DATA
使用 BLOCKS 方法的总估计: 64 KB
处理对象类型 TABLE_EXPORT/TABLE/TABLE
. . 导出了 "MACLEAN"."TLONG"                           5.421 KB       1 行
已成功加载/卸载了主表 "MACLEAN"."SYS_EXPORT_TABLE_01" 
******************************************************************************
MACLEAN.SYS_EXPORT_TABLE_01 的转储文件集为:
  /tmp/tlong.dmp
作业 "MACLEAN"."SYS_EXPORT_TABLE_01" 已于 06:19:14 成功完成



[oracle@vmac1 ~]$ impdp dumpfile=tmp:tlong.dmp remap_table=maclean.tlong:tlong1         

Import: Release 11.2.0.3.0 - Production on 星期五 2月 22 06:21:34 2013

Copyright (c) 1982, 2011, Oracle and/or its affiliates.  All rights reserved.

Username: maclean
Password: 

Connected to: Oracle Database 11g Enterprise Edition Release 11.2.0.3.0 - 64bit Production
With the Partitioning, Real Application Clusters, Automatic Storage Management, OLAP,
Data Mining and Real Application Testing options
已成功加载/卸载了主表 "MACLEAN"."SYS_IMPORT_FULL_01" 
启动 "MACLEAN"."SYS_IMPORT_FULL_01":  maclean/******** dumpfile=tmp:tlong.dmp remap_table=maclean.tlong:tlong1 
处理对象类型 TABLE_EXPORT/TABLE/TABLE
处理对象类型 TABLE_EXPORT/TABLE/TABLE_DATA
. . 导入了 "MACLEAN"."TLONG1"                          5.421 KB       1 行
作业 "MACLEAN"."SYS_IMPORT_FULL_01" 已于 06:21:39 成功完成


SQL> select * from tlong1;




        T1
----------
T2
--------------------------------------------------------------------------------
         1
Maclean的测试


Applying online patch on 11gr2

在Oracle 11g中提出了online patch(也叫hot patch)的特性;Hot patching允许我们在实例始终在线的情况下安装,启用或禁用一个修复补丁或者诊断补丁。针对7*24在线的业务系统,hot patch为我们提供了一条既能避免当机时间而又可以实施补丁的途径。在Oracle 11g中我们可以使用Opatch命令行工具针对任意数据库实施在线补丁(前提是该补丁是一个hot patch)。一般来说在线补丁(hot patches)只能是那些代码修改范围小且复杂度很低的补丁,举例来说它们往往是一些诊断补丁(diagnostic patches)或者小bug的修复(small bug fixes)。值得注意的是hot patching将需要消耗额外的内存,决定其消耗内存数量的因素是:1.补丁本身的大小,2.实例中的进程总数;举例来说某个补丁的大小正好为一个OS page的大小(一般为4kB),那么当实例中运行的进程总数为1000时,则该hot patching所额外消耗的内存总数为4kB*1000=4MB。

hot patches与常规Conventional patches对比具有可在线实施和安装快的特性,如下图:
online patching

在实际生产环境中,相信没有多少朋友实施过hot patching,一来国内目前还没有普及11g的使用,二来hot patching的数量在所有interim patch中只占极少数;一直以来都想写这样一篇关于hot patching的博文,唯一妨碍我写作的问题是在11.2.0.1下找不到可实施的online interim patch;以MOS->patches&upgrade目前的分类我们很难找出某个base release下可用的hot patch,当然这并不妨碍补丁专栏的使用。为了这个令人郁闷的问题,我特意去提交了一个Service Request,得到的回复:

I have tried to find the patches which support online patching on 11.2.0.1 version,
but I also can not find them because there are too many patches and there is no catalog for the patches
which support online patching, and I can only check the patch readme to confirm whether that patch supports online patching.

I found one patch which supports online patching, but this patch is for 11.2.0.2 version.
The patch no. is 10188727.

Sorry for the inconvenience brought to you. Hope the above update can help you.
If the above patch is not what you want, then please update the SR and I will continue for your issue.

这其中提到的patch 10188727,可以从Note<RDBMS Online Patching Aka Hot Patching [ID 761111.1]>中找到,另外一个可找到的hot patch是11.1.0.6上的6198642<DUMMY PATCH FOR TESTING DB11 PATCHING>,不过很可惜该补丁只有Linux x86一个平台版本的。所以我不得不先将11.2.0.1的测试库升级到了11.2.0.2上:

SQL> select * from v$version;

BANNER
--------------------------------------------------------------------------------
Oracle Database 11g Enterprise Edition Release 11.2.0.2.0 - 64bit Production
PL/SQL Release 11.2.0.2.0 - Production
CORE    11.2.0.2.0      Production
TNS for Linux: Version 11.2.0.2.0 - Production
NLSRTL Version 11.2.0.2.0 - Production

[maclean@rh2 OPatch]$ ps -ef|grep pmon|grep -v grep
maclean  22481     1  0 19:19 ?        00:00:00 ora_pmon_PROD

[maclean@rh2 OPatch]$ pmap -d 22481
22481:   ora_pmon_PROD
Address           Kbytes Mode  Offset           Device    Mapping
0000000000400000  180232 r-x-- 0000000000000000 008:00002 oracle
000000000b602000    1820 rwx-- 000000000b002000 008:00002 oracle
000000000b7c9000     300 rwx-- 000000000b7c9000 000:00000   [ anon ]
000000000dbef000     436 rwx-- 000000000dbef000 000:00000   [ anon ]
0000000060000000 2050048 rwxs- 0000000000000000 000:00009   [ shmid=0x550001 ]
0000003e09a00000     112 r-x-- 0000000000000000 008:00001 ld-2.5.so
0000003e09c1b000       4 r-x-- 000000000001b000 008:00001 ld-2.5.so
0000003e09c1c000       4 rwx-- 000000000001c000 008:00001 ld-2.5.so
0000003e09e00000    1336 r-x-- 0000000000000000 008:00001 libc-2.5.so
0000003e09f4e000    2044 ----- 000000000014e000 008:00001 libc-2.5.so
0000003e0a14d000      16 r-x-- 000000000014d000 008:00001 libc-2.5.so
0000003e0a151000       4 rwx-- 0000000000151000 008:00001 libc-2.5.so
0000003e0a152000      20 rwx-- 0000003e0a152000 000:00000   [ anon ]
0000003e0a200000     520 r-x-- 0000000000000000 008:00001 libm-2.5.so
0000003e0a282000    2044 ----- 0000000000082000 008:00001 libm-2.5.so
0000003e0a481000       4 r-x-- 0000000000081000 008:00001 libm-2.5.so
0000003e0a482000       4 rwx-- 0000000000082000 008:00001 libm-2.5.so
0000003e0a600000       8 r-x-- 0000000000000000 008:00001 libdl-2.5.so
0000003e0a602000    2048 ----- 0000000000002000 008:00001 libdl-2.5.so
0000003e0a802000       4 r-x-- 0000000000002000 008:00001 libdl-2.5.so
0000003e0a803000       4 rwx-- 0000000000003000 008:00001 libdl-2.5.so
0000003e0aa00000      88 r-x-- 0000000000000000 008:00001 libpthread-2.5.so
0000003e0aa16000    2044 ----- 0000000000016000 008:00001 libpthread-2.5.so
0000003e0ac15000       4 r-x-- 0000000000015000 008:00001 libpthread-2.5.so
0000003e0ac16000       4 rwx-- 0000000000016000 008:00001 libpthread-2.5.so
0000003e0ac17000      16 rwx-- 0000003e0ac17000 000:00000   [ anon ]
0000003e0ae00000      28 r-x-- 0000000000000000 008:00001 librt-2.5.so
0000003e0ae07000    2048 ----- 0000000000007000 008:00001 librt-2.5.so
0000003e0b007000       4 r-x-- 0000000000007000 008:00001 librt-2.5.so
0000003e0b008000       4 rwx-- 0000000000008000 008:00001 librt-2.5.so
0000003e0da00000      84 r-x-- 0000000000000000 008:00001 libnsl-2.5.so
0000003e0da15000    2044 ----- 0000000000015000 008:00001 libnsl-2.5.so
0000003e0dc14000       4 r-x-- 0000000000014000 008:00001 libnsl-2.5.so
0000003e0dc15000       4 rwx-- 0000000000015000 008:00001 libnsl-2.5.so
0000003e0dc16000       8 rwx-- 0000003e0dc16000 000:00000   [ anon ]
00002abec920e000       8 rwx-- 00002abec920e000 000:00000   [ anon ]
00002abec9210000       4 r-x-- 0000000000000000 008:00002 libodmd11.so
00002abec9211000    1024 ----- 0000000000001000 008:00002 libodmd11.so
00002abec9311000       4 rwx-- 0000000000001000 008:00002 libodmd11.so
00002abec9312000     360 r-x-- 0000000000000000 008:00002 libcell11.so
00002abec936c000    1020 ----- 000000000005a000 008:00002 libcell11.so
00002abec946b000      36 rwx-- 0000000000059000 008:00002 libcell11.so
00002abec9474000       4 rwx-- 00002abec9474000 000:00000   [ anon ]
00002abec9475000     848 r-x-- 0000000000000000 008:00002 libskgxp11.so
00002abec9549000    1024 ----- 00000000000d4000 008:00002 libskgxp11.so
00002abec9649000       8 rwx-- 00000000000d4000 008:00002 libskgxp11.so
00002abec9665000       4 rwx-- 00002abec9665000 000:00000   [ anon ]
00002abec9666000    2580 r-x-- 0000000000000000 008:00002 libnnz11.so
00002abec98eb000    1020 ----- 0000000000285000 008:00002 libnnz11.so
00002abec99ea000     264 rwx-- 0000000000284000 008:00002 libnnz11.so
00002abec9a2c000       8 rwx-- 00002abec9a2c000 000:00000   [ anon ]
00002abec9a2e000      96 r-x-- 0000000000000000 008:00002 libclsra11.so
00002abec9a46000    1020 ----- 0000000000018000 008:00002 libclsra11.so
00002abec9b45000       4 rwx-- 0000000000017000 008:00002 libclsra11.so
00002abec9b46000       4 rwx-- 00002abec9b46000 000:00000   [ anon ]
00002abec9b47000     136 r-x-- 0000000000000000 008:00002 libdbcfg11.so
00002abec9b69000    1020 ----- 0000000000022000 008:00002 libdbcfg11.so
00002abec9c68000       8 rwx-- 0000000000021000 008:00002 libdbcfg11.so
00002abec9c6a000    6832 r-x-- 0000000000000000 008:00002 libhasgen11.so
00002abeca316000    1020 ----- 00000000006ac000 008:00002 libhasgen11.so
00002abeca415000     136 rwx-- 00000000006ab000 008:00002 libhasgen11.so
00002abeca437000      24 rwx-- 00002abeca437000 000:00000   [ anon ]
00002abeca43d000       8 r-x-- 0000000000000000 008:00002 libskgxn2.so
00002abeca43f000    1020 ----- 0000000000002000 008:00002 libskgxn2.so
00002abeca53e000       4 rwx-- 0000000000001000 008:00002 libskgxn2.so
00002abeca53f000       4 rwx-- 00002abeca53f000 000:00000   [ anon ]
00002abeca540000     656 r-x-- 0000000000000000 008:00002 libocr11.so
00002abeca5e4000    1020 ----- 00000000000a4000 008:00002 libocr11.so
00002abeca6e3000      12 rwx-- 00000000000a3000 008:00002 libocr11.so
00002abeca6e6000     628 r-x-- 0000000000000000 008:00002 libocrb11.so
00002abeca783000    1024 ----- 000000000009d000 008:00002 libocrb11.so
00002abeca883000       8 rwx-- 000000000009d000 008:00002 libocrb11.so
00002abeca885000      44 r-x-- 0000000000000000 008:00002 libocrutl11.so
00002abeca890000    1020 ----- 000000000000b000 008:00002 libocrutl11.so
00002abeca98f000       4 rwx-- 000000000000a000 008:00002 libocrutl11.so
00002abeca990000       4 rwx-- 00002abeca990000 000:00000   [ anon ]
00002abeca991000       4 r-x-- 0000000000000000 008:00001 libaio.so.1.0.1
00002abeca992000    2044 ----- 0000000000001000 008:00001 libaio.so.1.0.1
00002abecab91000       4 rwx-- 0000000000000000 008:00001 libaio.so.1.0.1
00002abecab92000      16 rwx-- 00002abecab92000 000:00000   [ anon ]
00002abecab96000      20 r-x-- 0000000000000000 008:00001 libnuma.so.1
00002abecab9b000    2044 ----- 0000000000005000 008:00001 libnuma.so.1
00002abecad9a000       4 rwx-- 0000000000004000 008:00001 libnuma.so.1
00002abecad9b000    1280 rwx-- 00002abecad9b000 000:00000   [ anon ]
00002abecaef5000      40 r-x-- 0000000000000000 008:00001 libnss_files-2.5.so
00002abecaeff000    2044 ----- 000000000000a000 008:00001 libnss_files-2.5.so
00002abecb0fe000       4 r-x-- 0000000000009000 008:00001 libnss_files-2.5.so
00002abecb0ff000       4 rwx-- 000000000000a000 008:00001 libnss_files-2.5.so
00002abecb100000    1700 rwx-- 00002abecb100000 000:00000   [ anon ]
00002abecb2a9000      28 rwx-- 0000000000000000 000:00011 zero
00002abecb2b0000      64 rwx-- 0000000000000000 000:00011 zero
00002abecb2c0000      64 rwx-- 0000000000000000 000:00011 zero
00002abecb2d0000      64 rwx-- 0000000000000000 000:00011 zero
00002abecb2e0000      64 rwx-- 0000000000000000 000:00011 zero
00002abecb2f0000      64 rwx-- 0000000000000000 000:00011 zero
00002abecb300000     164 rwx-- 0000000000057000 000:00011 zero
00002abecb329000       8 rwx-- 00002abecb329000 000:00000   [ anon ]
00002abecb32b000       4 rwxs- 0000000000000000 008:00002 hc_PROD.dat
00002abecb32c000      40 r-x-- 0000000000000000 008:00002 libnque11.so
00002abecb336000    1020 ----- 000000000000a000 008:00002 libnque11.so
00002abecb435000       4 rwx-- 0000000000009000 008:00002 libnque11.so
00002abecb436000    1048 rwx-- 00002abecb436000 000:00000   [ anon ]
00007fff3342d000      84 rwx-- 00007ffffffea000 000:00000   [ stack ]
ffffffffff600000    8192 ----- 0000000000000000 000:00000   [ anon ]
mapped: 2291488K    writeable/private: 7840K    shared: 2050052K

[maclean@rh2 ~]$  cd $ORACLE_HOME/OPatch
[maclean@rh2 OPatch]$ unzip p10188727_112020_Linux-x86-64.zip
Archive:  p10188727_112020_Linux-x86-64.zip
   creating: 10188727/
   creating: 10188727/files/
   creating: 10188727/files/lib/
   creating: 10188727/files/lib/libserver11.a/
  inflating: 10188727/files/lib/libserver11.a/kkopq.o
   creating: 10188727/etc/
   creating: 10188727/etc/config/
  inflating: 10188727/etc/config/inventory.xml
  inflating: 10188727/etc/config/actions.xml
  inflating: 10188727/etc/config/deploy.xml
   creating: 10188727/etc/xml/
  inflating: 10188727/etc/xml/GenericActions.xml
  inflating: 10188727/etc/xml/ShiphomeDirectoryStructure.xml
  inflating: 10188727/README.txt
   creating: 10188727/online/
   creating: 10188727/online/files/
   creating: 10188727/online/files/hpatch/
  inflating: 10188727/online/files/hpatch/bug10188727.pch
   creating: 10188727/online/etc/
   creating: 10188727/online/etc/config/
  inflating: 10188727/online/etc/config/inventory.xml
  inflating: 10188727/online/etc/config/actions.xml
  inflating: 10188727/online/etc/config/deploy.xml
   creating: 10188727/online/etc/xml/
  inflating: 10188727/online/etc/xml/GenericActions.xml
  inflating: 10188727/online/etc/xml/ShiphomeDirectoryStructure.xml 

[maclean@rh2 OPatch]$ opatch query 10188727 -all
Invoking OPatch 11.2.0.1.1

Oracle Interim Patch Installer version 11.2.0.1.1
Copyright (c) 2009, Oracle Corporation.  All rights reserved.

Oracle Home       : /s01/product/11.2.0/dbhome_2
Central Inventory : /s01/oraInventory
   from           : /etc/oraInst.loc
OPatch version    : 11.2.0.1.1
OUI version       : 11.2.0.2.0
OUI location      : /s01/product/11.2.0/dbhome_2/oui
Log file location : /s01/product/11.2.0/dbhome_2/cfgtoollogs/opatch/opatch2011-02-17_20-05-21PM.log

Patch history file: /s01/product/11.2.0/dbhome_2/cfgtoollogs/opatch/opatch_history.txt

--------------------------------------------------------------------------------
 Patch created on 2 Dec 2010, 01:44:13 hrs PST8PDT
 Need to shutdown Oracle instances: true
 Patch is roll-backable: true
 Patch is a "Patchset Update": false
 Patch is a rolling patch: true
 Patch has sql related actions: false
 Patch is an online patch: false
 Patch is a portal patch: false
 Patch is an "auto-enabled" patch: false

 List of platforms supported:
   226: Linux x86-64

 List of bugs to be fixed:
   10188727: AFTER UPGRADING TO 11.2.0.2 SOME SQLS FAIL WITH ORA-7445 [KKEIDC()+180] ERROR

 This patch is a "singleton" patch.

 This patch belongs to the "db" product family 

 List of executables affected:
   ORACLE_HOME/bin/oracle

 List of optional components:
   oracle.rdbms:  11.2.0.2.0

 List of optional actions:
   Update /s01/product/11.2.0/dbhome_2/lib/libserver11.a with /kkopq.o
   cd /s01/product/11.2.0/dbhome_2/rdbms/lib
     ; make -f ins_rdbms.mk ioracle ORACLE_HOME=/s01/product/11.2.0/dbhome_2

  Possible XML representation of the patch:

     10188727

--------------------------------------------------------------------------------
OPatch succeeded.

[maclean@rh2 OPatch]$ ./opatch query -is_online_patch 10188727
Invoking OPatch 11.2.0.1.1

Oracle Interim Patch Installer version 11.2.0.1.1
Copyright (c) 2009, Oracle Corporation.  All rights reserved.

Oracle Home       : /s01/product/11.2.0/dbhome_2
Central Inventory : /s01/oraInventory
   from           : /etc/oraInst.loc
OPatch version    : 11.2.0.1.1
OUI version       : 11.2.0.2.0
OUI location      : /s01/product/11.2.0/dbhome_2/oui
Log file location : /s01/product/11.2.0/dbhome_2/cfgtoollogs/opatch/opatch2011-02-17_19-45-33PM.log

Patch history file: /s01/product/11.2.0/dbhome_2/cfgtoollogs/opatch/opatch_history.txt

--------------------------------------------------------------------------------
 Patch is an online patch: false

OPatch succeeded.

/* 虽然Opatch返回"online patch: false",但实际上这是一个online patch,
    造成以上问题的原因可能是Opatch版本低 */

[maclean@rh2 OPatch]$ tree 10188727
10188727
|-- README.txt
|-- etc
|   |-- config
|   |   |-- actions.xml
|   |   |-- deploy.xml
|   |   `-- inventory.xml
|   `-- xml
|       |-- GenericActions.xml
|       `-- ShiphomeDirectoryStructure.xml
|-- files
|   `-- lib
|       `-- libserver11.a
|           `-- kkopq.o
`-- online
    |-- etc
    |   |-- config
    |   |   |-- actions.xml
    |   |   |-- deploy.xml
    |   |   `-- inventory.xml
    |   `-- xml
    |       |-- GenericActions.xml
    |       `-- ShiphomeDirectoryStructure.xml
    `-- files
        `-- hpatch
            `-- bug10188727.pch

/* 可以从以上目录结构中看到包含了online子目录,我们可以直接观察其inventory.xml信息文件 */

[maclean@rh2 OPatch]$ cat 10188727/online/etc/config/inventory.xml |grep instance
 <instance_shutdown>false</instance_shutdown>
 <instance_shutdown_message></instance_shutdown_message>

/* 以上instance_shutdown为false说明其可以作为online patch实施 */

[maclean@rh2 OPatch]$ cd 10188727

/* opatch online patching的具体语法如下 */
opatch apply online -connectString  <SID>:<USERNAME>:<PASSWORD>:<NODE1>, \
<SID2>:<USERNAME>:<PASSWORD>:<NODE2>,... 

[maclean@rh2 10188727]$ opatch apply online -connectString PROD:sys:password
Invoking OPatch 11.2.0.1.1

Oracle Interim Patch Installer version 11.2.0.1.1
Copyright (c) 2009, Oracle Corporation.  All rights reserved.

Oracle Home       : /s01/product/11.2.0/dbhome_2
Central Inventory : /s01/oraInventory
   from           : /etc/oraInst.loc
OPatch version    : 11.2.0.1.1
OUI version       : 11.2.0.2.0
OUI location      : /s01/product/11.2.0/dbhome_2/oui
Log file location : /s01/product/11.2.0/dbhome_2/cfgtoollogs/opatch/opatch2011-02-17_19-49-44PM.log

Patch history file: /s01/product/11.2.0/dbhome_2/cfgtoollogs/opatch/opatch_history.txt

The patch should be applied/rolled back in '-all_nodes' mode only.
Converting the RAC mode to '-all_nodes' mode.
ApplySession applying interim patch '10188727' to OH '/s01/product/11.2.0/dbhome_2'

Running prerequisite checks...

OPatch detected non-cluster Oracle Home from the inventory and will patch the local system only.

Backing up files and inventory (not for auto-rollback) for the Oracle Home
Backing up files affected by the patch '10188727' for restore. This might take a while...
Backing up files affected by the patch '10188727' for rollback. This might take a while...

Patching component oracle.rdbms, 11.2.0.2.0...
The patch will be installed on active database instances.
Installing and enabling the online patch 'bug10188727.pch', on database 'PROD'.

ApplySession adding interim patch '10188727' to inventory

Verifying the update...
Inventory check OK: Patch ID 10188727 is registered in Oracle Home inventory with proper meta-data.
Files check OK: Files from Patch ID 10188727 are present in Oracle Home.
OPatch succeeded.

[maclean@rh2 OPatch]$ opatch lsinventory -detail|tail -21
Interim patches (1) :

Patch (online) 10188727: applied on Thu Feb 17 19:49:52 CST 2011
Unique Patch ID:  13202318
   Created on 2 Dec 2010, 01:44:15 hrs PST8PDT
   Bugs fixed:
     10188727
   Files Touched:
     bug10188727.pch --> ORACLE_HOME/hpatch/bug10188727.pch
   Instances Patched:
     PROD
   Patch Location in Inventory:
     /s01/product/11.2.0/dbhome_2/inventory/oneoffs/10188727
   Patch Location in Storage area:
     /s01/product/11.2.0/dbhome_2/.patch_storage/10188727_Dec_2_2010_01_44_15

告警日志alert.log中的信息:
Patch file bug10188727.pch is out of sync with oracle binary; performing fixup
Patch file bug10188727.pch has been synced with oracle binary
Patch bug10188727.pch Installed - Update #1
Patch bug10188727.pch Enabled - Update #2

[maclean@rh2 OPatch]$ ps -ef|grep pmon|grep -v grep
maclean  22481     1  0 19:19 ?        00:00:00 ora_pmon_PROD

[maclean@rh2 OPatch]$ pmap -d 22481|tail -10
00002abecb435000       4 rwx-- 0000000000009000 008:00002 libnque11.so
00002abecb436000    1048 rwx-- 00002abecb436000 000:00000   [ anon ]
00002abecb53c000       8 r-x-- 000000000c64e000 008:00002 oracle
00002abecb53e000    5052 r-x-- 00000000000bd000 008:00002 oracle
00002abecba2d000     140 r-x-- 0000000000000000 008:00002 bug10188727.so
00002abecba50000    1024 ----- 0000000000023000 008:00002 bug10188727.so
00002abecbb50000       8 rwx-- 0000000000023000 008:00002 bug10188727.so
00007fff3342d000      84 rwx-- 00007ffffffea000 000:00000   [ stack ]
ffffffffff600000    8192 ----- 0000000000000000 000:00000   [ anon ]
mapped: 2297720K    writeable/private: 7848K    shared: 2050052K

/* 再次观察pmon进程的内存信息,可以看到pmap输出中多出了2个小的oracle正文镜像和
    名为bug10188727.so的共享库文件,而该后台进程的private memory由原来的7840k上升到7848k,
    实际增幅为8k */

/* 此外可以通过oradebug命令将该online patch禁用,虽然并不推荐这样做 */

SQL> oradebug patch disable  bug10188727.pch;
Statement processed.

SQL>  oradebug patch disable bug10188727.pch;
Patch file already disabled

[maclean@rh2 hpatch]$ pmap -d 22481|tail -8
00002abecb53c000       8 r-x-- 000000000c64e000 008:00002 oracle
00002abecb53e000    5052 r-x-- 00000000000bd000 008:00002 oracle
00002abecba2d000     140 r-x-- 0000000000000000 008:00002 bug10188727.so
00002abecba50000    1024 ----- 0000000000023000 008:00002 bug10188727.so
00002abecbb50000       8 rwx-- 0000000000023000 008:00002 bug10188727.so
00007fff3342d000      84 rwx-- 00007ffffffea000 000:00000   [ stack ]
ffffffffff600000    8192 ----- 0000000000000000 000:00000   [ anon ]
mapped: 2297720K    writeable/private: 7848K    shared: 2050052K

/* 但disable掉online patch并不会导致在线补丁额外消耗的内存被回收 */

/* 当然我们还可以很方便地启用它 */

SQL> oradebug patch enable bug10188727.pch;
Statement processed.
SQL>  oradebug patch enable bug10188727.pch;
Patch file already enabled

[maclean@rh2 ~]$ cd $ORACLE_HOME/hpatch

[maclean@rh2 hpatch]$ ls -l
total 368
-rw-r--r-- 1 maclean oinstall 177874 Feb 17 19:49 bug10188727.pch
-rwx------ 1 maclean oinstall      1 Feb 17 19:49 bug10188727.pchPROD.fixup
-rwx------ 1 maclean oinstall 176850 Feb 17 19:49 bug10188727.so
-rw------- 1 maclean oinstall    712 Feb 17 20:13 orapatchPROD.cfg

/* 注意不要在实例启动时删除以上hpatch目录及目录下任何文件,这可能导致instance出现意外 */

[maclean@rh2 hpatch]$ cd $ORACLE_HOME/OPatch

/* 我们还能够将online patch rollback回滚掉,如以下语法 */
opatch rollback -id <patchID> -connectString  <SID>:<USERNAME>:<PASSWORD>:<NODE1>, \
<SID2>:<USERNAME>:<PASSWORD>:<NODE2>,  ...

[maclean@rh2 OPatch]$ opatch rollback -id 10188727 -connectString PROD:sys:password -invPtrLoc /s01/product/11.2.0/dbhome_2/oraInst.loc
Invoking OPatch 11.2.0.1.1

Oracle Interim Patch Installer version 11.2.0.1.1
Copyright (c) 2009, Oracle Corporation.  All rights reserved.

Oracle Home       : /s01/product/11.2.0/dbhome_2
Central Inventory : /s01/oraInventory
   from           : /s01/product/11.2.0/dbhome_2/oraInst.loc
OPatch version    : 11.2.0.1.1
OUI version       : 11.2.0.2.0
OUI location      : /s01/product/11.2.0/dbhome_2/oui
Log file location : /s01/product/11.2.0/dbhome_2/cfgtoollogs/opatch/opatch2011-02-17_20-18-00PM.log

Patch history file: /s01/product/11.2.0/dbhome_2/cfgtoollogs/opatch/opatch_history.txt

RollbackSession rolling back interim patch '10188727' from OH '/s01/product/11.2.0/dbhome_2'

The patch should be applied/rolled back in '-all_nodes' mode only.
Converting the RAC mode to '-all_nodes' mode.

Running prerequisite checks...

OPatch detected non-cluster Oracle Home from the inventory and will patch the local system only.

Backing up files affected by the patch '10188727' for restore. This might take a while...

Patching component oracle.rdbms, 11.2.0.2.0...
The patch will be removed from active database instances.
Disabling and removing online patch 'bug10188727.pch', on database 'PROD'

RollbackSession removing interim patch '10188727' from inventory

OPatch succeeded.

告警日志alert.log中的remove信息:
Patch bug10188727.pch Disabled - Update #5
Patch bug10188727.pch Removed - Update #6
Thu Feb 17 20:18:07 2011
Online patch bug10188727.pch has been disabled
Online patch bug10188727.pch has been removed

[maclean@rh2 trace]$ pmap -d 22481|tail -8
00002abecb53c000       8 r-x-- 000000000c64e000 008:00002 oracle
00002abecb53e000    5052 r-x-- 00000000000bd000 008:00002 oracle
00002abecba2d000     140 r-x-- 0000000000000000 008:00002 bug10188727.so
00002abecba50000    1024 ----- 0000000000023000 008:00002 bug10188727.so
00002abecbb50000       8 rwx-- 0000000000023000 008:00002 bug10188727.so
00007fff3342d000      84 rwx-- 00007ffffffea000 000:00000   [ stack ]
ffffffffff600000    8192 ----- 0000000000000000 000:00000   [ anon ]
mapped: 2297720K    writeable/private: 7848K    shared: 2050052K

/* 显然rollback回滚掉该online interim patch也不足以回收内存,唯一的方法是重启实例 */

SQL> startup force;

[maclean@rh2 trace]$ ps -ef|grep pmon|grep -v grep
maclean  25563     1  0 20:22 ?        00:00:00 ora_pmon_PROD

[maclean@rh2 trace]$ pmap -d 25563|tail -8
00002aaf2f324000       4 rwxs- 0000000000000000 008:00002 hc_PROD.dat
00002aaf2f325000      40 r-x-- 0000000000000000 008:00002 libnque11.so
00002aaf2f32f000    1020 ----- 000000000000a000 008:00002 libnque11.so
00002aaf2f42e000       4 rwx-- 0000000000009000 008:00002 libnque11.so
00002aaf2f42f000    1048 rwx-- 00002aaf2f42f000 000:00000   [ anon ]
00007fffdfa84000      84 rwx-- 00007ffffffea000 000:00000   [ stack ]
ffffffffff600000    8192 ----- 0000000000000000 000:00000   [ anon ]
mapped: 2291488K    writeable/private: 7840K    shared: 2050052K

/* That's ok!
    ash to ash, dust to dust! */

References:<RDBMS Online Patching Aka Hot Patching [ID 761111.1]>

解决Oracle错误ORA-15061一例

一套Linux上的11.2.0.1系统,告警日志中出现以下错误:

ORA-00202: control file: '+DATA/controlfile/current.256.7446483424'
ORA-17505: ksfdrsz:1 Failed to resize file to size 612 blocks
ORA-15061: ASM operation not supported [41]
WARNING: Oracle Managed File +FRA in the recovery area is orphaned by the control file.
The control file can not keep all recovery area files due to space limitation.
krse.c
Archived Log entry 200 added for thread 1 sequence 200 ID 0x3739c2f0 dest 1:

RMAN backup failed with

Rman backup error:
RMAN-03009: failure of backup command on ORA_DISK_1 channel at 
ORA-19510: failed to set size of 6800 blocks for file "+FRA" (block size=8192)
ORA-17505: ksfdrsz:1 Failed to resize file to size 6800 blocks
ORA-15061: ASM operation not supported [41]

[oracle@rh2 ~]$ oerr ora 15061
15061, 00000, "ASM operation not supported [%s]"
// *Cause:  An ASM operation was attempted that is invalid or not supported
//          by this version of the ASM instance.
// *Action: This is an internal error code that is used for maintaining
//          compatibility between software versions and should never be
//          visible to the user; contact Oracle support Services.
//

提交SR后,根据Oracle GCS确认为BUG:9788316:

1.The following error indicates that it failed to resize the controlfile to 612 blocks. If the DB_BLOCK_SIZE is 8192, 
then 612 blocks is not more than 5MB. According to the results of the query on V$ASM_DISKGROUP in 'results01.txt' file, 
the ASM diskgroup +DATA has 108605 MB free space. So, the ASM diskgroup +DATA has enough space for the 612 blocks.

2. By the way, please confirm whether you have recently applied PSU #1. Anyway, 
please try to relink the Oracle executables, as shown here. Before you run the "relink" command, 
make sure to shutdown both the ASM instance and target database.

$ORACLE_HOME/bin/relink all

3 After relinking the Oracle executables, please confirm whether you are still 
experiencing the same ORA-15061 error.

ORA-15061: ASM Operation Not Supported [41] After Apply PSU #1 (Doc ID 1126113.1)
ORA-15061 reported while doing a file operation with 11.1 or 11.2 ASM after PSU applied in database home (Doc ID 1070880.1)

Hdr: 9788316 11.2.0.1.1 RDBMS 11.2.0.1.0 ASM PRODID-5 PORTID-267
Abstract: AFTER APPLY PSU 1 (11.2.0.1.1) ON RDBMS HOME UNABLE TO RESIZE ASM DATAFILE.

*** 06/07/10 02:09 pm ***
—-
3-1827355081

PROBLEM:
——–
1) After apply PSU 1 (11.1.0.2.1) on the 11.2.0.1.0 RDBMS Oracle Home
customer is unable to resize an ASM datafile:
============================================================
SQL> alter database datafile
‘+DATA/brg13ed/datafile/ts_brg_users.1279.711064345’ resize 560M;
alter database datafile ‘+DATA/brg13ed/datafile/ts_brg_users.1279.711064345’
resize 560M
*
ERROR at line 1:
ORA-1237: cannot extend datafile 4
ORA-1110: data file 4: ‘+DATA/brg13ed/datafile/ts_brg_users.1279.711064345’
ORA-17505: ksfdrsz:1 Failed to resize file to size 71680 blocks
ORA-15061: ASM operation not supported [41]

============================================================

2) DB alertlog reports (alert_brg13ed1.log):
============================================================

Mon Jun 07 11:27:57 2010
Stopping background process CJQ0
Mon Jun 07 12:30:28 2010
alter database datafile ‘+DATA/brg13ed/datafile/ts_brg_users.1279.711064345’
resize 536870915
ORA-1237 signalled during: alter database datafile
‘+DATA/brg13ed/datafile/ts_brg_users.1279.711064345’ resize 536870915…
Mon Jun 07 13:07:40 2010
alter database datafile ‘+DATA/brg13ed/datafile/ts_brg_users.1279.711064345’
resize 560m
ORA-1237 signalled during: alter database datafile
‘+DATA/brg13ed/datafile/ts_brg_users.1279.711064345’ resize 560m…
Mon Jun 07 13:49:50 2010
alter database datafile ‘+DATA/brg13ed/datafile/ts_brg_users.1279.711064345’
resize 800M
ORA-1237 signalled during: alter database datafile
‘+DATA/brg13ed/datafile/ts_brg_users.1279.711064345’ resize 800M…
Mon Jun 07 13:58:51 2010
alter database datafile ‘+DATA/brg13ed/datafile/ts_brg_users.1279.711064345’
resize 560M
ORA-1237 signalled during: alter database datafile
‘+DATA/brg13ed/datafile/ts_brg_users.1279.711064345’ resize 560M…
Mon Jun 07 14:25:43 2010

============================================================

3) ASM alert.log does not report any problem.

DIAGNOSTIC ANALYSIS:
——————–
(see below)

WORKAROUND:
———–
None

RELATED BUGS:
————-

REPRODUCIBILITY:
—————-

TEST CASE:
———-

STACK TRACE:
————

SUPPORTING INFORMATION:
———————–

24 HOUR CONTACT INFORMATION FOR P1 BUGS:
—————————————-

DIAL-IN INFORMATION:
——————–

IMPACT DATE:
————

*** 06/07/10 02:09 pm ***
4) I can confirm that at (Jun 07 10:55:59 EDT 2010) customer installed the
Patch: 9355126 (on the Grid OH), which includes the fix for Bug: 8898852:
============================================================

Invoking OPatch 11.2.0.1.2

Oracle Interim Patch Installer version 11.2.0.1.2

Oracle Home : /u01/grid/oracle/product/grid
Central Inventory : /u01/app/oracle/oraInventory
from : /var/opt/oracle/oraInst.loc
OPatch version : 11.2.0.1.2
OUI version : 11.2.0.1.0
OUI location : /u01/grid/oracle/product/grid/oui
——————————————————————————

Installed Top-level Products (1):

Oracle Grid Infrastructure
11.2.0.1.0
There are 1 products installed in this Oracle Home.

Installed Products (87):

There are 87 products installed in this Oracle Home.

Interim patches (1) :

Patch 9355126 : applied on Mon Jun 07 10:55:59 EDT 2010
Unique Patch ID: 12175902
Created on 5 Feb 2010, 07:38:30 hrs PST8PDT
Bugs fixed:
8974548, 8898852
============================================================

5) Also, I can confirm that at (Mon May 03 00:01:14 EDT 2010) customer
installed the 11.2.0.1.1 Patch Set Update (PSU #1) which include as well the
patch for Bug: 8898852 (on the RDBMS OH):
============================================================
Invoking OPatch 11.2.0.1.2

Oracle Interim Patch Installer version 11.2.0.1.2

Oracle Home : /u01/app/oracle/product/11.2.0/db1
Central Inventory : /u01/app/oracle/oraInventory
from : /var/opt/oracle/oraInst.loc
OPatch version : 11.2.0.1.2
OUI version : 11.2.0.1.0
OUI location : /u01/app/oracle/product/11.2.0/db1//oui

——————————————————————————

Installed Top-level Products (1):

Oracle Database 11g
11.2.0.1.0
There are 1 products installed in this Oracle Home.

Installed Products (134):

. 11.2.0.1.0
There are 134 products installed in this Oracle Home.

Interim patches (1) :

Patch 9352237 : applied on Mon May 03 00:01:14 EDT 2010
Unique Patch ID: 12366369
Created on 6 Apr 2010, 05:03:41 hrs PST8PDT
Bugs fixed:
8661168, 8769239, 8898852, 8801119, 9054253, 8706590, 8725286, 8974548
8778277, 8780372, 8769569, 9027691, 9454036, 9454037, 9454038, 8761974
7705591, 8496830, 8702892, 8639114, 8723477, 8729793, 8919682, 8818983
9001453, 8475069, 9328668, 8891929, 8798317, 8820324, 8733749, 8702535
8565708, 9036013, 8735201, 8684517, 8870559, 8773383, 8933870, 8812705
8405205, 8822365, 8813366, 8761260, 8790767, 8795418, 8913269, 8897784
8760714, 8717461, 8671349, 8775569, 8898589, 8861700, 8607693, 8642202
8780281, 9369797, 8780711, 8784929, 8834636, 9015983, 8891037, 8828328
8570322, 8832205, 8665189, 8717031, 8685253, 8718952, 8799099, 8633358
9032717, 9321701, 8588519, 8783738, 8796511, 8782971, 8756598, 9454385
8856497, 8703064, 9066116, 9007102, 8721315, 8818175, 8674263, 9352237
8753903, 8720447, 9057443, 8790561, 8733225, 9197917, 8928276, 8991997,
8837736
============================================================

6) But the problem persists:
============================================================
SQL> alter database datafile
‘+DATA/brg13ed/datafile/ts_brg_users.1279.711064345’ resize 560M;
alter database datafile ‘+DATA/brg13ed/datafile/ts_brg_users.1279.711064345’
resize 560M
*
ERROR at line 1:
ORA-1237: cannot extend datafile 4
ORA-1110: data file 4: ‘+DATA/brg13ed/datafile/ts_brg_users.1279.711064345’
ORA-17505: ksfdrsz:1 Failed to resize file to size 71680 blocks
ORA-15061: ASM operation not supported [41]

============================================================

7) 11.2.0.1.1 Patch Set Update (PSU #1) should include the fix for bug:
8898852 (ORA-15061: ASM operation not supported [41]).
This PSU contains a fix that adds an operation to the ASM protocol.
The problem is that the PSU has been applied correctly to the RDBMS
home, but the fix is not correctly applied to the ASM HOME.

So either install that patch to the ASM HOME, or if that has been done
check why it failed.

A common reason for failure is a permission problem. Check bug 9711074
and its duplicates for more background. And check documentation bug 8629483
for the solution.

How do I know the fix is not missing? This error message is the key:
ORA-15061: ASM operation not supported [41]
(especially with operation 41)

If the patch was applied to the grid home, then probably the relink failed
due to permission problems.

ORA-15061: ASM Operation Not Supported [41] After Apply PSU #1 [ID 1126113.1]

pplies to:
Oracle Server – Enterprise Edition – Version: 11.2.0.1 and later [Release: 11.2 and later ]
Information in this document applies to any platform.
Symptoms
1) After apply PSU 1 (11.2.0.1.1 ) on the Grid Infrastructure Oracle Home
customer is unable to resize an ASM datafile:

SQL> alter database datafile
‘+DATA/brg13ed/datafile/ts_brg_users.1279.711064345’ resize 560M;
alter database datafile ‘+DATA/brg13ed/datafile/ts_brg_users.1279.711064345’
resize 560M
*
ERROR at line 1:
ORA-1237: cannot extend datafile 4
ORA-1110: data file 4: ‘+DATA/brg13ed/datafile/ts_brg_users.1279.711064345’
ORA-17505: ksfdrsz:1 Failed to resize file to size 71680 blocks
ORA-15061: ASM operation not supported [41]

2) DB alertlog reports (alert_brg13ed1.log):

Mon Jun 07 11:27:57 2010
Stopping background process CJQ0
Mon Jun 07 12:30:28 2010
alter database datafile ‘+DATA/brg13ed/datafile/ts_brg_users.1279.711064345’
resize 536870915
ORA-1237 signalled during: alter database datafile
‘+DATA/brg13ed/datafile/ts_brg_users.1279.711064345’ resize 536870915…
Mon Jun 07 13:07:40 2010
alter database datafile ‘+DATA/brg13ed/datafile/ts_brg_users.1279.711064345’
resize 560m
ORA-1237 signalled during: alter database datafile
‘+DATA/brg13ed/datafile/ts_brg_users.1279.711064345’ resize 560m…
Mon Jun 07 13:49:50 2010
alter database datafile ‘+DATA/brg13ed/datafile/ts_brg_users.1279.711064345’
resize 800M
ORA-1237 signalled during: alter database datafile
‘+DATA/brg13ed/datafile/ts_brg_users.1279.711064345’ resize 800M…
Mon Jun 07 13:58:51 2010
alter database datafile ‘+DATA/brg13ed/datafile/ts_brg_users.1279.711064345’
resize 560M
ORA-1237 signalled during: alter database datafile
‘+DATA/brg13ed/datafile/ts_brg_users.1279.711064345’ resize 560M…
Mon Jun 07 14:25:43 2010

3) ASM alert.log does not report any problem.

Changes
PSU 1 (11.2.0.1.1 ) patchset was installed on the Grid Infrastructure Oracle Home
Cause

The PSU #1 (11.2.0.1.1 ) patch on the Grid Infrastructure Oracle Home was not correctly linked by opatch.

The PSU #1 (11.2.0.1.1 ) patch on the Grid Infrastructure Oracle Home needs to be relinked.
Solution
Relink the PSU #1 patch on the Grid Infrastructure Oracle Home as follow:

1) Shutdown the database instances.

2) Shutdown the ASM instance.

3) Relink the Grid Infrastructure Oracle Home as follow:

script /tmp/relink_GI.txt

env | sort

$ORACLE_HOME/bin/relink all

exit

4) Startup the ASM instance.

5) Startup the databases.

6) Resize the datafile:

SQL> alter database datafile ‘+DATA/brg13ed/datafile/ts_brg_users.1279.711064345’ resize 560M;

该bug一般是由于对grid infrastructure实施了PSU 1 (11.2.0.1.1 )后,Oracle binary没有正确link导致的,可行的解决方案是relink all,但这需要重启instance!

Oracle RAC内部错误:ORA-00600[keltnfy-ldmInit]一例

一套SUNOS上的2节点10.2.0.2 RAC系统日前出现ORA-00600: internal error code, arguments: [keltnfy-ldmInit], [46], [1], [], [], [], [], []内部错误,错误发生时系统操作人员误使用hostname命令修改了1号主机的主机名,之后陆续出现以上ora-00600错误,同时操作系统日志显示RAC CSS进程意外终止,具体日志如下:

================== OS Message=====================
Jan 10 11:15:10 cupd25k-a root: [ID 702911 user.error] Cluster Ready Services completed waiting on dependencies.
Jan 10 11:15:16 cupd25k-a root: [ID 702911 user.error] Duplicate Oracle CLSMON found. Killing and restarting it.
Jan 10 11:15:16 cupd25k-a root: [ID 702911 user.error] Oracle CSS daemon failed to start up. Check CRS logs for diagnostics.
Jan 10 11:15:16 cupd25k-a root: [ID 702911 user.error] Oracle CLSMON terminated with unexpected status 137. Respawning

/* 这里的Duplicate Oracle CLSMON found 因该指的是OCLSMON进程,
"In Oracle 10.2.0.2 and above there is an additional process called OCLSOMON
which monitors the CSS daemon for hangs or scheduling issues and can reboot a
node if there is a perceived hang. OCLSOMON is spawned in init.cssd and runs
as the Oracle user."
   oclsmon进程在10.2.0.2以后版本被引入,用以监视css进程,
   若发生hang或操作系统调度问题时该进程可能会reboot节点,
   oclsmon进程会被init.cssd脚本spawned.  */

==================oclsmon.log======================
2011-01-10 11:15:11.376
unspecified member number is (1)
Member 1 group OCLSMON_ in use. Is oclsmon already up?
2011-01-10 11:15:11.479
Internal Error Information:
  Category: 8
  Operation: skgxnreg: the member number is i
  Location: skgxnreg_7
  Other:
  Dep: 1
2011-01-10 11:15:11.737
unspecified member number is (1)
Member 1 group OCLSMON_ in use. Is oclsmon already up?
2011-01-10 11:15:11.751
Internal Error Information:
  Category: 8
  Operation: skgxnreg: the member number is i
  Location: skgxnreg_7
  Other:
  Dep: 1
2011-01-10 11:15:12.006
unspecified member number is (1)
Member 1 group OCLSMON_ in use. Is oclsmon already up?
2011-01-10 11:15:12.023
Internal Error Information:
  Category: 8
  Operation: skgxnreg: the member number is i
  Location: skgxnreg_7
  Other:
  Dep: 1
2011-01-10 11:15:12.278
unspecified member number is (1)
Member 1 group OCLSMON_ in use. Is oclsmon already up?
2011-01-10 11:15:12.293
Internal Error Information:
  Category: 8
  Operation: skgxnreg: the member number is i
  Location: skgxnreg_7
  Other:
  Dep: 1

/*  skgxn是Oracle Clusterware用以监视skgxn事件(即第三方CLUSTERWARE相关的事宜,他们应该有用sun的cluster);
    似乎是修改hostname导致了Oracle CSS出现了fatal error,并启动了一个以上的OCLSMON进程(Duplicate Oracle CLSMON found),
    最后"Oracle CSS daemon failed to start up. Check CRS logs for diagnostics",
    在Oracle instance启动的情况下25k-a节点的CSS进程意外终止,
    可能导致该节点上的所有实例的LMD(global Enqueue Service daemon)、LMON无法正常工作而导致实例hang住。*/

==========================alert.log====================
Errors in file /oracle/oracle/admin/BOCPCS/udump/bocpcs1_ora_12320.trc:
ORA-00600: internal error code, arguments: [keltnfy-ldmInit], [46], [1], [], [], [], [], []

=========================part of trace file===============
*** 2011-01-10 11:11:02.957
ksedmp: internal or fatal error
ORA-00600: internal error code, arguments: [keltnfy-ldmInit], [46], [1], [], [], [], [], []
Current SQL information unavailable - no session.
----- Call Stack Trace -----
calling              call     entry                argument values in hex
location             type     point                (? means dubious value)
-------------------- -------- -------------------- ----------------------------
ksedmp()+716         CALL     ksedst()             FFFFFFFF7FFF9D40 ?
                                                   000000000 ? 0FFFFFFFF ?
                                                   FFFFFFFF7FFF8EE8 ?
                                                   FFFFFFFF7FFFA640 ?
                                                   000000008 ?
kgerinv()+200        PTR_CALL 0000000000000000     000000002 ? 10638A1CC ?
                                                   000000001 ? 000000000 ?
                                                   10638A000 ? 10638A1CC ?
kgeasnmierr()+28     CALL     kgerinv()            106384B98 ? 000000000 ?
                                                   105D3B940 ? 000000002 ?
                                                   FFFFFFFF7FFFDFF0 ?
                                                   000001430 ?
keltnfy()+784        CALL     kgeasnmierr()        106384B98 ? 1064DCBF0 ?
                                                   105D3B940 ? 000000002 ?
                                                   000000000 ? 00000002E ?
kscnfy()+552         PTR_CALL 0000000000000000     10639B498 ? 38001E7A8 ?
                                                   1055AC5D0 ? 10639B498 ?
                                                   000102C00 ? 10638A1C0 ?
ksucrp()+2436        CALL     kscnfy()             000008000 ? 000808214 ?
                                                   100C4C220 ? 1055C6680 ?
                                                   00000000F ? 000000001 ?
opiino()+2056        CALL     ksucrp()             000106387 ? 380007608 ?
                                                   000000000 ? 000380000 ?
                                                   000106000 ? 106387618 ?
opiodr()+1488        PTR_CALL 0000000000000000     10555A000 ?
                                                   FFFFFFFF7FFFF1C8 ?
                                                   00010555A ? 000106000 ?
                                                   105C83000 ? 000000001 ?
opidrv()+828         CALL     opiodr()             106391000 ? 000000000 ?
                                                   106390DD8 ? 106390000 ?
                                                   106391BD0 ? 000106000 ?
sou2o()+80           CALL     opidrv()             106394358 ? 000000001 ?
                                                   00000003C ? 000000000 ?
                                                   00000003C ? 000106000 ?
opimai_real()+124    CALL     sou2o()              FFFFFFFF7FFFF788 ?
                                                   00000003C ? 000000004 ?
                                                   FFFFFFFF7FFFF7B0 ?
                                                   105C82000 ? 000105C82 ?
main()+152           CALL     opimai_real()        000000002 ?
                                                   FFFFFFFF7FFFF888 ?
                                                   103F1BBCC ? 10632DB10 ?
                                                   002411E44 ? 000014400 ?
_start()+380         CALL     main()               000000002 ? 000000008 ?
                                                   000000000 ?
                                                   FFFFFFFF7FFFF898 ?
                                                   FFFFFFFF7FFFF9A8 ?
                                                   FFFFFFFF7C700200 ?

/* 可以看到以上trace文件指出了no session,
    在服务进程启动阶段遭遇了该keltnfy-ldmInit内部错误*/

metalink文档Startup Database Produces Ora-00600: [Keltnfy-Ldminit] [ID 336447.1]
介绍了该内部错误一般由主机上的不当网络配置引起,很显然使用hostname命令修改了一个无法解析的
主机名时可能引发该ORA-00600[keltnfy-ldmInit]内部错误。

Applies to:
Oracle Server - Enterprise Edition - Version: 10.2.0.1 to 10.2.0.3 - Release: 10.2 to 10.2
Information in this document applies to any platform.
***Checked for relevance on 09-Jun-2010***
Symptoms

An startup nomount on Oracle 10g Release 2 database produces the following exception in alert log

Starting up ORACLE RDBMS Version: 10.2.0.1.0.
Errors in file /opt/oracle/10.2/admin/ORCL/udump/ORCL_ora_535.trc:
ORA-00600: internal error code, arguments: [keltnfy-ldmInit], [46], [1], [], [], [], [], []
USER: terminating instance due to error 600
Instance terminated by USER, pid = 535
Cause

The problem is related to getting host information.
In this case, ldmInit()/sldmInit() is failing with error 46 : LDMERR_HOST_NOT_FOUND

The following exception may also occur :

LDMERR_SOSD_INIT         OSD init failed to be specific in these OSD failures
 LDMERR_BAD_ADDR         bad address when system call gethostname failed
 LDMERR_HOST_NOT_FOUND   gethostbyname system call fails
 LDMERR_NO_SUPPORT       when specific address type is not supported

Development has fixed two bugs so far regarding this issue

Bug:5438154 - Abstract: ORA-600[KELTNFY-LDMINIT]  STARTING THE DB
Release Notes:
ldmInit returned LDMERR_HOST_NOT_FOUND for the machine huge alias list/address list
Workaround:
reduce the alais list of the machine

Bug:5486074 - Abstract: ORA-600 [KELTNFY-LDMINIT] WHEN DNS IS NOT AVAILABLE
Release Notes:
Internal error is raised by the Server Generated Alert subsystem when it can not determine Host Name or
Network Address. This can be caused by DNS server being unaavilable. 

Solution

The fix for 5486074 will not fix any underlying error from gethostbyname(), it just change the internal error to a warning message :

 "Warning: keltnfy call to ldmInit failed with error 46"

You will still need to fix the network config issue.  

These are the check you can do verify the host information 

      Check permission on /etc/hosts 

$ ls -l /etc/hosts
-rw-r--r--  2 root root 194 Oct 17  2006 /etc/hosts

      Check if /etc/hosts file is correctly configured

              ( all of this on one line ). 

Check the hostname:
$ hostname
$ ping `hostname`

Make sure you are able to ping the hostname
      Check if /etc/nodename is correctly configured

If you have DNS setup, ping is not a tool to diagnose DNS problem. A better tool to use is nslookup, dnsquery, or dig.

$ nslookup
$ nslookup
$ nslookup 

The forward and reverse lookup should succeed and return consistent address/info.  

 Check nsswitch.conf

$ more nsswitch.conf
hosts:      files dns
Make sure host lookup is also done through the /etc/hosts file and not just dns.  It is recommended that FILES come first before DNS.
Also, check the resolv.conf. This makes sure that the DNS is working properly.

显然在生产主机上使用hostname命令是危险的,因为你很难保证你在打字的时候不会因为同事的一下拍击而输错,有人说在生产环境中rm命令因该被禁用,那么这种特殊待遇对hostname命令也适用,我们可以用什么来代替hostname查看主机名呢?选择可以有非常多,这里我推荐一种:

-bash-3.00$ oslevel -r 
5300-07

-bash-3.00$ hostname
askmac.cn

-bash-3.00$ uname -n
askmac.cn

/* uname -n完全可以满足你的需要! */
That's great!

Oracle内部错误:ORA-00600[OSDEP_INTERNAL]一例

一套HP-UX上的9.2.0.5系统在shutdown abort时出现ORA-00600: internal error code, arguments: [OSDEP_INTERNAL], [], [], [], [], [], [], []内部错误,伴随有ORA-27302: failure occurred at: skgpwinit4,ORA-27303: additional information: attach to invalid skgp shared ctx,具体日志如下:

/opt/oracle/product/9.2.0.5/rdbms/log/ngende_ora_7669.trc
Oracle9i Enterprise Edition Release 9.2.0.5.0 - 64bit Production
With the Partitioning, OLAP and Oracle Data Mining options
JServer Release 9.2.0.5.0 - Production
ORACLE_HOME = /opt/oracle/product/9.2.0.5
System name: HP-UX
Node name: yictngd3
Release: B.11.23
Version: U
Machine: ia64
Instance name: nGende
Redo thread mounted by this instance: 0 
Oracle process number: 0
7669

*** 2010-09-08 00:10:02.985
ksedmp: internal or fatal error
ORA-00600: internal error code, arguments: [OSDEP_INTERNAL], [], [], [], [], [], [], []
ORA-27302: failure occurred at: skgpwinit4
ORA-27303: additional information: attach to invalid skgp shared ctx
Current SQL information unavailable - no session.

Call stack
--------------
ksedmp <- ksfdmp <- kgerinv <- kgerin <- kgerecoserr <- ksucrp <- ksucresg <- kpolna 
<- kpogsk <- opiodr <- ttcpip <- opitsk <- Cannot <- Cannot <- Cannot <- Cannot <- opiino 
<- opiodr <- opidrv <- sou2o <- main <- main_opd_entry

经查该内部错误与操作系统共享内存有关,相关的Note有:

Ora-00600: Internal Error Code, Arguments: [Osdep_internal] [ID 304027.1]
Applies to:
Oracle Server - Enterprise Edition - Version: 9.2.0.2 to 10.2.0.3 - Release: 9.2 to 10.2
Information in this document applies to any platform.
***Checked for relevance on 03-NOV-2010***

Getting ORA-600 [OSDEP_INTERNAL] errors while starting up the database:

ORA-00600: internal error code, arguments: [OSDEP_INTERNAL],
[], [], [], [], [], [], []
ORA-27302: failure occurred at: skgpwreset1
ORA-27303: additional information: invalid shared ctx
ORA-27146: post/wait initialization failed
ORA-27300: OS system dependent operation:semget failed with status: 28
ORA-27301: OS failure message: No space left on device
ORA-27302: failure occurred at: sskgpsemsper
Symptoms
Getting ORA-600 [OSDEP_INTERNAL]
Accompanied by the following errors
ORA-27302:Failure occured at: skgpwreset1
ORA-27303:additional information: invalid shared ctx
ORA-27146: post/wait initialization failed
ORA-27300: OS system dependent operation: segment failed with error 28
ORA-27301: OS system Failure message: No space left on device
ORA-27302: failure occured at: sskgpsemsper

Cause
The functions in the trace file generated point to the semaphore settings .
Smmns is set too low.

Solution
set semmns 32767
Arrange to make the changes persistent as per the Operating system then restart the server and check if the changes are persistent.
eg: Linux /etc/sysctl.conf

sem = semmsl semmns semopm semmni
kernel.sem = 256 32768 100 228

Getting ORA-00600 [OSDEP_INTERNAL]: Internal Error While Trying To Connect / As Sysdba [ID 253885.1]

Applies to:
Oracle Server - Enterprise Edition - Version: 9.2.0.3 and later   [Release: 9.2 and later ]
HP-UX PA-RISC (64-bit)
Symptoms
Getting following error while trying to connect as sysdba using sqlplus:

SQL> conn / as sysdba
ERROR:
ORA-01041: internal error. hostdef extension doesn't exist

Alert.log shows:

ORA-00600: internal error code, arguments: [OSDEP_INTERNAL], [], [], [], [], [],[], []
ORA-27302: failure occurred at: skgpwinit4
ORA-27303: additional information: attach to invalid skgp shared ctx
Cause
- Database was shutdown using "shutdown abort" option.
- Shared memory segment was not removed even though the instance was down.
Solution
+ Check which shared memory segments are owned by the oracle owner

Use the ipcs -bm command:

% ipcs -bm

m 34034336 0xf8f18468 --rw-r----- ORACLE dba 16777216

+ Delete the 'orphan' shared memory segments:

% ipcrm -m 34034336

If there is more than one instance running on the server and you are not sure how to identify the shared
memory segments then please contact support.

不恰当的设置OS VM参数可能导致该问题,而在HP-UX PA-RISC平台上使用'shotdown abort'命令时可能因为共享内存未能正常移除而出现该内部错误;因为实例还是以'abort'方式关闭的,仅仅是共享内存未能释放,所以只需要以ipcs->ipcrm等os命令将相应的共享内存段释放就可以了,不会造成其他影响。

Oracle内部错误:ORA-00600[15801], [1]一例

一套Sparc Solaris上的11.1.0.7系统,在创建索引时频繁出现ORA-00600: internal error code, arguments: [15801], [1], [], [], [], [], [], [], [], [], [], []内部错误,日志信息如下:

Tue Aug 17 17:34:21 2010
WARNING: Oracle executable binary mismatch detected.
Binary of new process does not match binary which started instance
issue alter system set "_disable_image_check" = true to disable these messages
Tue Aug 17 17:34:21 2010
Errors in file /u04/app/oracle/diag/rdbms/orahcmu/ORAHCMU/trace/ORAHCMU_p023_22262.trc (incident=12505):
ORA-00600: internal error code, arguments: [15801], [1], [], [], [], [], [], [], [], [], [], []
Incident details in: /u04/app/oracle/diag/rdbms/orahcmu/ORAHCMU/incident/incdir_12505/ORAHCMU_p023_22262_i12505.trc
Tue Aug 17 17:34:21 2010
Errors in file /u04/app/oracle/diag/rdbms/orahcmu/ORAHCMU/trace/ORAHCMU_p021_22258.trc (incident=12489):
ORA-00600: internal error code, arguments: [15801], [1], [], [], [], [], [], [], [], [], [], []
Incident details in: /u04/app/oracle/diag/rdbms/orahcmu/ORAHCMU/incident/incdir_12489/ORAHCMU_p021_22258_i12489.trc

Errors in file /u04/app/oracle/diag/rdbms/orahcmu/ORAHCMU/trace/ORAHCMU_p015_9328.trc (incident=19909):
ORA-00600: internal error code, arguments: [15801], [1], [], [], [], [], [], [], [], [], [], []
Errors in file /u04/app/oracle/diag/rdbms/orahcmu/ORAHCMU/trace/ORAHCMU_p043_9388.trc (incident=20133):
ORA-00600: internal error code, arguments: [15801], [1], [], [], [], [], [], [], [], [], [], []
Mon Aug 23 14:43:42 2010
Errors in file /u04/app/oracle/diag/rdbms/orahcmu/ORAHCMU/trace/ORAHCMU_p087_9668.trc (incident=20485):
ORA-00600: internal error code, arguments: [15801], [1], [], [], [], [], [], [], [], [], [], []
Mon Aug 23 14:43:42 2010
Errors in file /u04/app/oracle/diag/rdbms/orahcmu/ORAHCMU/trace/ORAHCMU_p012_9322.trc (incident=19885):
ORA-00600: internal error code, arguments: [15801], [1], [], [], [], [], [], [], [], [], [], []
Incident details in: /u04/app/oracle/diag/rdbms/orahcmu/ORAHCMU/incident/incdir_19789/ORAHCMU_ora_8602_i19789.trc
Mon Aug 23 14:43:43 2010
WARNING: Oracle executable binary mismatch detected.
Binary of new process does not match binary which started instance
issue alter system set "_disable_image_check" = true to disable these messages

Dump continued from file: /u04/app/oracle/diag/rdbms/orahcmu/ORAHCMU/trace/ORAHCMU_ora_8602.trc
ORA-00600: internal error code, arguments: [15801], [1], [], [], [], [], [], [], [], [], [], []

*** 2010-08-23 14:43:42.974
----- Current SQL Statement for this session (sql_id=00abhfx460qm9) -----
CREATE UNIQUE iNDEX PS_HM_BEN_GP_STG ON PS_HM_BEN_GP_STG (CAL_ID, GP_PAYGROUP, 
EMPLID, EMPL_RCD, HM_INCURRED_BY, HM_SUM_ASSURED) TABLESPACE PSINDEX STORAGE 
(INITIAL 40000 NEXT 100000 MAXEXTENTS UNLIMITED PCTINCREASE 0) PCTFREE 10 PARALLEL NOLOGGING

----- Call Stack Trace -----
ksedst1 ksedst dbkedDefDump dbgexPhaseII dbgexProcessError dbgePostErrorKGE kgeade kgerem
kxfpProcessError kxfpqidqr kxfpqdqr kxfxgs kxfxcp qerpxSendParse kxfpValidateSlaveGroup kxfpgsg
kxfrAllocSlaves kxfrialo kxfralo qerpx_rowsrc_start qerpxStart kdicrws kdicdrv opiexe opiosq0
kpooprx kpoal8 opiodr ttcpip opitsk opiino opiodr opidrv sou2o main



SO: 0x3bf0bbf20, type: 4, owner: 0x3bf5452d0, flag: INIT/-/-/0x00 if: 0x3 c: 0x3
proc=0x3bf5452d0, name=session, file=ksu.h LINE:10719 ID:, pg=0
(session) sid: 217 ser: 767 trans: 0x3bc0660f8, creator: 0x3bf5452d0
flags: (0x8000041) USR/- flags_idl: (0x1) BSY/-/-/-/-/-
flags2: (0x44008) DDLT1/-
DID: , short-term DID:
txn branch: 0x0
oct: 9, prv: 0, sql: 0x3b5d14510, psql: 0x3b6d59820, user: 31/SYSADM
ksuxds FALSE at location: 0
service name: ORAHCMU
client details:
O/S info: user: Administrator, term: UJWALTPVM, ospid: 304:2892
machine: WORKGROUP\UJWALTPVM program: pside.exe
client info: ujwal,Administrator,UJWALTPVM,,pside.exe,
application name: pside.exe, hash value=2824484291
Current Wait Stack:
Not in wait; last wait ended 2.475286 sec ago
Wait State:
auto_close=0 flags=0x21 boundary=0x0/-1
Session Wait History:
0: waited for 'lient'
=c8, =1, =0
wait_id=10483 seq_num=10484 snap_id=1
wait times: snap=0.168502 sec, exc=0.168502 sec, total=0.168502 sec
wait times: max=2.000000 sec
wait counts: calls=1 os=1
occurred after 0.000903 sec of elapsed time
1: waited for ' waiting for ruleset'
=10010063, =1, =0
wait_id=10482 seq_num=10483 snap_id=1
wait times: snap=0.008580 sec, exc=0.008580 sec, total=0.008580 sec
wait times: max=2.000000 sec
wait counts: calls=1 os=1
occurred after 0.000731 sec of elapsed time
2: waited for ' waiting for ruleset'
=1001004f, =4, =0
wait_id=10481 seq_num=10482 snap_id=1
wait times: snap=0.000132 sec, exc=0.000132 sec, total=0.000132 sec
wait times: max=2.000000 sec
wait counts: calls=1 os=1
occurred after 0.000074 sec of elapsed time
3: waited for ' waiting for ruleset'
=1001004f, =3, =0
wait_id=10480 seq_num=10481 snap_id=1
wait times: snap=0.000002 sec, exc=0.000002 sec, total=0.000002 sec
wait times: max=2.000000 sec
wait counts: calls=1 os=1
occurred after 0.000065 sec of elapsed time

----- Session Cursor Dump -----
Current cursor: 1, pgadep=0

Open cursors(pls, sys, hwm, max): 3(0, 2, 64, 300)
NULL=1 SYNTAX=0 PARSE=0 BOUND=1 FETCH=0 ROW=1
Cached frame pages(total, free):
4k(14, 14), 8k(1, 1), 16k(1, 1), 32k(0, 0)

----- Current Cursor -----


----- Plan Table -----

============
Plan Table
============
----------------------------------------------------+-----------------------------------+-------------------------+
| Id | Operation | Name | Rows | Bytes | Cost | Time | TQ |IN-OUT|PQ Distrib |
----------------------------------------------------+-----------------------------------+-------------------------+
| 0 | CREATE INDEX STATEMENT | | | | 2 | | | | |
| 1 | PX COORDINATOR | | | | | | | | |
| 2 | PX SEND QC (ORDER) | :TQ10001 | 82 | 4510 | | |:Q1001| P->S |QC (ORDER) |
| 3 | INDEX BUILD UNIQUE | PS_HM_BEN_GP_STG| | | | |:Q1001| PCWP | |
| 4 | SORT CREATE INDEX | | 82 | 4510 | | |:Q1001| PCWP | |
| 5 | PX RECEIVE | | 82 | 4510 | 2 | 00:00:01 |:Q1001| PCWP | |
| 6 | PX SEND RANGE | :TQ10000 | 82 | 4510 | 2 | 00:00:01 |:Q1000| P->P |RANGE |
| 7 | PX BLOCK ITERATOR | | 82 | 4510 | 2 | 00:00:01 |:Q1000| PCWC | |
| 8 | TABLE ACCESS FULL | PS_HM_BEN_GP_STG| 82 | 4510 | 2 | 00:00:01 |:Q1000| PCWP | |
----------------------------------------------------+-----------------------------------+-------------------------+


----------------------------------------
Cursor#1(0xffffffff7ce31928) state=BOUND curiob=0xffffffff7ce57d28
curflg=4c fl2=0 par=0x0 ses=0x3bf0bbf20
----- Dump Cursor sql_id=00abhfx460qm9 xsc=0xffffffff7ce57d28 cur=0xffffffff7ce31928 -----
Dump Parent Cursor sql_id=00abhfx460qm9 phd=0x3b5d14510 plk=0x3b0bb3318
sqltxt(0x3b5d14510)=CREATE UNIQUE iNDEX PS_HM_BEN_GP_STG ON PS_HM_BEN_GP_STG 
(CAL_ID, GP_PAYGROUP, EMPLID, EMPL_RCD, HM_INCURRED_BY, HM_SUM_ASSURED) 
TABLESPACE PSINDEX STORAGE (INITIAL 40000 NEXT 100000 MAXEXTENTS UNLIMITED PCTINCREASE 0) 
PCTFREE 10 PARALLEL NOLOGGING
hash=616eaa631fc21f4c0029707748605a69
parent=0x3ae539590 maxchild=01 plk=0x3b0bb3318 ppn=n
cursor instantiation=0xffffffff7ce57d28 used=1282545779 exec_id=16777216 exec=1
child#0(0x3b5d05e10) pcs=0x3b678c128
clk=0x3b7e200d0 ci=0x3b5b204c8 pn=0x39955d2b8 ctx=0x3b86ee988
kgsccflg=0 llk[0xffffffff7ce57d30,0xffffffff7ce57d30] idx=0
xscflg=c0102276 fl2=c000400 fl3=2202008 fl4=100
Frames pfr 0xffffffff7ce67098 siz=85976 efr 0xffffffff7ce66fb8 siz=85960
Cursor frame dump
enxt: 7.0x00000168 enxt: 6.0x00008000 enxt: 5.0x00008000 enxt: 4.0x00003978
enxt: 3.0x00000490 enxt: 2.0x000000b8 enxt: 1.0x00000fa0
pnxt: 1.0x00000010
kxscphp=0xffffffff7dd80a18 siz=984 inu=312 nps=312
kxscwhp=0xffffffff7ddd2cc8 siz=8136 inu=6264 nps=3968
kxscefhp=0xffffffff7ce51468 siz=88456 inu=86128 nps=86128


FileName
----------------
ORAHCMU_ora_8602.trc

FileComment
----------------------


Oracle Support - August 27, 2010 6:13:39 PM GMT+08:00 [ODM Data Collection]
Name
--------
=== ODM Data Collection ===

=== ODM Data Collection ===

Trace file /u04/app/oracle/diag/rdbms/orahcmu/ORAHCMU/trace/ORAHCMU_p012_9322.trc


*** 2010-08-23 14:43:00.472
WARNING: Oracle executable binary mismatch detected.
Binary of new process does not match binary which started instance
issue alter system set "_disable_image_check" = true to disable these messages
startup image information
iid info sz=245752512 inode=65458 ts=0x4c6df668
current process image information
iid info sz=245750720 inode=65427 ts=0x4c7204b0
set _disable_image_check = TRUE to disable this check
qksceLinearToCe error

*** 2010-08-23 14:43:42.974
*** SESSION ID:(220.111) 2010-08-23 14:43:42.974
*** CLIENT ID:(ujwal) 2010-08-23 14:43:42.974
*** SERVICE NAME:(ORAHCMU) 2010-08-23 14:43:42.974

DDE: Problem Key 'ORA 600 [15801]' was flood controlled (0x6) (incident: 19885)
ORA-00600: internal error code, arguments: [15801], [1], [], [], [], [], [], [], [], [], [], []
kxfxdss
KXFXSLAVESTATE dump [0, 0]
(pgakid: 0 oercnt: 0 oerrcd: -2224892588)
kxfxdss
no current cursor context.
kxfxdss
no cursors.

关于binary no match的问题已知是由于在实例启动情况下relink导致的;这个case提交了SR,metalink认为ORA-600 15801一般由QC与服务子进程通信问题引起:

The ORA-600 15801 is reporting a communication problem between QC and slaves related with messages sent/received.
Alert log reports several of the following error on the ASM instance:
ORA-600: internal error code, arguments: [15801], [1], [], [], [], [], [], 
[]

last wait was for 'eq: Msg Fragment' 

DIAGNOSTIC ANALYSIS:
--------------------
There were also several of the following message in the alert log:
WARNING: Oracle executable binary mismatch detected.
 Binary of new process does not match binary which started instance
issue alter system set "_disable_image_check" = true to disable these 
messages

So, I asked the customer to set the "_disable_image_check" = true 
This had no impact on the ora-600 errors as expected.

ORA-600 [15801] is signalled when a message overflow occurs between  PQ 
processes.

WORKAROUND:
-----------
none 
RELATED BUGS:
-------------
none
REPRODUCIBILITY:
----------------
intermittent but frequently - occurs at all different times of the day.
STACK TRACE:
------------
*** ID:(29.2904) 2006-07-05 15:50:57.972
qksceLinearToCe error
*** 15:50:58.233
ksedmp: internal or fatal error
ORA-600: internal error code, arguments: [15801], [1], [], [], [], [], [], 
[]
----- Call Stack Trace -----

kxfxGeter qks3tttdefReceive kxfxsui kxfxsp kxfxmai kxfprdp 

    SO: 0x67977018, type: 4, owner: 0x6793f208, flag: INIT/-/-/0x00
    (session) sid: 29 trans: (nil), creator: 0x6793f208, flag: (c0000041) 
USR/- BSY/-/-/-/-/-
              DID: 0000-0012-0000FADB, short-term DID: 0000-0000-00000000
              txn branch: (nil)
              oct: 3, prv: 0, sql: (nil), psql: (nil), user: 0/SYS
    O/S info: user: oracle, term: , ospid: 4558, machine: 
    last wait for 'eq: Msg Fragment' blocking sess=0x(nil) seq=2 
wait_time=4441 seconds since wait started=3
                ct path write=1002ffff, ct path write temp=2, Network=0
    Dumping Session Wait History
     for 'eq: Msg Fragment' count=1 wait_time=4441
                ct path write=1002ffff, ct path write temp=2, Network=0
     for 'eq: Msg Fragment' count=1 wait_time=31
                ct path write=1002ffff, ct path write temp=1, Network=0
    temporary object counter: 0

最后这个case通过设置10235和10501事件后错误不再产生了:

event = "10235 trace name context forever, level 2"  

10235, 00000, "check memory manager internal structures" 

event = "10501 trace name context forever, level 1"
  
10501, 00000, "periodically check selected heap"
// *Cause:
// *Action:
//    Level:  0x01 PGA
//            0x02 SGA
//            0x04 UGA
//            0x08 current call
//            0x10 user call
//            0x20 large allocation pool

Oracle内部错误:ORA-00600:[4097]一例

一套Linux上的10.2.0.4系统在异常恢复后(使用_allow_resetlogs_corruption隐藏参数打开后遭遇ORA-00600:[40xx]相关的内部错误,创建并切换到了新的撤销表空间上)出现ORA-00600: internal error code, arguments: [4097], [], [], [], [], [], [], []内部错误,当该非内部错误(non-fatal)出现100次以上时会在告警日志alert.log中出现记录。
并有可能导致实例crash,具体日志如下:

 

如果自己搞不定可以找诗檀软件专业ORACLE数据库修复团队成员帮您恢复!

 

诗檀软件专业数据库修复团队

 

服务热线 : 13764045638   QQ号:47079569    邮箱:service@parnassusdata.com

 

 

Errors in file /s01/10gdb/admin/clinica/bdump/clinica_smon_21463.trc:
ORA-00600: internal error code, arguments: [4097], [], [], [], [], [], [], []
Tue Jan  4 23:13:19 2011
Non-fatal internal error happenned while SMON was doing logging scn->time mapping.
SMON encountered 1 out of maximum 100 non-fatal internal errors.

clinica_smon_21463.trc:
Dump of buffer cache at level 4 for tsn=1, rdba=8388633
BH (0x91fdf428) file#: 2 rdba: 0x00800019 (2/25) class: 19 ba: 0x91c62000
  set: 3 blksize: 8192 bsi: 0 set-flg: 0 pwbcnt: 0
  dbwrid: 0 obj: -1 objn: 0 tsn: 1 afn: 2
  hash: [fcf7dd68,fcf7dd68] lru: [91fdf5b8,91fdf398]
  ckptq: [NULL] fileq: [NULL] objq: [f5b53d60,f5b53d60]
  use: [fa694970,fa694970] wait: [NULL]
  st: XCURRENT md: SHR tch: 0
  flags: gotten_in_current_mode
  LRBA: [0x0.0.0] HSCN: [0xffff.ffffffff] HSUB: [65535]
  buffer tsn: 1 rdba: 0x00800019 (2/25)
  scn: 0x0000.0352d07c seq: 0x01 flg: 0x00 tail: 0xd07c2601
  frmt: 0x02 chkval: 0x0000 type: 0x26=KTU SMU HEADER BLOCK

/* 这里dump了一个tsn=1,file#=2的数据块,
    可以看到它的类型是KTU SMU HEADER BLOCK即某个回滚段头
*/

Hex dump of block: st=0, typ_found=1
........................
ORA-00600: internal error code, arguments: [4097], [], [], [], [], [], [], []
Current SQL statement for this session:
insert into smon_scn_time (thread, time_mp, time_dp, scn, scn_wrp, scn_bas,  num_mappings, tim_scn_map) 
values (0, :1, :2, :3, :4, :5, :6, :7)
----- Call Stack Trace -----
calling              call     entry                argument values in hex
location             type     point                (? means dubious value)
-------------------- -------- -------------------- ----------------------------
ksedst()+31          call     ksedst1()            000000000 ? 000000001 ?
                                                   7FFFF53BC160 ? 7FFFF53BC1C0 ?
                                                   7FFFF53BC100 ? 000000000 ?
ksedmp()+610         call     ksedst()             000000000 ? 000000001 ?
                                                   7FFFF53BC160 ? 7FFFF53BC1C0 ?
                                                   7FFFF53BC100 ? 000000000 ?
ksfdmp()+21          call     ksedmp()             000000003 ? 000000001 ?
                                                   7FFFF53BC160 ? 7FFFF53BC1C0 ?
                                                   7FFFF53BC100 ? 000000000 ?
kgeriv()+176         call     ksfdmp()             000000003 ? 000000001 ?
                                                   7FFFF53BC160 ? 7FFFF53BC1C0 ?
                                                   7FFFF53BC100 ? 000000000 ?
kgesiv()+119         call     kgeriv()             0068C97C0 ? 2ABDF1D42BF0 ?
                                                   000000000 ? 0F4A33EA0 ?
                                                   7FFFF53BC100 ? 000000000 ?
ksesic0()+209        call     kgesiv()             0068C97C0 ? 2ABDF1D42BF0 ?
                                                   000001001 ? 000000000 ?
                                                   7FFFF53BCEE0 ? 000000000 ?
ktugti()+3200        call     ksesic0()            000001001 ? 0068C9940 ?
                                                   000000000 ? 00000009A ?
                                                   000000010 ? 101010101010101 ?
ktsftcmove()+4149    call     ktugti()             0B73F111C ? 7FFFF53BD278 ?
                                                   7FFFF53BD280 ? 000000000 ?
                                                   7FFFF53BD27C ? 7FFFF53BD270 ?
ktsf_gsp()+1937      call     ktsftcmove()         00000000A ? 000000000 ?
                                                   000000000 ? 000000000 ?
                                                   7FFFF53BD27C ? 7FFFF53BD270 ?
kdtgsp()+512         call     ktsf_gsp()           000000000 ? 7FFFF53BF460 ?
                                                   000000024 ? 000000002 ?
                                                   7FFFF53BF460 ? 000000000 ?
kdccak()+111         call     kdtgsp()             2ABDF1D6A2D8 ? 7FFF00000000 ?
                                                   2ABDF1D68530 ? 000000002 ?
                                                   7FFFF53BF460 ? 000000000 ?
kdcgcs()+5419        call     kdccak()             2ABDF1D6A2D8 ? 000000001 ?
                                                   0F4A3BBA8 ? 000000000 ?
                                                   2ABDF1D6A370 ? 000000000 ?
kdcgsp()+1372        call     kdcgcs()             2ABDF1D6A2D8 ? 000000001 ?
                                                   0F4A3BBA8 ? 000000000 ?
                                                   2ABDF1D6A370 ? 000000000 ?
kdtInsRow()+1808     call     kdcgsp()             2ABDF1D6A2D8 ? 000000001 ?
                                                   0F4A3BBA8 ? 000000000 ?
                                                   2ABDF1D6A370 ? 000000000 ?
insrow()+342         call     kdtInsRow()          2ABDF1D6A2D8 ? 000000001 ?
                                                   0F4A3BBA8 ? 000000000 ?
                                                   2ABDF1D6A370 ? 000000000 ?
insdrv()+594         call     insrow()             2ABDF1D6A2D8 ? 7FFFF53BFCC8 ?
                                                   000000000 ? 0F4A33DE0 ?
                                                   2ABDF1D6A370 ? 000000000 ?
inscovexe()+404      call     insdrv()             2ABDF1D6A2D8 ? 7FFFF53BFCC8 ?
                                                   000000000 ? 2ABDF1D6D908 ?
                                                   2ABDF1D6A370 ? 000000000 ?
insExecStmtExecIniE  call     inscovexe()          0F4A33DE0 ? 0F4A3C230 ?
ngine()+85                                         7FFFF53C0EF0 ? 2ABDF1D69F20 ?
                                                   2ABDF1D6A370 ? 000000000 ?
insexe()+386         call     insExecStmtExecIniE  0F4A33DE0 ? 0F4A3C230 ?
                              ngine()              2ABDF1D69F20 ? 2ABDF1D69F20 ?
                                                   2ABDF1D6A370 ? 000000000 ?
opiexe()+9182        call     insexe()             0F4A333A8 ? 7FFFF53C0EF0 ?
                                                   0F4A33DE0 ? 2ABDF1D69F20 ?
                                                   2ABDF1D6A370 ? 2ABDF1D69F20 ?
opiall0()+1842       call     opiexe()             000000049 ? 000000003 ?
                                                   7FFFF53C12F8 ? 000000001 ?
..............

针对该ORA-00600:[4097]内部错误,metalink上Note [ID 1030620.6]介绍了一种workaround的方法:

An ORA-600 [4097] can be encountered through various activities that use 
rollback segments.

Solution Description: 
===================== 

The most likely cause of this is BUG 427389.  This BUG is fixed in
version 7.3.3.3.  The BUG is caused when Rollback Segments are dropped and 
recreated after a shutdown abort.  It is encountered through a very specific 
set of circumstances: 

When an instance has a rollback segment offline and the instance crashes, or 
the user does a shutdown abort, the rollback segment wrap number does not get 
updated.  If that segment is then dropped and recreated immediately after the 
instance is restarted, the wrap number could be lower than existing wrap 
numbers.  This will cause the ORA-600[4097] to occur in subsequent 
transactions using Rollback. 

To avoid encountering this bug, rollback segments should only be dropped and 
recreated after the instance has been shutdown normal and restarted.  If you 
have already encountered the bug, use the following workaround:  

   Select segment_name, segment_id from dba_rollback_segs; 

   Drop all Rollback Segments except for SYSTEM.  

   Recreate dummy (small) rollback segments with the same names in their place. 

   Then, recreate additional rollback segments you want to keep with their 
   permanent storage parameters.   

   Now drop the dummy ones. This should ensure that the segment_ids are not 
   reused. 

If you ever want to add a rollback segment you have to use the workaround steps
again.  If you do not fill the dummy slots you may see the problem re-appear.

我们可以尝试drop异常恢复前已有的可能存在问题的rollback segment来规避这个问题,虽然在10g下使用AMU(automatic managed undo)但仍可以做到这一点:

SQL> alter system set "_smu_debug_mode"=4;
System altered.

/* 设置SMU debug模式为4以便能够手动管理回滚段 */

SQL> set heading off 

SQL> select 'drop rollback segment "'||segment_name||'";' from dba_rollback_segs where segment_name!='SYSTEM';

drop rollback segment "_SYSSMU1$";
drop rollback segment "_SYSSMU2$";
drop rollback segment "_SYSSMU3$";
drop rollback segment "_SYSSMU4$";
drop rollback segment "_SYSSMU5$";
drop rollback segment "_SYSSMU6$";
drop rollback segment "_SYSSMU7$";
drop rollback segment "_SYSSMU8$";
drop rollback segment "_SYSSMU9$";
drop rollback segment "_SYSSMU10$";
drop rollback segment "_SYSSMU11$";
drop rollback segment "_SYSSMU12$";
drop rollback segment "_SYSSMU13$";
drop rollback segment "_SYSSMU14$";
drop rollback segment "_SYSSMU15$";
drop rollback segment "_SYSSMU16$";
drop rollback segment "_SYSSMU17$";
drop rollback segment "_SYSSMU18$";
drop rollback segment "_SYSSMU19$";
drop rollback segment "_SYSSMU20$";
drop rollback segment "_SYSSMU21$";
drop rollback segment "_SYSSMU22$";
drop rollback segment "_SYSSMU23$";
drop rollback segment "_SYSSMU24$";
drop rollback segment "_SYSSMU25$";
drop rollback segment "_SYSSMU26$";
drop rollback segment "_SYSSMU27$";
drop rollback segment "_SYSSMU28$";
drop rollback segment "_SYSSMU29$";
drop rollback segment "_SYSSMU30$";

30 rows selected.

/* 依次执行以上的drop rollback segment回滚段的命令
    注意当前撤销表空间上的回滚段仅能offline而无法drop掉,
    实际上我们需要做的也仅仅是把之前undo表空间上有问题的回滚段drop掉
*/

SQL> alter rollback segment "_SYSSMU30$" offline;
Rollback segment altered.

SQL> drop rollback segment "_SYSSMU30$";
drop rollback segment "_SYSSMU30$"
*
ERROR at line 1:
ORA-30025: DROP segment '_SYSSMU30$' (in undo tablespace) not allowed

SQL> alter rollback segment "_SYSSMU30$" online;
Rollback segment altered.

经过以上drop问题回滚段rollback segment后,系统不再出现ORA-00600:[4097]内部错误,实例恢复正常。在系统正常后,我们有必要重置之前所设的”_smu_debug_mode”UNDO管理debug模式的隐藏参数。

Oracle内部错误:ORA-00600:[6033]一例

一套HP-UX上的9.2.0.8系统,某条查询语句执行时出现ORA-00600: internal error code, arguments: [6033], [], [], [], [], [], [], []内部错误,错误trace信息如下:

*** SESSION ID:(583.18281) 2010-12-20 22:49:01.364
*** 2010-12-20 22:49:01.364
ksedmp: internal or fatal error
ORA-00600: internal error code, arguments: [6033], [], [], [], [], [], [], []
Current SQL statement for this session:
SELECT INTERFACE_HEADER_ID, DOCUMENT_SUBTYPE, AGENT_ID, VENDOR_SITE_ID FROM PO_HEADERS_INTERFACE WHE
RE WF_GROUP_ID = :B1 ORDER BY INTERFACE_HEADER_ID
----- PL/SQL Call Stack -----
object line object
handle number name
c0000001067e3328 4332 package body APPS.PO_AUTOCREATE_DOC
c0000000fd267060 1 anonymous block
c000000108fe4d60 1979 package body APPS.WF_ENGINE_UTIL
c000000108fe4d60 1745 package body APPS.WF_ENGINE_UTIL
c000000108fe4d60 1099 package body APPS.WF_ENGINE_UTIL
c000000108fe4d60 560 package body APPS.WF_ENGINE_UTIL
c000000108fe4d60 1863 package body APPS.WF_ENGINE_UTIL
c000000108fe4d60 1099 package body APPS.WF_ENGINE_UTIL
c000000108fe4d60 560 package body APPS.WF_ENGINE_UTIL
c000000108fe4d60 1863 package body APPS.WF_ENGINE_UTIL
c000000108fe4d60 1099 package body APPS.WF_ENGINE_UTIL
c000000108fe4d60 560 package body APPS.WF_ENGINE_UTIL
c000000108fe4d60 1863 package body
PL/SQL call stack truncated after 1024 bytes.
----- Call Stack Trace -----
calling call entry argument values in hex
location type point (? means dubious value)
-------------------- -------- -------------------- ----------------------------
ksedmp()+184 ? ksedst() C0000000CEB36420 ?
400000000147994B ?

已知的ORA-00600:[6033]错误一般和索引逻辑讹误相关,metalink上有相关的Note建议在出现该错误后运行analyze table validate structure cascade语句以验证表与索引间的数据正确性。

ORA-600 [6033] "null value retrieved from index leaf lookup" [ID 45795.1]
Modified 03-JUN-2010 Type REFERENCE Status PUBLISHED
Note: For additional ORA-600 related information please read Note:146580.1

PURPOSE:
This article represents a partially published OERI note.
It has been published because the ORA-600 error has been
reported in at least one confirmed bug.
Therefore, the SUGGESTIONS section of this article may help
in terms of identifying the cause of the error.
This specific ORA-600 error may be considered for full publication
at a later date. If/when fully published, additional information
will be available here on the nature of this error.
SUGGESTIONS:
Run the ANALYZE command on any tables and indexes in the
trace file:
Example: ANALYZE TABLE
 VALIDATE STRUCTURE CASCADE;
Rebuild any corrupted indexes.
Index corruption.
Known Bugs

NB Bug Fixed Description
6401576 9.2.0.8.P22 OERI[ktbair1] / ORA-600 [6101] index corruption possible
5845232 9.2.0.8.P06 Block corruption / errors from concurrent dequeue operations
2718937 9.2.0.4, 10.1.0.2 OERI:6033 from SELECT on IOT with COMPRESSED PRIMARY KEY
1573283 8.1.7.2, 9.0.1.0 OERI:6033 from ALTER INDEX .. REBUILD ONLINE PARAMETERS ('OPTIMIZE FULL')
Certain index operations can lead to block corruption / memory corruption with varying symptoms such as ORA-600 [6033], ORA-600 [6101] , ORA-600 [ktbair1] , ORA-600 [kcbzpb_1], ORA-600 [4519] and ORA-600 [kcoapl_blkchk] if DB_BLOCK_CHECKING is enabled. Concurrent dequeue operations can lead to block corruption / memory corruption with varying symptoms such as ORA-600 [6033], ORA-600 [6101] and ORA-600 [kcoapl_blkchk] if DB_BLOCK_CHECKING is enabled. Note: This issue was previously fixed under bug 5559640 but that fix had a serious problem which could lead to SGA memory corruption. This fix supercedes the fix for bug 5559640. The problem with patch 5559640 is alerted in Note:414109.1 This fix is superceeded by the fix for bug 6401576.

通过analyze table validate structure cascade命令验证索引后若存在问题则会进一步产生相关的trace文件,一般这类索引逻辑讹误的问题可以通过drop-recreate索引来解决。

沪ICP备14014813号-2

沪公网安备 31010802001379号