Upgrade 11.2.0.1 GI/CRS to 11.2.0.2 in Linux

11.2.0.2已经release 1年多了,相对于11.2.0.1要稳定很多。现在我们为客户部署新系统的时候一般都会推荐直接装11.2.0.2(out of place),并打到<Oracle Recommended Patches — Oracle Database>所推荐的PSU。

对于现有的系统则推荐在停机窗口允许的前提下尽可能升级到11.2.0.2上来,当然客户也可以更耐心的等待11.2.0.3版本的release。

针对11.2.0.1到11.2.0.2上的升级工程,其与10g中的升级略有区别。对于misson-critical的数据库必须进行有效的升级演练和备份操作,因为Oracle数据库软件的升级一直是一项复杂的工程,并且具有风险,不能不慎。

同时RAC数据库的升级又要较single-instance单实例的升级来的复杂,主要可以分成以下步骤:

1.  若使用Exadata Database Machine硬件,首先要检查是否需要升级Exadata Storage Software和Infiniband Switch的版本,<Database Machine and Exadata Storage Server 11g Release 2 (11.2) Supported Versions>

2. 完成rolling upgrade Grid Infrastructure的准备工作

3.滚动升级Gird Infrastructure GI软件

4.完成升级RDBMS数据库软件的准备工作

5.具体升级RDBMS数据库软件,包括升级数据字典、并编译失效对象等

这里我们重点介绍的是滚动升级GI/CRS集群软件的准备工作和具体升级步骤,因为11.2.0.2是11gR2的第一个Patchset,且又是首个out of place的大补丁集,所以绝大多数人对新的升级模式并不熟悉。

 

升级GI的准备工作

 

1.注意从11.2.0.1 GI/CRS滚动升级(rolling upgrade)到 11.2.0.2时可能出现意外错误,具体见<Pre-requsite for 11.2.0.1 to 11.2.0.2 ASM Rolling Upgrade>,这里一并引用:

Applies to:
Oracle Server - Enterprise Edition - Version: 11.2.0.1.0 to 11.2.0.2.0 - Release: 11.2 to 11.2
Oracle Server - Enterprise Edition - Version: 11.2.0.1 to 11.2.0.2   [Release: 11.2 to 11.2]
Information in this document applies to any platform.
Purpose
This note is to clarify the patch requirement when doing 11.2.0.1 to 11.2.0.2 rolling upgrade.
Scope and Application
Intended audience includes DBA, support engineers.
Pre-requsite for 11.2.0.1 to 11.2.0.2 ASM Rolling Upgrade

There has been some confusion as what patches need to be applied for 11.2.0.1 ASM rolling
upgrade to 11.2.0.2 to be successful. Documentation regarding this is not very clear
(at the time of writing) and a documentation bug has been filed and documentation will be updated in the future.

There are two bugs related to 11.2.0.1 ASM rolling upgrade to 11.2.0.2:

Unpublished bug 9413827: 11201 TO 11202 ASM ROLLING UPGRADE - OLD CRS STACK FAILS TO STOP

Unpublished bug 9706490: LNX64-11202-UD 11201 -> 11202, DG OFFLINE AFTER RESTART CRS STACK DURING UPGRADE

Some of the symptoms include error message when running rootupgrade.sh:

ORA-15154: cluster rolling upgrade incomplete (from bug: 9413827)

or

Diskgroup status is shown offline after the upgrade, crsd.log may have:

2010-05-12 03:45:49.029: [ AGFW][1506556224] Agfw Proxy Server sending the
last reply to PE for message:RESOURCE_START[ora.MYDG1.dg rwsdcvm44 1] ID 4098:1526
TextMessage[CRS-2674: Start of 'ora.MYDG1.dg' on 'rwsdcvm44' failed]
TextMessage[ora.MYDG1.dg rwsdcvm44 1]
ora.MYDG1.dg rwsdcvm44 1:

To overcome this issue, there are two actions you need to take:

a). apply proper patch.
b). change crsconfig_lib.pm

Applying Patch:

1). If $GI_HOME is on version 11.2.0.1.2 (i.e GI PSU2 is applied):

Action: You can apply Patch:9706490 for version 11.2.0.1.2.

Unpublished bug 9413827 is fixed in 11.2.0.1.2 GI PSU2. Patch:9706490 for version
11.2.0.1.2 is built on top of 11.2.0.1.2 GI PSU2 (i.e. includes the 11.2.0.1.2 GI PSU2,
hence includes the fix for 9413827). Applying Patch:9706490 includes both fixes.
opatch will recognize 9706490 is superset of 11.2.0.1.2 GI PSU2 (Patch: 9655006)
and rollback patch 9655006 before applying Patch: 9706490).

2). If $GI_HOME is on version 11.2.0.1.0 (i.e. no GI PSU applied).

Action: You can apply Patch:9706490 for version 11.2.0.1.2. This would make sure you have
applied 11.2.0.1.2 GI PSU2 plus both 9706490 and 9413827 (which is included in GI PSU2).

For platforms that do not have 11.2.0.1.2 GI PSU, then you can apply patch 9413827 on 11.2.0.1.0.

3). If $GI_HOME is on version 11.2.0.1.1 (GI PSU1) (this is rare since GI PSU1 was only
released for Linux platforms and was quite old).

Action: You can rollback GI PSU1 then apply Patch:9706490 on version 11.2.0.1.2
if your platform has 11.2.0.1.2 GI PSU. If your platform does not have 11.2.0.1.2GI PSU,
then apply patch 9413827.

Modify crsconfig_lib.pm

After patch is applied, modify $11.2.0.2_GI_HOME/crs/install/crsconfig_lib.pm:

Before the change:
# grep for bugs 9655006 or 9413827
@cmdout = grep(/(9655006|9413827)/, @output);

After the change:
# grep for bugs 9655006 or 9413827 or 9706490
@cmdout = grep(/(9655006|9413827|9706490)/, @output);

This would prevent rootupgrade.sh from failing when it validates the pre-requsite patches.

这里我们假设环境中的11.2.0.1 GI没有apply任何PSU补丁,为了解决这一”11201 TO 11202 ASM ROLLING UPGRADE – OLD CRS STACK FAILS TO STOP” bug,并成功滚动升级GI,需要在正式升级11.2.0.2 Patchset之前apply 9413827 bug的对应patch。

此外我们还推荐使用最新的opatch工具以避免出现11.2.0.1上opatch无法识别相关patch的问题。

所以我们为了升级GI到11.2.0.2,需要先从MOS下载  3个对应平台(platform)的补丁包,它们是

1.   11.2.0.2.0 PATCH SET FOR ORACLE DATABASE SERVER (Patchset)(patchid:10098816),注意实际上11.2.0.2的这个Patchset由多达7个zip文件组成,如在Linux x86-64平台上:

Patch 10098816 11.2.0.2.0 PATCH SET FOR ORACLE DATABASE SERVER_download

其中升级我们只需要下载1-3的zip包即可,第一、二包是RDBMS Database软件的out of place Patchset,而第三个包为Grid Infrastructure/CRS软件的out of place Patchset,实际在本篇文章(只升级GI)中仅会用到p10098816_112020_Linux-x86-64_3of7.zip这个压缩包。

2.  Patch 9413827: 11201 TO 11202 ASM ROLLING UPGRADE – OLD CRS STACK FAILS TO STOP(patchid:9413827)

3.  Patch 6880880: OPatch 11.2 (patchid:6880880),最新的opatch工具

2. 在所有节点上安装最新的opatch工具,该步骤不需要停止任何服务:

切换到GI拥有者用户,并移动原有的Opatch目录,将新的Opatch安装到CRS_HOME

su - grid

[grid@vrh1 ~]$ mv $CRS_HOME/OPatch $CRS_HOME/OPatch_old
[grid@vrh1 ~]$ unzip /tmp/p6880880_112000_Linux-x86-64.zip -d $CRS_HOME

确认opatch版本

[grid@vrh1 ~]$ $CRS_HOME/OPatch/opatch
Invoking OPatch 11.2.0.1.6

Oracle Interim Patch Installer version 11.2.0.1.6
Copyright (c) 2011, Oracle Corporation.  All rights reserved.

3.  在所有节点上滚动安装BUNDLE Patch for Base Bug 9413827补丁包:

1.切换到GI拥有者用户,并确认已经安装的补丁

su - grid 

opatch lsinventory -detail -oh $CRS_HOME

Invoking OPatch 11.2.0.1.6

Oracle Interim Patch Installer version 11.2.0.1.6
Copyright (c) 2011, Oracle Corporation.  All rights reserved.

Oracle Home       : /g01/11.2.0/grid
Central Inventory : /g01/oraInventory
   from           : /etc/oraInst.loc
OPatch version    : 11.2.0.1.6
OUI version       : 11.2.0.1.0
Log file location : /g01/11.2.0/grid/cfgtoollogs/opatch/opatch2011-09-04_19-08-33PM.log

Lsinventory Output file location :
/g01/11.2.0/grid/cfgtoollogs/opatch/lsinv/lsinventory2011-09-04_19-08-33PM.txt

--------------------------------------------------------------------------------
Installed Top-level Products (1): 

Oracle Grid Infrastructure                                           11.2.0.1.0
There are 1 products installed in this Oracle Home.
........................
###########################################################################

2. 解压之前下载的 p9413827_11201_$platform.zip的补丁包

 unzip p9413827_112010_Linux-x86-64.zip 

###########################################################################

3. 切换到DB HOME拥有者身份,在本地节点上停止RDBMS DB HOME相关的资源:

su - oracle

语法:
 % [RDBMS_HOME]/bin/srvctl stop home -o [RDBMS_HOME] -s [status file location] -n [node_name]

srvctl stop home -o $ORACLE_HOME  -n vrh1 -s stop_db_res           

cat stop_db_res
db-vprod

 hostname
www.askmac.cn

###########################################################################

4. 切换到root用户执行rootcrs.pl -unlock 命令

[root@vrh1 ~]# $CRS_HOME/crs/install/rootcrs.pl -unlock 

2011-09-04 20:46:53: Parsing the host name
2011-09-04 20:46:53: Checking for super user privileges
2011-09-04 20:46:53: User has super user privileges
Using configuration parameter file: /g01/11.2.0/grid/crs/install/crsconfig_params
CRS-2791: Starting shutdown of Oracle High Availability Services-managed resources on 'vrh1'
CRS-2673: Attempting to stop 'ora.crsd' on 'vrh1'
CRS-2790: Starting shutdown of Cluster Ready Services-managed resources on 'vrh1'
CRS-2673: Attempting to stop 'ora.LISTENER.lsnr' on 'vrh1'
CRS-2673: Attempting to stop 'ora.SYSTEMDG.dg' on 'vrh1'
CRS-2673: Attempting to stop 'ora.registry.acfs' on 'vrh1'
CRS-2673: Attempting to stop 'ora.DATA.dg' on 'vrh1'
CRS-2673: Attempting to stop 'ora.FRA.dg' on 'vrh1'
CRS-2677: Stop of 'ora.LISTENER.lsnr' on 'vrh1' succeeded
CRS-2673: Attempting to stop 'ora.vrh1.vip' on 'vrh1'
CRS-2677: Stop of 'ora.vrh1.vip' on 'vrh1' succeeded
CRS-2672: Attempting to start 'ora.vrh1.vip' on 'vrh2'
CRS-2677: Stop of 'ora.registry.acfs' on 'vrh1' succeeded
CRS-2676: Start of 'ora.vrh1.vip' on 'vrh2' succeeded
CRS-2677: Stop of 'ora.SYSTEMDG.dg' on 'vrh1' succeeded
CRS-2677: Stop of 'ora.FRA.dg' on 'vrh1' succeeded
CRS-2677: Stop of 'ora.DATA.dg' on 'vrh1' succeeded
CRS-2673: Attempting to stop 'ora.asm' on 'vrh1'
CRS-2677: Stop of 'ora.asm' on 'vrh1' succeeded
CRS-2673: Attempting to stop 'ora.ons' on 'vrh1'
CRS-2673: Attempting to stop 'ora.eons' on 'vrh1'
CRS-2677: Stop of 'ora.ons' on 'vrh1' succeeded
CRS-2673: Attempting to stop 'ora.net1.network' on 'vrh1'
CRS-2677: Stop of 'ora.net1.network' on 'vrh1' succeeded
CRS-2677: Stop of 'ora.eons' on 'vrh1' succeeded
CRS-2792: Shutdown of Cluster Ready Services-managed resources on 'vrh1' has completed
CRS-2677: Stop of 'ora.crsd' on 'vrh1' succeeded
CRS-2673: Attempting to stop 'ora.gpnpd' on 'vrh1'
CRS-2673: Attempting to stop 'ora.cssdmonitor' on 'vrh1'
CRS-2673: Attempting to stop 'ora.ctssd' on 'vrh1'
CRS-2673: Attempting to stop 'ora.evmd' on 'vrh1'
CRS-2673: Attempting to stop 'ora.asm' on 'vrh1'
CRS-2673: Attempting to stop 'ora.mdnsd' on 'vrh1'
CRS-2673: Attempting to stop 'ora.drivers.acfs' on 'vrh1'
CRS-2677: Stop of 'ora.cssdmonitor' on 'vrh1' succeeded
CRS-2677: Stop of 'ora.gpnpd' on 'vrh1' succeeded
CRS-2677: Stop of 'ora.evmd' on 'vrh1' succeeded
CRS-2677: Stop of 'ora.mdnsd' on 'vrh1' succeeded
CRS-2677: Stop of 'ora.ctssd' on 'vrh1' succeeded
CRS-2677: Stop of 'ora.drivers.acfs' on 'vrh1' succeeded
CRS-2677: Stop of 'ora.asm' on 'vrh1' succeeded
CRS-2673: Attempting to stop 'ora.cssd' on 'vrh1'
CRS-2677: Stop of 'ora.cssd' on 'vrh1' succeeded
CRS-2673: Attempting to stop 'ora.diskmon' on 'vrh1'
CRS-2673: Attempting to stop 'ora.gipcd' on 'vrh1'
CRS-2677: Stop of 'ora.gipcd' on 'vrh1' succeeded
CRS-2677: Stop of 'ora.diskmon' on 'vrh1' succeeded
CRS-2793: Shutdown of Oracle High Availability Services-managed resources on 'vrh1' has completed
CRS-4133: Oracle High Availability Services has been stopped.
Successfully unlock /g01/11.2.0/grid

###########################################################################

5.以RDBMS HOME拥有者用户执行patch目录下的prepatch.sh脚本

su - oracle

% custom/server/9413827/custom/scripts/prepatch.sh -dbhome [RDBMS_HOME]

[oracle@vrh1 tmp]$ 9413827/custom/server/9413827/custom/scripts/prepatch.sh -dbhome $ORACLE_HOME

9413827/custom/server/9413827/custom/scripts/prepatch.sh completed successfully.

###########################################################################

6.实际apply patch

以GI/CRS拥有者用户执行以下命令

 % opatch napply -local -oh [CRS_HOME] -id 9413827

su - grid

cd /tmp/9413827/

opatch napply -local -oh $CRS_HOME -id 9413827

Invoking OPatch 11.2.0.1.6

Oracle Interim Patch Installer version 11.2.0.1.6
Copyright (c) 2011, Oracle Corporation.  All rights reserved.

UTIL session

Oracle Home       : /g01/11.2.0/grid
Central Inventory : /g01/oraInventory
   from           : /etc/oraInst.loc
OPatch version    : 11.2.0.1.6
OUI version       : 11.2.0.1.0
Log file location : /g01/11.2.0/grid/cfgtoollogs/opatch/opatch2011-09-04_20-52-37PM.log

Verifying environment and performing prerequisite checks...
OPatch continues with these patches:   9413827  

Do you want to proceed? [y|n]
y
User Responded with: Y
All checks passed.
Provide your email address to be informed of security issues, install and
initiate Oracle Configuration Manager. Easier for you if you use your My
Oracle Support Email address/User Name.
Visit http://www.oracle.com/support/policies.html for details.
Email address/User Name: 

You have not provided an email address for notification of security issues.
Do you wish to remain uninformed of security issues ([Y]es, [N]o) [N]:  y

Please shutdown Oracle instances running out of this ORACLE_HOME on the local system.
(Oracle Home = '/g01/11.2.0/grid')

Is the local system ready for patching? [y|n]
y
User Responded with: Y
Backing up files...
Applying interim patch '9413827' to OH '/g01/11.2.0/grid'

Patching component oracle.crs, 11.2.0.1.0...
Patches 9413827 successfully applied.
Log file location: /g01/11.2.0/grid/cfgtoollogs/opatch/opatch2011-09-04_20-52-37PM.log

OPatch succeeded.

以DB/RDBMS拥有者用户执行以下命令

su - oracle
cd /tmp/9413827/

% opatch napply custom/server/ -local -oh [RDBMS_HOME] -id 9413827

opatch napply custom/server/ -local -oh $ORACLE_HOME -id 9413827

Verifying the update...
Inventory check OK: Patch ID 9413827 is registered in Oracle Home inventory with proper meta-data.
Files check OK: Files from Patch ID 9413827 are present in Oracle Home.
Running make for target install
Running make for target install

The local system has been patched and can be restarted.

UtilSession: N-Apply done.

OPatch succeeded.

###########################################################################

7. 配置HOME目录

以root用户执行以下命令

 chmod +w $CRS_HOME/log/[nodename]/agent
 chmod +w $CRS_HOME/log/[nodename]/agent/crsd

以DB/RDBMS拥有者用户执行以下命令
su - oracle

 cd /tmp/9413827/

% custom/server/9413827/custom/scripts/postpatch.sh -dbhome [RDBMS_HOME]

[oracle@vrh1 9413827]$ custom/server/9413827/custom/scripts/postpatch.sh -dbhome $ORACLE_HOME
Reading /s01/orabase/product/11.2.0/dbhome_1/install/params.ora..
Reading /s01/orabase/product/11.2.0/dbhome_1/install/params.ora..
Parsing file /s01/orabase/product/11.2.0/dbhome_1/bin/racgwrap
Parsing file /s01/orabase/product/11.2.0/dbhome_1/bin/srvctl
Parsing file /s01/orabase/product/11.2.0/dbhome_1/bin/srvconfig
Parsing file /s01/orabase/product/11.2.0/dbhome_1/bin/cluvfy
Verifying file /s01/orabase/product/11.2.0/dbhome_1/bin/racgwrap
Verifying file /s01/orabase/product/11.2.0/dbhome_1/bin/srvctl
Verifying file /s01/orabase/product/11.2.0/dbhome_1/bin/srvconfig
Verifying file /s01/orabase/product/11.2.0/dbhome_1/bin/cluvfy
Reapplying file permissions on /s01/orabase/product/11.2.0/dbhome_1/bin/racgwrap
Reapplying file permissions on /s01/orabase/product/11.2.0/dbhome_1/bin/srvctl
Reapplying file permissions on /s01/orabase/product/11.2.0/dbhome_1/bin/srvconfig
Reapplying file permissions on /s01/orabase/product/11.2.0/dbhome_1/bin/cluvfy
Reapplying file permissions on /s01/orabase/product/11.2.0/dbhome_1/bin/racgmain
Reapplying file permissions on /s01/orabase/product/11.2.0/dbhome_1/bin/racgeut
Reapplying file permissions on /s01/orabase/product/11.2.0/dbhome_1/bin/diskmon.bin
Reapplying file permissions on /s01/orabase/product/11.2.0/dbhome_1/bin/lsnodes
Reapplying file permissions on /s01/orabase/product/11.2.0/dbhome_1/bin/osdbagrp
Reapplying file permissions on /s01/orabase/product/11.2.0/dbhome_1/bin/rawutl
Reapplying file permissions on /s01/orabase/product/11.2.0/dbhome_1/srvm/admin/ractrans
Reapplying file permissions on /s01/orabase/product/11.2.0/dbhome_1/srvm/admin/getcrshome
Reapplying file permissions on /s01/orabase/product/11.2.0/dbhome_1/bin/gnsd
Reapplying file permissions on /s01/orabase/product/11.2.0/dbhome_1/bin/crsdiag.pl
Reapplying file permissions on /s01/orabase/product/11.2.0/dbhome_1/lib/libhasgen11.so
Reapplying file permissions on /s01/orabase/product/11.2.0/dbhome_1/lib/libclsra11.so
Reapplying file permissions on /s01/orabase/product/11.2.0/dbhome_1/lib/libdbcfg11.so
Reapplying file permissions on /s01/orabase/product/11.2.0/dbhome_1/lib/libocr11.so
Reapplying file permissions on /s01/orabase/product/11.2.0/dbhome_1/lib/libocrb11.so
Reapplying file permissions on /s01/orabase/product/11.2.0/dbhome_1/lib/libocrutl11.so
Reapplying file permissions on /s01/orabase/product/11.2.0/dbhome_1/lib/libuini11.so
Reapplying file permissions on /s01/orabase/product/11.2.0/dbhome_1/lib/librdjni11.so
Reapplying file permissions on /s01/orabase/product/11.2.0/dbhome_1/lib/libgns11.so
Reapplying file permissions on /s01/orabase/product/11.2.0/dbhome_1/lib/libgnsjni11.so
Reapplying file permissions on /s01/orabase/product/11.2.0/dbhome_1/lib/libagfw11.so

###########################################################################

8.以root用户重启CRS进程

# $CRS_HOME/crs/install/rootcrs.pl -patch 

2011-09-04 21:03:32: Parsing the host name
2011-09-04 21:03:32: Checking for super user privileges
2011-09-04 21:03:32: User has super user privileges
Using configuration parameter file: /g01/11.2.0/grid/crs/install/crsconfig_params
CRS-4123: Oracle High Availability Services has been started.

# $ORACLE_HOME/bin/srvctl start home -o $ORACLE_HOME -s $STATUS_FILE -n nodename

###########################################################################

9. 使用opatch命令确认补丁安装成功

 opatch lsinventory -detail -oh $CRS_HOME
 opatch lsinventory -detail -oh $RDBMS_HOME

###########################################################################

10. 在其他节点上重复以上步骤,直到在所有节点上成功安装该补丁

###########################################################################

注意AIX平台上有额外的注意事项:

# Special Instruction for AIX
# ---------------------------
#
# During the application of this patch should you see any errors with regards
# to files being locked or opatch  being  unable to copy files then this
#
# could be as result of a process which requires termination or an additional
#
# file needing to be unloaded from the system cache.
#
#
# To try and identify the likely cause please execute the following  commands
#
# and provide the output to your support representative, who will be  able to
#
# identify the corrective steps.
#
#
#     genld -l | grep [CRS_HOME]
#
#     genkld | grep [CRS_HOME]    ( full or partial path will do )
#
#
# Simple Case Resolution:
#
# If genld returns data then a currently executing process has something open
# in
# the [CRS_HOME] directory, please terminate the process as
# required/recommended.
#
#
#  If genkld return data then please remove the enteries from the
#  OS system cache by using the slibclean command as root;
#
#
#     slibclean
#
###########################################################################
#
#  Patch Deinstallation Instructions:
#  ----------------------------------
#
#  To roll back the patch, follow all of the above steps 1-5. In step 6,
#  invoke the following opatch commands to roll back the patch in all homes.
#
#  % opatch rollback -id 9413827 -local -oh [CRS_HOME]
#
#  % opatch rollback -id 9413827 -local -oh [RDBMS_HOME]
#
#  Afterwards, continue with steps 7-9 to complete the procedure.
#
###########################################################################
#
#  If you have any problems installing this PSE or are not sure
#  about inventory setup please call Oracle support.
#
###########################################################################

 

正式升级GI到11.2.0.2

 

1. 解压软件包,如上所述第三个zip包为grid软件

unzip p10098816_112020_Linux-x86-64_3of7.zip

 

2. 以GI拥有者用户启动GI/CRS的OUI安装界面,并选择Out of Place的安装目录

(grid)$ unset ORACLE_HOME ORACLE_BASE ORACLE_SID
(grid)$ export DISPLAY=:0
(grid)$ cd /u01/app/oracle/patchdepot/grid
(grid)$ ./runInstaller
Starting Oracle Universal Installer…

在”Select Installation Options”屏幕中选择Upgrade Oracle Grid Infrastructure or Oracle Automatic Storage Management

 

upgrade_110202_GI

upgrade_110202_GI_a

 

选择不同于现有GI软件的目录

 

upgrade_110202_GI_b

完成安装后会提示要以root用户执行rootupgrade.sh

upgrade_110202_GI_c

3. 注意在正式执行rootupgrade.sh之前数据库服务在所有节点上都是可用的,而在执行rootupgrade.sh脚本期间,本地节点的CRS将短暂关闭,也就是说滚动升级期间至少有一个节点不用

因为unpublished bug 10011084 and unpublished bug 10128494的关系,在执行rootupgrade.sh之前需要修改crsconfig_lib.pm参数文件,修改方式如下:

cp $NEW_CRS_HOME/crs/install/crsconfig_lib.pm $NEW_CRS_HOME/crs/install/crsconfig_lib.pm.bak
vi $NEW_CRS_HOME/crs/install/crsconfig_lib.pm

从以上配置文件中修改如下行,并使用diff命令确认

From
 @cmdout = grep(/$bugid/, @output);
To
  @cmdout = grep(/(9655006|9413827)/, @output);

From
my @exp_func = qw(check_CRSConfig validate_olrconfig validateOCR
To
my @exp_func = qw(check_CRSConfig validate_olrconfig validateOCR read_file

$ diff crsconfig_lib.pm.orig crsconfig_lib.pm
699c699
< my @exp_func = qw(check_CRSConfig validate_olrconfig validateOCR --- >
my @exp_func = qw(check_CRSConfig validate_olrconfig validateOCR read_file
13277c13277
< @cmdout = grep(/$bugid/, @output); --- > @cmdout = grep(/(9655006|9413827)/, @output);

cp /g01/11.2.0.2/grid/crs/install/crsconfig_lib.pm /g01/11.2.0.2/grid/crs/install/crsconfig_lib.pm.bak

并在所有节点上复制该配置文件
scp /g01/11.2.0.2/grid/crs/install/crsconfig_lib.pm vrh2:/g01/11.2.0.2/grid/crs/install/crsconfig_lib.pm

如果觉得麻烦,那么也可以直接从这里下载修改好的crsconfig_lib.pm

由于 bug 10056593 和 bug 10241443 的缘故执行rootupgrde.sh的过程中还可能出现以下错误

Due to bug 10056593, rootupgrade.sh will report this error and continue. This error is ignorable.

Failed to add (property/value):('OLD_OCR_ID/'-1') for checkpoint:ROOTCRS_OLDHOMEINFO.Error code is 256

Due to bug 10241443, rootupgrade.sh may report the following error when installing the cvuqdisk package.
This error is ignorable.

    ls: /usr/sbin/smartctl: No such file or directory
    /usr/sbin/smartctl not found.

以上错误可以被忽略,不会影响到升级。

4.正式执行rootupgrade.sh脚本,建议从负载较高的节点开始

[root@vrh1 grid]# /g01/11.2.0.2/grid/rootupgrade.sh
Running Oracle 11g root script...

The following environment variables are set as:
ORACLE_OWNER= grid
ORACLE_HOME= /g01/11.2.0.2/grid

Enter the full pathname of the local bin directory: [/usr/local/bin]:
The contents of "dbhome" have not changed. No need to overwrite.
The contents of "oraenv" have not changed. No need to overwrite.
The contents of "coraenv" have not changed. No need to overwrite.

Entries will be added to the /etc/oratab file as needed by
Database Configuration Assistant when a database is created
Finished running generic part of root script.
Now product-specific root actions will be performed.
Using configuration parameter file: /g01/11.2.0.2/grid/crs/install/crsconfig_params
Creating trace directory
Failed to add (property/value):('OLD_OCR_ID/'-1') for checkpoint:ROOTCRS_OLDHOMEINFO.Error code is 256

ASM upgrade has started on first node.

CRS-2791: Starting shutdown of Oracle High Availability Services-managed resources on 'vrh1'
CRS-2673: Attempting to stop 'ora.crsd' on 'vrh1'
CRS-2790: Starting shutdown of Cluster Ready Services-managed resources on 'vrh1'
CRS-2673: Attempting to stop 'ora.LISTENER.lsnr' on 'vrh1'
CRS-2673: Attempting to stop 'ora.SYSTEMDG.dg' on 'vrh1'
CRS-2673: Attempting to stop 'ora.registry.acfs' on 'vrh1'
CRS-2677: Stop of 'ora.LISTENER.lsnr' on 'vrh1' succeeded
CRS-2673: Attempting to stop 'ora.vrh1.vip' on 'vrh1'
CRS-2677: Stop of 'ora.vrh1.vip' on 'vrh1' succeeded
CRS-2672: Attempting to start 'ora.vrh1.vip' on 'vrh2'
CRS-2677: Stop of 'ora.registry.acfs' on 'vrh1' succeeded
CRS-2676: Start of 'ora.vrh1.vip' on 'vrh2' succeeded
CRS-2677: Stop of 'ora.SYSTEMDG.dg' on 'vrh1' succeeded
CRS-2673: Attempting to stop 'ora.asm' on 'vrh1'
CRS-2677: Stop of 'ora.asm' on 'vrh1' succeeded
CRS-2673: Attempting to stop 'ora.ons' on 'vrh1'
CRS-2673: Attempting to stop 'ora.eons' on 'vrh1'
CRS-2677: Stop of 'ora.ons' on 'vrh1' succeeded
CRS-2673: Attempting to stop 'ora.net1.network' on 'vrh1'
CRS-2677: Stop of 'ora.net1.network' on 'vrh1' succeeded
CRS-2677: Stop of 'ora.eons' on 'vrh1' succeeded
CRS-2792: Shutdown of Cluster Ready Services-managed resources on 'vrh1' has completed
CRS-2677: Stop of 'ora.crsd' on 'vrh1' succeeded
CRS-2673: Attempting to stop 'ora.mdnsd' on 'vrh1'
CRS-2673: Attempting to stop 'ora.cssdmonitor' on 'vrh1'
CRS-2673: Attempting to stop 'ora.ctssd' on 'vrh1'
CRS-2673: Attempting to stop 'ora.evmd' on 'vrh1'
CRS-2673: Attempting to stop 'ora.asm' on 'vrh1'
CRS-2673: Attempting to stop 'ora.drivers.acfs' on 'vrh1'
CRS-2677: Stop of 'ora.cssdmonitor' on 'vrh1' succeeded
CRS-2677: Stop of 'ora.mdnsd' on 'vrh1' succeeded
CRS-2677: Stop of 'ora.evmd' on 'vrh1' succeeded
CRS-2677: Stop of 'ora.ctssd' on 'vrh1' succeeded
CRS-2677: Stop of 'ora.drivers.acfs' on 'vrh1' succeeded
CRS-2677: Stop of 'ora.asm' on 'vrh1' succeeded
CRS-2673: Attempting to stop 'ora.cssd' on 'vrh1'
CRS-2677: Stop of 'ora.cssd' on 'vrh1' succeeded
CRS-2673: Attempting to stop 'ora.gpnpd' on 'vrh1'
CRS-2673: Attempting to stop 'ora.diskmon' on 'vrh1'
CRS-2677: Stop of 'ora.diskmon' on 'vrh1' succeeded
CRS-2677: Stop of 'ora.gpnpd' on 'vrh1' succeeded
CRS-2673: Attempting to stop 'ora.gipcd' on 'vrh1'
CRS-2677: Stop of 'ora.gipcd' on 'vrh1' succeeded
CRS-2793: Shutdown of Oracle High Availability Services-managed resources on 'vrh1' has completed
CRS-4133: Oracle High Availability Services has been stopped.
Successfully deleted 1 keys from OCR.
Creating OCR keys for user 'root', privgrp 'root'..
Operation successful.
OLR initialization - successful
Adding daemon to inittab
ACFS-9200: Supported
ACFS-9300: ADVM/ACFS distribution files found.
ACFS-9312: Existing ADVM/ACFS installation detected.
ACFS-9314: Removing previous ADVM/ACFS installation.
ACFS-9315: Previous ADVM/ACFS components successfully removed.
ACFS-9307: Installing requested ADVM/ACFS software.
ACFS-9308: Loading installed ADVM/ACFS drivers.
ACFS-9321: Creating udev for ADVM/ACFS.
ACFS-9323: Creating module dependencies - this may take some time.
ACFS-9327: Verifying ADVM/ACFS devices.
ACFS-9309: ADVM/ACFS installation correctness verified.
clscfg: EXISTING configuration version 5 detected.
clscfg: version 5 is 11g Release 2.
Successfully accumulated necessary OCR keys.
Creating OCR keys for user 'root', privgrp 'root'..
Operation successful.
Preparing packages for installation...
cvuqdisk-1.0.9-1
Configure Oracle Grid Infrastructure for a Cluster ... succeeded

 

最后执行rootupgrade.sh脚本的节点会出现以下GI/CRS成功升级的信息:

 

Successfully deleted 1 keys from OCR.
Creating OCR keys for user 'root', privgrp 'root'..
Operation successful.
OLR initialization - successful
Adding daemon to inittab
ACFS-9200: Supported
ACFS-9300: ADVM/ACFS distribution files found.
ACFS-9312: Existing ADVM/ACFS installation detected.
ACFS-9314: Removing previous ADVM/ACFS installation.
ACFS-9315: Previous ADVM/ACFS components successfully removed.
ACFS-9307: Installing requested ADVM/ACFS software.
ACFS-9308: Loading installed ADVM/ACFS drivers.
ACFS-9321: Creating udev for ADVM/ACFS.
ACFS-9323: Creating module dependencies - this may take some time.
ACFS-9327: Verifying ADVM/ACFS devices.
ACFS-9309: ADVM/ACFS installation correctness verified.
clscfg: EXISTING configuration version 5 detected.
clscfg: version 5 is 11g Release 2.
Successfully accumulated necessary OCR keys.
Creating OCR keys for user 'root', privgrp 'root'..
Operation successful.
Started to upgrade the Oracle Clusterware. This operation may take a few minutes.
Started to upgrade the CSS.
Started to upgrade the CRS.
The CRS was successfully upgraded.
Oracle Clusterware operating version was successfully set to 11.2.0.2.0

ASM upgrade has finished on last node.

Preparing packages for installation...
cvuqdisk-1.0.9-1
Configure Oracle Grid Infrastructure for a Cluster ... succeeded

5. 确认GI/CRS的版本

su - grid

$ crsctl query crs activeversion
Oracle Clusterware active version on the cluster is [11.2.0.2.0]

 hostname
www.askmac.cn

/g01/11.2.0.2/grid/OPatch/opatch lsinventory -oh /g01/11.2.0.2/grid
Invoking OPatch 11.2.0.1.1

Oracle Interim Patch Installer version 11.2.0.1.1
Copyright (c) 2009, Oracle Corporation.  All rights reserved.

Oracle Home       : /g01/11.2.0.2/grid
Central Inventory : /g01/oraInventory
   from           : /etc/oraInst.loc
OPatch version    : 11.2.0.1.1
OUI version       : 11.2.0.2.0
OUI location      : /g01/11.2.0.2/grid/oui
Log file location : /g01/11.2.0.2/grid/cfgtoollogs/opatch/opatch2011-09-05_02-17-19AM.log

Patch history file: /g01/11.2.0.2/grid/cfgtoollogs/opatch/opatch_history.txt

Lsinventory Output file location : /g01/11.2.0.2/grid/cfgtoollogs/opatch/lsinv/lsinventory2011-09-05_02-17-19AM.txt

--------------------------------------------------------------------------------
Installed Top-level Products (1): 

Oracle Grid Infrastructure                                           11.2.0.2.0
There are 1 products installed in this Oracle Home.

6.更新bash_profile , 将CRS_HOME、ORACLE_HOME、PATH等变量指向新的GI目录

Uninstall/Remove 11.2.0.2 Grid Infrastructure & Database in Linux

出于研究或者测试的目的我们可能已经在平台上安装了11gR2的Grid Infrastructure和RAC Database,因为GI部署的特殊性我们不能直接删除CRS_HOME和一些列脚本的方法来卸载GI和RAC Database软件,所幸在11gR2中Oracle提供了卸载软件的新特性:Deinstall,通过执行Deinstall脚本可以方便地删除Oracle软件产品在系统上的各类配置文件。

具体的卸载步骤如下:

1. 将平台上现有的数据库迁移走或者物理、逻辑地备份,如果该数据库已经没有任何价值的话使用DBCA删除该数据库及相关服务。

以oracle用户登录系统启动DBCA界面,并选择RAC database:

[oracle@vrh2 ~]$ dbca

deinstall_11gr2_rac_1

在step 1 of 2 :operations上选择删除数据库 delete a Database

deinstall_11gr2_rac_2

在 step 2 of 2 : List of cluster databases上选择所要删除的数据库

deinstall_11gr2_rac_3

逐一删除Cluster环境中所有的Database

2.
使用oracle用户登录任意节点并执行$ORACLE_HOME/deinstall目录下的deinstall脚本


SQL> select * from v$version;

BANNER
--------------------------------------------------------------------------------
Oracle Database 11g Enterprise Edition Release 11.2.0.2.0 - 64bit Production
PL/SQL Release 11.2.0.2.0 - Production
CORE    11.2.0.2.0      Production
TNS for Linux: Version 11.2.0.2.0 - Production
NLSRTL Version 11.2.0.2.0 - Production

SQL> select * from global_name;

GLOBAL_NAME
--------------------------------------------------------------------------------
www.askmac.cn


[root@vrh2 ~]# su - oracle

[oracle@vrh2 ~]$ cd $ORACLE_HOME/deinstall

[oracle@vrh2 deinstall]$ ./deinstall

Checking for required files and bootstrapping ...
Please wait ...
Location of logs /g01/oraInventory/logs/

############ ORACLE DEINSTALL & DECONFIG TOOL START ############

######################### CHECK OPERATION START #########################
Install check configuration START

Checking for existence of the Oracle home location /s01/orabase/product/11.2.0/dbhome_1
Oracle Home type selected for de-install is: RACDB
Oracle Base selected for de-install is: /s01/orabase
Checking for existence of central inventory location /g01/oraInventory
Checking for existence of the Oracle Grid Infrastructure home /g01/11.2.0/grid
The following nodes are part of this cluster: vrh1,vrh2

Install check configuration END

Skipping Windows and .NET products configuration check

Checking Windows and .NET products configuration END

Network Configuration check config START

Network de-configuration trace file location:
/g01/oraInventory/logs/netdc_check2011-08-31_11-19-25-PM.log

Specify all RAC listeners (do not include SCAN listener) that are to be de-configured [CRS_LISTENER]:

Network Configuration check config END

Database Check Configuration START

Database de-configuration trace file location: /g01/oraInventory/logs/databasedc_check2011-08-31_11-19-39-PM.log

Use comma as separator when specifying list of values as input

Specify the list of database names that are configured in this Oracle home []:
Database Check Configuration END

Enterprise Manager Configuration Assistant START

EMCA de-configuration trace file location: /g01/oraInventory/logs/emcadc_check2011-08-31_11-19-46-PM.log 

Enterprise Manager Configuration Assistant END
Oracle Configuration Manager check START
OCM check log file location : /g01/oraInventory/logs//ocm_check131.log
Oracle Configuration Manager check END

######################### CHECK OPERATION END #########################

####################### CHECK OPERATION SUMMARY #######################
Oracle Grid Infrastructure Home is: /g01/11.2.0/grid
The cluster node(s) on which the Oracle home de-installation will be performed are:vrh1,vrh2
Oracle Home selected for de-install is: /s01/orabase/product/11.2.0/dbhome_1
Inventory Location where the Oracle home registered is: /g01/oraInventory
Skipping Windows and .NET products configuration check
Following RAC listener(s) will be de-configured: CRS_LISTENER
No Enterprise Manager configuration to be updated for any database(s)
No Enterprise Manager ASM targets to update
No Enterprise Manager listener targets to migrate
Checking the config status for CCR
vrh1 : Oracle Home exists with CCR directory, but CCR is not configured
vrh2 : Oracle Home exists with CCR directory, but CCR is not configured
CCR check is finished
Do you want to continue (y - yes, n - no)? [n]: y
A log of this session will be written to: '/g01/oraInventory/logs/deinstall_deconfig2011-08-31_11-19-23-PM.out'
Any error messages from this session will be written to: '/g01/oraInventory/logs/deinstall_deconfig2011-08-31_11-19-23-PM.err'

######################## CLEAN OPERATION START ########################

Enterprise Manager Configuration Assistant START

EMCA de-configuration trace file location: /g01/oraInventory/logs/emcadc_clean2011-08-31_11-19-46-PM.log 

Updating Enterprise Manager ASM targets (if any)
Updating Enterprise Manager listener targets (if any)
Enterprise Manager Configuration Assistant END
Database de-configuration trace file location: /g01/oraInventory/logs/databasedc_clean2011-08-31_11-20-00-PM.log

Network Configuration clean config START

Network de-configuration trace file location: /g01/oraInventory/logs/netdc_clean2011-08-31_11-20-00-PM.log

De-configuring RAC listener(s): CRS_LISTENER

De-configuring listener: CRS_LISTENER
    Stopping listener: CRS_LISTENER
    Listener stopped successfully.
    Unregistering listener: CRS_LISTENER
    Listener unregistered successfully.
Listener de-configured successfully.

De-configuring Listener configuration file on all nodes...
Listener configuration file de-configured successfully.

De-configuring Naming Methods configuration file on all nodes...
Naming Methods configuration file de-configured successfully.

De-configuring Local Net Service Names configuration file on all nodes...
Local Net Service Names configuration file de-configured successfully.

De-configuring Directory Usage configuration file on all nodes...
Directory Usage configuration file de-configured successfully.

De-configuring backup files on all nodes...
Backup files de-configured successfully.

The network configuration has been cleaned up successfully.

Network Configuration clean config END

Oracle Configuration Manager clean START
OCM clean log file location : /g01/oraInventory/logs//ocm_clean131.log
Oracle Configuration Manager clean END
Removing Windows and .NET products configuration END
Oracle Universal Installer clean START

Detach Oracle home '/s01/orabase/product/11.2.0/dbhome_1' from the central inventory on the local node : Done

Delete directory '/s01/orabase/product/11.2.0/dbhome_1' on the local node : Done

Delete directory '/s01/orabase' on the local node : Done

Detach Oracle home '/s01/orabase/product/11.2.0/dbhome_1' from the central inventory on the remote nodes 'vrh1' : Done

Delete directory '/s01/orabase/product/11.2.0/dbhome_1' on the remote nodes 'vrh1' : Done

Delete directory '/s01/orabase' on the remote nodes 'vrh1' : Done

Oracle Universal Installer cleanup was successful.

Oracle Universal Installer clean END

Oracle install clean START

Clean install operation removing temporary directory '/tmp/deinstall2011-08-31_11-19-18PM' on node 'vrh2'
Clean install operation removing temporary directory '/tmp/deinstall2011-08-31_11-19-18PM' on node 'vrh1'

Oracle install clean END

######################### CLEAN OPERATION END #########################

####################### CLEAN OPERATION SUMMARY #######################
Following RAC listener(s) were de-configured successfully: CRS_LISTENER
Cleaning the config for CCR
As CCR is not configured, so skipping the cleaning of CCR configuration
CCR clean is finished
Skipping Windows and .NET products configuration clean
Successfully detached Oracle home '/s01/orabase/product/11.2.0/dbhome_1' from the central inventory on the local node.
Successfully deleted directory '/s01/orabase/product/11.2.0/dbhome_1' on the local node.
Successfully deleted directory '/s01/orabase' on the local node.
Successfully detached Oracle home '/s01/orabase/product/11.2.0/dbhome_1' from the central inventory on the remote nodes 'vrh1'.
Successfully deleted directory '/s01/orabase/product/11.2.0/dbhome_1' on the remote nodes 'vrh1'.
Successfully deleted directory '/s01/orabase' on the remote nodes 'vrh1'.
Oracle Universal Installer cleanup was successful.

Oracle deinstall tool successfully cleaned up temporary directories.
#######################################################################

############# ORACLE DEINSTALL & DECONFIG TOOL END #############

以上deinstall脚本会删除所有节点上的$ORACLE_HOME下的RDBMS软件,并从central inventory中将已经卸载的RDBMS软件注销,注意这种操作是不可逆的!

3.

使用root用户登录在所有节点上注意运行”$ORA_CRS_HOME/crs/install/rootcrs.pl -verbose -deconfig -force”的命令,注意在最后一个节点不要运行该命令。举例来说如果你有2个节点的话,就只要在一个节点上运行上述命令即可:

[root@vrh1 ~]# $ORA_CRS_HOME/crs/install/rootcrs.pl -verbose -deconfig -force

Using configuration parameter file: /g01/11.2.0/grid/crs/install/crsconfig_params
Network exists: 1/192.168.1.0/255.255.255.0/eth0, type static
VIP exists: /vrh1-vip/192.168.1.162/192.168.1.0/255.255.255.0/eth0, hosting node vrh1
VIP exists: /vrh2-vip/192.168.1.164/192.168.1.0/255.255.255.0/eth0, hosting node vrh2
VIP exists: /vrh3-vip/192.168.1.166/192.168.1.0/255.255.255.0/eth0, hosting node vrh3
GSD exists
ONS exists: Local port 6100, remote port 6200, EM port 2016
ACFS-9200: Supported
CRS-2673: Attempting to stop 'ora.registry.acfs' on 'vrh1'
CRS-2677: Stop of 'ora.registry.acfs' on 'vrh1' succeeded
CRS-2791: Starting shutdown of Oracle High Availability Services-managed resources on 'vrh1'
CRS-2673: Attempting to stop 'ora.crsd' on 'vrh1'
CRS-2790: Starting shutdown of Cluster Ready Services-managed resources on 'vrh1'
CRS-2673: Attempting to stop 'ora.oc4j' on 'vrh1'
CRS-2673: Attempting to stop 'ora.DATA.dg' on 'vrh1'
CRS-2673: Attempting to stop 'ora.FRA.dg' on 'vrh1'
CRS-2673: Attempting to stop 'ora.SYSTEMDG.dg' on 'vrh1'
CRS-2677: Stop of 'ora.oc4j' on 'vrh1' succeeded
CRS-2672: Attempting to start 'ora.oc4j' on 'vrh2'
CRS-2676: Start of 'ora.oc4j' on 'vrh2' succeeded
CRS-2677: Stop of 'ora.DATA.dg' on 'vrh1' succeeded
CRS-2677: Stop of 'ora.SYSTEMDG.dg' on 'vrh1' succeeded
CRS-2677: Stop of 'ora.FRA.dg' on 'vrh1' succeeded
CRS-2673: Attempting to stop 'ora.asm' on 'vrh1'
CRS-2677: Stop of 'ora.asm' on 'vrh1' succeeded
CRS-2792: Shutdown of Cluster Ready Services-managed resources on 'vrh1' has completed
CRS-2677: Stop of 'ora.crsd' on 'vrh1' succeeded
CRS-2673: Attempting to stop 'ora.ctssd' on 'vrh1'
CRS-2673: Attempting to stop 'ora.evmd' on 'vrh1'
CRS-2673: Attempting to stop 'ora.asm' on 'vrh1'
CRS-2673: Attempting to stop 'ora.mdnsd' on 'vrh1'
CRS-2673: Attempting to stop 'ora.drivers.acfs' on 'vrh1'
CRS-2677: Stop of 'ora.asm' on 'vrh1' succeeded
CRS-2673: Attempting to stop 'ora.cluster_interconnect.haip' on 'vrh1'
CRS-2677: Stop of 'ora.drivers.acfs' on 'vrh1' succeeded
CRS-2677: Stop of 'ora.evmd' on 'vrh1' succeeded
CRS-2677: Stop of 'ora.cluster_interconnect.haip' on 'vrh1' succeeded
CRS-2677: Stop of 'ora.mdnsd' on 'vrh1' succeeded
CRS-2677: Stop of 'ora.ctssd' on 'vrh1' succeeded
CRS-2673: Attempting to stop 'ora.cssd' on 'vrh1'
CRS-2677: Stop of 'ora.cssd' on 'vrh1' succeeded
CRS-2673: Attempting to stop 'ora.crf' on 'vrh1'
CRS-2673: Attempting to stop 'ora.diskmon' on 'vrh1'
CRS-2677: Stop of 'ora.diskmon' on 'vrh1' succeeded
CRS-2677: Stop of 'ora.crf' on 'vrh1' succeeded
CRS-2673: Attempting to stop 'ora.gipcd' on 'vrh1'
CRS-2677: Stop of 'ora.gipcd' on 'vrh1' succeeded
CRS-2673: Attempting to stop 'ora.gpnpd' on 'vrh1'
CRS-2677: Stop of 'ora.gpnpd' on 'vrh1' succeeded
CRS-2793: Shutdown of Oracle High Availability Services-managed resources on 'vrh1' has completed
CRS-4133: Oracle High Availability Services has been stopped.
Successfully deconfigured Oracle clusterware stack on this node

4.在最后的节点(last node)以root用户执行”$ORA_CRS_HOME/crs/install/rootcrs.pl -verbose -deconfig -force -lastnode”命令,该命令会清空OCR和Votedisk :

[root@vrh2 ~]# $ORA_CRS_HOME/crs/install/rootcrs.pl -verbose -deconfig -force -lastnode

Using configuration parameter file: /g01/11.2.0/grid/crs/install/crsconfig_params
CRS resources for listeners are still configured
Network exists: 1/192.168.1.0/255.255.255.0/eth0, type static
VIP exists: /vrh2-vip/192.168.1.164/192.168.1.0/255.255.255.0/eth0, hosting node vrh2
VIP exists: /vrh3-vip/192.168.1.166/192.168.1.0/255.255.255.0/eth0, hosting node vrh3
GSD exists
ONS exists: Local port 6100, remote port 6200, EM port 2016
ACFS-9200: Supported
CRS-2673: Attempting to stop 'ora.registry.acfs' on 'vrh2'
CRS-2677: Stop of 'ora.registry.acfs' on 'vrh2' succeeded
CRS-2673: Attempting to stop 'ora.crsd' on 'vrh2'
CRS-2790: Starting shutdown of Cluster Ready Services-managed resources on 'vrh2'
CRS-2673: Attempting to stop 'ora.DATA.dg' on 'vrh2'
CRS-2673: Attempting to stop 'ora.FRA.dg' on 'vrh2'
CRS-2673: Attempting to stop 'ora.SYSTEMDG.dg' on 'vrh2'
CRS-2673: Attempting to stop 'ora.oc4j' on 'vrh2'
CRS-2677: Stop of 'ora.oc4j' on 'vrh2' succeeded
CRS-2677: Stop of 'ora.DATA.dg' on 'vrh2' succeeded
CRS-2677: Stop of 'ora.SYSTEMDG.dg' on 'vrh2' succeeded
CRS-2677: Stop of 'ora.FRA.dg' on 'vrh2' succeeded
CRS-2673: Attempting to stop 'ora.asm' on 'vrh2'
CRS-2677: Stop of 'ora.asm' on 'vrh2' succeeded
CRS-2792: Shutdown of Cluster Ready Services-managed resources on 'vrh2' has completed
CRS-2677: Stop of 'ora.crsd' on 'vrh2' succeeded
CRS-2673: Attempting to stop 'ora.ctssd' on 'vrh2'
CRS-2673: Attempting to stop 'ora.evmd' on 'vrh2'
CRS-2673: Attempting to stop 'ora.asm' on 'vrh2'
CRS-2677: Stop of 'ora.asm' on 'vrh2' succeeded
CRS-2673: Attempting to stop 'ora.cluster_interconnect.haip' on 'vrh2'
CRS-2677: Stop of 'ora.evmd' on 'vrh2' succeeded
CRS-2677: Stop of 'ora.cluster_interconnect.haip' on 'vrh2' succeeded
CRS-2677: Stop of 'ora.ctssd' on 'vrh2' succeeded
CRS-2673: Attempting to stop 'ora.cssd' on 'vrh2'
CRS-2677: Stop of 'ora.cssd' on 'vrh2' succeeded
CRS-2673: Attempting to stop 'ora.diskmon' on 'vrh2'
CRS-2677: Stop of 'ora.diskmon' on 'vrh2' succeeded
CRS-2672: Attempting to start 'ora.cssdmonitor' on 'vrh2'
CRS-2676: Start of 'ora.cssdmonitor' on 'vrh2' succeeded
CRS-2672: Attempting to start 'ora.cssd' on 'vrh2'
CRS-2672: Attempting to start 'ora.diskmon' on 'vrh2'
CRS-2676: Start of 'ora.diskmon' on 'vrh2' succeeded
CRS-2676: Start of 'ora.cssd' on 'vrh2' succeeded
CRS-4611: Successful deletion of voting disk +SYSTEMDG.
ASM de-configuration trace file location: /tmp/asmcadc_clean2011-08-31_11-55-52-PM.log
ASM Clean Configuration START
ASM Clean Configuration END

ASM with SID +ASM1 deleted successfully. Check /tmp/asmcadc_clean2011-08-31_11-55-52-PM.log for details.

CRS-2791: Starting shutdown of Oracle High Availability Services-managed resources on 'vrh2'
CRS-2673: Attempting to stop 'ora.mdnsd' on 'vrh2'
CRS-2673: Attempting to stop 'ora.ctssd' on 'vrh2'
CRS-2673: Attempting to stop 'ora.asm' on 'vrh2'
CRS-2677: Stop of 'ora.asm' on 'vrh2' succeeded
CRS-2677: Stop of 'ora.mdnsd' on 'vrh2' succeeded
CRS-2677: Stop of 'ora.ctssd' on 'vrh2' succeeded
CRS-2673: Attempting to stop 'ora.cssd' on 'vrh2'
CRS-2677: Stop of 'ora.cssd' on 'vrh2' succeeded
CRS-2673: Attempting to stop 'ora.gipcd' on 'vrh2'
CRS-2673: Attempting to stop 'ora.diskmon' on 'vrh2'
CRS-2677: Stop of 'ora.gipcd' on 'vrh2' succeeded
CRS-2673: Attempting to stop 'ora.gpnpd' on 'vrh2'
CRS-2677: Stop of 'ora.diskmon' on 'vrh2' succeeded
CRS-2677: Stop of 'ora.gpnpd' on 'vrh2' succeeded
CRS-2793: Shutdown of Oracle High Availability Services-managed resources on 'vrh2' has completed
CRS-4133: Oracle High Availability Services has been stopped.
Successfully deconfigured Oracle clusterware stack on this node

5.在任意节点以Grid Infrastructure拥有者用户执行”$ORA_CRS_HOME/deinstall/deinstall”脚本:

[root@vrh1 ~]# su - grid
[grid@vrh1 ~]$ cd $ORA_CRS_HOME
[grid@vrh1 grid]$ cd deinstall/

[grid@vrh1 deinstall]$ cat deinstall
#!/bin/sh
#
# $Header: install/utl/scripts/db/deinstall /main/3 2010/05/28 20:12:57 ssampath Exp $
#
# Copyright (c) 2005, 2010, Oracle and/or its affiliates. All rights reserved.
#
#    NAME
#      deinstall - wrapper script that calls deinstall tool.
#
#    DESCRIPTION
#      This script will set all the necessary variables and call the tools
#      entry point.
#
#    NOTES
#
#
#    MODIFIED   (MM/DD/YY)
#    mwidjaja    04/29/10 - XbranchMerge mwidjaja_bug-9579184 from
#                           st_install_11.2.0.1.0
#    mwidjaja    04/15/10 - Added SHLIB_PATH for HP-PARISC
#    mwidjaja    01/14/10 - XbranchMerge mwidjaja_bug-9269768 from
#                           st_install_11.2.0.1.0
#    mwidjaja    01/14/10 - Fix help message for params
#    ssampath    12/24/09 - Fix for bug 9227535. Remove legacy version_check
#                           function
#    ssampath    12/01/09 - XbranchMerge ssampath_bug-9167533 from
#                           st_install_11.2.0.1.0
#    ssampath    11/30/09 - Set umask to 022.
#    prsubram    10/12/09 - XbranchMerge prsubram_bug-9005648 from main
#    prsubram    10/08/09 - Compute ARCHITECTURE_FLAG in the script
#    prsubram    09/15/09 - Setting LIBPATH for AIX
#    prsubram    09/10/09 - Add AIX specific code check java version
#    prsubram    09/10/09 - Change TOOL_DIR to BOOTSTRAP_DIR in java cmd
#                           invocation of bug#8874160
#    prsubram    09/08/09 - Change the default shell to /usr/xpg4/bin/sh on
#                           SunOS
#    prsubram    09/03/09 - Removing -d64 for client32 homes for the bug8859294
#    prsubram    06/22/09 - Resolve port specific id cmd issue
#    ssampath    06/02/09 - Fix for bug 8566942
#    ssampath    05/19/09 - Move removal of /tmp/deinstall to java
#                           code.
#    prsubram    04/30/09 - Fix for the bug#8474891
#    mwidjaja    04/29/09 - Added user check between the user running the
#                           script and inventory owner
#    ssampath    04/29/09 - Changes to make error message better when deinstall
#                           tool is invoked from inside ORACLE_HOME and -home
#                           is passed.
#    ssampath    04/15/09 - Fix for bug 8414555
#    prsubram    04/09/09 - LD_LIBRARY_PATH is ported for sol,hp-ux & aix
#    mwidjaja    03/26/09 - Disallow -home for running from OH
#    ssampath    03/24/09 - Fix for bug 8339519
#    wyou        02/25/09 - restructure the ohome check
#    wyou        02/25/09 - change the error msg for directory existance check
#    wyou        02/12/09 - add directory existance check
#    wyou        02/09/09 - add the check for the writablity for the oracle
#                           home passed-in
#    ssampath    01/21/09 - Add oui/lib to LD_LIBRARY_PATH
#    poosrini    01/07/09 - LOG related changes
#    ssampath    11/24/08 - Create /main/osds/unix branch
#    dchriste    10/30/08 - eliminate non-generic tools like 'cut'
#    ssampath    08/18/08 - Pickup srvm.jar from JLIB directory.
#    ssampath    07/30/08 - Add http_client.jar and OraCheckpoint.jar to
#                           CLASSPATH
#    ssampath    07/08/08 - assistantsCommon.jar and netca.jar location has
#                           changed.
#    ssampath    04/11/08 - If invoking the tool from installed home, JRE_HOME
#                           should be set to $OH/jdk/jre.
#    ssampath    04/09/08 - Add logic to instantiate ORA_CRS_HOME, JAVA_HOME
#                           etc.,
#    ssampath    04/03/08 - Pick up ldapjclnt11.jar
#    idai        04/03/08 - remove assistantsdc.jar and netcadc.jar
#    bktripat    02/23/07 -
#    khsingh     07/18/06 - add osdbagrp fix
#    khsingh     07/07/06 - fix regression
#    khsingh     06/20/06 - fix bug 5228203
#    bktripat    06/12/06 - Fix for bug 5246802
#    bktripat    05/08/06 -
#    khsingh     05/08/06 - fix tool to run from any parent directory
#    khsingh     05/08/06 - fix LD_LIBRARY_PATH to have abs. path
#    ssampath    05/01/06 - Fix for bug 5198219
#    bktripat    04/21/06 - Fix for bug 5074246
#    khsingh     04/11/06 - fix bug 5151658
#    khsingh     04/08/06 - Add WA for bugs 5006414 & 5093832
#    bktripat    02/08/06 - Fix for bug 5024086 & 5024061
#    bktripat    01/24/06 -
#    mstalin     01/23/06 - Add lib to pick libOsUtils.so
#    bktripat    01/19/06 - adding library changes
#    rahgupta    01/19/06 -
#    bktripat    01/19/06 -
#    mstalin     01/17/06 - Modify the assistants deconfig jar file name
#    rahgupta    01/17/06 - updating emcp classpath
#    khsingh     01/17/06 - export ORACLE_HOME
#    khsingh     01/17/06 - fix for CRS deconfig.
#    hying       01/17/06 - netcadc.jar
#    bktripat    01/16/06 -
#    ssampath    01/16/06 -
#    bktripat    01/11/06 -
#    clo         01/10/06 - add EMCP entries
#    hying       01/10/06 - netcaDeconfig.jar
#    mstalin     01/09/06 - Add OraPrereqChecks.jar
#    mstalin     01/09/06 -
#    khsingh     01/09/06 -
#    mstalin     01/09/06 - Add additional jars for assistants
#    ssampath    01/09/06 - removing parseOracleHome temporarily
#    ssampath    01/09/06 -
#    khsingh     01/08/06 - fix for CRS deconfig
#    ssampath    12/08/05 - added java version check
#    ssampath    12/08/05 - initial run,minor bugs fixed
#    ssampath    12/07/05 - Creation
#

#MACROS

if [ -z "$UNAME" ]; then UNAME="/bin/uname"; fi
if [ -z "$ECHO" ]; then ECHO="/bin/echo"; fi
if [ -z "$AWK" ]; then AWK="/bin/awk"; fi
if [ -z "$ID" ]; then ID="/usr/bin/id"; fi
if [ -z "$DIRNAME" ]; then DIRNAME="/usr/bin/dirname"; fi
if [ -z "$FILE" ]; then FILE="/usr/bin/file"; fi

if [ "`$UNAME`" = "SunOS" ]
then
    if [ -z "${_xpg4ShAvbl_deconfig}" ]
    then
        _xpg4ShAvbl_deconfig=1
        export _xpg4ShAvbl_deconfig
        /usr/xpg4/bin/sh $0 "$@"
        exit $?
    fi
        AWK="/usr/xpg4/bin/awk"
fi 

# Set umask to 022 always.
umask 022

INSTALLED_VERSION_FLAG=true
ARCHITECTURE_FLAG=64

TOOL_ARGS=$* # initialize this always.

# Since the OTN and the installed version of the tool is same, only way to
# differentiate is through the instantated variable ORA_CRS_HOME.  If it is
# NOT instantiated, then the tool is a downloaded version.
# Set HOME_VER to true based on the value of $INSTALLED_VERSION_FLAG
if [ x"$INSTALLED_VERSION_FLAG" = x"true" ]
then
   ORACLE_HOME=/g01/11.2.0/grid
   HOME_VER=1     # HOME_VER
   TOOL_ARGS="$ORACLE_HOME $TOOL_ARGS"
else
   HOME_VER=0
fi

# Save current working directory
CURR_DIR=`pwd`

# If CURR_DIR is different from TOOL_DIR get that location and cd into it.
TOOL_REL_PATH=`$DIRNAME $0`
cd $TOOL_REL_PATH

DOT=`$ECHO $TOOL_REL_PATH | $AWK -F'/' '{ print $1}'`

if [ "$DOT" = "." ];
then
  TOOL_DIR=$CURR_DIR/$TOOL_REL_PATH
elif [ `expr "$DOT" : '.*'` -gt 0 ];
then
  TOOL_DIR=$CURR_DIR/$TOOL_REL_PATH
else
  TOOL_DIR=$TOOL_REL_PATH
fi

# Check if this script is run as root.  If so, then error out.
# This is fix for bug 5024086.

RUID=`$ID|$AWK -F\( '{print $2}'|$AWK -F\) '{print $1}'`
if [ ${RUID} = "root" ];then
 $ECHO "You must not be logged in as root to run $0."
 $ECHO "Log in as Oracle user and rerun $0."
 exit $ROOT_USER
fi

# DEFINE FUNCTIONS BELOW
computeArchFlag() {
   TOOL_HOME=$1
   case `$UNAME` in
      HP-UX)
         if [ "`/usr/bin/file $TOOL_HOME/bin/kfod | $AWK -F\: '{print $2}' | $AWK -F\- '{print $2}' | $AWK '{print $1}'`" = "64" ];then
            ARCHITECTURE_FLAG="-d64"
         fi
      ;;
      AIX)
         if [ "`/usr/bin/file $TOOL_HOME/bin/kfod | $AWK -F\: '{print $2}' | $AWK '{print $1}' | $AWK -F\- '{print $1}'`" = "64" ];then
            ARCHITECTURE_FLAG="-d64"
         fi
      ;;
      *)
         if [ "`/usr/bin/file $TOOL_HOME/bin/kfod | $AWK -F\: '{print $2}' | $AWK '{print $2}' | $AWK -F\- '{print $1}'`" = "64" ];then
            ARCHITECTURE_FLAG="-d64"
         fi
      ;;
   esac
}

if [ $HOME_VER = 1 ];
then
   $ECHO "Checking for required files and bootstrapping ..."
   $ECHO "Please wait ..."
   TEMP_LOC=`$ORACLE_HOME/perl/bin/perl $ORACLE_HOME/deinstall/bootstrap.pl $HOME_VER $TOOL_ARGS`
   TOOL_DIR=$TEMP_LOC
else
   TEMP_LOC=`$TOOL_DIR/perl/bin/perl $TOOL_DIR/bootstrap.pl $HOME_VER $TOOL_ARGS`
fi

computeArchFlag $TOOL_DIR

$TOOL_DIR/perl/bin/perl $TOOL_DIR/deinstall.pl $HOME_VER $TEMP_LOC $TOOL_DIR $ARCHITECTURE_FLAG $TOOL_ARGS

[grid@vrh1 deinstall]$ ./deinstall

Checking for required files and bootstrapping ...
Please wait ...
Location of logs /tmp/deinstall2011-08-31_11-59-55PM/logs/

############ ORACLE DEINSTALL & DECONFIG TOOL START ############

######################### CHECK OPERATION START #########################
Install check configuration START

Checking for existence of the Oracle home location /g01/11.2.0/grid
Oracle Home type selected for de-install is: CRS
Oracle Base selected for de-install is: /g01/orabase
Checking for existence of central inventory location /g01/oraInventory
Checking for existence of the Oracle Grid Infrastructure home /g01/11.2.0/grid
The following nodes are part of this cluster: vrh1,vrh2,vrh3

Install check configuration END

Skipping Windows and .NET products configuration check

Checking Windows and .NET products configuration END

Traces log file: /tmp/deinstall2011-08-31_11-59-55PM/logs//crsdc.log
Enter an address or the name of the virtual IP used on node "vrh1"[vrh1-vip]
 > 

The following information can be collected by running "/sbin/ifconfig -a" on node "vrh1"
Enter the IP netmask of Virtual IP "192.168.1.162" on node "vrh1"[255.255.255.0]
 > 

Enter the network interface name on which the virtual IP address "192.168.1.162" is active
 > 

Enter an address or the name of the virtual IP used on node "vrh2"[vrh2-vip]
 > 

The following information can be collected by running "/sbin/ifconfig -a" on node "vrh2"
Enter the IP netmask of Virtual IP "192.168.1.164" on node "vrh2"[255.255.255.0]
 > 

Enter the network interface name on which the virtual IP address "192.168.1.164" is active
 > 

Enter an address or the name of the virtual IP used on node "vrh3"[vrh3-vip]
 > 

The following information can be collected by running "/sbin/ifconfig -a" on node "vrh3"
Enter the IP netmask of Virtual IP "192.168.1.166" on node "vrh3"[255.255.255.0]
 > 

Enter the network interface name on which the virtual IP address "192.168.1.166" is active
 > 

Enter an address or the name of the virtual IP[]
 > 

Network Configuration check config START

Network de-configuration trace file location: /tmp/deinstall2011-08-31_11-59-55PM/logs/
netdc_check2011-09-01_12-01-50-AM.log

Specify all RAC listeners (do not include SCAN listener) that are to be de-configured [LISTENER,LISTENER_SCAN1]:

Network Configuration check config END

Asm Check Configuration START

ASM de-configuration trace file location: /tmp/deinstall2011-08-31_11-59-55PM/logs/
asmcadc_check2011-09-01_12-01-51-AM.log

ASM configuration was not detected in this Oracle home. Was ASM configured in this Oracle home (y|n) [n]:
ASM was not detected in the Oracle Home

######################### CHECK OPERATION END #########################

####################### CHECK OPERATION SUMMARY #######################
Oracle Grid Infrastructure Home is: /g01/11.2.0/grid
The cluster node(s) on which the Oracle home de-installation will be performed are:vrh1,vrh2,vrh3
Oracle Home selected for de-install is: /g01/11.2.0/grid
Inventory Location where the Oracle home registered is: /g01/oraInventory
Skipping Windows and .NET products configuration check
Following RAC listener(s) will be de-configured: LISTENER,LISTENER_SCAN1
ASM was not detected in the Oracle Home
Do you want to continue (y - yes, n - no)? [n]: y
A log of this session will be written to: '/tmp/deinstall2011-08-31_11-59-55PM/logs/deinstall_deconfig2011-09-01_12-01-15-AM.out'
Any error messages from this session will be written to: '/tmp/deinstall2011-08-31_11-59-55PM/logs/deinstall_deconfig2011-09-01_12-01-15-AM.err'

######################## CLEAN OPERATION START ########################
ASM de-configuration trace file location: /tmp/deinstall2011-08-31_11-59-55PM/logs/asmcadc_clean2011-09-01_12-02-00-AM.log
ASM Clean Configuration END

Network Configuration clean config START

Network de-configuration trace file location: /tmp/deinstall2011-08-31_11-59-55PM/logs/netdc_clean2011-09-01_12-02-00-AM.log

De-configuring RAC listener(s): LISTENER,LISTENER_SCAN1

De-configuring listener: LISTENER
    Stopping listener: LISTENER
    Warning: Failed to stop listener. Listener may not be running.
Listener de-configured successfully.

De-configuring listener: LISTENER_SCAN1
    Stopping listener: LISTENER_SCAN1
    Warning: Failed to stop listener. Listener may not be running.
Listener de-configured successfully.

De-configuring Naming Methods configuration file on all nodes...
Naming Methods configuration file de-configured successfully.

De-configuring Local Net Service Names configuration file on all nodes...
Local Net Service Names configuration file de-configured successfully.

De-configuring Directory Usage configuration file on all nodes...
Directory Usage configuration file de-configured successfully.

De-configuring backup files on all nodes...
Backup files de-configured successfully.

The network configuration has been cleaned up successfully.

Network Configuration clean config END

---------------------------------------->

The deconfig command below can be executed in parallel on all the remote nodes.
Execute the command on  the local node after the execution completes on all the remote nodes.

Run the following command as the root user or the administrator on node "vrh3".

/tmp/deinstall2011-08-31_11-59-55PM/perl/bin/perl -I/tmp/deinstall2011-08-31_11-59-55PM/perl/lib
-I/tmp/deinstall2011-08-31_11-59-55PM/crs/install /tmp/deinstall2011-08-31_11-59-55PM/crs/install/rootcrs.pl
-force  -deconfig -paramfile "/tmp/deinstall2011-08-31_11-59-55PM/response/deinstall_Ora11g_gridinfrahome1.rsp"

Run the following command as the root user or the administrator on node "vrh2".

/tmp/deinstall2011-08-31_11-59-55PM/perl/bin/perl -I/tmp/deinstall2011-08-31_11-59-55PM/perl/lib
-I/tmp/deinstall2011-08-31_11-59-55PM/crs/install /tmp/deinstall2011-08-31_11-59-55PM/crs/install/rootcrs.pl -force
-deconfig -paramfile "/tmp/deinstall2011-08-31_11-59-55PM/response/deinstall_Ora11g_gridinfrahome1.rsp"

Run the following command as the root user or the administrator on node "vrh1".

/tmp/deinstall2011-08-31_11-59-55PM/perl/bin/perl -I/tmp/deinstall2011-08-31_11-59-55PM/perl/lib
-I/tmp/deinstall2011-08-31_11-59-55PM/crs/install /tmp/deinstall2011-08-31_11-59-55PM/crs/install/rootcrs.pl
-force  -deconfig -paramfile "/tmp/deinstall2011-08-31_11-59-55PM/response/deinstall_Ora11g_gridinfrahome1.rsp"
-lastnode

Press Enter after you finish running the above commands

执行deinstall过程中会要求以root用户在所有平台上执行相关命令

su - root

[root@vrh3 ~]# /tmp/deinstall2011-08-31_11-59-55PM/perl/bin/perl -I/tmp/deinstall2011-08-31_11-59-55PM/perl/lib
-I/tmp/deinstall2011-08-31_11-59-55PM/crs/install /tmp/deinstall2011-08-31_11-59-55PM/crs/install/rootcrs.pl -force
-deconfig -paramfile "/tmp/deinstall2011-08-31_11-59-55PM/response/deinstall_Ora11g_gridinfrahome1.rsp"
Using configuration parameter file: /tmp/deinstall2011-08-31_11-59-55PM/response/deinstall_Ora11g_gridinfrahome1.rsp
PRCR-1119 : Failed to look up CRS resources of ora.cluster_vip_net1.type type
PRCR-1068 : Failed to query resources
Cannot communicate with crsd
PRCR-1070 : Failed to check if resource ora.gsd is registered
Cannot communicate with crsd
PRCR-1070 : Failed to check if resource ora.ons is registered
Cannot communicate with crsd

ACFS-9200: Supported
CRS-4535: Cannot communicate with Cluster Ready Services
CRS-4000: Command Stop failed, or completed with errors.
CRS-4544: Unable to connect to OHAS
CRS-4000: Command Stop failed, or completed with errors.
Successfully deconfigured Oracle clusterware stack on this node

[root@vrh2 ~]# /tmp/deinstall2011-08-31_11-59-55PM/perl/bin/perl -I/tmp/deinstall2011-08-31_11-59-55PM/perl/lib -I/tmp/deinstall2011-08-31_11-59-55PM/crs/install /tmp/deinstall2011-08-31_11-59-55PM/crs/install/rootcrs.pl -force  -deconfig -paramfile
"/tmp/deinstall2011-08-31_11-59-55PM/response/deinstall_Ora11g_gridinfrahome1.rsp"
Using configuration parameter file: /tmp/deinstall2011-08-31_11-59-55PM/response/deinstall_Ora11g_gridinfrahome1.rsp
Usage: srvctl [command] [object] []
    commands: enable|disable|start|stop|status|add|remove|modify|getenv|setenv|unsetenv|config
    objects: database|service|asm|diskgroup|listener|home|ons
For detailed help on each command and object and its options use:
  srvctl [command] -h or
  srvctl [command] [object] -h
PRKO-2012 : nodeapps object is not supported in Oracle Restart
ACFS-9200: Supported
CRS-4047: No Oracle Clusterware components configured.
CRS-4000: Command Stop failed, or completed with errors.
CRS-4047: No Oracle Clusterware components configured.
CRS-4000: Command Stop failed, or completed with errors.
You must kill crs processes or reboot the system to properly
cleanup the processes started by Oracle clusterware
ACFS-9313: No ADVM/ACFS installation detected.
Either /etc/oracle/olr.loc does not exist or is not readable
Make sure the file exists and it has read and execute access
Failure in execution (rc=-1, 256, No such file or directory) for command 1 /etc/init.d/ohasd deinstall
error: package cvuqdisk is not installed
Successfully deconfigured Oracle clusterware stack on this node

[root@vrh1 ~]# /tmp/deinstall2011-08-31_11-59-55PM/perl/bin/perl -I/tmp/deinstall2011-08-31_11-59-55PM/perl/lib
-I/tmp/deinstall2011-08-31_11-59-55PM/crs/install /tmp/deinstall2011-08-31_11-59-55PM/crs/install/rootcrs.pl -force
-deconfig -paramfile "/tmp/deinstall2011-08-31_11-59-55PM/response/deinstall_Ora11g_gridinfrahome1.rsp" -lastnode
Using configuration parameter file: /tmp/deinstall2011-08-31_11-59-55PM/response/deinstall_Ora11g_gridinfrahome1.rsp
Adding daemon to inittab
crsexcl failed to start
Failed to start the Clusterware. Last 20 lines of the alert log follow:
2011-08-31 23:36:55.813
[ctssd(4067)]CRS-2408:The clock on host vrh1 has been updated by the Cluster Time Synchronization Service to be synchronous with the mean cluster time.
2011-08-31 23:38:23.855
[ctssd(4067)]CRS-2408:The clock on host vrh1 has been updated by the Cluster Time Synchronization Service to be synchronous with the mean cluster time.
2011-08-31 23:39:03.873
[ctssd(4067)]CRS-2408:The clock on host vrh1 has been updated by the Cluster Time Synchronization Service to be synchronous with the mean cluster time.
2011-08-31 23:39:11.707
[/g01/11.2.0/grid/bin/orarootagent.bin(4559)]CRS-5822:Agent '/g01/11.2.0/grid/bin/orarootagent_root'
disconnected from server. Details at (:CRSAGF00117:) {0:2:27} in
/g01/11.2.0/grid/log/vrh1/agent/crsd/orarootagent_root/orarootagent_root.log.
2011-08-31 23:39:12.725
[ctssd(4067)]CRS-2405:The Cluster Time Synchronization Service on host vrh1 is shutdown by user
2011-08-31 23:39:12.764
[mdnsd(3868)]CRS-5602:mDNS service stopping by request.
2011-08-31 23:39:13.987
[/g01/11.2.0/grid/bin/orarootagent.bin(3892)]CRS-5016:Process "/g01/11.2.0/grid/bin/acfsload"
spawned by agent "/g01/11.2.0/grid/bin/orarootagent.bin" for action "check" failed:
details at "(:CLSN00010:)" in "/g01/11.2.0/grid/log/vrh1/agent/ohasd/orarootagent_root/orarootagent_root.log"
2011-08-31 23:39:27.121
[cssd(3968)]CRS-1603:CSSD on node vrh1 shutdown by user.
2011-08-31 23:39:27.130
[ohasd(3639)]CRS-2767:Resource state recovery not attempted for 'ora.cssdmonitor' as its target state is OFFLINE
2011-08-31 23:39:31.926
[gpnpd(3880)]CRS-2329:GPNPD on node vrh1 shutdown. 

Usage: srvctl [command] [object] []
    commands: enable|disable|start|stop|status|add|remove|modify|getenv|setenv|unsetenv|config
    objects: database|service|asm|diskgroup|listener|home|ons
For detailed help on each command and object and its options use:
  srvctl [command] -h or
  srvctl [command] [object] -h
PRKO-2012 : scan_listener object is not supported in Oracle Restart
Usage: srvctl [command] [object] []
    commands: enable|disable|start|stop|status|add|remove|modify|getenv|setenv|unsetenv|config
    objects: database|service|asm|diskgroup|listener|home|ons
For detailed help on each command and object and its options use:
  srvctl [command] -h or
  srvctl [command] [object] -h
PRKO-2012 : scan_listener object is not supported in Oracle Restart
Usage: srvctl [command] [object] []
    commands: enable|disable|start|stop|status|add|remove|modify|getenv|setenv|unsetenv|config
    objects: database|service|asm|diskgroup|listener|home|ons
For detailed help on each command and object and its options use:
  srvctl [command] -h or
  srvctl [command] [object] -h
PRKO-2012 : scan object is not supported in Oracle Restart
Usage: srvctl [command] [object] []
    commands: enable|disable|start|stop|status|add|remove|modify|getenv|setenv|unsetenv|config
    objects: database|service|asm|diskgroup|listener|home|ons
For detailed help on each command and object and its options use:
  srvctl [command] -h or
  srvctl [command] [object] -h
PRKO-2012 : scan object is not supported in Oracle Restart
Usage: srvctl [command] [object] []
    commands: enable|disable|start|stop|status|add|remove|modify|getenv|setenv|unsetenv|config
    objects: database|service|asm|diskgroup|listener|home|ons
For detailed help on each command and object and its options use:
  srvctl [command] -h or
  srvctl [command] [object] -h
PRKO-2012 : nodeapps object is not supported in Oracle Restart
ACFS-9200: Supported
CRS-4047: No Oracle Clusterware components configured.
CRS-4000: Command Stop failed, or completed with errors.
CRS-4047: No Oracle Clusterware components configured.
CRS-4000: Command Delete failed, or completed with errors.
CRS-4047: No Oracle Clusterware components configured.
CRS-4000: Command Stop failed, or completed with errors.
CRS-4047: No Oracle Clusterware components configured.
CRS-4000: Command Modify failed, or completed with errors.
Adding daemon to inittab
crsexcl failed to start
Failed to start the Clusterware. Last 20 lines of the alert log follow:
[ctssd(4067)]CRS-2408:The clock on host vrh1 has been updated by the Cluster Time
Synchronization Service to be synchronous with the mean cluster time.
2011-08-31 23:38:23.855
[ctssd(4067)]CRS-2408:The clock on host vrh1 has been updated by the Cluster Time
Synchronization Service to be synchronous with the mean cluster time.
2011-08-31 23:39:03.873
[ctssd(4067)]CRS-2408:The clock on host vrh1 has been updated by the Cluster Time
Synchronization Service to be synchronous with the mean cluster time.
2011-08-31 23:39:11.707
[/g01/11.2.0/grid/bin/orarootagent.bin(4559)]CRS-5822:Agent '/g01/11.2.0/grid/bin/orarootagent_root'
disconnected from server. Details at (:CRSAGF00117:) {0:2:27} in
/g01/11.2.0/grid/log/vrh1/agent/crsd/orarootagent_root/orarootagent_root.log.
2011-08-31 23:39:12.725
[ctssd(4067)]CRS-2405:The Cluster Time Synchronization Service on host vrh1 is shutdown by user
2011-08-31 23:39:12.764
[mdnsd(3868)]CRS-5602:mDNS service stopping by request.
2011-08-31 23:39:13.987
[/g01/11.2.0/grid/bin/orarootagent.bin(3892)]CRS-5016:Process
"/g01/11.2.0/grid/bin/acfsload" spawned by agent "/g01/11.2.0/grid/bin/orarootagent.bin" for action
"check" failed: details at "(:CLSN00010:)" in
"/g01/11.2.0/grid/log/vrh1/agent/ohasd/orarootagent_root/orarootagent_root.log"
2011-08-31 23:39:27.121
[cssd(3968)]CRS-1603:CSSD on node vrh1 shutdown by user.
2011-08-31 23:39:27.130
[ohasd(3639)]CRS-2767:Resource state recovery not attempted for 'ora.cssdmonitor' as its target state is OFFLINE
2011-08-31 23:39:31.926
[gpnpd(3880)]CRS-2329:GPNPD on node vrh1 shutdown.
[client(13099)]CRS-10001:01-Sep-11 00:11 ACFS-9200: Supported

CRS-4047: No Oracle Clusterware components configured.
CRS-4000: Command Delete failed, or completed with errors.
crsctl delete for vds in SYSTEMDG ... failed
CRS-4047: No Oracle Clusterware components configured.
CRS-4000: Command Delete failed, or completed with errors.
CRS-4047: No Oracle Clusterware components configured.
CRS-4000: Command Stop failed, or completed with errors.
ACFS-9313: No ADVM/ACFS installation detected.
Either /etc/oracle/olr.loc does not exist or is not readable
Make sure the file exists and it has read and execute access
Failure in execution (rc=-1, 256, No such file or directory) for command 1 /etc/init.d/ohasd deinstall
error: package cvuqdisk is not installed
Successfully deconfigured Oracle clusterware stack on this node

回到最初运行deintall的终端摁下回车

The deconfig command below can be executed in parallel on all the remote nodes.
Execute the command on  the local node after the execution completes on all the remote nodes.

Press Enter after you finish running the above commands

<----------------------------------------

Removing Windows and .NET products configuration END
Oracle Universal Installer clean START

Detach Oracle home '/g01/11.2.0/grid' from the central inventory on the local node : Done

Delete directory '/g01/11.2.0/grid' on the local node : Done

Delete directory '/g01/oraInventory' on the local node : Done

Delete directory '/g01/orabase' on the local node : Done

Detach Oracle home '/g01/11.2.0/grid' from the central inventory on the remote nodes 'vrh3,vrh2' : Done

Delete directory '/g01/11.2.0/grid' on the remote nodes 'vrh2,vrh3' : Done

Delete directory '/g01/oraInventory' on the remote nodes 'vrh3' : Done

Delete directory '/g01/oraInventory' on the remote nodes 'vrh2' : Failed <<<<

The directory '/g01/oraInventory' could not be deleted on the nodes 'vrh2'.
Delete directory '/g01/orabase' on the remote nodes 'vrh2' : Done

Delete directory '/g01/orabase' on the remote nodes 'vrh3' : Done

Oracle Universal Installer cleanup completed with errors.

Oracle Universal Installer clean END

Oracle install clean START

Clean install operation removing temporary directory '/tmp/deinstall2011-08-31_11-59-55PM' on node 'vrh1'
Clean install operation removing temporary directory '/tmp/deinstall2011-08-31_11-59-55PM' on node 'vrh2'
Clean install operation removing temporary directory '/tmp/deinstall2011-08-31_11-59-55PM' on node 'vrh3'

Oracle install clean END

######################### CLEAN OPERATION END #########################

####################### CLEAN OPERATION SUMMARY #######################
Following RAC listener(s) were de-configured successfully: LISTENER,LISTENER_SCAN1
Oracle Clusterware is stopped and successfully de-configured on node "vrh3"
Oracle Clusterware is stopped and successfully de-configured on node "vrh2"
Oracle Clusterware is stopped and successfully de-configured on node "vrh1"
Oracle Clusterware is stopped and de-configured successfully.
Skipping Windows and .NET products configuration clean
Successfully detached Oracle home '/g01/11.2.0/grid' from the central inventory on the local node.
Successfully deleted directory '/g01/11.2.0/grid' on the local node.
Successfully deleted directory '/g01/oraInventory' on the local node.
Successfully deleted directory '/g01/orabase' on the local node.
Successfully detached Oracle home '/g01/11.2.0/grid' from the central inventory on the remote nodes 'vrh3,vrh2'.
Successfully deleted directory '/g01/11.2.0/grid' on the remote nodes 'vrh2,vrh3'.
Successfully deleted directory '/g01/oraInventory' on the remote nodes 'vrh3'.
Failed to delete directory '/g01/oraInventory' on the remote nodes 'vrh2'.
Successfully deleted directory '/g01/orabase' on the remote nodes 'vrh2'.
Successfully deleted directory '/g01/orabase' on the remote nodes 'vrh3'.
Oracle Universal Installer cleanup completed with errors.

Run 'rm -rf /etc/oraInst.loc' as root on node(s) 'vrh1,vrh3' at the end of the session.

Run 'rm -rf /opt/ORCLfmap' as root on node(s) 'vrh1 vrh3 vrh2 ' at the end of the session.
Oracle deinstall tool successfully cleaned up temporary directories.
#######################################################################

############# ORACLE DEINSTALL & DECONFIG TOOL END #############

deintall运行完成后会提示让你在必要的节点上运行”rm -rf /etc/oraInst.loc”和”rm -rf /opt/ORCLfmap”,照做即可。
以上脚本运行完成后各节点上的GI已被删除,且/etc/inittab文件已还原为非GI版,/etc/init.d下的CRS相关脚本也已相应删除。

 

Reference:

<How to Proceed from Failed 11gR2 Grid Infrastructure (CRS) Installation [ID 942166.1]>

Script:RAC Failover检验脚本loop.sh

以下脚本可以用于验证RAC中FAILOVER的可用性:

loop.sh
  nohup sqlplus su/su@failover @verify.sql &
     sleep 1
  nohup sqlplus su/su@failover @verify.sql &
     sleep 1
  nohup sqlplus su/su@failover @verify.sql &
     sleep 1
  nohup sqlplus su/su@failover @verify.sql &
     sleep 1

verify.sql (检验SQL)
  REM  set pagesize 1000
  REM  the following query is for TAF connection verification
  col sid format 999
  col serial# format 9999999
  col failover_type format a13
  col failover_method format a15
  col failed_over format a11
  select sid, serial#, failover_type, failover_method, failed_over
    from v$session where username = 'SU';

  REM  the following query is for load balancing verification
  select instance_name from v$instance;
  exit

  REM you can also combine two queries:
  col inst_id format 999
  col sid format 999
  col serial# format 9999999
  col failover_type format a13
  col failover_method format a15
  col failed_over format a11

  select inst_id, sid, serial#, failover_type, failover_method,
         failed_over from gv$session where username = 'SU';

  REM  a simple select to see the distribution of users when testing 
  REM  connection load balancing

  select inst_id, count(*) from gv$session group by inst_id;

用法:
./loop.sh

Script:优化crs_stat命令的输出

在10g RAC中我们常用crs_stat命令查看CRS资源的状态,但是crs_stat命令的输出并不完整。可以通过以下脚本来优化crs_stat的输出:

--------------------------- Begin Shell Script -------------------------------

#!/usr/bin/ksh
#
# Sample 10g CRS resource status query script
#
# Description:
#    - Returns formatted version of crs_stat -t, in tabular
#      format, with the complete rsc names and filtering keywords
#   - The argument, $RSC_KEY, is optional and if passed to the script, will
#     limit the output to HA resources whose names match $RSC_KEY.
# Requirements:
#   - $ORA_CRS_HOME should be set in your environment 

RSC_KEY=$1
QSTAT=-u
AWK=/usr/xpg4/bin/awk    # if not available use /usr/bin/awk

# Table header:echo ""
$AWK \
  'BEGIN {printf "%-45s %-10s %-18s\n", "HA Resource", "Target", "State";
          printf "%-45s %-10s %-18s\n", "-----------", "------", "-----";}'

# Table body:
$ORA_CRS_HOME/bin/crs_stat $QSTAT | $AWK \
 'BEGIN { FS="="; state = 0; }
  $1~/NAME/ && $2~/'$RSC_KEY'/ {appname = $2; state=1};
  state == 0 {next;}
  $1~/TARGET/ && state == 1 {apptarget = $2; state=2;}
  $1~/STATE/ && state == 2 {appstate = $2; state=3;}
  state == 3 {printf "%-45s %-10s %-18s\n", appname, apptarget, appstate; state=0;}'

--------------------------- End Shell Script -------------------------------

利用Cluster Verify Utility工具体验RAC最佳实践

Cluster Verification Utilit(CVU)是Oracle所推荐的一种集群检验工具。该检验工具帮助用户在Cluter部署的各个阶段验证集群的重要组件,这些阶段包括硬件搭建、Clusterware的安装、RDBMS的安装、存储等等。我们既可以在Cluster安装之前使用CVU来帮我们检验所配置的环境正确可用,也可以在软件安装完成后使用CVU来做对集群的验收。

CVU提供了一种可扩展的框架,其所实施的常规检验活动是独立于具体的平台,并且向存储和网络的检验提供了厂商接口(Vendor Interface)。
CVU工具不依赖于其他Oracle软件,仅使用命令cluvfy,如cluvfy stage -pre crsinst -n vrh1,vrh2。

cluvfy的部署十分简单,在本地节点安装后,该工具在运行过程中会自动部署到远程主机上。具体的自动部署流程如下:

  1. 用户在本地节点安装CVU
  2. 用户针对多个节点实施Verify检验命令
  3. CVU工具将拷贝自身必要的文件到远程节点
  4. CVU会在所有节点执行检验任务并生成报告

CVU工具可以为我们提供以下功能:

  1. 验证Cluster集群是否规范配置以便后续的RAC安装、配置和操作顺利
  2. 全类型的验证
  3. 非破坏性的验证
  4. 提供了易于使用的接口
  5. 支持各种平台和配置的RAC,明确完善的统一行为方式

注意不要误解cluvfy的作用,它仅仅是一个检验者,而不负责实际的配置或修复工作:

  1. cluvfy不支持任何类型的cluster或RAC操作
  2. 在检验到问题或失败后,cluvfy不会采取任何修正行为
  3. cluvfy不是性能调优或监控工具
  4. cluvfy不会尝试帮助你验证RAC数据库的内部结构

RAC的实际部署可以被逻辑地区分为多个操作阶段,这些阶段被称作是”stage”,在实际的部署过程中每一个stage由一系列的操作组成。每一个stage的都有自身的预检查(pre-check)和验收检查(post-check),如图:

cluvfy_stage_list

 

我们会在CRS和RAC数据库的安装过程中具体使用Cluster Verify Utility的不同”stage”,各种不同的stage可以使用cluvfy stage -list命令列出:

cluvfy stage -list

USAGE:
cluvfy stage {-pre|-post} <stage-name> <stage-specific options>  [-verbose]

Valid Stages are:
      -pre cfs        : pre-check for CFS setup
      -pre crsinst    : pre-check for CRS installation
      -pre acfscfg    : pre-check for ACFS Configuration.
      -pre dbinst     : pre-check for database installation
      -pre dbcfg      : pre-check for database configuration
      -pre hacfg      : pre-check for HA configuration
      -pre nodeadd    : pre-check for node addition.
      -post hwos      : post-check for hardware and operating system
      -post cfs       : post-check for CFS setup
      -post crsinst   : post-check for CRS installation
      -post acfscfg   : post-check for ACFS Configuration.
      -post hacfg     : post-check for HA configuration
      -post nodeadd   : post-check for node addition.
      -post nodedel   : post-check for node deletion.

在RAC Cluster中独立的子系统或者模块被称作组件(component),集群组件的可用性、完整性、稳定性以及其他一些表现均可以使用CVU来验证。简单如某个存储设备、复杂如包含CRSD、EVMD、CSSD、OCR等多个子组件的CRS stack都可以被认为是一个组件。

在CRS运行过程中为了检验Cluster中的某个特定组件(component)或者为了独立诊断某个Cluster集群子系统,需要用到合适的组件检查命令;各种不同组件的检验可以使用cluvfy comp -list命令列出:

 cluvfy comp -list

USAGE:
cluvfy comp     [-verbose]

Valid Components are:
      nodereach       : checks reachability between nodes
      nodecon         : checks node connectivity
      cfs             : checks CFS integrity
      ssa             : checks shared storage accessibility
      space           : checks space availability
      sys             : checks minimum system requirements
      clu             : checks cluster integrity
      clumgr          : checks cluster manager integrity
      ocr             : checks OCR integrity
      olr             : checks OLR integrity
      ha              : checks HA integrity
      crs             : checks CRS integrity
      nodeapp         : checks node applications existence
      admprv          : checks administrative privileges
      peer            : compares properties with peers
      software        : checks software distribution
      acfs            : checks ACFS integrity
      asm             : checks ASM integrity
      gpnp            : checks GPnP integrity
      gns             : checks GNS integrity
      scan            : checks SCAN configuration
      ohasd           : checks OHASD integrity
      clocksync       : checks Clock Synchronization
      vdisk           : checks Voting Disk configuration and UDEV settings
      dhcp            : Checks DHCP configuration
      dns             : Checks DNS configuration

我们可以从OTN的CVU专栏内下载到最新版本的cluvfy,如果没有特殊的要求那么我们总是推荐使用最新版本。一般在完成RAC安装后也可以从以下2个路径找到cluvfy:

Clusterware Home
<crs_home>/bin/cluvfy

Oracle Home
$ORACLE_HOME/bin/cluvfy

使用最为频繁的几个cluvfy命令如下:

Verify the hardware and operating system:检验操作系统和硬件的配置

cluvfy stage -post hwos -n vrh1,vrh2

cluvfy stage -post hwos -n vrh1,vrh2

Performing post-checks for hardware and operating system setup 

Checking node reachability...
Node reachability check passed from node "vrh1"

Checking user equivalence...
User equivalence check passed for user "grid"

Checking node connectivity...

Checking hosts config file...

Verification of the hosts config file successful

Node connectivity passed for subnet "192.168.1.0" with node(s) vrh2,vrh1
TCP connectivity check passed for subnet "192.168.1.0"

Node connectivity passed for subnet "192.168.2.0" with node(s) vrh2,vrh1
TCP connectivity check passed for subnet "192.168.2.0"

Node connectivity passed for subnet "169.254.0.0" with node(s) vrh2,vrh1
TCP connectivity check passed for subnet "169.254.0.0"

Interfaces found on subnet "192.168.1.0" that are likely candidates for VIP are:
vrh2 eth0:192.168.1.163 eth0:192.168.1.164 eth0:192.168.1.166
vrh1 eth0:192.168.1.161 eth0:192.168.1.190 eth0:192.168.1.162

Interfaces found on subnet "169.254.0.0" that are likely candidates for VIP are:
vrh2 eth1:169.254.8.92
vrh1 eth1:169.254.175.195

Interfaces found on subnet "192.168.2.0" that are likely candidates for a private interconnect are:
vrh2 eth1:192.168.2.19
vrh1 eth1:192.168.2.18

Node connectivity check passed

Check for multiple users with UID value 0 passed
Time zone consistency check passed

Checking shared storage accessibility...

  Disk                                  Sharing Nodes (2 in count)
  ------------------------------------  ------------------------
  /dev/sdb                              vrh2 vrh1               

  Disk                                  Sharing Nodes (2 in count)
  ------------------------------------  ------------------------
  /dev/sdc                              vrh2 vrh1               

  Disk                                  Sharing Nodes (2 in count)
  ------------------------------------  ------------------------
  /dev/sdd                              vrh2 vrh1               

  Disk                                  Sharing Nodes (2 in count)
  ------------------------------------  ------------------------
  /dev/sde                              vrh2 vrh1               

  Disk                                  Sharing Nodes (2 in count)
  ------------------------------------  ------------------------
  /dev/sdf                              vrh2 vrh1               

  Disk                                  Sharing Nodes (2 in count)
  ------------------------------------  ------------------------
  /dev/sdg                              vrh2 vrh1               

  Disk                                  Sharing Nodes (2 in count)
  ------------------------------------  ------------------------
  /dev/sdh                              vrh2 vrh1               

  Disk                                  Sharing Nodes (2 in count)
  ------------------------------------  ------------------------
  /dev/sdi                              vrh2 vrh1               

  Disk                                  Sharing Nodes (2 in count)
  ------------------------------------  ------------------------
  /dev/sdj                              vrh2 vrh1               

  Disk                                  Sharing Nodes (2 in count)
  ------------------------------------  ------------------------
  /dev/sdk                              vrh2 vrh1               

  Disk                                  Sharing Nodes (2 in count)
  ------------------------------------  ------------------------
  /dev/sdl                              vrh2 vrh1               

  Disk                                  Sharing Nodes (2 in count)
  ------------------------------------  ------------------------
  /dev/sdm                              vrh2 vrh1               

  Disk                                  Sharing Nodes (2 in count)
  ------------------------------------  ------------------------
  /dev/sdn                              vrh2 vrh1               

  Disk                                  Sharing Nodes (2 in count)
  ------------------------------------  ------------------------
  /dev/sdo                              vrh2 vrh1               

  Disk                                  Sharing Nodes (2 in count)
  ------------------------------------  ------------------------
  /dev/sdp                              vrh2 vrh1               

  Disk                                  Sharing Nodes (2 in count)
  ------------------------------------  ------------------------
  /dev/sdq                              vrh2 vrh1               

  Disk                                  Sharing Nodes (2 in count)
  ------------------------------------  ------------------------
  /dev/sdr                              vrh2 vrh1               

  Disk                                  Sharing Nodes (2 in count)
  ------------------------------------  ------------------------
  /dev/sds                              vrh2 vrh1               

  Disk                                  Sharing Nodes (2 in count)
  ------------------------------------  ------------------------
  /dev/sdt                              vrh2 vrh1               
Shared storage check was successful on nodes "vrh2,vrh1"
Post-check for hardware and operating system setup was successful.

Cluster Installation Ready check on all nodes:安装Clusterware前执行以下命令

 cluvfy stage -pre crsinst -n vrh1,vrh2

cluvfy stage -pre crsinst -n vrh1,vrh2
Performing pre-checks for cluster services setup
Checking node reachability...
Node reachability check passed from node "vrh1"
Checking user equivalence...
User equivalence check passed for user "grid"
Checking node connectivity...
Checking hosts config file...
Verification of the hosts config file successful
Node connectivity passed for subnet "192.168.1.0" with node(s) vrh2,vrh1
TCP connectivity check passed for subnet "192.168.1.0"

Node connectivity passed for subnet "192.168.2.0" with node(s) vrh2,vrh1
TCP connectivity check passed for subnet "192.168.2.0"

Node connectivity passed for subnet "169.254.0.0" with node(s) vrh2,vrh1
TCP connectivity check passed for subnet "169.254.0.0"

Interfaces found on subnet "192.168.1.0" that are likely candidates for VIP are:
vrh2 eth0:192.168.1.163 eth0:192.168.1.164 eth0:192.168.1.166
vrh1 eth0:192.168.1.161 eth0:192.168.1.190 eth0:192.168.1.162

Interfaces found on subnet "169.254.0.0" that are likely candidates for VIP are:
vrh2 eth1:169.254.8.92
vrh1 eth1:169.254.175.195

Interfaces found on subnet "192.168.2.0" that are likely candidates for a private interconnect are:
vrh2 eth1:192.168.2.19
vrh1 eth1:192.168.2.18

Node connectivity check passed

Checking ASMLib configuration.
Check for ASMLib configuration passed.
Total memory check passed
Available memory check passed
Swap space check passed
Free disk space check passed for "vrh2:/tmp"
Free disk space check passed for "vrh1:/tmp"
Check for multiple users with UID value 54322 passed
User existence check passed for "grid"
Group existence check passed for "oinstall"
Group existence check passed for "dba"
Membership check for user "grid" in group "oinstall" [as Primary] failed
Check failed on nodes:
        vrh2,vrh1
Membership check for user "grid" in group "dba" failed
Check failed on nodes:
        vrh2,vrh1
Run level check passed
Hard limits check passed for "maximum open file descriptors"
Soft limits check passed for "maximum open file descriptors"
Hard limits check passed for "maximum user processes"
Soft limits check passed for "maximum user processes"
System architecture check passed
Kernel version check passed
Kernel parameter check passed for "semmsl"
Kernel parameter check passed for "semmns"
Kernel parameter check passed for "semopm"
Kernel parameter check passed for "semmni"
Kernel parameter check passed for "shmmax"
Kernel parameter check passed for "shmmni"
Kernel parameter check passed for "shmall"
Kernel parameter check passed for "file-max"
Kernel parameter check passed for "ip_local_port_range"
Kernel parameter check passed for "rmem_default"
Kernel parameter check passed for "rmem_max"
Kernel parameter check passed for "wmem_default"
Kernel parameter check passed for "wmem_max"
Kernel parameter check passed for "aio-max-nr"
Package existence check passed for "make-3.81( x86_64)"
Package existence check passed for "binutils-2.17.50.0.6( x86_64)"
Package existence check passed for "gcc-4.1.2 (x86_64)( x86_64)"
Package existence check passed for "libaio-0.3.106 (x86_64)( x86_64)"
Package existence check passed for "glibc-2.5-24 (x86_64)( x86_64)"
Package existence check passed for "compat-libstdc++-33-3.2.3 (x86_64)( x86_64)"
Package existence check passed for "elfutils-libelf-0.125 (x86_64)( x86_64)"
Package existence check passed for "elfutils-libelf-devel-0.125( x86_64)"
Package existence check passed for "glibc-common-2.5( x86_64)"
Package existence check passed for "glibc-devel-2.5 (x86_64)( x86_64)"
Package existence check passed for "glibc-headers-2.5( x86_64)"
Package existence check passed for "gcc-c++-4.1.2 (x86_64)( x86_64)"
Package existence check passed for "libaio-devel-0.3.106 (x86_64)( x86_64)"
Package existence check passed for "libgcc-4.1.2 (x86_64)( x86_64)"
Package existence check passed for "libstdc++-4.1.2 (x86_64)( x86_64)"
Package existence check passed for "libstdc++-devel-4.1.2 (x86_64)( x86_64)"
Package existence check passed for "sysstat-7.0.2( x86_64)"
Package existence check passed for "ksh-20060214( x86_64)"
Check for multiple users with UID value 0 passed
Current group ID check passed

Starting Clock synchronization checks using Network Time Protocol(NTP)...

NTP Configuration file check started...
No NTP Daemons or Services were found to be running

Clock synchronization check using Network Time Protocol(NTP) passed

Core file name pattern consistency check passed.

User "grid" is not part of "root" group. Check passed
Default user file creation mask check passed
Checking consistency of file "/etc/resolv.conf" across nodes

File "/etc/resolv.conf" does not have both domain and search entries defined
domain entry in file "/etc/resolv.conf" is consistent across nodes
search entry in file "/etc/resolv.conf" is consistent across nodes
All nodes have one search entry defined in file "/etc/resolv.conf"
The DNS response time for an unreachable node is within acceptable limit on all nodes

File "/etc/resolv.conf" is consistent across nodes

Time zone consistency check passed

Starting check for Huge Pages Existence ...

Check for Huge Pages Existence passed

Starting check for Hardware Clock synchronization at shutdown ...

Check for Hardware Clock synchronization at shutdown passed

Pre-check for cluster services setup was unsuccessful on all the nodes.

Database Installation Ready check on all nodes:安装RDBMS前执行以下命令
cluvfy stage -pre dbinst -n vrh1,vrh2

cluvfy stage -pre dbinst -n vrh1,vrh2 

Performing pre-checks for database installation 

Checking node reachability...
Node reachability check passed from node "vrh1"

Checking user equivalence...
User equivalence check passed for user "grid"

Checking node connectivity...

Checking hosts config file...

Verification of the hosts config file successful

Check: Node connectivity for interface "eth0"
Node connectivity passed for interface "eth0"

Check: Node connectivity for interface "eth1"
Node connectivity passed for interface "eth1"

Node connectivity check passed

Total memory check passed
Available memory check passed
Swap space check passed
Free disk space check passed for "vrh2:/tmp"
Free disk space check passed for "vrh1:/tmp"
Check for multiple users with UID value 54322 passed
User existence check passed for "grid"
Group existence check passed for "asmadmin"
Group existence check passed for "dba"
Membership check for user "grid" in group "asmadmin" [as Primary] passed
Membership check for user "grid" in group "dba" failed
Check failed on nodes:
        vrh2,vrh1
Run level check passed
Hard limits check passed for "maximum open file descriptors"
Soft limits check passed for "maximum open file descriptors"
Hard limits check passed for "maximum user processes"
Soft limits check passed for "maximum user processes"
System architecture check passed
Kernel version check passed
Kernel parameter check passed for "semmsl"
Kernel parameter check passed for "semmns"
Kernel parameter check passed for "semopm"
Kernel parameter check passed for "semmni"
Kernel parameter check passed for "shmmax"
Kernel parameter check passed for "shmmni"
Kernel parameter check passed for "shmall"
Kernel parameter check passed for "file-max"
Kernel parameter check passed for "ip_local_port_range"
Kernel parameter check passed for "rmem_default"
Kernel parameter check passed for "rmem_max"
Kernel parameter check passed for "wmem_default"
Kernel parameter check passed for "wmem_max"
Kernel parameter check passed for "aio-max-nr"
Package existence check passed for "make-3.81( x86_64)"
Package existence check passed for "binutils-2.17.50.0.6( x86_64)"
Package existence check passed for "gcc-4.1.2 (x86_64)( x86_64)"
Package existence check passed for "libaio-0.3.106 (x86_64)( x86_64)"
Package existence check passed for "glibc-2.5-24 (x86_64)( x86_64)"
Package existence check passed for "compat-libstdc++-33-3.2.3 (x86_64)( x86_64)"
Package existence check passed for "elfutils-libelf-0.125 (x86_64)( x86_64)"
Package existence check passed for "elfutils-libelf-devel-0.125( x86_64)"
Package existence check passed for "glibc-common-2.5( x86_64)"
Package existence check passed for "glibc-devel-2.5 (x86_64)( x86_64)"
Package existence check passed for "glibc-headers-2.5( x86_64)"
Package existence check passed for "gcc-c++-4.1.2 (x86_64)( x86_64)"
Package existence check passed for "libaio-devel-0.3.106 (x86_64)( x86_64)"
Package existence check passed for "libgcc-4.1.2 (x86_64)( x86_64)"
Package existence check passed for "libstdc++-4.1.2 (x86_64)( x86_64)"
Package existence check passed for "libstdc++-devel-4.1.2 (x86_64)( x86_64)"
Package existence check passed for "sysstat-7.0.2( x86_64)"
Package existence check passed for "ksh-20060214( x86_64)"
Check for multiple users with UID value 0 passed
Current group ID check passed
Default user file creation mask check passed

Checking CRS integrity...

CRS integrity check passed

Checking Cluster manager integrity... 

Checking CSS daemon...
Oracle Cluster Synchronization Services appear to be online.

Cluster manager integrity check passed

Checking node application existence...

Checking existence of VIP node application (required)
VIP node application check passed

Checking existence of NETWORK node application (required)
NETWORK node application check passed

Checking existence of GSD node application (optional)
GSD node application is offline on nodes "vrh2,vrh1"

Checking existence of ONS node application (optional)
ONS node application check passed

Checking if Clusterware is installed on all nodes...
Check of Clusterware install passed

Checking if CTSS Resource is running on all nodes...
CTSS resource check passed

Querying CTSS for time offset on all nodes...
Query of CTSS for time offset passed

Check CTSS state started...
CTSS is in Active state. Proceeding with check of clock time offsets on all nodes...
Check of clock time offsets passed

Oracle Cluster Time Synchronization Services check passed
Checking consistency of file "/etc/resolv.conf" across nodes

File "/etc/resolv.conf" does not have both domain and search entries defined
domain entry in file "/etc/resolv.conf" is consistent across nodes
search entry in file "/etc/resolv.conf" is consistent across nodes
All nodes have one search entry defined in file "/etc/resolv.conf"
The DNS response time for an unreachable node is within acceptable limit on all nodes

File "/etc/resolv.conf" is consistent across nodes

Time zone consistency check passed
Checking VIP configuration.
Checking VIP Subnet configuration.
Check for VIP Subnet configuration passed.
Checking VIP reachability
Check for VIP reachability passed.

Pre-check for database installation was unsuccessful on all the nodes.

利用cluvfy检验ocr组件:
cluvfy comp ocr

cluvfy comp ocr
Verifying OCR integrity

Checking OCR integrity...

Checking the absence of a non-clustered configuration...
All nodes free of non-clustered, local-only configurations

ASM Running check passed. ASM is running on all specified nodes

Checking OCR config file "/etc/oracle/ocr.loc"...

OCR config file "/etc/oracle/ocr.loc" check successful

Disk group for ocr location "+SYSTEMDG" available on all the nodes

Disk group for ocr location "+FRA" available on all the nodes

Disk group for ocr location "+DATA" available on all the nodes

NOTE:
This check does not verify the integrity of the OCR contents.
Execute 'ocrcheck' as a privileged user to verify the contents of OCR.

OCR integrity check passed

Verification of OCR integrity was successful.

Make Connection NON Load Balance But Failover to RAC

今天在Q群里有朋友问如何配置客户端以限制应用仅连接到特定的一个实例,而将其他实例作为Failover的对象;有网友称这是拿RAC来做双机热备。实际上在过去的8i 时代,OPS就是被人们用来充当一种升级版的双机热备的,因为双机热备仍只有1台服务器在运行,而OPS让原来闲置的备机资源也得到一定程度的利用,而且其MTTR要短于双机热备,所以虽然当时的OPS仍有着显著的性能问题(没有cache fusion),但仍有不少用户使用。

我们可以在tnsnames.ora文件中配置类似如以下2个服务别名来限制应用连接到某个特定实例,并支持client端的TAF。

FINANCE =
 (DESCRIPTION=
  (ADDRESS=
       (PROTOCOL=tcp)  
       (HOST=VRH1)  
       (PORT=1522)) 
  (CONNECT_DATA=
     (SERVICE_NAME=VPROD) 
     (INSTANCE_NAME=VPROD1) 
     (FAILOVER_MODE=
       (BACKUP=HR) 
       (TYPE=select) 
       (METHOD=basic))))

HR =
 (DESCRIPTION=
  (ADDRESS=
       (PROTOCOL=tcp)  
       (HOST=VRH2)  
       (PORT=1522)) 
  (CONNECT_DATA=
     (SERVICE_NAME=VPROD) 
     (INSTANCE_NAME=VPROD2)
     (FAILOVER_MODE=
       (BACKUP=FINANCE) 
       (TYPE=select) 
       (METHOD=basic))))

关于Load Balance和Failover的更多信息可以参见Dan Norris的<Oracle Real Application Clusters
Load Balancing and Failover Options>和Jeremy Schneider的<Oracle Services on RAC: Five
Things You Might Not Know>

为11.2.0.2 Grid Infrastructure添加节点

在之前的文章中我介绍了为10g RAC Cluster添加节点的具体步骤。在11gr2中Oracle CRS升级为Grid Infrastructure,通过GI我们可以更方便地控制CRS资源如:VIP、ASM等等,这也导致了在为11.2中的GI添加节点时,同10gr2相比有着较大的差异。

这里我们要简述在11.2中为GI ADD NODE的几个要点:

一、准备工作

准备工作是不可忽略的,在10g RAC Cluster添加节点中我列举了必须完成的先决条件,在11.2 GI中这些条件依然有效,但请注意以下2点:

1.不仅要为oracle用户配置用户等价性,也要为grid(GI安装用户)用户配置;除非你同时使用oracle安装GI和RDBMS,这是不推荐的

2.在11.2 GI中推出了octssd(Oracle Cluster Synchronization Service Daemon)时间同步服务,如果打算使用octssd的话那么建议禁用ntpd事件服务,具体方法如下:

# service ntpd stop
Shutting down ntpd:                                        [  OK  ]
# chkconfig ntpd off
# mv /etc/ntp.conf /etc/ntp.conf.orig
# rm /var/run/ntpd.pid

3.使用cluster verify工具验证新增节点是否满足cluster的要求:

cluvfy stage -pre nodeadd -n <NEW NODE>

具体用法如:

su - grid

[grid@vrh1 ~]$ cluvfy stage -pre nodeadd -n vrh3

Performing pre-checks for node addition 

Checking node reachability...
Node reachability check passed from node "vrh1"

Checking user equivalence...
User equivalence check passed for user "grid"

Checking node connectivity...

Checking hosts config file...

Verification of the hosts config file successful

Check: Node connectivity for interface "eth0"
Node connectivity passed for interface "eth0"

Node connectivity check passed

Checking CRS integrity...

CRS integrity check passed

Checking shared resources...

Checking CRS home location...
The location "/g01/11.2.0/grid" is not shared but is present/creatable on all nodes
Shared resources check for node addition passed

Checking node connectivity...

Checking hosts config file...

Verification of the hosts config file successful

Check: Node connectivity for interface "eth0"
Node connectivity passed for interface "eth0"

Check: Node connectivity for interface "eth1"
Node connectivity passed for interface "eth1"

Node connectivity check passed

Total memory check passed
Available memory check passed
Swap space check passed
Free disk space check passed for "vrh3:/tmp"
Free disk space check passed for "vrh1:/tmp"
Check for multiple users with UID value 54322 passed
User existence check passed for "grid"
Run level check passed
Hard limits check failed for "maximum open file descriptors"
Check failed on nodes:
        vrh3
Soft limits check passed for "maximum open file descriptors"
Hard limits check passed for "maximum user processes"
Soft limits check passed for "maximum user processes"
System architecture check passed
Kernel version check passed
Kernel parameter check passed for "semmsl"
Kernel parameter check passed for "semmns"
Kernel parameter check passed for "semopm"
Kernel parameter check passed for "semmni"
Kernel parameter check passed for "shmmax"
Kernel parameter check passed for "shmmni"
Kernel parameter check passed for "shmall"
Kernel parameter check passed for "file-max"
Kernel parameter check passed for "ip_local_port_range"
Kernel parameter check passed for "rmem_default"
Kernel parameter check passed for "rmem_max"
Kernel parameter check passed for "wmem_default"
Kernel parameter check passed for "wmem_max"
Kernel parameter check passed for "aio-max-nr"
Package existence check passed for "make-3.81( x86_64)"
Package existence check passed for "binutils-2.17.50.0.6( x86_64)"
Package existence check passed for "gcc-4.1.2 (x86_64)( x86_64)"
Package existence check passed for "libaio-0.3.106 (x86_64)( x86_64)"
Package existence check passed for "glibc-2.5-24 (x86_64)( x86_64)"
Package existence check passed for "compat-libstdc++-33-3.2.3 (x86_64)( x86_64)"
Package existence check passed for "elfutils-libelf-0.125 (x86_64)( x86_64)"
Package existence check passed for "elfutils-libelf-devel-0.125( x86_64)"
Package existence check passed for "glibc-common-2.5( x86_64)"
Package existence check passed for "glibc-devel-2.5 (x86_64)( x86_64)"
Package existence check passed for "glibc-headers-2.5( x86_64)"
Package existence check passed for "gcc-c++-4.1.2 (x86_64)( x86_64)"
Package existence check passed for "libaio-devel-0.3.106 (x86_64)( x86_64)"
Package existence check passed for "libgcc-4.1.2 (x86_64)( x86_64)"
Package existence check passed for "libstdc++-4.1.2 (x86_64)( x86_64)"
Package existence check passed for "libstdc++-devel-4.1.2 (x86_64)( x86_64)"
Package existence check passed for "sysstat-7.0.2( x86_64)"
Package existence check passed for "ksh-20060214( x86_64)"
Check for multiple users with UID value 0 passed
Current group ID check passed

Checking OCR integrity...

OCR integrity check passed

Checking Oracle Cluster Voting Disk configuration...

Oracle Cluster Voting Disk configuration check passed
Time zone consistency check passed

Starting Clock synchronization checks using Network Time Protocol(NTP)...

NTP Configuration file check started...
No NTP Daemons or Services were found to be running

Clock synchronization check using Network Time Protocol(NTP) passed

User "grid" is not part of "root" group. Check passed
Checking consistency of file "/etc/resolv.conf" across nodes

File "/etc/resolv.conf" does not have both domain and search entries defined
domain entry in file "/etc/resolv.conf" is consistent across nodes
search entry in file "/etc/resolv.conf" is consistent across nodes
All nodes have one search entry defined in file "/etc/resolv.conf"
PRVF-5636 : The DNS response time for an unreachable node exceeded "15000" ms on following nodes: vrh3

File "/etc/resolv.conf" is not consistent across nodes

Pre-check for node addition was unsuccessful on all the nodes.

一般来说如果我们不使用DNS解析域名方式的话,那么resolv.conf不一直的问题可以忽略,但在slient安装模式下可能造成我们的操作无法完成,这个后面会介绍。

二、向GI中加入新的节点

注意11.2.0.2 GI添加节点的关键脚本addNode.sh可能存在Bug,如官方文档所述当希望使用Interactive Mode交互模式启动OUI界面添加节点时,只要运行addNode.sh脚本即可,实际情况则不是这样:

documentation said:
Go to CRS_home/oui/bin and run the addNode.sh script on one of the existing nodes.
Oracle Universal Installer runs in add node mode and the Welcome page displays.
Click Next and the Specify Cluster Nodes for Node Addition page displays.

we done:

运行addNode.sh要求以GI拥有者身份运行该脚本,一般为grid用户,要求在已有的正运行GI的节点上启动脚本

[grid@vrh1 ~]$ cd $ORA_CRS_HOME/oui/bin

[grid@vrh1 bin]$ ./addNode.sh
ERROR:
Value for CLUSTER_NEW_NODES not specified.

USAGE:
/g01/11.2.0/grid/cv/cvutl/check_nodeadd.pl  {-pre|-post} 

/g01/11.2.0/grid/cv/cvutl/check_nodeadd.pl -pre [-silent] CLUSTER_NEW_NODES={}
/g01/11.2.0/grid/cv/cvutl/check_nodeadd.pl -pre [-silent] CLUSTER_NEW_NODES={} 
CLUSTER_NEW_VIRTUAL_HOSTNAMES={}

/g01/11.2.0/grid/cv/cvutl/check_nodeadd.pl -pre [-silent] -responseFile
/g01/11.2.0/grid/cv/cvutl/check_nodeadd.pl -post [-silent]

我们的本意是期望使用图形化的交互界面的OUI(runInstaller -addnode)来新增节点,然而addNode.sh居然让我们输入一些参量,而且其调用的check_nodeadd.pl脚本使用的是silent模式。

在MOS和GOOGLE上搜了一圈,基本所有的文档都推荐使用silent模式来添加节点,无法只好转到静默添加上来。实际上静默添加所需要提供的参数并不多,这可能是这种方式得到推崇的原因之一,但是这里又碰到问题了:

语法SYNTAX:
./addNode.sh –silent 
"CLUSTER_NEW_NODES={node2}" 
"CLUSTER_NEW_PRIVATE_NODE_NAMES={node2-priv}" "CLUSTER_NEW_VIRTUAL_HOSTNAMES={node2-vip}"

在我们的例子中具体命令如下

./addNode.sh -silent
"CLUSTER_NEW_NODES={vrh3}" "CLUSTER_NEW_VIRTUAL_HOSTNAMES={vrh3-vip}"
"CLUSTER_NEW_PRIVATE_NODE_NAMES={vrh3-priv}" 

以上命令因为采用silent模式所以没有任何窗口输出(实际上会输出到 /tmp/silentInstall.log日志文件中),去掉-silent参数

./addNode.sh  "CLUSTER_NEW_NODES={vrh3}"
"CLUSTER_NEW_VIRTUAL_HOSTNAMES={vrh3-vip}" "CLUSTER_NEW_PRIVATE_NODE_NAMES={vrh3-priv}"

Performing pre-checks for node addition 

Checking node reachability...
Node reachability check passed from node "vrh1"

Checking user equivalence...
User equivalence check passed for user "grid"

Checking node connectivity...

Checking hosts config file...

Verification of the hosts config file successful

Check: Node connectivity for interface "eth0"
Node connectivity passed for interface "eth0"

Node connectivity check passed

Checking CRS integrity...

CRS integrity check passed

Checking shared resources...

Checking CRS home location...
The location "/g01/11.2.0/grid" is not shared but is present/creatable on all nodes
Shared resources check for node addition passed

Checking node connectivity...

Checking hosts config file...

Verification of the hosts config file successful

Check: Node connectivity for interface "eth0"
Node connectivity passed for interface "eth0"

Check: Node connectivity for interface "eth1"
Node connectivity passed for interface "eth1"

Node connectivity check passed

Total memory check passed
Available memory check passed
Swap space check passed
Free disk space check passed for "vrh3:/tmp"
Free disk space check passed for "vrh1:/tmp"
Check for multiple users with UID value 54322 passed
User existence check passed for "grid"
Run level check passed
Hard limits check failed for "maximum open file descriptors"
Check failed on nodes:
        vrh3
Soft limits check passed for "maximum open file descriptors"
Hard limits check passed for "maximum user processes"
Soft limits check passed for "maximum user processes"
System architecture check passed
Kernel version check passed
Kernel parameter check passed for "semmsl"
Kernel parameter check passed for "semmns"
Kernel parameter check passed for "semopm"
Kernel parameter check passed for "semmni"
Kernel parameter check passed for "shmmax"
Kernel parameter check passed for "shmmni"
Kernel parameter check passed for "shmall"
Kernel parameter check passed for "file-max"
Kernel parameter check passed for "ip_local_port_range"
Kernel parameter check passed for "rmem_default"
Kernel parameter check passed for "rmem_max"
Kernel parameter check passed for "wmem_default"
Kernel parameter check passed for "wmem_max"
Kernel parameter check passed for "aio-max-nr"
Package existence check passed for "make-3.81( x86_64)"
Package existence check passed for "binutils-2.17.50.0.6( x86_64)"
Package existence check passed for "gcc-4.1.2 (x86_64)( x86_64)"
Package existence check passed for "libaio-0.3.106 (x86_64)( x86_64)"
Package existence check passed for "glibc-2.5-24 (x86_64)( x86_64)"
Package existence check passed for "compat-libstdc++-33-3.2.3 (x86_64)( x86_64)"
Package existence check passed for "elfutils-libelf-0.125 (x86_64)( x86_64)"
Package existence check passed for "elfutils-libelf-devel-0.125( x86_64)"
Package existence check passed for "glibc-common-2.5( x86_64)"
Package existence check passed for "glibc-devel-2.5 (x86_64)( x86_64)"
Package existence check passed for "glibc-headers-2.5( x86_64)"
Package existence check passed for "gcc-c++-4.1.2 (x86_64)( x86_64)"
Package existence check passed for "libaio-devel-0.3.106 (x86_64)( x86_64)"
Package existence check passed for "libgcc-4.1.2 (x86_64)( x86_64)"
Package existence check passed for "libstdc++-4.1.2 (x86_64)( x86_64)"
Package existence check passed for "libstdc++-devel-4.1.2 (x86_64)( x86_64)"
Package existence check passed for "sysstat-7.0.2( x86_64)"
Package existence check passed for "ksh-20060214( x86_64)"
Check for multiple users with UID value 0 passed
Current group ID check passed

Checking OCR integrity...

OCR integrity check passed

Checking Oracle Cluster Voting Disk configuration...

Oracle Cluster Voting Disk configuration check passed
Time zone consistency check passed

Starting Clock synchronization checks using Network Time Protocol(NTP)...

NTP Configuration file check started...
No NTP Daemons or Services were found to be running

Clock synchronization check using Network Time Protocol(NTP) passed

User "grid" is not part of "root" group. Check passed
Checking consistency of file "/etc/resolv.conf" across nodes

File "/etc/resolv.conf" does not have both domain and search entries defined
domain entry in file "/etc/resolv.conf" is consistent across nodes
search entry in file "/etc/resolv.conf" is consistent across nodes
All nodes have one search entry defined in file "/etc/resolv.conf"
PRVF-5636 : The DNS response time for an unreachable node exceeded "15000" ms on following nodes: vrh3

File "/etc/resolv.conf" is not consistent across nodes

Checking VIP configuration.
Checking VIP Subnet configuration.
Check for VIP Subnet configuration passed.
Checking VIP reachability
Check for VIP reachability passed.

Pre-check for node addition was unsuccessful on all the nodes.

在addNode.sh正式添加节点之前它也会调用cluvfy工具来验证新加入节点是否满足条件,如果不满足则拒绝下一步操作。因为我们在之前已经验证过了新节点的可用性,所以这里完全可以跳过addNode.sh的验证,具体来看一下addNode.sh脚本的内容:

[grid@vrh1 bin]$ cat addNode.sh 

#!/bin/sh
OHOME=/g01/11.2.0/grid
INVPTRLOC=$OHOME/oraInst.loc
ADDNODE="$OHOME/oui/bin/runInstaller -addNode -invPtrLoc $INVPTRLOC ORACLE_HOME=$OHOME $*"
if [ "$IGNORE_PREADDNODE_CHECKS" = "Y" -o ! -f "$OHOME/cv/cvutl/check_nodeadd.pl" ]
then
        $ADDNODE
else
        CHECK_NODEADD="$OHOME/perl/bin/perl $OHOME/cv/cvutl/check_nodeadd.pl -pre $*"
        $CHECK_NODEADD
        if [ $? -eq 0 ]
        then
        $ADDNODE
        fi
fi

可以看到存在一个IGNORE_PREADDNODE_CHECKS环境变量可以控制是否进行节点新增的预检查,我们手动设置该变量,之后再次运行addNode.sh脚本:

export IGNORE_PREADDNODE_CHECKS=Y

./addNode.sh  "CLUSTER_NEW_NODES={vrh3}"
"CLUSTER_NEW_VIRTUAL_HOSTNAMES={vrh3-vip}" "CLUSTER_NEW_PRIVATE_NODE_NAMES={vrh3-priv}"
> add_node.log  2>&1

另开一个窗口可以监控新增节点的过程日志

tail -f add_node.log 

Starting Oracle Universal Installer...

Checking swap space: must be greater than 500 MB.   Actual 5951 MB    Passed
Checking monitor: must be configured to display at least 256 colors.    Actual 16777216    Passed
Oracle Universal Installer, Version 11.2.0.2.0 Production
Copyright (C) 1999, 2010, Oracle. All rights reserved.

Performing tests to see whether nodes vrh2,vrh3 are available
............................................................... 100% Done.

.
-----------------------------------------------------------------------------
Cluster Node Addition Summary
Global Settings
   Source: /g01/11.2.0/grid
   New Nodes
Space Requirements
   New Nodes
      vrh3
         /: Required 6.66GB : Available 32.40GB
Installed Products
   Product Names
      Oracle Grid Infrastructure 11.2.0.2.0
      Sun JDK 1.5.0.24.08
      Installer SDK Component 11.2.0.2.0
      Oracle One-Off Patch Installer 11.2.0.0.2
      Oracle Universal Installer 11.2.0.2.0
      Oracle USM Deconfiguration 11.2.0.2.0
      Oracle Configuration Manager Deconfiguration 10.3.1.0.0
      Enterprise Manager Common Core Files 10.2.0.4.3
      Oracle DBCA Deconfiguration 11.2.0.2.0
      Oracle RAC Deconfiguration 11.2.0.2.0
      Oracle Quality of Service Management (Server) 11.2.0.2.0
      Installation Plugin Files 11.2.0.2.0
      Universal Storage Manager Files 11.2.0.2.0
      Oracle Text Required Support Files 11.2.0.2.0
      Automatic Storage Management Assistant 11.2.0.2.0
      Oracle Database 11g Multimedia Files 11.2.0.2.0
      Oracle Multimedia Java Advanced Imaging 11.2.0.2.0
      Oracle Globalization Support 11.2.0.2.0
      Oracle Multimedia Locator RDBMS Files 11.2.0.2.0
      Oracle Core Required Support Files 11.2.0.2.0
      Bali Share 1.1.18.0.0
      Oracle Database Deconfiguration 11.2.0.2.0
      Oracle Quality of Service Management (Client) 11.2.0.2.0
      Expat libraries 2.0.1.0.1
      Oracle Containers for Java 11.2.0.2.0
      Perl Modules 5.10.0.0.1
      Secure Socket Layer 11.2.0.2.0
      Oracle JDBC/OCI Instant Client 11.2.0.2.0
      Oracle Multimedia Client Option 11.2.0.2.0
      LDAP Required Support Files 11.2.0.2.0
      Character Set Migration Utility 11.2.0.2.0
      Perl Interpreter 5.10.0.0.1
      PL/SQL Embedded Gateway 11.2.0.2.0
      OLAP SQL Scripts 11.2.0.2.0
      Database SQL Scripts 11.2.0.2.0
      Oracle Extended Windowing Toolkit 3.4.47.0.0
      SSL Required Support Files for InstantClient 11.2.0.2.0
      SQL*Plus Files for Instant Client 11.2.0.2.0
      Oracle Net Required Support Files 11.2.0.2.0
      Oracle Database User Interface 2.2.13.0.0
      RDBMS Required Support Files for Instant Client 11.2.0.2.0
      RDBMS Required Support Files Runtime 11.2.0.2.0
      XML Parser for Java 11.2.0.2.0
      Oracle Security Developer Tools 11.2.0.2.0
      Oracle Wallet Manager 11.2.0.2.0
      Enterprise Manager plugin Common Files 11.2.0.2.0
      Platform Required Support Files 11.2.0.2.0
      Oracle JFC Extended Windowing Toolkit 4.2.36.0.0
      RDBMS Required Support Files 11.2.0.2.0
      Oracle Ice Browser 5.2.3.6.0
      Oracle Help For Java 4.2.9.0.0
      Enterprise Manager Common Files 10.2.0.4.3
      Deinstallation Tool 11.2.0.2.0
      Oracle Java Client 11.2.0.2.0
      Cluster Verification Utility Files 11.2.0.2.0
      Oracle Notification Service (eONS) 11.2.0.2.0
      Oracle LDAP administration 11.2.0.2.0
      Cluster Verification Utility Common Files 11.2.0.2.0
      Oracle Clusterware RDBMS Files 11.2.0.2.0
      Oracle Locale Builder 11.2.0.2.0
      Oracle Globalization Support 11.2.0.2.0
      Buildtools Common Files 11.2.0.2.0
      Oracle RAC Required Support Files-HAS 11.2.0.2.0
      SQL*Plus Required Support Files 11.2.0.2.0
      XDK Required Support Files 11.2.0.2.0
      Agent Required Support Files 10.2.0.4.3
      Parser Generator Required Support Files 11.2.0.2.0
      Precompiler Required Support Files 11.2.0.2.0
      Installation Common Files 11.2.0.2.0
      Required Support Files 11.2.0.2.0
      Oracle JDBC/THIN Interfaces 11.2.0.2.0
      Oracle Multimedia Locator 11.2.0.2.0
      Oracle Multimedia 11.2.0.2.0
      HAS Common Files 11.2.0.2.0
      Assistant Common Files 11.2.0.2.0
      PL/SQL 11.2.0.2.0
      HAS Files for DB 11.2.0.2.0
      Oracle Recovery Manager 11.2.0.2.0
      Oracle Database Utilities 11.2.0.2.0
      Oracle Notification Service 11.2.0.2.0
      SQL*Plus 11.2.0.2.0
      Oracle Netca Client 11.2.0.2.0
      Oracle Net 11.2.0.2.0
      Oracle JVM 11.2.0.2.0
      Oracle Internet Directory Client 11.2.0.2.0
      Oracle Net Listener 11.2.0.2.0
      Cluster Ready Services Files 11.2.0.2.0
      Oracle Database 11g 11.2.0.2.0
-----------------------------------------------------------------------------

Instantiating scripts for add node (Monday, August 15, 2011 10:15:35 PM CST)
.                                                                 1% Done.
Instantiation of add node scripts complete

Copying to remote nodes (Monday, August 15, 2011 10:15:38 PM CST)
...............................................................................................                                 96% Done.
Home copied to new nodes

Saving inventory on nodes (Monday, August 15, 2011 10:21:02 PM CST)
.                                                               100% Done.
Save inventory complete
WARNING:A new inventory has been created on one or more nodes in this session.
However, it has not yet been registered as the central inventory of this system.
To register the new inventory please run the script at '/g01/oraInventory/orainstRoot.sh'
with root privileges on nodes 'vrh3'.
If you do not register the inventory, you may not be able to update or
patch the products you installed.
The following configuration scripts need to be executed as the "root" user in each cluster node.
/g01/oraInventory/orainstRoot.sh #On nodes vrh3
/g01/11.2.0/grid/root.sh #On nodes vrh3
To execute the configuration scripts:
    1. Open a terminal window
    2. Log in as "root"
    3. Run the scripts in each cluster node

The Cluster Node Addition of /g01/11.2.0/grid was successful.
Please check '/tmp/silentInstall.log' for more details.

以上GI软件的安装成功了,接下来我们还需要在新加入的节点上运行2个关键的脚本,千万不要忘记这一点!:

运行orainstRoot.sh 和 root.sh脚本要求以root身份
su - root 

[root@vrh3]# cat /etc/oraInst.loc
inventory_loc=/g01/oraInventory                     --这里是oraInventory的位置
inst_group=asmadmin

[root@vrh3 ~]# cd /g01/oraInventory

[root@vrh3 oraInventory]# ./orainstRoot.sh
Creating the Oracle inventory pointer file (/etc/oraInst.loc)
Changing permissions of /g01/oraInventory.
Adding read,write permissions for group.
Removing read,write,execute permissions for world.

Changing groupname of /g01/oraInventory to asmadmin.
The execution of the script is complete.

运行CRS_HOME下的root.sh脚本,可能会有警告但不要紧

[root@vrh3 ~]# cd $ORA_CRS_HOME

[root@vrh3 g01]# /g01/11.2.0/grid/root.sh
Running Oracle 11g root script...

The following environment variables are set as:
    ORACLE_OWNER= grid
    ORACLE_HOME=  /g01/11.2.0/grid

Enter the full pathname of the local bin directory: [/usr/local/bin]:
   Copying dbhome to /usr/local/bin ...
   Copying oraenv to /usr/local/bin ...
   Copying coraenv to /usr/local/bin ...

Creating /etc/oratab file...
Entries will be added to the /etc/oratab file as needed by
Database Configuration Assistant when a database is created
Finished running generic part of root script.
Now product-specific root actions will be performed.

Using configuration parameter file: /g01/11.2.0/grid/crs/install/crsconfig_params
Creating trace directory
LOCAL ADD MODE
Creating OCR keys for user 'root', privgrp 'root'..
Operation successful.
OLR initialization - successful
Adding daemon to inittab
ACFS-9200: Supported
ACFS-9300: ADVM/ACFS distribution files found.
ACFS-9307: Installing requested ADVM/ACFS software.
ACFS-9308: Loading installed ADVM/ACFS drivers.
ACFS-9321: Creating udev for ADVM/ACFS.
ACFS-9323: Creating module dependencies - this may take some time.
ACFS-9327: Verifying ADVM/ACFS devices.
ACFS-9309: ADVM/ACFS installation correctness verified.
CRS-4402: The CSS daemon was started in exclusive mode but found an active CSS daemon on node vrh1, number 1, and is terminating
An active cluster was found during exclusive startup, restarting to join the cluster
clscfg: EXISTING configuration version 5 detected.
clscfg: version 5 is 11g Release 2.
Successfully accumulated necessary OCR keys.
Creating OCR keys for user 'root', privgrp 'root'..
Operation successful.
/g01/11.2.0/grid/bin/srvctl start listener -n vrh3 ... failed
Failed to perform new node configuration at /g01/11.2.0/grid/crs/install/crsconfig_lib.pm line 8255.
/g01/11.2.0/grid/perl/bin/perl -I/g01/11.2.0/grid/perl/lib -I/g01/11.2.0/grid/crs/install 
/g01/11.2.0/grid/crs/install/rootcrs.pl execution failed

以上会出现了2个小错误:

1.新增节点上LISTENER启动失败的问题可以忽略,这是因为RDBMS_HOME仍未安装,但CRS尝试去启动相关的监听

[root@vrh3 g01]# /g01/11.2.0/grid/bin/srvctl start listener -n vrh3
PRCR-1013 : Failed to start resource ora.CRS_LISTENER.lsnr
PRCR-1064 : Failed to start resource ora.CRS_LISTENER.lsnr on node vrh3
CRS-5010: Update of configuration file "/s01/orabase/product/11.2.0/dbhome_1/network/admin/listener.ora" failed: details at "(:CLSN00014:)" in "/g01/11.2.0/grid/log/vrh3/agent/crsd/oraagent_oracle/oraagent_oracle.log"
CRS-5013: Agent "/g01/11.2.0/grid/bin/oraagent.bin" failed to start process "/s01/orabase/product/11.2.0/dbhome_1/bin/lsnrctl" for action "check": details at "(:CLSN00008:)" in "/g01/11.2.0/grid/log/vrh3/agent/crsd/oraagent_oracle/oraagent_oracle.log"
CRS-2674: Start of 'ora.CRS_LISTENER.lsnr' on 'vrh3' failed
CRS-5013: Agent "/g01/11.2.0/grid/bin/oraagent.bin" failed to start process "/s01/orabase/product/11.2.0/dbhome_1/bin/lsnrctl" for action "clean": details at "(:CLSN00008:)" in "/g01/11.2.0/grid/log/vrh3/agent/crsd/oraagent_oracle/oraagent_oracle.log"
CRS-5013: Agent "/g01/11.2.0/grid/bin/oraagent.bin" failed to start process "/s01/orabase/product/11.2.0/dbhome_1/bin/lsnrctl" for action "check": details at "(:CLSN00008:)" in "/g01/11.2.0/grid/log/vrh3/agent/crsd/oraagent_oracle/oraagent_oracle.log"
CRS-2678: 'ora.CRS_LISTENER.lsnr' on 'vrh3' has experienced an unrecoverable failure
CRS-0267: Human intervention required to resume its availability.
PRCC-1015 : LISTENER was already running on vrh3
PRCR-1004 : Resource ora.LISTENER.lsnr is already running

2.rootcrs.pl脚本运行失败的话,一般重新运行一次即可:

[root@vrh3 bin]# /g01/11.2.0/grid/perl/bin/perl -I/g01/11.2.0/grid/perl/lib
-I/g01/11.2.0/grid/crs/install /g01/11.2.0/grid/crs/install/rootcrs.pl

Using configuration parameter file: /g01/11.2.0/grid/crs/install/crsconfig_params
PRKO-2190 : VIP exists for node vrh3, VIP name vrh3-vip
PRKO-2420 : VIP is already started on node(s): vrh3
Preparing packages for installation...
cvuqdisk-1.0.9-1
Configure Oracle Grid Infrastructure for a Cluster ... succeeded

3.建议在新增节点上重启crs,并使用cluvfy验证nodeadd顺利完成 :

[root@vrh3 ~]# crsctl stop crs

[root@vrh3 ~]# crsctl start crs

[root@vrh3 ~]# su - grid

[grid@vrh3 ~]$ cluvfy stage -post nodeadd -n vrh1,vrh2,vrh3

Performing post-checks for node addition 

Checking node reachability...
Node reachability check passed from node "vrh1"

Checking user equivalence...
User equivalence check passed for user "grid"

Checking node connectivity...

Checking hosts config file...

Verification of the hosts config file successful

Check: Node connectivity for interface "eth0"
Node connectivity passed for interface "eth0"

Node connectivity check passed

Checking cluster integrity...

Cluster integrity check passed

Checking CRS integrity...

CRS integrity check passed

Checking shared resources...

Checking CRS home location...
The location "/g01/11.2.0/grid" is not shared but is present/creatable on all nodes
Shared resources check for node addition passed

Checking node connectivity...

Checking hosts config file...

Verification of the hosts config file successful

Check: Node connectivity for interface "eth0"
Node connectivity passed for interface "eth0"

Check: Node connectivity for interface "eth1"
Node connectivity passed for interface "eth1"

Node connectivity check passed

Checking node application existence...

Checking existence of VIP node application (required)
VIP node application check passed

Checking existence of NETWORK node application (required)
NETWORK node application check passed

Checking existence of GSD node application (optional)
GSD node application is offline on nodes "vrh3,vrh2,vrh1"

Checking existence of ONS node application (optional)
ONS node application check passed

Checking Single Client Access Name (SCAN)...

Checking TCP connectivity to SCAN Listeners...
TCP connectivity to SCAN Listeners exists on all cluster nodes

Checking name resolution setup for "vrh.cluster.oracle.com"...

ERROR:
PRVF-4664 : Found inconsistent name resolution entries for SCAN name "vrh.cluster.oracle.com"

ERROR:
PRVF-4657 : Name resolution setup check for "vrh.cluster.oracle.com" (IP address: 192.168.1.190) failed

ERROR:
PRVF-4664 : Found inconsistent name resolution entries for SCAN name "vrh.cluster.oracle.com"

Verification of SCAN VIP and Listener setup failed

User "grid" is not part of "root" group. Check passed

Checking if Clusterware is installed on all nodes...
Check of Clusterware install passed

Checking if CTSS Resource is running on all nodes...
CTSS resource check passed

Querying CTSS for time offset on all nodes...
Query of CTSS for time offset passed

Check CTSS state started...
CTSS is in Active state. Proceeding with check of clock time offsets on all nodes...
Check of clock time offsets passed

Oracle Cluster Time Synchronization Services check passed

Post-check for node addition was successful.

11gr2 RAC安装INS-35354问题一例

今天在安装一套11.2.0.2 RAC数据库时出现了INS-35354的问题:
11gR2-GI-INS-35354

因为之前已经成功安装了11.2.0.2的GI,而且Cluster的一切状态都正常,出现这错误都少有点意外:

[grid@vrh1 ~]$ crsctl check crs
CRS-4638: Oracle High Availability Services is online
CRS-4537: Cluster Ready Services is online
CRS-4529: Cluster Synchronization Services is online
CRS-4533: Event Manager is online

去MOS搜了一圈,发现有可能是oraInventory中的inventory.xml更新不正确导致的:

Applies to:
Oracle Server - Enterprise Edition - Version: 11.2.0.1 to 11.2.0.2 - Release: 11.2 to 11.2
Information in this document applies to any platform.
Symptoms

Installing 11gR2 database software in a Grid Infrastrsucture environment fails with the error INS-35354:

The system on which you are attempting to install Oracle RAC is not part of a valid cluster.

Grid Infrastructure (Oracle Clusterware) is running on all nodes in the cluster which can be verified with:

crsctl check crs

Changes
This is a new install.
Cause
As per 11gR2 documentation the error description is:

INS-35354: The system on which you are attempting to install Oracle RAC is not part of a valid cluster.

Cause: Prior to installing Oracle RAC, you must create a valid cluster. 
This is done by deploying Grid Infrastructure software, 
which will allow configuration of Oracle Clusterware and Automatic Storage Management.

However, the problem at hand may be that the central inventory is missing the "CRS=true" flag 
(for the Grid Infrastructure Home).
<inventory.xml>
-------------

<HOME_LIST>
<HOME NAME="Ora11g_gridinfrahome1" LOC="/u01/grid" TYPE="O" IDX="1">
<NODE_LIST>
<NODE NAME="node1"/>
<NODE NAME="node2"/>
</NODE_LIST>

 -------------

From the inventory.xml, we see that the HOME NAME line is missing the CRS="true" flag.

The error INS-35354 will occur when the central inventory entry for the Grid Infrastructure 
home is missing the flag that identifies it as CRS-type home.
Solution
Use the -updateNodeList option for the installer command to fix the the inventory.

The full syntax is:

./runInstaller -updateNodeList "CLUSTER_NODES={node1,node2}"
ORACLE_HOME="" ORACLE_HOME_NAME="" LOCAL_NODE="Node_Name" CRS=[true|false]

Execute the command on any node in the cluster.

Examples:

For a two-node RAC cluster on UNIX:

Node1:
cd /u01/grid/oui/bin
./runInstaller -updateNodeList "CLUSTER_NODES={node1,node2}" ORACLE_HOME="/u01/crs" 
ORACLE_HOME_NAME="GI_11201" LOCAL_NODE="node1" CRS=true

For a 2-node RAC cluster on Windows:

Node 1:
cd e:\app\11.2.0\grid\oui\bin
e:\app\11.2.0\grid\oui\bin\setup -updateNodeList "CLUSTER_NODES={RACNODE1,RACNODE2}" 
ORACLE_HOME="e:\app\11.2.0\grid" ORACLE_HOME_NAME="OraCrs11g_home1" LOCAL_NODE="RACNODE1" CRS=true

我环境中的inventory.xml内容如下:

[grid@vrh1 ContentsXML]$ cat inventory.xml 
<?xml version="1.0" standalone="yes" ?>
<!-- Copyright (c) 1999, 2010, Oracle. All rights reserved. -->
<!-- Do not modify the contents of this file by hand. -->
<INVENTORY>
<VERSION_INFO>
   <SAVED_WITH>11.2.0.2.0</SAVED_WITH>
   <MINIMUM_VER>2.1.0.6.0</MINIMUM_VER>
</VERSION_INFO>
<HOME_LIST>
<HOME NAME="Ora11g_gridinfrahome1" LOC="/g01/11.2.0/grid" TYPE="O" IDX="1" >
   <NODE_LIST>
      <NODE NAME="vrh1"/>
      <NODE NAME="vrh2"/>
   </NODE_LIST>
</HOME>
</HOME_LIST>
</INVENTORY>

显然是在<HOME NAME这里缺少了CRS=”true”的标志,导致OUI安装界面在检测时认为该节点没有安装GI。

解决方案其实很简单只要加入CRS=”true”在重启runInstaller即可,不需要如文档中介绍的那样使用runInstaller -updateNodeList的复杂命令组合。

[grid@vrh1 ContentsXML]$ cat /g01/oraInventory/ContentsXML/inventory.xml 
<?xml version="1.0" standalone="yes" ?>
<!-- Copyright (c) 1999, 2010, Oracle. All rights reserved. -->
<!-- Do not modify the contents of this file by hand. -->
<INVENTORY>
<VERSION_INFO>
   <SAVED_WITH>11.2.0.2.0</SAVED_WITH>
   <MINIMUM_VER>2.1.0.6.0</MINIMUM_VER>
</VERSION_INFO>
<HOME_LIST>
<HOME NAME="Ora11g_gridinfrahome1" LOC="/g01/11.2.0/grid" TYPE="O" IDX="1" CRS="true">
   <NODE_LIST>
      <NODE NAME="vrh1"/>
      <NODE NAME="vrh2"/>
   </NODE_LIST>
</HOME>
</HOME_LIST>
</INVENTORY>

如上修改后问题解决,安装界面正常:
11gr2-RAC-Installing-db-step-4-10

global cache cr request等待事件

The session is looking for a consistent read version of data but cannot find it in the local cache. The local instance has made a request to other RAC instances for the data and is waiting on its return.

Solutions

Collect more information about the average time the instances are waiting for this type of request:

select b1.inst_id, b2.value ‘GCS CR BLOCKS RECEIVED’,
b1.value ‘GCS CR BLOCK RECEIVE TIME’,
((b1.value / b2.value) * 10) ‘AVG CR BLOCK RECEIVE TIME (ms)’
from gv$sysstat b1, gv$sysstat b2
where b1.name = ‘global cache cr block receive time’
and b2.name = ‘global cache cr blocks received’
and b1.inst_id = b2.inst_id;

If the average cr block receive time is more than 15 milliseconds, this implies the session is waiting excessively for the block request to be satisfied by other instances. If the average time is less, that implies an excessive amount of block requests are occurring. In both cases the following are suggestions for decreasing wait times:

If a SQL statement is not tuned properly it may request more data blocks than necessary. Review the explain plan for the SQL and tune the statement accordingly.

Reduce the latency on the network interconnect between instances. Excessive latency could be caused by:

Slow network technology is being used for the interconnect or RAC has chosen the incorrect one. To verify the interconnect being used, search the alert.log file for “cluster interconnect”.

Under-configured network settings at the OS level.

Errors occurring on the interconnect.

If there are excessive waits on the remote instance’s buffer cache, increase the DB_CACHE_SIZE parameter on the remote database.

Tune the Lock Management System (LMS) being used to handle lock requests. If the system is heavily loaded or other scheduling delays for the LMS are present, consider increasing the number of LMS processes or increasing the priority so they get more CPU time. The parameter _LM_LMS can be used to set the number of LMS processes.

Distribute the workload across instances to achieve better localized block access.

If the average wait time for another wait event named “global cache null to x'” wait event is low (under 15ms) then you may be experiencing an Oracle statistics bug. This is a problem in the way statistics are reported and does not impact performance. More information can be found in Metalink Note: 243593.1. Also review Metalink Note: 181489.1 for other possible Oracle bugs.

gc cr request等待事件

The session is looking for a consistent read version of data but cannot find it in the local cache. The local instance has made a request to other RAC instances for the data and is waiting on its return.

Solutions

Collect more information about the average time the instances are waiting for this type of request:

select b1.inst_id, b2.value ‘GCS CR BLOCKS RECEIVED’,
b1.value ‘GCS CR BLOCK RECEIVE TIME’,
((b1.value / b2.value) * 10) ‘AVG CR BLOCK RECEIVE TIME (ms)’
from gv$sysstat b1, gv$sysstat b2
where b1.name = ‘global cache cr block receive time’
and b2.name = ‘global cache cr blocks received’
and b1.inst_id = b2.inst_id;

If the average cr block receive time is more than 15 milliseconds, this implies the session is waiting excessively for the block request to be satisfied by other instances. If the average time is less, that implies an excessive amount of block requests are occurring.  In both cases the following are suggestions for decreasing wait times:

If a SQL statement is not tuned properly it may request more data blocks than necessary. Review the explain plan for the SQL and tune the statement accordingly.

Reduce the latency on the network interconnect between instances.  Excessive latency could be caused by:

Slow network technology is being used for the interconnect or RAC has chosen the incorrect one. To verify the interconnect being used, search the alert.log file for “cluster interconnect”.

Under-configured network settings at the OS level.

Errors occurring on the interconnect.

If there are excessive waits on the remote instance’s buffer cache, increase the DB_CACHE_SIZE parameter on the remote database.

Tune the Lock Management System (LMS) being used to handle lock requests. If the system is heavily loaded or other scheduling delays for the LMS are present, consider increasing the number of LMS processes or increasing the priority so they get more CPU time. The parameter _LM_LMS can be used to set the number of LMS processes.

Distribute the workload across instances to achieve better localized block access.

If the average wait time for another wait event named “gc null to x” wait event is low (under 15ms) then you may be experiencing an Oracle statistics bug. This is a problem in the way statistics are reported and does not impact performance. More information can be found in Metalink Note: 243593.1. Also review Metalink Note: 181489.1 for other possible Oracle bugs.

沪ICP备14014813号-2

沪公网安备 31010802001379号