利用Cluster Verify Utility工具体验RAC最佳实践

Cluster Verification Utilit(CVU)是Oracle所推荐的一种集群检验工具。该检验工具帮助用户在Cluter部署的各个阶段验证集群的重要组件,这些阶段包括硬件搭建、Clusterware的安装、RDBMS的安装、存储等等。我们既可以在Cluster安装之前使用CVU来帮我们检验所配置的环境正确可用,也可以在软件安装完成后使用CVU来做对集群的验收。

CVU提供了一种可扩展的框架,其所实施的常规检验活动是独立于具体的平台,并且向存储和网络的检验提供了厂商接口(Vendor Interface)。
CVU工具不依赖于其他Oracle软件,仅使用命令cluvfy,如cluvfy stage -pre crsinst -n vrh1,vrh2。

cluvfy的部署十分简单,在本地节点安装后,该工具在运行过程中会自动部署到远程主机上。具体的自动部署流程如下:

  1. 用户在本地节点安装CVU
  2. 用户针对多个节点实施Verify检验命令
  3. CVU工具将拷贝自身必要的文件到远程节点
  4. CVU会在所有节点执行检验任务并生成报告

CVU工具可以为我们提供以下功能:

  1. 验证Cluster集群是否规范配置以便后续的RAC安装、配置和操作顺利
  2. 全类型的验证
  3. 非破坏性的验证
  4. 提供了易于使用的接口
  5. 支持各种平台和配置的RAC,明确完善的统一行为方式

注意不要误解cluvfy的作用,它仅仅是一个检验者,而不负责实际的配置或修复工作:

  1. cluvfy不支持任何类型的cluster或RAC操作
  2. 在检验到问题或失败后,cluvfy不会采取任何修正行为
  3. cluvfy不是性能调优或监控工具
  4. cluvfy不会尝试帮助你验证RAC数据库的内部结构

RAC的实际部署可以被逻辑地区分为多个操作阶段,这些阶段被称作是”stage”,在实际的部署过程中每一个stage由一系列的操作组成。每一个stage的都有自身的预检查(pre-check)和验收检查(post-check),如图:

cluvfy_stage_list

 

我们会在CRS和RAC数据库的安装过程中具体使用Cluster Verify Utility的不同”stage”,各种不同的stage可以使用cluvfy stage -list命令列出:

cluvfy stage -list

USAGE:
cluvfy stage {-pre|-post} <stage-name> <stage-specific options>  [-verbose]

Valid Stages are:
      -pre cfs        : pre-check for CFS setup
      -pre crsinst    : pre-check for CRS installation
      -pre acfscfg    : pre-check for ACFS Configuration.
      -pre dbinst     : pre-check for database installation
      -pre dbcfg      : pre-check for database configuration
      -pre hacfg      : pre-check for HA configuration
      -pre nodeadd    : pre-check for node addition.
      -post hwos      : post-check for hardware and operating system
      -post cfs       : post-check for CFS setup
      -post crsinst   : post-check for CRS installation
      -post acfscfg   : post-check for ACFS Configuration.
      -post hacfg     : post-check for HA configuration
      -post nodeadd   : post-check for node addition.
      -post nodedel   : post-check for node deletion.

在RAC Cluster中独立的子系统或者模块被称作组件(component),集群组件的可用性、完整性、稳定性以及其他一些表现均可以使用CVU来验证。简单如某个存储设备、复杂如包含CRSD、EVMD、CSSD、OCR等多个子组件的CRS stack都可以被认为是一个组件。

在CRS运行过程中为了检验Cluster中的某个特定组件(component)或者为了独立诊断某个Cluster集群子系统,需要用到合适的组件检查命令;各种不同组件的检验可以使用cluvfy comp -list命令列出:

 cluvfy comp -list

USAGE:
cluvfy comp     [-verbose]

Valid Components are:
      nodereach       : checks reachability between nodes
      nodecon         : checks node connectivity
      cfs             : checks CFS integrity
      ssa             : checks shared storage accessibility
      space           : checks space availability
      sys             : checks minimum system requirements
      clu             : checks cluster integrity
      clumgr          : checks cluster manager integrity
      ocr             : checks OCR integrity
      olr             : checks OLR integrity
      ha              : checks HA integrity
      crs             : checks CRS integrity
      nodeapp         : checks node applications existence
      admprv          : checks administrative privileges
      peer            : compares properties with peers
      software        : checks software distribution
      acfs            : checks ACFS integrity
      asm             : checks ASM integrity
      gpnp            : checks GPnP integrity
      gns             : checks GNS integrity
      scan            : checks SCAN configuration
      ohasd           : checks OHASD integrity
      clocksync       : checks Clock Synchronization
      vdisk           : checks Voting Disk configuration and UDEV settings
      dhcp            : Checks DHCP configuration
      dns             : Checks DNS configuration

我们可以从OTN的CVU专栏内下载到最新版本的cluvfy,如果没有特殊的要求那么我们总是推荐使用最新版本。一般在完成RAC安装后也可以从以下2个路径找到cluvfy:

Clusterware Home
<crs_home>/bin/cluvfy

Oracle Home
$ORACLE_HOME/bin/cluvfy

使用最为频繁的几个cluvfy命令如下:

Verify the hardware and operating system:检验操作系统和硬件的配置

cluvfy stage -post hwos -n vrh1,vrh2

cluvfy stage -post hwos -n vrh1,vrh2

Performing post-checks for hardware and operating system setup 

Checking node reachability...
Node reachability check passed from node "vrh1"

Checking user equivalence...
User equivalence check passed for user "grid"

Checking node connectivity...

Checking hosts config file...

Verification of the hosts config file successful

Node connectivity passed for subnet "192.168.1.0" with node(s) vrh2,vrh1
TCP connectivity check passed for subnet "192.168.1.0"

Node connectivity passed for subnet "192.168.2.0" with node(s) vrh2,vrh1
TCP connectivity check passed for subnet "192.168.2.0"

Node connectivity passed for subnet "169.254.0.0" with node(s) vrh2,vrh1
TCP connectivity check passed for subnet "169.254.0.0"

Interfaces found on subnet "192.168.1.0" that are likely candidates for VIP are:
vrh2 eth0:192.168.1.163 eth0:192.168.1.164 eth0:192.168.1.166
vrh1 eth0:192.168.1.161 eth0:192.168.1.190 eth0:192.168.1.162

Interfaces found on subnet "169.254.0.0" that are likely candidates for VIP are:
vrh2 eth1:169.254.8.92
vrh1 eth1:169.254.175.195

Interfaces found on subnet "192.168.2.0" that are likely candidates for a private interconnect are:
vrh2 eth1:192.168.2.19
vrh1 eth1:192.168.2.18

Node connectivity check passed

Check for multiple users with UID value 0 passed
Time zone consistency check passed

Checking shared storage accessibility...

  Disk                                  Sharing Nodes (2 in count)
  ------------------------------------  ------------------------
  /dev/sdb                              vrh2 vrh1               

  Disk                                  Sharing Nodes (2 in count)
  ------------------------------------  ------------------------
  /dev/sdc                              vrh2 vrh1               

  Disk                                  Sharing Nodes (2 in count)
  ------------------------------------  ------------------------
  /dev/sdd                              vrh2 vrh1               

  Disk                                  Sharing Nodes (2 in count)
  ------------------------------------  ------------------------
  /dev/sde                              vrh2 vrh1               

  Disk                                  Sharing Nodes (2 in count)
  ------------------------------------  ------------------------
  /dev/sdf                              vrh2 vrh1               

  Disk                                  Sharing Nodes (2 in count)
  ------------------------------------  ------------------------
  /dev/sdg                              vrh2 vrh1               

  Disk                                  Sharing Nodes (2 in count)
  ------------------------------------  ------------------------
  /dev/sdh                              vrh2 vrh1               

  Disk                                  Sharing Nodes (2 in count)
  ------------------------------------  ------------------------
  /dev/sdi                              vrh2 vrh1               

  Disk                                  Sharing Nodes (2 in count)
  ------------------------------------  ------------------------
  /dev/sdj                              vrh2 vrh1               

  Disk                                  Sharing Nodes (2 in count)
  ------------------------------------  ------------------------
  /dev/sdk                              vrh2 vrh1               

  Disk                                  Sharing Nodes (2 in count)
  ------------------------------------  ------------------------
  /dev/sdl                              vrh2 vrh1               

  Disk                                  Sharing Nodes (2 in count)
  ------------------------------------  ------------------------
  /dev/sdm                              vrh2 vrh1               

  Disk                                  Sharing Nodes (2 in count)
  ------------------------------------  ------------------------
  /dev/sdn                              vrh2 vrh1               

  Disk                                  Sharing Nodes (2 in count)
  ------------------------------------  ------------------------
  /dev/sdo                              vrh2 vrh1               

  Disk                                  Sharing Nodes (2 in count)
  ------------------------------------  ------------------------
  /dev/sdp                              vrh2 vrh1               

  Disk                                  Sharing Nodes (2 in count)
  ------------------------------------  ------------------------
  /dev/sdq                              vrh2 vrh1               

  Disk                                  Sharing Nodes (2 in count)
  ------------------------------------  ------------------------
  /dev/sdr                              vrh2 vrh1               

  Disk                                  Sharing Nodes (2 in count)
  ------------------------------------  ------------------------
  /dev/sds                              vrh2 vrh1               

  Disk                                  Sharing Nodes (2 in count)
  ------------------------------------  ------------------------
  /dev/sdt                              vrh2 vrh1               
Shared storage check was successful on nodes "vrh2,vrh1"
Post-check for hardware and operating system setup was successful.

Cluster Installation Ready check on all nodes:安装Clusterware前执行以下命令

 cluvfy stage -pre crsinst -n vrh1,vrh2

cluvfy stage -pre crsinst -n vrh1,vrh2
Performing pre-checks for cluster services setup
Checking node reachability...
Node reachability check passed from node "vrh1"
Checking user equivalence...
User equivalence check passed for user "grid"
Checking node connectivity...
Checking hosts config file...
Verification of the hosts config file successful
Node connectivity passed for subnet "192.168.1.0" with node(s) vrh2,vrh1
TCP connectivity check passed for subnet "192.168.1.0"

Node connectivity passed for subnet "192.168.2.0" with node(s) vrh2,vrh1
TCP connectivity check passed for subnet "192.168.2.0"

Node connectivity passed for subnet "169.254.0.0" with node(s) vrh2,vrh1
TCP connectivity check passed for subnet "169.254.0.0"

Interfaces found on subnet "192.168.1.0" that are likely candidates for VIP are:
vrh2 eth0:192.168.1.163 eth0:192.168.1.164 eth0:192.168.1.166
vrh1 eth0:192.168.1.161 eth0:192.168.1.190 eth0:192.168.1.162

Interfaces found on subnet "169.254.0.0" that are likely candidates for VIP are:
vrh2 eth1:169.254.8.92
vrh1 eth1:169.254.175.195

Interfaces found on subnet "192.168.2.0" that are likely candidates for a private interconnect are:
vrh2 eth1:192.168.2.19
vrh1 eth1:192.168.2.18

Node connectivity check passed

Checking ASMLib configuration.
Check for ASMLib configuration passed.
Total memory check passed
Available memory check passed
Swap space check passed
Free disk space check passed for "vrh2:/tmp"
Free disk space check passed for "vrh1:/tmp"
Check for multiple users with UID value 54322 passed
User existence check passed for "grid"
Group existence check passed for "oinstall"
Group existence check passed for "dba"
Membership check for user "grid" in group "oinstall" [as Primary] failed
Check failed on nodes:
        vrh2,vrh1
Membership check for user "grid" in group "dba" failed
Check failed on nodes:
        vrh2,vrh1
Run level check passed
Hard limits check passed for "maximum open file descriptors"
Soft limits check passed for "maximum open file descriptors"
Hard limits check passed for "maximum user processes"
Soft limits check passed for "maximum user processes"
System architecture check passed
Kernel version check passed
Kernel parameter check passed for "semmsl"
Kernel parameter check passed for "semmns"
Kernel parameter check passed for "semopm"
Kernel parameter check passed for "semmni"
Kernel parameter check passed for "shmmax"
Kernel parameter check passed for "shmmni"
Kernel parameter check passed for "shmall"
Kernel parameter check passed for "file-max"
Kernel parameter check passed for "ip_local_port_range"
Kernel parameter check passed for "rmem_default"
Kernel parameter check passed for "rmem_max"
Kernel parameter check passed for "wmem_default"
Kernel parameter check passed for "wmem_max"
Kernel parameter check passed for "aio-max-nr"
Package existence check passed for "make-3.81( x86_64)"
Package existence check passed for "binutils-2.17.50.0.6( x86_64)"
Package existence check passed for "gcc-4.1.2 (x86_64)( x86_64)"
Package existence check passed for "libaio-0.3.106 (x86_64)( x86_64)"
Package existence check passed for "glibc-2.5-24 (x86_64)( x86_64)"
Package existence check passed for "compat-libstdc++-33-3.2.3 (x86_64)( x86_64)"
Package existence check passed for "elfutils-libelf-0.125 (x86_64)( x86_64)"
Package existence check passed for "elfutils-libelf-devel-0.125( x86_64)"
Package existence check passed for "glibc-common-2.5( x86_64)"
Package existence check passed for "glibc-devel-2.5 (x86_64)( x86_64)"
Package existence check passed for "glibc-headers-2.5( x86_64)"
Package existence check passed for "gcc-c++-4.1.2 (x86_64)( x86_64)"
Package existence check passed for "libaio-devel-0.3.106 (x86_64)( x86_64)"
Package existence check passed for "libgcc-4.1.2 (x86_64)( x86_64)"
Package existence check passed for "libstdc++-4.1.2 (x86_64)( x86_64)"
Package existence check passed for "libstdc++-devel-4.1.2 (x86_64)( x86_64)"
Package existence check passed for "sysstat-7.0.2( x86_64)"
Package existence check passed for "ksh-20060214( x86_64)"
Check for multiple users with UID value 0 passed
Current group ID check passed

Starting Clock synchronization checks using Network Time Protocol(NTP)...

NTP Configuration file check started...
No NTP Daemons or Services were found to be running

Clock synchronization check using Network Time Protocol(NTP) passed

Core file name pattern consistency check passed.

User "grid" is not part of "root" group. Check passed
Default user file creation mask check passed
Checking consistency of file "/etc/resolv.conf" across nodes

File "/etc/resolv.conf" does not have both domain and search entries defined
domain entry in file "/etc/resolv.conf" is consistent across nodes
search entry in file "/etc/resolv.conf" is consistent across nodes
All nodes have one search entry defined in file "/etc/resolv.conf"
The DNS response time for an unreachable node is within acceptable limit on all nodes

File "/etc/resolv.conf" is consistent across nodes

Time zone consistency check passed

Starting check for Huge Pages Existence ...

Check for Huge Pages Existence passed

Starting check for Hardware Clock synchronization at shutdown ...

Check for Hardware Clock synchronization at shutdown passed

Pre-check for cluster services setup was unsuccessful on all the nodes.

Database Installation Ready check on all nodes:安装RDBMS前执行以下命令
cluvfy stage -pre dbinst -n vrh1,vrh2

cluvfy stage -pre dbinst -n vrh1,vrh2 

Performing pre-checks for database installation 

Checking node reachability...
Node reachability check passed from node "vrh1"

Checking user equivalence...
User equivalence check passed for user "grid"

Checking node connectivity...

Checking hosts config file...

Verification of the hosts config file successful

Check: Node connectivity for interface "eth0"
Node connectivity passed for interface "eth0"

Check: Node connectivity for interface "eth1"
Node connectivity passed for interface "eth1"

Node connectivity check passed

Total memory check passed
Available memory check passed
Swap space check passed
Free disk space check passed for "vrh2:/tmp"
Free disk space check passed for "vrh1:/tmp"
Check for multiple users with UID value 54322 passed
User existence check passed for "grid"
Group existence check passed for "asmadmin"
Group existence check passed for "dba"
Membership check for user "grid" in group "asmadmin" [as Primary] passed
Membership check for user "grid" in group "dba" failed
Check failed on nodes:
        vrh2,vrh1
Run level check passed
Hard limits check passed for "maximum open file descriptors"
Soft limits check passed for "maximum open file descriptors"
Hard limits check passed for "maximum user processes"
Soft limits check passed for "maximum user processes"
System architecture check passed
Kernel version check passed
Kernel parameter check passed for "semmsl"
Kernel parameter check passed for "semmns"
Kernel parameter check passed for "semopm"
Kernel parameter check passed for "semmni"
Kernel parameter check passed for "shmmax"
Kernel parameter check passed for "shmmni"
Kernel parameter check passed for "shmall"
Kernel parameter check passed for "file-max"
Kernel parameter check passed for "ip_local_port_range"
Kernel parameter check passed for "rmem_default"
Kernel parameter check passed for "rmem_max"
Kernel parameter check passed for "wmem_default"
Kernel parameter check passed for "wmem_max"
Kernel parameter check passed for "aio-max-nr"
Package existence check passed for "make-3.81( x86_64)"
Package existence check passed for "binutils-2.17.50.0.6( x86_64)"
Package existence check passed for "gcc-4.1.2 (x86_64)( x86_64)"
Package existence check passed for "libaio-0.3.106 (x86_64)( x86_64)"
Package existence check passed for "glibc-2.5-24 (x86_64)( x86_64)"
Package existence check passed for "compat-libstdc++-33-3.2.3 (x86_64)( x86_64)"
Package existence check passed for "elfutils-libelf-0.125 (x86_64)( x86_64)"
Package existence check passed for "elfutils-libelf-devel-0.125( x86_64)"
Package existence check passed for "glibc-common-2.5( x86_64)"
Package existence check passed for "glibc-devel-2.5 (x86_64)( x86_64)"
Package existence check passed for "glibc-headers-2.5( x86_64)"
Package existence check passed for "gcc-c++-4.1.2 (x86_64)( x86_64)"
Package existence check passed for "libaio-devel-0.3.106 (x86_64)( x86_64)"
Package existence check passed for "libgcc-4.1.2 (x86_64)( x86_64)"
Package existence check passed for "libstdc++-4.1.2 (x86_64)( x86_64)"
Package existence check passed for "libstdc++-devel-4.1.2 (x86_64)( x86_64)"
Package existence check passed for "sysstat-7.0.2( x86_64)"
Package existence check passed for "ksh-20060214( x86_64)"
Check for multiple users with UID value 0 passed
Current group ID check passed
Default user file creation mask check passed

Checking CRS integrity...

CRS integrity check passed

Checking Cluster manager integrity... 

Checking CSS daemon...
Oracle Cluster Synchronization Services appear to be online.

Cluster manager integrity check passed

Checking node application existence...

Checking existence of VIP node application (required)
VIP node application check passed

Checking existence of NETWORK node application (required)
NETWORK node application check passed

Checking existence of GSD node application (optional)
GSD node application is offline on nodes "vrh2,vrh1"

Checking existence of ONS node application (optional)
ONS node application check passed

Checking if Clusterware is installed on all nodes...
Check of Clusterware install passed

Checking if CTSS Resource is running on all nodes...
CTSS resource check passed

Querying CTSS for time offset on all nodes...
Query of CTSS for time offset passed

Check CTSS state started...
CTSS is in Active state. Proceeding with check of clock time offsets on all nodes...
Check of clock time offsets passed

Oracle Cluster Time Synchronization Services check passed
Checking consistency of file "/etc/resolv.conf" across nodes

File "/etc/resolv.conf" does not have both domain and search entries defined
domain entry in file "/etc/resolv.conf" is consistent across nodes
search entry in file "/etc/resolv.conf" is consistent across nodes
All nodes have one search entry defined in file "/etc/resolv.conf"
The DNS response time for an unreachable node is within acceptable limit on all nodes

File "/etc/resolv.conf" is consistent across nodes

Time zone consistency check passed
Checking VIP configuration.
Checking VIP Subnet configuration.
Check for VIP Subnet configuration passed.
Checking VIP reachability
Check for VIP reachability passed.

Pre-check for database installation was unsuccessful on all the nodes.

利用cluvfy检验ocr组件:
cluvfy comp ocr

cluvfy comp ocr
Verifying OCR integrity

Checking OCR integrity...

Checking the absence of a non-clustered configuration...
All nodes free of non-clustered, local-only configurations

ASM Running check passed. ASM is running on all specified nodes

Checking OCR config file "/etc/oracle/ocr.loc"...

OCR config file "/etc/oracle/ocr.loc" check successful

Disk group for ocr location "+SYSTEMDG" available on all the nodes

Disk group for ocr location "+FRA" available on all the nodes

Disk group for ocr location "+DATA" available on all the nodes

NOTE:
This check does not verify the integrity of the OCR contents.
Execute 'ocrcheck' as a privileged user to verify the contents of OCR.

OCR integrity check passed

Verification of OCR integrity was successful.

为10g RAC Cluster添加节点

RAC体系为系统扩容开创了一条新的道路:只需要不断向集群Cluster中加入新的服务器就可以了。这里我们就来介绍一下如何为10g RAC Cluster增加节点;实际上这并不复杂,甚至可以说是很简单的。为现有的Oracle 10g RAC添加节点大致包括以下步骤:

  1. 配置新的服务器节点上的硬件及操作系统环境
  2. 向Cluster集群中加入该节点
  3. 在新节点上安装Oracle Database software
  4. 为新的节点配置监听器LISTENER
  5. 通过DBCA为新的节点添加实例

我们假设当前RAC环境中存在2个节点,分别为vrh1和vrh2;该RAC环境中clusterware和database的版本均为10.2.0.4,需要加入该Cluster的新节点为vrh3。

一、在新的服务器节点上配置操作系统环境

  1. 这包括配置该节点今后使用的public network公用网络和private network接口,不要忘记在hosts文件中加入之前节点的网络信息,并将该份完整的hosts文件传到集群Cluster中已有的节点上,保证处处可以访问
  2. 同时需要在原有的基础上配置oracle(或其他DBA)用户的身份等价性,这需要将新节点上生成的id_rsa.pub和id_dsa.pub文件中的信息追加到authorized_keys文件中,并保证在所有节点上均有这样一份相同的authorized_keys文件
  3. 调整新节点上的操作系统内核参数,保证其满足今后在该节点上运行实例的内存要求以及10g RAC Cluster的推荐的udp网络参数
  4. 调整新节点上的系统时间以保持同其他节点一致,或者配置NTP服务
  5. 保证原有Cluster中所有节点上的CRS都正常运行,否则addNode时会报错
  6. 配置clusterware和database软件的安装目录,要求路径和原有节点上的一致

二.向Rac Cluster中加入新的节点
1.在原有RAC节点上(vrh1)以oracle用户身份登录,设置合理的DISPLAY显示环境变量,并运行$ORA_CRS_HOME/oui/bin目录下的addNode.sh脚本

[oracle@vrh1 ~]$ export DISPLAY=IP:0.0
[oracle@vrh1 ~]$ cd $ORA_CRS_HOME/oui/bin
[oracle@vrh1 bin]$ ./addNode.sh

2.若OUI被正常调用则会出现欢迎界面,并点击NEXT:
Screenshot

3.在”Specify Cluster Nodes to Add to Installation”界面下输入公共节点名、私有节点名和VIP节点名(之前应当在所有节点的/etc/hosts文件中都配置了):
Screenshot-1

4.出现”Cluster Node Addition Summary”界面Review一遍Summary信息,并点击NEXT:
Screenshot-2

5.出现”Cluster Node Addition Progress”界面,OUI成功将安装好的clusterware软件传输到新的节点后,会提示用户运行多个脚本包括:orainstRoot.sh(新节点)、rootaddnode.sh(运行OUI的原有节点)、root.sh(新节点):
Screenshot-3

6.运行之前提示的三个脚本:

/* 在新节点上运行orainstRoot.sh脚本 */

[root@vrh3 ~]# /s01/oraInventory/orainstRoot.sh 
Changing permissions of /s01/oraInventory to 770.
Changing groupname of /s01/oraInventory to oinstall.
The execution of the script is complete

/* 在运行OUI的原有节点上运行rootaddnode.sh脚本 */

[root@vrh1 ~]# /s01/oracle/product/10.2.0/crs/install/rootaddnode.sh
clscfg: EXISTING configuration version 3 detected.
clscfg: version 3 is 10G Release 2.
Attempting to add 1 new nodes to the configuration
Using ports: CSS=49895 CRS=49896 EVMC=49898 and EVMR=49897.
node {nodenumber}: {nodename} {private interconnect name} {hostname}
node 3: vrh3 vrh3-priv vrh3
Creating OCR keys for user 'root', privgrp 'root'..
Operation successful.
/s01/oracle/product/10.2.0/crs/bin/srvctl add nodeapps -n vrh3 -A vrh3-vip/255.255.255.0/eth0 -o /s01/oracle/product/10.2.0/crs

/* 在新节点上运行root.sh脚本 */

[root@vrh3 ~]# /s01/oracle/product/10.2.0/crs/root.sh 
WARNING: directory '/s01/oracle/product/10.2.0' is not owned by root
WARNING: directory '/s01/oracle/product' is not owned by root
WARNING: directory '/s01/oracle' is not owned by root
WARNING: directory '/s01' is not owned by root
Checking to see if Oracle CRS stack is already configured
/etc/oracle does not exist. Creating it now.
OCR LOCATIONS =  /dev/raw/raw1
OCR backup directory '/s01/oracle/product/10.2.0/crs/cdata/crs' does not exist. Creating now
Setting the permissions on OCR backup directory
Setting up NS directories
Oracle Cluster Registry configuration upgraded successfully
WARNING: directory '/s01/oracle/product/10.2.0' is not owned by root
WARNING: directory '/s01/oracle/product' is not owned by root
WARNING: directory '/s01/oracle' is not owned by root
WARNING: directory '/s01' is not owned by root
clscfg: EXISTING configuration version 3 detected.
clscfg: version 3 is 10G Release 2.
Successfully accumulated necessary OCR keys.
Using ports: CSS=49895 CRS=49896 EVMC=49898 and EVMR=49897.
node {nodenumber}: {nodename} {private interconnect name} {hostname}
node 1: vrh1 vrh1-priv vrh1
node 2: vrh2 vrh2-priv vrh2
clscfg: Arguments check out successfully.

NO KEYS WERE WRITTEN. Supply -force parameter to override.
-force is destructive and will destroy any previous cluster
configuration.
Oracle Cluster Registry for cluster has already been initialized
Startup will be queued to init within 30 seconds.
Adding daemons to inittab
Expecting the CRS daemons to be up within 600 seconds.
CSS is active on these nodes.
        vrh1
        vrh2
        vrh3
CSS is active on all nodes.
Waiting for the Oracle CRSD and EVMD to start
Oracle CRS stack installed and running under init(1M)
Running vipca(silent) for configuring nodeapps

Creating VIP application resource on (0) nodes.
Creating GSD application resource on (2) nodes...
Creating ONS application resource on (2) nodes...
Starting VIP application resource on (2) nodes...
Starting GSD application resource on (2) nodes...
Starting ONS application resource on (2) nodes...
Done.

7.运行完成以上脚本后回到OUI界面点选OK,出现安装成功界面:
Screenshot-4

三.在新节点上安装Oracle Database software
1.在原有RAC节点上以Oracle用户身份登录,设置合理的DISPLAY显示环境变量,并运行$ORACLE_HOME/oui/bin目录下的addNode.sh脚本:

[oracle@vrh1 ~]$ export DISPLAY=IP:0.0

[oracle@vrh1 ~]$ cd $ORACLE_HOME/oui/bin
[oracle@vrh1 bin]$ ./addNode.sh 

2.若OUI被正常调用则会出现欢迎界面,并点击NEXT:
Screenshot-5

3.在”Specify Cluster Nodes to Add to Installation“界面中勾选需要安装Oracle Database software数据库软件的节点,这里为vrh3:
Screenshot-6

4.出现”Cluster Node Addition Summary”界面,review一遍安装Summary后点击Next:
Screenshot-7

5.出现”Cluster Node Addition Progress”进度条界面,完成对Oracle database Software的远程传输安装后会提示用户运行root.sh脚本:
Screenshot-8

6.在新的节点(这里为vrh3)上运行提示脚本root.sh

四.在新的节点(vrh3)上使用netca工具配置监听器LISTENER,不做展开

五.使用DBCA工具在新节点上添加实例
1.在原有RAC节点上以Oracle用户身份登录,设置合理的DISPLAY显示环境变量,并运行DBCA命令:

2.出现DBCA欢迎界面,选择”Oracle Real Application Clusters”并点击Next:
Screenshot-9

3.勾选节点管理,并点击Next:
Screenshot11

4.勾选”Add an instance”增加节点,并点击Next:
Screenshot12

5.勾选想要添加实例的数据库并填写SYSDBA用户信息密码:
Screenshot13

6.勾选正确的实例名和节点,并点击Next

7.Review Storage存储信息,并点击Next

8.Review Summary信息,并点击OK

9.等待进度条完成,当被提示是否执行另一操作”perform another operation”时选择No

10.进一步通过查询gv$instance视图验证节点是否添加成功,若添加成功则当可以看到所有的节点信息。

【Oracle数据恢复】ORA-00600[6711]错误一例

一套Linux上的10.2.0.4系统,日志中频繁出现ORA-00600[6711]内部错误:

 

如果自己搞不定可以找ASKMACLEAN专业ORACLE数据库修复团队成员帮您恢复!

 

Wed Sep  1 21:24:30 2010
Errors in file /s01/10gdb/admin/YOUYUS/bdump/youyus_smon_5622.trc:
ORA-00600: internal error code, arguments: [6711], [4256248], [1], [4256242], [0], [], [], []
Wed Sep  1 21:24:31 2010
Non-fatal internal error happenned while SMON was doing logging scn->time mapping.

 

 

MOS上有一个关于6711内部错误十分简单的Note,该文档声称出现6711错误极有可能是部分类型为簇(cluster)的数据字典表存在潜在的讹误,这个Note甚至没有告诉我们该错误argument参数的意义。
不过其实我们可以猜出来,因为是和corruption相关的错误,那么实际上可能关联的几个因素无非是obj#,file#,block#;4256248和4256242 两个数字像极了Data Block Address,把他们当做dba来看待,也就指向了1号数据文件的61938块和61944数据块,我们来看看这些块属于哪个对象:

SQL> set linesize 200;
SQL> select segment_name, segment_type
  2    from dba_extents
  3   where relative_fno = 1
  4     and (61938 between block_id and block_id + blocks or
  5         61944 between block_id and block_id + blocks);

SEGMENT_NAME                                                                      SEGMENT_TYPE
--------------------------------------------------------------------------------- ------------------
SMON_SCN_TO_TIME                                                                  CLUSTER

不出意料是一个cluster,SMON_SCN_TO_TIME是SMON_SCN_TIME表的基簇,SMON_SCN_TIME表用以记录数据库中scn对应的时间戳。我们直接查看用以创建数据字典的sql.bsq文件,可以进一步了解他们的结构:

cat $ORACLE_HOME/rdbms/admin/sql.bsq|grep -A 24 "create cluster smon_scn_to_time"
create cluster smon_scn_to_time (
  thread number                         /* thread, compatibility */
)
/
create index smon_scn_to_time_idx on cluster smon_scn_to_time
/
create table smon_scn_time (
  thread number,                         /* thread, compatibility */
  time_mp number,                        /* time this recent scn represents */
  time_dp date,                          /* time as date, compatibility */
  scn_wrp number,                        /* scn.wrp, compatibility */
  scn_bas number,                        /* scn.bas, compatibility */
  num_mappings number,
  tim_scn_map raw(1200),
  scn number default 0,                  /* scn */
  orig_thread number default 0           /* for downgrade */
) cluster smon_scn_to_time (thread)
/

create unique index smon_scn_time_tim_idx on smon_scn_time(time_mp)
/

create unique index smon_scn_time_scn_idx on smon_scn_time(scn)
/

从以上脚本可以看到这个簇上存在多个索引,我们需要进一步validate验证所有这些对象:

SQL> analyze table SMON_SCN_TIME validate structure;
Table analyzed.

SQL>analyze table SMON_SCN_TIME validate structure cascade;
Table analyzed.

SQL> analyze cluster SMON_SCN_TO_TIME validate structure;
Cluster analyzed.

SQL> analyze cluster SMON_SCN_TO_TIME validate structure cascade;
analyze cluster SMON_SCN_TO_TIME validate structure cascade
*
ERROR at line 1:
ORA-01499: table/index cross reference failure - see trace file

到这里问题已经很清晰了,问题出在SMON_SCN_TO_TIME的索引smon_scn_to_time_idx身上,极有可能是该索引上出现了逻辑讹误。所幸有问题的仅仅是索引,找出问题所在后要解决就显得容易得多了:

SQL> alter index smon_scn_to_time_idx rebuild ;

Index altered.

/* 在索引出现讹误的情况下仅仅rebuild往往是无效的,在我们rebuild的同时告警日志中再次出现了ORA-00600[6711]错误 !!! */

/* 我们需要的彻底把有问题的索引drop掉,并再次创建!!! */

SQL> drop index smon_scn_to_time_idx ;

Index dropped.

SQL> create index smon_scn_to_time_idx on cluster smon_scn_to_time;

Index created.

/* 至此问题解决,告警日志中不再出现错误! * /

/* That's great! * /

沪ICP备14014813号-2

沪公网安备 31010802001379号