Ignore gsd resource failed to start above 10g

On : 10.2.0.1 version, Real Application Cluster

When attempting to start gsd resource.
the following error occurs.

ERROR
———————–
Auto-start failed for the CRS resource .

Trac the issue with note:
Tracing GSD, SRVCTL, GSDCTL, VIPCA and SRVCONFIG (Doc ID 178683.1)

Tracing GSD, SRVCTL, GSDCTL, VIPCA and SRVCONFIG

PURPOSE
-------

The Purpose of this document is to assist in debugging SRVCTL, GSD, GSDCTL, VIPCA,
and SRVCONFIG problems.

SCOPE & APPLICATION
-------------------

This document is for support analysts to troubleshoot SRVCTL, GSD, GSDCTL, VIPCA,
and SRVCONFIG issues.

TRACING GSD, SRVCTL, GSDCTL, VIPCA, and SRVCONFIG
------------------------------------------

To provide verbose output for SRVCTL, GSD, GSDCTL, VIPCA, or SRVCONFIG, tracing can
be enabled to provide additional screen output.

--------------------------------------------------------------------------

10g:

Just set the environment variable SRVM_TRACE to true to trace all of the
SRVM files like gsd, srvctl, vipca, and ocrconfig.

--------------------------------------------------------------------------

9i:

To Trace GSD:
-------------
1. vi the gsd.sh file in the $ORACLE_HOME/bin directory.

   For Windows:  Right click on the OraHomebingsd.bat file and choose Edit.

2. At the end of the file, look for the following line:

  exec $JRE -classpath $CLASSPATH oracle.ops.mgmt.daemon.OPSMDaemon $MY_OHOME

3. Add the following just before the -classpath in the 'exec $JRE' line:

  -DTRACING.ENABLED=true -DTRACING.LEVEL=2

4. At the end of the gsd.sh file, the string should now look like this:

  exec $JRE -DTRACING.ENABLED=true -DTRACING.LEVEL=2 -classpath.....

5. Test this by running gsd.sh:

 [opcbsol1]/u01/home/usupport> gsd.sh
 [main][9:31:8:860] Daemon: argument is /u01/32bit/app/oracle/product/9.0.1
 [main][9:31:8:893] tracing is true; at level 2
 [main][9:31:8:893] trace file is /u01/32bit/app/oracle/product/9.0.1/srvm/log/gsdaemon.log
 cont...

To Trace SRVCTL:
---------------
1. vi the srvctl file in the $ORACLE_HOME/bin directory.

   For Windows:  Right click on the OraHomebinsrvctl.bat file and choose Edit.

2. At the end of the file, look for the following line:

  $JRE -classpath $CLASSPATH oracle.ops.opsctl.OPSCTLDriver "$@"

3. Add the following just before the -classpath in the '$JRE' line:

  -DTRACING.ENABLED=true -DTRACING.LEVEL=2

4. At the end of the srvctl file, the string should now look like this:

  $JRE -DTRACING.ENABLED=true -DTRACING.LEVEL=2 -classpath.....

5. Test this by running srvctl:

 [opcbsol1]/u01/home/usupport> srvctl status -p V90321
 [main][9:33:2:968] srvctl: tracing is true at level 2
 [main][9:33:3:38] Going into GetActiveNodes constructor...
 [main][9:33:3:59] Detected Cluster
 [main][9:33:3:60] Cluster existence = true
 [main][9:33:3:95] loaded library
 [main][9:33:3:108] Inside GetActiveNodes.initializeCluster
 [main][9:33:3:264] The status string is: 1
 [main][9:33:3:265] The result string is: Everything ok So Far 1
 cont...

To Trace GSDCTL:
---------------
1. vi the gsdctl file in the $ORACLE_HOME/bin directory.

   For Windows:  Right click on the OraHomebingsdctl.bat file and choose Edit.

2. At the end of the file, look for the following line:

  $JRE -classpath $CLASSPATH oracle.ops.mgmt.daemon.GSDCTLDriver...

3. Add the following just before the -classpath in the '$JRE' line:

  -DTRACING.ENABLED=true -DTRACING.LEVEL=2

4. At the end of the gsdctl file, the string should now look like this:

  $JRE -DTRACING.ENABLED=true -DTRACING.LEVEL=2 -classpath.....

5. Test this by running gsdctl:

  [opcbsol1]/u02/32bit/app/oracle/product/9.2.0/bin> gsdctl stat
  [main] [15:41:34:849] [GetActiveNodes.create:Compile]  Going into GetActiveNodes
  [main] [15:41:34:918] [sQueryCluster.:Compile]  Detected Cluster
  [main] [15:41:34:922] [sQueryCluster.isCluster:Compile]  Cluster existence = true
  cont...

To Trace SRVCONFIG:
-------------------
1. vi the srvconfig file in the $ORACLE_HOME/bin directory.

   For Windows:  Right click on the OraHomebinsrvconfig.bat file and choose Edit.

2. At the end of the file, look for the following line:

  $JRE -classpath $CLASSPATH oracle.ops.mgmt.rawdevice.RawDeviceUtil $*

3. Add the following just before the -classpath in the '$JRE' line:

  -DTRACING.ENABLED=true -DTRACING.LEVEL=2

4. At the end of the srvconfig file, the string should now look like this:

  $JRE -DTRACING.ENABLED=true -DTRACING.LEVEL=2 -classpath.....

5. Test this by running srvconfig:

  [opcbsol1]/u02/32bit/app/oracle/product/9.2.0/bin> srvconfig -version
  [main] [16:0:58:395] [RawDeviceUtil.getDeviceName:Compile]
  [main] [16:0:58:454] [sQueryCluster.:Compile]  Detected Cluster
  [main] [16:0:58:457] [sQueryCluster.isCluster:Compile]  Cluster existence = true
  cont...

Failed to start GSD on local node

PROBLEM
-------

AIX 5L cannot successfully start gsd on any node of the cluster.
Get error "Failed to start GSD on local node"

SOLUTION
--------
Ensure that the user (oracle) is added to the HAGSUSER UNIX group.

If the gsd still fails, turn on tracing of the GSD.
Simply turning on GSD tracing, allowed for the GSD to start successfully.

Look at note 178683.1 for how to enable GSD tracing.

LOG FILE
-----------------------
Filename =crsd.log
See the following error:
2009-01-02 08:08:27.838: [ CRSCOMM][12351]32Receive message header caa_clsrecv ret 11
2009-01-02 08:08:27.838: [ CRSCOMM][12351]32Error reading response IOException : Didn't receive header part of message
(File: caa_Message.cpp, line: 711

2009-01-02 08:08:27.838: [ CRSEVT][12351]32invokepeer ret 300
2009-01-02 08:08:27.838: [ CRSRES][12351]32Remote start failed to execute on ccdb_b: X_E2E_NoResponse :
(File: caa_CmdRTI.cpp, line: 507

2009-01-02 08:08:27.839: [ CRSRES][12351][ALERT]32Remote start for `ora.ccdb_b.gsd` failed on member `ccdb_b`
2009-01-02 08:08:27.914: [ OCRMAS][3611]th_master:13: I AM THE NEW OCR MASTER at incar 6. Node Number 1
2009-01-02 08:08:27.915: [ OCRRAW][3611]proprioo: for disk 0 (/dev/ro_ocr_raw), id match (1), my id set
(1731740172,1028247821) total id sets (1), 1st set (1731740172,1028247821), 2nd set (0,0) my votes (2), total votes (2)
2009-01-02 08:08:27.916: [ OCRRAW][3611]rrecovernumpage: numpage on device is not correct (0); recalculate (262075)
2009-01-02 08:08:27.922: [ OCRMAS][3611]th_master: Deleted ver keys from cache (master)
2009-01-02 08:08:30.996: [ CLSVER][527]32Returned from grpstat with event 1
2009-01-02 08:08:30.996: [ CLSVER][527]32Doing grpstat on crs_version group
2009-01-02 08:08:58.400: [ CRSCOMM][13127]32CLEANUP: Searching for connections to failed node ccdb_b
2009-01-02 08:08:58.400: [ CRSEVT][13127]32Processing member leave for ccdb_b, incarnation: 7
2009-01-02 08:08:58.402: [ CRSD][13127]32SM: recovery in process: 8
2009-01-02 08:08:58.402: [ CRSEVT][13127]32Do failover for: ccdb_b
2009-01-02 08:08:58.418: [ CRSRES][13127]32 startup = 0
2009-01-02 08:08:58.435: [ CRSRES][13127]32Not failing resource ora.ccdb_a.gsd because it was locked.
2009-01-02 08:08:58.435: [ CRSRES][13127]32X_RES_Unavailable : Resource ora.ccdb_a.gsd is locked
(File: rti.cpp, line: 976

2009-01-02 08:08:58.438: [ CRSRES][13127]32 startup = 0
2009-01-02 08:08:58.444: [ CRSRES][13127]32 startup = 0
2009-01-02 08:08:58.491: [ CRSRES][13898]32startRunnable: setting CLI values

On the customer ‘s environment other Aix platform got the same issues as this machine .
Due to this reason ,we considered the issue is cause of setups and gsd resource won’t impact the oracle or other applications above the version (10G) .

Work arounds
Manually disable the gsd resource :
1.Use crs_unregister to delete the resource from CRS then CRS won’t attempt to start the gsd resource .
Hard code the during checking the status
2.Hard code the gsd.sh return the status Online ,to show the status Online ;

GSD resource won’t impace the CRS or Database above the version 10g

Comment

*

沪ICP备14014813号-2

沪公网安备 31010802001379号