Recreate failovered primary database using Flashback Database

很多朋友一直对DataGuard的fast-start failover持怀疑的态度;通过observer的观察,(Fast-Start Failover)FSFO提供了一种在primary数据库不可用情况下自动故障切换到standby数据库的能力。造成很多朋友不愿意使用FSFO的原因之一是故障切换后不得不重建原primary数据库,不过如果我们能配合使用10g中的闪回数据库特性的话,这一重建数据库的工作便也可以一劳永逸了。

/* 启用fast_start failover的前提之一是在primary和standby库上都启用flashback database */

SQL> select * from v$version;

BANNER
--------------------------------------------------------------------------------
Oracle Database 11g Enterprise Edition Release 11.2.0.2.0 - 64bit Production
PL/SQL Release 11.2.0.2.0 - Production
CORE    11.2.0.2.0      Production
TNS for Linux: Version 11.2.0.2.0 - Production
NLSRTL Version 11.2.0.2.0 - Production

/* 当前的primary库情况 */

SQL> select open_mode ,database_role,flashback_on from v$database;
OPEN_MODE            DATABASE_ROLE    FLASHBACK_ON
-------------------- ---------------- ------------------
READ WRITE           PRIMARY          YES


SQL> alter session set nls_date_format='YYYY-MM-DD hh24:mi:ss';

Session altered.

SQL> select OLDEST_FLASHBACK_TIME  from v$flashback_database_log;

OLDEST_FLASHBACK_TI
-------------------
2011-02-19 22:40:39

/* 当前的standby库情况 */
SQL> select open_mode ,database_role,flashback_on from v$database;

OPEN_MODE            DATABASE_ROLE    FLASHBACK_ON
-------------------- ---------------- ------------------
READ ONLY WITH APPLY PHYSICAL STANDBY YES

SQL> alter session set nls_date_format='YYYY-MM-DD hh24:mi:ss';

Session altered.
SQL> select OLDEST_FLASHBACK_TIME  from v$flashback_database_log;

OLDEST_FLASHBACK_TI
-------------------
2011-02-19 22:34:56


DGMGRL> edit configuration set property faststartfailoverthreshold=10;
Property "faststartfailoverthreshold" updated

DGMGRL> show configuration verbose;

Configuration - pro

  Protection Mode: MaxAvailability
  Databases:
    sbdb - Primary database
    prod - Physical standby database

Fast-Start Failover: DISABLED

Configuration Status:
SUCCESS

/* 另开一个终端启动observer */

DGMGRL> start observer
Observer started

DGMGRL> show fast_start failover

Fast-Start Failover: DISABLED

  Threshold:        10 seconds
  Target:           (none)
  Observer:         rh3.oracle.com
  Lag Limit:        30 seconds
  Shutdown Primary: TRUE
  Auto-reinstate:   TRUE

Configurable Failover Conditions
  Health Conditions:
    Corrupted Controlfile          YES
    Corrupted Dictionary           YES
    Inaccessible Logfile            NO
    Stuck Archiver                  NO
    Datafile Offline               YES

  Oracle Error Conditions:
    (none)


/* 启动快速故障切换  */

DGMGRL> enable fast_start failover
Enabled.

/* 记录实际切换前的时间 */

[maclean@rh3 ~]$ date
Sat Feb 19 23:35:44 CST 2011

[maclean@rh2 ~]$ ps -ef|grep pmon|grep -v grep
maclean  25165     1  0 22:58 ?        00:00:00 ora_pmon_SBDB
maclean  26080     1  0 23:04 ?        00:00:00 ora_pmon_PROD

[maclean@rh2 ~]$ kill -9 25165

/* 通过kill pmon进程造成主库crash,引发fast-start failover */

/* observer观察到primary库意外终止后,实施了快速故障切换 */

23:37:27.92  Saturday, February 19, 2011
Initiating Fast-Start Failover to database "prod"...
Performing failover NOW, please wait...
Failover succeeded, new primary is "prod"
23:37:33.69  Saturday, February 19, 2011
23:39:17.98  Saturday, February 19, 2011
Initiating reinstatement for database "sbdb"...
Reinstating database "sbdb", please wait...
Error: ORA-16653: failed to reinstate database

Failed.
Reinstatement of database "sbdb" failed
23:39:26.30  Saturday, February 19, 2011

23:40:05.00  Saturday, February 19, 2011
Initiating reinstatement for database "sbdb"...
Reinstating database "sbdb", please wait...
Error: ORA-16653: failed to reinstate database

Failed.
Reinstatement of database "sbdb" failed
23:40:09.24  Saturday, February 19, 2011

/* 接下来我们通过闪回数据库来将失败切换后的primary库flashback到faliover之前以便重用*/

SQL> startup mount;

SQL> flashback database to timestamp to_timestamp('2011-02-19 23:30:44','YYYY-MM-DD hh24:mi:ss');
flashback database to timestamp to_timestamp('2011-02-19 23:35:44','YYYY-MM-DD hh24:mi:ss')
*
ERROR at line 1:
ORA-38754: FLASHBACK DATABASE not started; required redo log is not available
ORA-38762: redo logs needed for SCN 1143987 to SCN End-of-Redo
ORA-38761: redo log sequence 10 in thread 1, incarnation 3 could not be
accessed

/* 缺少最近的归档日志而无法完成闪回,需要到目前的primary库中查找 */

SQL> select name from v$archived_log where FIRST_CHANGE#<1143987 and NEXT_CHANGE#>1143987;

NAME
--------------------------------------------------------------------------------
/s01/fast_recovery_area/PROD/archivelog/2011_02_19/o1_mf_1_10_6ozpzh9c_.arc

SQL> alter database register physical logfile '/s01/fast_recovery_area/PROD/archivelog/2011_02_19/o1_mf_1_10_6ozpzh9c_.arc';
Database altered.


SQL> flashback database to timestamp to_timestamp('2011-02-19 23:30:44','YYYY-MM-DD hh24:mi:ss');
Flashback complete.

/* 成功闪回后,observer将主动去尝试reinstate这个目前可用的standby库,如以下日志*/

23:46:20.16  Saturday, February 19, 2011
Initiating reinstatement for database "sbdb"...
Reinstating database "sbdb", please wait...
Operation requires shutdown of instance "SBDB" on database "sbdb"
Shutting down instance "SBDB"...
ORA-01109: database not open

Database dismounted.
ORACLE instance shut down.
Operation requires startup of instance "SBDB" on database "sbdb"
Starting instance "SBDB"...
ORACLE instance started.
Database mounted.
Continuing to reinstate database "sbdb" ...
Reinstatement of database "sbdb" succeeded
23:47:13.49  Saturday, February 19, 2011

DGMGRL> show configuration verbose

Configuration - pro

  Protection Mode: MaxAvailability
  Databases:
    prod - Primary database
    sbdb - (*) Physical standby database

  (*) Fast-Start Failover target

Fast-Start Failover: ENABLED

  Threshold:        10 seconds
  Target:           sbdb
  Observer:         rh3.oracle.com
  Lag Limit:        30 seconds (not in use)
  Shutdown Primary: TRUE
  Auto-reinstate:   TRUE

Configuration Status:
SUCCESS

/*可以通过show configuration命令监控到当前primary与standby库都处于可用状态,
   且fast_start failover也为启用状态 */

沪ICP备14014813号-2

沪公网安备 31010802001379号