dataguard主库丢失archivelog，如何不重建备库？

12 月 2, 2012

—

由

昨天群中有网友问到dataguard环境中，由于主库archivelog丢失，且尚未同步到standby，问如何在避免
重建standby的情况下来将standby恢复成功的。下面是我的测试过程，供参考！

—-主库

SQL> select database_role from v$database;

DATABASE_ROLE
----------------
PRIMARY

SQL> archive log list;
Database log mode              Archive Mode
Automatic archival             Enabled
Archive destination            /arch
Oldest online log sequence     382
Next log sequence to archive   383
Current log sequence           383
SQL> select count(1) from roger.test;

  COUNT(1)
----------
        30

—-备库

SQL> select database_role from v$database;

DATABASE_ROLE
----------------
PHYSICAL STANDBY

SQL> archive log list;
Database log mode              Archive Mode
Automatic archival             Enabled
Archive destination            /arch2
Oldest online log sequence     382
Next log sequence to archive   0
Current log sequence           383
SQL>
SQL> alter database recover managed standby database cancel;

Database altered.

SQL> alter database open read only;

Database altered.

SQL> select count(1) from roger.test;

  COUNT(1)
----------
        30

模拟主库丢失归档的情况：

—主库

SQL> select * from v$Log;

    GROUP#    THREAD#  SEQUENCE#      BYTES    MEMBERS ARC STATUS           FIRST_CHANGE# FIRST_TIME
---------- ---------- ---------- ---------- ---------- --- ---------------- ------------- ------------
         1          1        383   10485760          1 NO  CURRENT                 740638 10-OCT-12
         2          1        382   20971520          1 YES INACTIVE                740633 10-OCT-12

SQL> archive log list;
Database log mode              Archive Mode
Automatic archival             Enabled
Archive destination            /arch
Oldest online log sequence     382
Next log sequence to archive   383
Current log sequence           383
SQL> alter system set log_archive_dest_state_2 = 'defer';

System altered.

SQL> delete from roger.test where rownum < 11;

10 rows deleted.

SQL> commit;

Commit complete.

SQL> select count(*) from roger.test;

  COUNT(*)
----------
        20

SQL> alter system switch logfile;

System altered.

SQL> /

System altered.

SQL> /

System altered.

SQL> !
[oracle@primarydb ~]$ cd /arch
[oracle@primarydb arch]$ ls -ltr|tail -10
-rw-r-----  1 oracle dba  3456000 Oct 10 15:18 1_374_726529113.dbf
-rw-r-----  1 oracle dba   593408 Oct 10 15:23 1_375_726529113.dbf
-rw-r-----  1 oracle dba     4608 Oct 10 15:23 1_376_726529113.dbf
-rw-r-----  1 oracle dba     1024 Oct 10 15:23 1_377_726529113.dbf
-rw-r-----  1 oracle dba    68096 Oct 10 15:25 1_378_726529113.dbf
-rw-r-----  1 oracle dba  1297408 Oct 10 15:34 1_379_726529113.dbf
-rw-r-----  1 oracle dba     2048 Oct 10 18:35 1_382_726529113.dbf
-rw-r-----  1 oracle dba   166400 Oct 10 18:44 1_383_726529113.dbf
-rw-r-----  1 oracle dba     2560 Oct 10 18:44 1_384_726529113.dbf
-rw-r-----  1 oracle dba     1024 Oct 10 18:44 1_385_726529113.dbf
[oracle@primarydb arch]$ rm 1_383_726529113.dbf  ---删除archivelog
[oracle@primarydb arch]$ rm 1_384_726529113.dbf
[oracle@primarydb arch]$ exit
exit

SQL> alter system set log_archive_dest_state_2 = 'enable';

System altered.

SQL>

—备库

SQL> alter database recover managed standby database disconnect from session;

Database altered.

SQL>

此时alert log信息如下：
Tue Nov 13 14:45:26 2012
MRP0: Background Managed Standby Recovery process started (test)
Managed Standby Recovery not using Real Time Apply
 parallel recovery started with 2 processes
Tue Nov 13 14:45:31 2012
Waiting for all non-current ORLs to be archived...
Media Recovery Waiting for thread 1 sequence 383
Tue Nov 13 14:45:32 2012
Completed: alter database recover managed standby database disconnect from session
Tue Nov 13 14:45:59 2012
Redo Shipping Client Connected as PUBLIC
-- Connected User is Valid
RFS[2]: Assigned to RFS process 9542
RFS[2]: Identified database type as 'physical standby'
RFS[2]: Archived Log: '/arch2/1_385_726529113.dbf'
Tue Nov 13 14:46:02 2012
Fetching gap sequence in thread 1, gap sequence 383-384
Tue Nov 13 14:47:02 2012
FAL[client]: Failed to request gap sequence
 GAP - thread 1 sequence 383-384
 DBID 2024668720 branch 726529113
FAL[client]: All defined FAL servers have been attempted.
-------------------------------------------------------------
Check that the CONTROL_FILE_RECORD_KEEP_TIME initialization
parameter is defined to a value that is sufficiently large
enough to maintain adequate log switch information to resolve
archivelog gaps.

如何在重建standby的情况下搞好备库呢？mos上,其实也有文章进行描述的,就是利用rman进行增量scn的恢复,下面我来进行展示：

1)首先定位到scn

SQL> select SEQUENCE#,FIRST_CHANGE#,NEXT_CHANGE# from v$archived_log where SEQUENCE# > 382 order by 1;

 SEQUENCE# FIRST_CHANGE# NEXT_CHANGE#
---------- ------------- ------------
       383        740638       740911
       384        740911       740915
       385        740915       740917
       385        740915       740917

2)根据scn，进行rman增量备份

[oracle@primarydb ~]$ cd $ORACLE_HOME/bin
[oracle@primarydb bin]$ ./rman target /

Recovery Manager: Release 10.2.0.4.0 - Production on Wed Oct 10 18:51:57 2012

Copyright (c) 1982, 2007, Oracle.  All rights reserved.

connected to target database: TEST (DBID=2024668720)

RMAN> backup device type disk incremental from scn 740638 database format '/tmp/test_db_incre.bbk';

Starting backup at 10-OCT-12
using target database control file instead of recovery catalog
allocated channel: ORA_DISK_1
channel ORA_DISK_1: sid=147 devtype=DISK
channel ORA_DISK_1: starting full datafile backupset
channel ORA_DISK_1: specifying datafile(s) in backupset
input datafile fno=00004 name=/oracle/product/oradata/test/test01.dbf
input datafile fno=00007 name=/oracle/product/oradata/test/sysaux01.dbf
input datafile fno=00009 name=/oracle/product/oradata/test/test03.dbf
input datafile fno=00006 name=/oracle/product/oradata/test/undo02.dbf
input datafile fno=00001 name=/oracle/product/oradata/test/system01.dbf
input datafile fno=00005 name=/oracle/product/oradata/test/perfstat.dbf
input datafile fno=00003 name=/oracle/product/oradata/test/rman.dbf
input datafile fno=00002 name=/oracle/product/oradata/test/undo01.dbf
input datafile fno=00008 name=/oracle/product/oradata/test/test02.dbf
input datafile fno=00010 name=/oracle/product/oradata/test/test04.dbf
input datafile fno=00011 name=/oracle/product/oradata/test/test05.dbf
channel ORA_DISK_1: starting piece 1 at 10-OCT-12

channel ORA_DISK_1: finished piece 1 at 10-OCT-12
piece handle=/tmp/test_db_incre.bbk tag=TAG20121010T185204 comment=NONE
channel ORA_DISK_1: backup set complete, elapsed time: 00:03:26
Finished backup at 10-OCT-12

RMAN>

3) 拷贝增量备份到standby

[oracle@primarydb bin]$ cd /tmp
[oracle@primarydb tmp]$ scp test_db_incre.bbk 192.168.3.176:/tmp/backup
The authenticity of host '192.168.3.176 (192.168.3.176)' can't be established.
RSA key fingerprint is a4:54:6b:bf:12:34:42:73:f5:ba:5f:38:c7:28:9c:b5.
Are you sure you want to continue connecting (yes/no)? yes
Warning: Permanently added '192.168.3.176' (RSA) to the list of known hosts.
oracle@192.168.3.176's password:
test_db_incre.bbk                             100%  736KB 736.0KB/s   00:00
[oracle@primarydb tmp]$

4) standby进行recover

[oracle@standbydb bin]$ sqlplus "/as sysdba"

SQL*Plus: Release 10.2.0.4.0 - Production on Tue Nov 13 15:03:46 2012

Copyright (c) 1982, 2007, Oracle.  All Rights Reserved.


Connected to:
Oracle Database 10g Enterprise Edition Release 10.2.0.4.0 - Production
With the Partitioning, OLAP, Data Mining and Real Application Testing options

SQL> alter database recover managed standby database cancel;

Database altered.

SQL> exit
Disconnected from Oracle Database 10g Enterprise Edition Release 10.2.0.4.0 - Production
With the Partitioning, OLAP, Data Mining and Real Application Testing options
[oracle@standbydb ~]$ cd $ORACLE_HOME/bin
[oracle@standbydb bin]$ ./rman target /

Recovery Manager: Release 10.2.0.4.0 - Production on Tue Nov 13 14:58:25 2012

Copyright (c) 1982, 2007, Oracle.  All rights reserved.

connected to target database: TEST (DBID=2024668720, not open)

RMAN> catalog backuppiece '/tmp/backup/test_db_incre.bbk';

using target database control file instead of recovery catalog
cataloged backuppiece
backup piece handle=/tmp/backup/test_db_incre.bbk recid=49 stamp=799253957

RMAN>  recover database noredo;

Starting recover at 13-NOV-12
using target database control file instead of recovery catalog
allocated channel: ORA_DISK_1
channel ORA_DISK_1: sid=97 devtype=DISK
channel ORA_DISK_1: starting incremental datafile backupset restore
channel ORA_DISK_1: specifying datafile(s) to restore from backup set
destination for restore of datafile 00001: /oracle/product/oradata/test/system01.dbf
destination for restore of datafile 00002: /oracle/product/oradata/test/undo01.dbf
destination for restore of datafile 00003: /oracle/product/oradata/test/rman.dbf
destination for restore of datafile 00004: /oracle/product/oradata/test/test01.dbf
destination for restore of datafile 00005: /oracle/product/oradata/test/perfstat.dbf
destination for restore of datafile 00006: /oracle/product/oradata/test/undo02.dbf
destination for restore of datafile 00007: /oracle/product/oradata/test/sysaux01.dbf
destination for restore of datafile 00008: /oracle/product/oradata/test/test02.dbf
destination for restore of datafile 00009: /oracle/product/oradata/test/test03.dbf
destination for restore of datafile 00010: /oracle/product/oradata/test/test04.dbf
destination for restore of datafile 00011: /oracle/product/oradata/test/test05.dbf
channel ORA_DISK_1: reading from backup piece /tmp/backup/test_db_incre.bbk
channel ORA_DISK_1: restored backup piece 1
piece handle=/tmp/backup/test_db_incre.bbk tag=TAG20121010T185204
channel ORA_DISK_1: restore complete, elapsed time: 00:00:01
Finished recover at 13-NOV-12

RMAN>

如下信息是recover时，alert log的记录信息：
Tue Nov 13 15:04:05 2012
Completed: alter database recover managed standby database cancel
Tue Nov 13 15:04:55 2012
Incremental restore complete of datafile 2 /oracle/product/oradata/test/undo01.dbf
  checkpoint is 741134
Incremental restore complete of datafile 8 /oracle/product/oradata/test/test02.dbf
  checkpoint is 741134
Incremental restore complete of datafile 10 /oracle/product/oradata/test/test04.dbf
  checkpoint is 741134
Incremental restore complete of datafile 11 /oracle/product/oradata/test/test05.dbf
  checkpoint is 741134
Incremental restore complete of datafile 3 /oracle/product/oradata/test/rman.dbf
  checkpoint is 741134
Incremental restore complete of datafile 5 /oracle/product/oradata/test/perfstat.dbf
  checkpoint is 741134
Incremental restore complete of datafile 1 /oracle/product/oradata/test/system01.dbf
  checkpoint is 741134
Incremental restore complete of datafile 6 /oracle/product/oradata/test/undo02.dbf
  checkpoint is 741134
Incremental restore complete of datafile 4 /oracle/product/oradata/test/test01.dbf
  checkpoint is 741134
Incremental restore complete of datafile 7 /oracle/product/oradata/test/sysaux01.dbf
  checkpoint is 741134
Incremental restore complete of datafile 9 /oracle/product/oradata/test/test03.dbf
  checkpoint is 741134

5) 开启standby同步，检查是否ok。

SQL> alter database recover managed standby database disconnect from session;

Database altered.

SQL> select database_role,open_mode from v$database;

DATABASE_ROLE    OPEN_MODE
---------------- ----------
PHYSICAL STANDBY MOUNTED

--主库切换日志
SQL> select database_role,open_mode from v$database;

DATABASE_ROLE    OPEN_MODE
---------------- ----------
PRIMARY          READ WRITE

SQL>
SQL> alter system switch logfile;

System altered.
SQL> select max(al.sequence#) "Last Seq Recieved",
  2         max(lh.sequence#) "Last Seq Applied"
  3  from v$archived_log al, v$log_history lh;

Last Seq Recieved Last Seq Applied
----------------- ----------------
              386              386

SQL>

--备库

SQL> select max(al.sequence#) "Last Seq Recieved",
  2         max(lh.sequence#) "Last Seq Applied"
  3  from v$archived_log al, v$log_history lh;

Last Seq Recieved Last Seq Applied
----------------- ----------------
              386              382

SQL>
SQL> select * from v$archive_gap;

   THREAD# LOW_SEQUENCE# HIGH_SEQUENCE#
---------- ------------- --------------
         1           383            384

备库archivelog：
[oracle@standbydb arch2]$ ls -ltr |tail -5
-rw-r-----  1 oracle dba     2048 Nov 13 14:33 1_381_726529113.dbf
-rw-r-----  1 oracle dba     2048 Nov 13 14:34 1_382_726529113.dbf
-rw-r-----  1 oracle dba     1024 Nov 13 14:45 1_385_726529113.dbf
-rw-r-----  1 oracle dba  1262080 Nov 13 15:06 1_386_726529113.dbf

备库alert log此时的信息：
Tue Nov 13 15:06:17 2012
alter database recover managed standby database disconnect from session
Tue Nov 13 15:06:17 2012
Attempt to start background Managed Standby Recovery process (test)
MRP0 started with pid=19, OS id=13497
Tue Nov 13 15:06:17 2012
MRP0: Background Managed Standby Recovery process started (test)
Managed Standby Recovery not using Real Time Apply
 parallel recovery started with 2 processes
Tue Nov 13 15:06:22 2012
Waiting for all non-current ORLs to be archived...
Media Recovery Waiting for thread 1 sequence 383
Fetching gap sequence in thread 1, gap sequence 383-384
Tue Nov 13 15:06:23 2012
Completed: alter database recover managed standby database disconnect from session
Tue Nov 13 15:06:54 2012
Redo Shipping Client Connected as PUBLIC
-- Connected User is Valid
RFS[3]: Assigned to RFS process 13596
RFS[3]: Identified database type as 'physical standby'
Primary database is in MAXIMUM PERFORMANCE mode
Primary database is in MAXIMUM PERFORMANCE mode
RFS[3]: Successfully opened standby log 4: '/oracle/product/oradata/test/redo04.log'
Tue Nov 13 15:06:54 2012
RFS[2]: Successfully opened standby log 3: '/oracle/product/oradata/test/redo03.log'
Tue Nov 13 15:07:23 2012
FAL[client]: Failed to request gap sequence
 GAP - thread 1 sequence 383-384
 DBID 2024668720 branch 726529113
FAL[client]: All defined FAL servers have been attempted.
-------------------------------------------------------------
Check that the CONTROL_FILE_RECORD_KEEP_TIME initialization
parameter is defined to a value that is sufficiently large
enough to maintain adequate log switch information to resolve
archivelog gaps.

我们可以看到，虽然备库，仍然在提示383,384是gap ，但是实际上已经是同步的了。

6)最后来验证下数据

SQL> alter database recover managed standby database cancel;

Database altered.

SQL> select file#,checkpoint_change# from v$datafile order by 1;

     FILE# CHECKPOINT_CHANGE#
---------- ------------------
         1             740638
         2             740638
         3             740638
         4             740638
         5             740638
         6             740638
         7             740638
         8             740638
         9             740638
        10             740638
        11             740638

11 rows selected.

SQL> select file#,checkpoint_change# from v$datafile_header order by 1;

     FILE# CHECKPOINT_CHANGE#
---------- ------------------
         1             741134
         2             741134
         3             741134
         4             741134
         5             741134
         6             741134
         7             741134
         8             741134
         9             741134
        10             741134
        11             741134

11 rows selected.

SQL> alter database open read only;
alter database open read only
*
ERROR at line 1:
ORA-16004: backup database requires recovery
ORA-01152: file 1 was not restored from a sufficiently old backup
ORA-01110: data file 1: '/oracle/product/oradata/test/system01.dbf'

重建备库的standby controlfile：
SQL> alter database create standby controlfile as '/tmp/standby.ctl';

Database altered.

SQL> exit
Disconnected from Oracle Database 10g Enterprise Edition Release 10.2.0.4.0 - Production
With the Partitioning, OLAP, Data Mining and Real Application Testing options
[oracle@primarydb tmp]$ scp standby.ctl 192.168.3.176:/tmp/backup
oracle@192.168.3.176's password:
standby.ctl                                                 100% 3928KB   3.8MB/s   00:00
[oracle@primarydb tmp]$

[oracle@standbydb bin]$ ./rman target /

Recovery Manager: Release 10.2.0.4.0 - Production on Tue Nov 13 15:19:09 2012

Copyright (c) 1982, 2007, Oracle.  All rights reserved.

connected to target database (not started)

RMAN> startup nomount

Oracle instance started

Total System Global Area     264241152 bytes

Fixed Size                     1266944 bytes
Variable Size                209718016 bytes
Database Buffers              50331648 bytes
Redo Buffers                   2924544 bytes

RMAN> restore controlfile from '/tmp/backup/standby.ctl';

Starting restore at 13-NOV-12
using target database control file instead of recovery catalog
allocated channel: ORA_DISK_1
channel ORA_DISK_1: sid=101 devtype=DISK

channel ORA_DISK_1: copied control file copy
output filename=/oracle/product/oradata/test/control01.ctl
Finished restore at 13-NOV-12

RMAN> startup mount

database is already started
database mounted
released channel: ORA_DISK_1

RMAN> exit


Recovery Manager complete.
[oracle@standbydb bin]$ sqlplus "/as sysdba"

SQL*Plus: Release 10.2.0.4.0 - Production on Tue Nov 13 15:20:02 2012

Copyright (c) 1982, 2007, Oracle.  All Rights Reserved.


Connected to:
Oracle Database 10g Enterprise Edition Release 10.2.0.4.0 - Production
With the Partitioning, OLAP, Data Mining and Real Application Testing options

SQL> alter database recover managed standby database disconnect from session;

Database altered.

SQL> alter database recover managed standby database cancel;

Database altered.

SQL>
SQL>
SQL> alter database open read only;

Database altered.

SQL> select count(1) from roger.test;

  COUNT(1)
----------
        20

SQL>

最后，我们可以看到，在主库archivelog丢失无法同步到备库时，可以利用增量scn的方式，来避免重建standby。

《 “dataguard主库丢失archivelog，如何不重建备库？” 》有 10 条评论

Dong_2

2012年12月2日

好东西啊，牛

回复
Kingtsi

2012年12月2日

好文，一年半前做过一次，当时就是一股倔劲整出来了，呵呵

回复
Lingdugudu

2012年12月3日

学习了。多谢群主。

回复
我想喝一杯

2012年12月3日

你如茶一杯，清淡滋味，溢满幽香，我想喝一杯；你如花一朵，含苞待放，清新魅力，我想走近你；你如水一湾，宁静温婉，幽静四周，我想游一游。…

回复
saup007

2013年1月21日

此文真心赞~！

回复
saup007

2013年1月22日

补充一下。如果primary、standby都使用OMF管理数据文件的话，创建standby control file时，要rename datafile、rename tempfile、rename redo 、create standby logfile。

回复
1. admin
  
  2013年1月22日
  
  感谢补充~~~~
  
  回复
2. Oracledba
  
  2013年1月22日
  
  感谢补充~~~~
  
  回复
鹏涛雷

2013年12月5日

controlfile也是要恢复滴 – -！

回复
dophin

2014年10月15日

中间给表空间新增了数据文件，也能用增量备份的方式修复从库吗，从库恢复后应用日志时报不能识别这个新增的数据文件；

回复