来自雪山之巅的Oracle恢复案例

某西藏客户去年6月份就联系过我们,需要进行数据恢复;然而中间由于种种原因而放弃;中间沟通了多次;今年又再次找到我们需要恢复。这次终于可以帮用户解决问题了。总的来讲是一个较为简单的恢复,由于写过太多类似的案例了,这里我就简单叙述一下,不做过多解释了;不太明白的朋友请看之前的blog文章。首先我们来看下alert log:

00:56:01.742 Sun Apr 12 00:54:33 2020
00:56:01.742 alter database open resetlogs
00:56:01.757 RESETLOGS is being done without consistancy checks. This may result
00:56:01.757 in a corrupted database. The database should be recreated.
00:56:01.768 RESETLOGS after incomplete recovery UNTIL CHANGE 3134302272
00:56:05.381 Online log /home/oracle/redo02.log: Thread 1 Group 2 was previously cleared
00:56:05.381 Online log /home/oracle/redo03.log: Thread 1 Group 3 was previously cleared
00:56:05.741 Sun Apr 12 00:54:37 2020
00:56:05.741 Setting recovery target incarnation to 2
00:56:05.820 Sun Apr 12 00:54:37 2020
00:56:05.820 Assigning activation ID 3134737258 (0xbad84b6a)
00:56:05.992 Thread 1 opened at log sequence 1
00:56:05.992   Current log# 1 seq# 1 mem# 0: /home/oracle/redo01.log
00:56:05.992 Successful open of redo thread 1
00:56:05.993 MTTR advisory is disabled because FAST_START_MTTR_TARGET is not set
00:56:06.001 Sun Apr 12 00:54:37 2020
00:56:06.001 SMON: enabling cache recovery
00:56:06.067 Errors in file /home/oracle/diag/rdbms/kanms/kanms/trace/kanms_ora_16196.trc:
00:56:06.067 ORA-00704: bootstrap process failure
00:56:06.067 ORA-00704: bootstrap process failure
00:56:06.068 ORA-00604: error occurred at recursive SQL level 1
00:56:06.068 ORA-01173: data dictionary indicates missing data file from system tablespace
00:56:06.068 Errors in file /home/oracle/diag/rdbms/kanms/kanms/trace/kanms_ora_16196.trc:
00:56:06.068 ORA-00704: bootstrap process failure
00:56:06.068 ORA-00704: bootstrap process failure
00:56:06.068 ORA-00604: error occurred at recursive SQL level 1
00:56:06.068 ORA-01173: data dictionary indicates missing data file from system tablespace
00:56:06.068 Error 704 happened during db open, shutting down database
00:56:06.069 USER (ospid: 16196): terminating the instance due to error 704
00:56:07.288 Instance terminated by USER, pid = 16196
00:56:07.288 ORA-1092 signalled during: alter database open resetlogs...
00:56:07.289 opiodr aborting process unknown ospid (16196) as a result of ORA-1092
00:56:08.669 Sun Apr 12 00:54:40 2020
00:56:08.670 ORA-1092 : opitsk aborting process
00:59:02.059 Sun Apr 12 00:57:33 2020

上面的错误都非常常见;首先我们先使用常规恢复方式;添加undo相关参数(_corrupted_rollback_segments、undo_managment)来屏蔽错误;试图打开数据库,但是没成功:

01:00:58.315 MTTR advisory is disabled because FAST_START_MTTR_TARGET is not set
01:00:58.330 Sun Apr 12 00:59:29 2020
01:00:58.335 SMON: enabling cache recovery
01:00:58.452 ORA-01555 caused by SQL statement below (SQL ID: 4krwuz0ctqxdt, SCN: 0x0000.bad1a843):
01:00:58.453 select ctime, mtime, stime from obj$ where obj# = :1
01:00:58.470 Errors in file /home/oracle/diag/rdbms/kanms/kanms/trace/kanms_ora_16595.trc:
01:00:58.470 ORA-00704: bootstrap process failure
01:00:58.470 ORA-00704: bootstrap process failure
01:00:58.470 ORA-00604: error occurred at recursive SQL level 1
01:00:58.471 ORA-01555: snapshot too old: rollback segment number 9 with name "_SYSSMU9_2996678720$" too small
01:00:58.471 Errors in file /home/oracle/diag/rdbms/kanms/kanms/trace/kanms_ora_16595.trc:
01:00:58.471 ORA-00704: bootstrap process failure
01:00:58.471 ORA-00704: bootstrap process failure
01:00:58.471 ORA-00604: error occurred at recursive SQL level 1
01:00:58.472 ORA-01555: snapshot too old: rollback segment number 9 with name "_SYSSMU9_2996678720$" too small
01:00:58.472 Error 704 happened during db open, shutting down database
01:00:58.473 USER (ospid: 16595): terminating the instance due to error 704
01:00:59.673 Instance terminated by USER, pid = 16595
01:00:59.674 ORA-1092 signalled during: alter database open resetlogs..

可以看到Oracle 在bootstrap时仍然失败了,说明部分事务可能是涉及到数据字典的操作;通过一般隐含参数难以绕过这个问题。由于之前是安排其他同事在处理;尝试多次没成功之后;直接通过ODU进行了数据抽取;但是最终处理index,package,trigger,function等元数据时候比较麻烦,需要去拼接脚本。由于system本身坏块并不是太多;因此通过非常规手段打开数据库,然后导出元数据应该是没问题的;如下是简单的过程:

 ++++重建cf cf.sql
 CREATE CONTROLFILE REUSE DATABASE "kanms" RESETLOGS  noA
      MAXLOGFILES 16
      MAXLOGMEMBERS 3
      MAXDATAFILES 100
      MAXINSTANCES 8
      MAXLOGHISTORY 292
      LOGFILE
    GROUP 1 '/home/oracle/redo01.log'  SIZE 50M,
    GROUP 2 '/home/oracle/redo02.log'  SIZE 50M,
    GROUP 3 '/home/oracle/redo03.log'  SIZE 50M
  -- STANDBY LOGFILE
  DATAFILE
    '/home/oracle/oradata/kanms/system01.dbf.bak',
    '/home/oracle/oradata/kanms/sysaux01.dbf',
    '/home/oracle/oradata/kanms/undotbs01.dbf'
     CHARACTER SET ZHS16GBK
  ;

+++修改checkpoint
 BBED> set filename '/home/oracle/oradata/kanms/sysaux01.dbf'
         FILENAME        /home/oracle/oradata/kanms/sysaux01.dbf

 BBED> p kcvfhrls
 struct kcvfhrls, 8 bytes                    @116
    ub4 kscnbas                              @116      0xbad1a849
    ub2 kscnwrp                              @120      0x0000

 BBED> d offset 116 count 16
  File: /home/oracle/oradata/kanms/sysaux01.dbf (0)
  Block: 1                Offsets:  116 to  131           Dba:0x00000000
 ------------------------------------------------------------------------
  49a8d1ba 00000000 00000000 00000000

  <32 bytes per line>

 BBED> d offset 112 count 16
  File: /home/oracle/oradata/kanms/sysaux01.dbf (0)
  Block: 1                Offsets:  112 to  127           Dba:0x00000000
 ------------------------------------------------------------------------
  c7ebd63d 49a8d1ba 00000000 00000000

  <32 bytes per line>

 BBED> set filename '/home/oracle/oradata/kanms/system01.dbf.bak'
         FILENAME        /home/oracle/oradata/kanms/system01.dbf.bak

 BBED> set mode edit
         MODE            Edit

 BBED> d offset 116 count 16
  File: /home/oracle/oradata/kanms/system01.dbf.bak (0)
  Block: 1                Offsets:  116 to  131           Dba:0x00000000
 ------------------------------------------------------------------------
  41a8d1ba 00000000 00000000 00000000

  <32 bytes per line>

 BBED> m /x 49a8d1ba offset 116
  File: /home/oracle/oradata/kanms/system01.dbf.bak (0)
  Block: 1                Offsets:  116 to  131           Dba:0x00000000
 ------------------------------------------------------------------------
  49a8d1ba 00000000 00000000 00000000

  <32 bytes per line>

 BBED> m /x c7ebd63d offset 112
 BBED-00209: invalid number (c7ebd63d)


 BBED>  m /x c7eb offset 112
  File: /home/oracle/oradata/kanms/system01.dbf.bak (0)
  Block: 1                Offsets:  112 to  127           Dba:0x00000000
 ------------------------------------------------------------------------
  c7ebd73d 49a8d1ba 00000000 00000000

  <32 bytes per line>

 BBED>  m /x d63d offset 114
  File: /home/oracle/oradata/kanms/system01.dbf.bak (0)
  Block: 1                Offsets:  114 to  129           Dba:0x00000000
 ------------------------------------------------------------------------
  d63d49a8 d1ba0000 00000000 00000000

  <32 bytes per line>

 BBED> sum apply
 Check value for File 0, Block 1:
 current = 0xf685, required = 0xf685
 BBED> exit


SQL> startup nomount force pfile='/tmp/init.ora'
ORACLE instance started.

Total System Global Area 1.3429E+10 bytes
Fixed Size                  2217992 bytes
Variable Size            6845106168 bytes
Database Buffers         6509559808 bytes
Redo Buffers               71770112 bytes
SQL> @cf

Control file created.

ORA-00279: change 3134302276 generated at 04/12/2020 08:54:23 needed for thread
1
ORA-00289: suggestion :
/home/oracle/product/11.2.0/db_1/dbs/arch1_1_1037495239.dbf
ORA-00280: change 3134302276 for thread 1 is in sequence #1


Specify log: {<RET>=suggested | filename | AUTO | CANCEL}
cancel
Media recovery cancelled.
SQL> alter database open resetlogs;
alter database open resetlogs
*
ERROR at line 1:
ORA-00600: internal error code, arguments: [kcvorl_2], [0], [3134302277], [0],
[3134302281], [], [], [], [], [], [], []

SQL> select file#,checkpoint_change# from v$datafile_header;

     FILE# CHECKPOINT_CHANGE#
---------- ------------------
         1         3134302276     --修改system文件的checkpoint scn
         2         3134302284
         3         3134302281

BBED> d offset 484
 File: /home/oracle/oradata/kanms/system01.dbf.bak (0)
 Block: 1                Offsets:  484 to  995           Dba:0x00000000
------------------------------------------------------------------------
 44a8d1ba 00000000 3f59d73d 01000000 01000000 02000000 10000000 02000000
 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000


 <32 bytes per line>


BBED> set mode edit
        MODE            Edit

BBED> m /x 49 offset 484
Warning: contents of previous BIFILE will be lost. Proceed? (Y/N) y
 File: /home/oracle/oradata/kanms/system01.dbf.bak (0)
 Block: 1                Offsets:  484 to  995           Dba:0x00000000
------------------------------------------------------------------------
 49a8d1ba 00000000 3f59d73d 01000000 01000000 02000000 10000000 02000000
 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000

BBED> sum apply
Check value for File 0, Block 1:
current = 0xf2a0, required = 0xf2a0

BBED> exit

SQL> select file#,checkpoint_change#,resetlogs_change# from v$datafile_header;

     FILE# CHECKPOINT_CHANGE# RESETLOGS_CHANGE#
---------- ------------------ -----------------
         1         3134302281        3134302281
         2         3134302284        3134302281
         3         3134302281        3134302281


+++尝试打开数据库
 SQL> startup nomount pfile='/tmp/init.ora';
 ORACLE instance started.

 Total System Global Area 1.3429E+10 bytes
 Fixed Size                  2217992 bytes
 Variable Size            6845106168 bytes
 Database Buffers         6509559808 bytes
 Redo Buffers               71770112 bytes
 SQL> @cf

 Control file created.

 SQL> oradebug setmypid;
 Statement processed.
 SQL> oradebug dumpvar sga kcsgscn_
 kcslf kcsgscn_ [060017F58, 060017F88) = 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 60017C38 00000000
 SQL> oradebug poke 0x060017F58 8 3134400000
 BEFORE: [060017F58, 060017F60) = 00000000 00000000
 AFTER:  [060017F58, 060017F60) = BAD32600 00000000
 SQL> recover database using backup controlfile;
 ORA-00279: change 3134302285 generated at 04/12/2020 09:16:36 needed for thread
 1
 ORA-00289: suggestion :
 /home/oracle/product/11.2.0/db_1/dbs/arch1_1_1037524592.dbf
 ORA-00280: change 3134302285 for thread 1 is in sequence #1


 Specify log: {<RET>=suggested | filename | AUTO | CANCEL}
 cancel
 Media recovery cancelled.
 SQL> alter database open resetlogs;
 Database altered.

顺利打开数据库后,直接exp相关元数据即可。

export NLS_LANG=AMERICAN_AMERICA.ZHS16GBK
exp \'/ as sysdba\' owner=ISHARE,GESS,ITHINK_ALARM,UAPWEB,NRES,MOP_INM,TOPO,ELEAP,FLOW,STD_INM,STG_INM,SECU,ITHINK_RULE,KANMS_SHEET,KANMS_APP,STD_INM_RES,APPMESSAGE buffer=102400000 file=meta.dmp log=meta.log rows=n

由此完成了整个恢复。实际上大家可以看到,这个case是比较easy的,其实我们完全不需要借助数据抽取工具。比较简单,给大家随便分享一下吧。

 

这里我要简单说一下,这个case并非是我来完成,由团队人员李翔宇、许玉晨、高达完成!很高兴能看到大家的进步。


评论

发表回复

您的电子邮箱地址不会被公开。 必填项已用 * 标注