很久没恢复过Oracle 9的数据库了

上周末一位同事反馈说某客户维保的一套数据库无法启动了;本来没有什么兴趣继续看了。但听说是Oracle 9的老库;很多年没玩这么老的版本了。远程支持一下吧;整个恢复非常简单,就不做过多描述了。先来看看当时的报错信息:

Instance terminated by DBW0, pid = 4232
Sun Mar 22 22:25:40 2020
Starting ORACLE instance (normal)
Starting up ORACLE RDBMS Version: 9.0.1.1.1.
System parameters with non-default values:
  processes                = 150
  timed_statistics         = TRUE
  shared_pool_size         = 46137344
  large_pool_size          = 1048576
  java_pool_size           = 33554432
  control_files            = D:\oracle\oradata\wlyfwdb\CONTROL01.CTL, D:\oracle\oradata\wlyfwdb\CONTROL02.CTL, D:\oracle\oradata\wlyfwdb\CONTROL03.CTL
  db_block_size            = 4096
  db_cache_size            = 33554432
  compatible               = 9.0.0
  fast_start_mttr_target   = 300
  undo_management          = AUTO
  undo_tablespace          = UNDOTBS
  remote_login_passwordfile= EXCLUSIVE
  db_domain                = push
  instance_name            = wlyfwdb
  dispatchers              = (PROTOCOL=TCP)(SER=MODOSE), (PROTOCOL=TCP)(PRE=oracle.aurora.server.GiopServer), (PROTOCOL=TCP)(PRE=oracle.aurora.server.SGiopServer)
  background_dump_dest     = D:\oracle\admin\wlyfwdb\bdump
  user_dump_dest           = D:\oracle\admin\wlyfwdb\udump
  core_dump_dest           = D:\oracle\admin\wlyfwdb\cdump
  sort_area_size           = 524288
  db_name                  = wlyfwdb
  open_cursors             = 300
PMON started with pid=2
DBW0 started with pid=3
LGWR started with pid=4
CKPT started with pid=5
SMON started with pid=6
RECO started with pid=7
Sun Mar 22 22:25:42 2020
starting up 1 shared server(s) ...
starting up 1 dispatcher(s) for network address '(ADDRESS=(PARTIAL=YES)(PROTOCOL=TCP))'...
starting up 1 dispatcher(s) for network address '(ADDRESS=(PARTIAL=YES)(PROTOCOL=TCP))'...
starting up 1 dispatcher(s) for network address '(ADDRESS=(PARTIAL=YES)(PROTOCOL=TCP))'...
Sun Mar 22 22:25:44 2020
ALTER DATABASE   MOUNT
Sun Mar 22 22:25:49 2020
Successful mount of redo thread 1, with mount id 2927408233.
Sun Mar 22 22:25:49 2020
Database mounted in Exclusive Mode.
Completed: ALTER DATABASE   MOUNT
Sun Mar 22 22:41:52 2020
alter database recover datafile list
 clear

Sun Mar 22 22:41:52 2020
Completed: alter database recover datafile list
 clear
Sun Mar 22 22:41:52 2020
alter database recover datafile list
 1 , 2 , 3 , 4 , 5 , 6 , 7 , 8 , 9 , 10 , 11 , 12

Media Recovery Datafile: 1
Media Recovery Datafile: 2
Media Recovery Datafile: 3
Media Recovery Datafile: 4
Media Recovery Datafile: 5
Media Recovery Datafile: 6
Media Recovery Datafile: 7
Media Recovery Datafile: 8
Media Recovery Datafile: 9
Media Recovery Datafile: 10
Media Recovery Datafile: 11
Media Recovery Datafile: 12
Completed: alter database recover datafile list
 1 , 2 , 3 ,
Sun Mar 22 22:41:52 2020
alter database recover if needed
 start

Media Recovery Start
Successfully started datafile 1 media recovery
Datafile #1: 'D:\ORACLE\ORADATA\WLYFWDB\SYSTEM01.DBF'
Successfully started datafile 2 media recovery
Datafile #2: 'D:\ORACLE\ORADATA\WLYFWDB\UNDOTBS01.DBF'
Successfully started datafile 3 media recovery
Datafile #3: 'D:\ORACLE\ORADATA\WLYFWDB\CWMLITE01.DBF'
Successfully started datafile 4 media recovery
Datafile #4: 'D:\ORACLE\ORADATA\WLYFWDB\DRSYS01.DBF'
Successfully started datafile 5 media recovery
Datafile #5: 'D:\ORACLE\ORADATA\WLYFWDB\EXAMPLE01.DBF'
Successfully started datafile 6 media recovery
Datafile #6: 'D:\ORACLE\ORADATA\WLYFWDB\INDX01.DBF'
Successfully started datafile 7 media recovery
Datafile #7: 'D:\ORACLE\ORADATA\WLYFWDB\TOOLS01.DBF'
Successfully started datafile 8 media recovery
Datafile #8: 'D:\ORACLE\ORADATA\WLYFWDB\USERS01.DBF'
Successfully started datafile 9 media recovery
Datafile #9: 'D:\ORACLE\ORADATA\WLYFWDB\WLYWL.ORA'
Successfully started datafile 10 media recovery
Datafile #10: 'D:\ORACLE\ORADATA\WLYFWDB\WLYWL.01.ORA'
Successfully started datafile 11 media recovery
Datafile #11: 'D:\ORACLE\ORADATA\WLYFWDB\NWM.ORA'
Successfully started datafile 12 media recovery
Datafile #12: 'D:\ORACLE\ORADATA\WLYFWDB\INDEX_2D.ORA'
Media Recovery Log
Recovery of Online Redo Log: Thread 1 Group 2 Seq 10416 Reading mem 0
  Mem# 0 errs 0: D:\ORACLE\ORADATA\WLYFWDB\REDO02.LOG
Sun Mar 22 22:41:55 2020
Errors in file D:\oracle\admin\wlyfwdb\udump\ORA04352.TRC:
ORA-00600: ÄÚ²¿´íÎó´úÂ룬²ÎÊý: [2037], [8388610], [164], [8454146], [214], [0], [0], [0]

***
Corrupt block relative dba: 0x00928008 (file 2, block 1212424)
Bad header found during preparing block for write
Data in bad block -
 type: 110 format: 133 rdba: 0x7c820000
 last change scn: 0x000c.00000006 seq: 0x14 flg: 0x00
 consistency value in tail: 0x0092800e
 check value in block header: 0x18, block checksum disabled
 spare1: 0x23, spare2: 0x34, spare3: 0xc
***
Sun Mar 22 22:42:03 2020
Errors in file D:\oracle\admin\wlyfwdb\bdump\wlyfwdbDBW0.TRC:
ORA-00600: internal error code, arguments: [kcbzpb_1], [9601032], [4], [1], [], [], [], []

Sun Mar 22 22:42:08 2020
Errors in file D:\oracle\admin\wlyfwdb\bdump\wlyfwdbDBW0.TRC:
ORA-00600: internal error code, arguments: [kcbzpb_1], [9601032], [4], [1], [], [], [], []

从上述报错信息来看很明显是坏块,而且是undotbs出现了问题;后面的ora-00600 [kcbzpb_1]其实也坏块导致;大家从函数名称就可以猜出来了。从目前来看是Oracle在open时通过undo进行事务恢复时遇到坏块导致异常终止了。

如果是10g或者11g的库,那你在recover database时可以allow xx corruption的方式来跳过部分坏块(10g只能允许1个,11g可以允许跳过多个坏块);通过破坏事务一致性的方法来完成恢复。很可惜这是oracle 9.0 版本。

处理方法也简单,设置undo_managment参数即可并加上强制open参数;在open时发现报错数据块scn问题:

Sun Mar 22 23:13:57 2020
SMON: enabling cache recovery
Sun Mar 22 23:13:57 2020
Errors in file D:\oracle\admin\wlyfwdb\udump\ORA05500.TRC:
ORA-00600: ÄÚ²¿´íÎó´úÂ룬²ÎÊý: [2662], [0], [874738679], [0], [874742070], [8388913], [], []

Sun Mar 22 23:14:06 2020
Errors in file D:\oracle\admin\wlyfwdb\udump\ORA05500.TRC:
ORA-00600: ÄÚ²¿´íÎó´úÂ룬²ÎÊý: [2662], [0], [874738679], [0], [874742070], [8388913], [], []

这个错误也是老生常谈了;以前blog写过太多了,这里不再进行描述了。

处理方法有几种:

1、bbed修改block的事务(这里8388913就是dba 地址,为10进制,转换一下即可)

2、推进整个db的scn来绕过这个错误(oradebug、event 10015等等均可)

由于这里是Oracle 9,可以使用event 10015,所以我直接使用该event了。非常顺利就open了。

Sun Mar 22 23:20:55 2020
RESETLOGS is being done without consistancy checks. This may result
in a corrupted database. The database should be recreated.
RESETLOGS after incomplete recovery UNTIL CHANGE 874738658
Resetting resetlogs activation ID 2927415173 (0xae7ccf85)
Online log 1 of thread 1 was previously cleared
Online log 3 of thread 1 was previously cleared
Sun Mar 22 23:20:56 2020
Assigning activation ID 2927417093 (0xae7cd705)
Thread 1 opened at log sequence 1
  Current log# 2 seq# 1 mem# 0: D:\ORACLE\ORADATA\WLYFWDB\REDO02.LOG
Successful open of redo thread 1.
Sun Mar 22 23:20:57 2020
SMON: enabling cache recovery
Sun Mar 22 23:20:57 2020
Debugging event used to advance scn to 10737418240
Dictionary check beginning
Dictionary check complete
Sun Mar 22 23:20:57 2020
SMON: enabling tx recovery
SMON: about to recover undo segment 10
Sun Mar 22 23:20:57 2020
replication_dependency_tracking turned off (no async multimaster replication found)
Sun Mar 22 23:20:58 2020
SMON: about to recover undo segment 10
SMON: mark undo segment 10 as available
SMON: about to recover undo segment 10
SMON: mark undo segment 10 as available
SMON: about to recover undo segment 10
SMON: mark undo segment 10 as available
SMON: about to recover undo segment 10
SMON: mark undo segment 10 as available
SMON: about to recover undo segment 10

打开数据库后重建一下undo tablespace即可;同事客户开始进行数据库导出重建的工作。

备注:event 10015 在11.2.0.3.2+版本不再支持,已经被废弃,可以通过oradebug poke来推进scn。


评论

发表回复

您的电子邮箱地址不会被公开。 必填项已用 * 标注