ASM 启动报错ora-600 kfcInitPba15

今天一大早朋友找到我说他们一个客户数据库更换存储后无法启动,发了一个报错给我。这个错误其实我也是第一次遇到,所以

这里简单记录一下。

2022-10-13T08:35:14.566784+08:00
WARNING: failed to online diskgroup resource ora.ARCH.dg (unable to communicate with CRSD/OHASD)
WARNING: failed to online diskgroup resource ora.CRS_NEW.dg (unable to communicate with CRSD/OHASD)
WARNING: failed to online diskgroup resource ora.DATA.dg (unable to communicate with CRSD/OHASD)
2022-10-13T08:35:14.665540+08:00
NOTE: Attempting voting file refresh on diskgroup CRS_NEW
NOTE: Refresh completed on diskgroup CRS_NEW. Found 3 voting file(s).
NOTE: Voting file relocation is required in diskgroup CRS_NEW
NOTE: Attempting voting file relocation on diskgroup CRS_NEW
Errors in file /u01/app/grid/diag/asm/+asm/+ASM1/trace/+ASM1_rbal_81437.trc  (incident=5424364):
ORA-00600: internal error code, arguments: [kfcInitPba15], [2], [0], [], [], [], [], [], [], [], [], []
Incident details in: /u01/app/grid/diag/asm/+asm/+ASM1/incident/incdir_5424364/+ASM1_rbal_81437_i5424364.trc
Use ADRCI or Support Workbench to package the incident.
See Note 411.1 at My Oracle Support for error and packaging details.
2022-10-13T08:35:15.066313+08:00
System state dump requested by (instance=1, osid=81437 (RBAL)), summary=[incident=5424364 (kfcInitPba)].
System State dumped to trace file /u01/app/grid/diag/asm/+asm/+ASM1/trace/+ASM1_diag_81405_20221013083515.trc
2022-10-13T08:35:15.441939+08:00
Errors in file /u01/app/grid/diag/asm/+asm/+ASM1/trace/+ASM1_rbal_81437.trc:
ORA-00600: internal error code, arguments: [kfcInitPba15], [2], [0], [], [], [], [], [], [], [], [], []
2022-10-13T08:35:15.448669+08:00
ERROR: ORA-600 thrown in RBAL for group number 2
2022-10-13T08:35:15.448768+08:00
Errors in file /u01/app/grid/diag/asm/+asm/+ASM1/trace/+ASM1_rbal_81437.trc:
ORA-00600: internal error code, arguments: [kfcInitPba15], [2], [0], [], [], [], [], [], [], [], [], []
2022-10-13T08:35:15.448986+08:00
Errors in file /u01/app/grid/diag/asm/+asm/+ASM1/trace/+ASM1_rbal_81437.trc:
ORA-00600: internal error code, arguments: [kfcInitPba15], [2], [0], [], [], [], [], [], [], [], [], []
2022-10-13T08:35:15.459548+08:00
Errors in file /u01/app/grid/diag/asm/+asm/+ASM1/trace/+ASM1_rbal_81437.trc  (incident=5424365):
ORA-488 [] [] [] [] [] [] [] [] [] [] [] []
Incident details in: /u01/app/grid/diag/asm/+asm/+ASM1/incident/incdir_5424365/+ASM1_rbal_81437_i5424365.trc
2022-10-13T08:35:15.972683+08:00
Dumping diagnostic data in directory=[cdmp_20221013083515], requested by (instance=1, osid=81437 (RBAL)), summary=[incident=5424364].
2022-10-13T08:35:16.189427+08:00
USER (ospid: 81437): terminating the instance due to error 488
2022-10-13T08:35:16.258267+08:00
System state dump requested by (instance=1, osid=81437 (RBAL)), summary=[abnormal instance termination].
System State dumped to trace file /u01/app/grid/diag/asm/+asm/+ASM1/trace/+ASM1_diag_81405_20221013083516.trc
2022-10-13T08:35:16.340898+08:00
Dumping diagnostic data in directory=[cdmp_20221013083516], requested by (instance=1, osid=81437 (RBAL)), summary=[abnormal instance termination].
2022-10-13T08:35:17.403366+08:00
Instance terminated by USER, pid = 81437
2022-10-13T08:35:22.225371+08:00
MEMORY_TARGET defaulting to 1128267776.
WARNING: ASM does not support ipclw. Switching to skgxp
WARNING: ASM does not support ipclw. Switching to skgxp
WARNING: ASM does not support ipclw. Switching to skgxp
ksxp_exafusion_enabled_dcf: ipclw_enabled=0
WARNING: ASM does not support ipclw. Switching to skgxp
WARNING: ASM does not support ipclw. Switching to skgxp
WARNING: ASM does not support ipclw. Switching to skgxp
* instance_number obtained from CSS = 1, checking for the existence of node 0...
* node 0 does not exist. instance_number = 1
Starting ORACLE instance (normal) (OS id: 82216)
2022-10-13T08:35:22.231051+08:00
CLI notifier numLatches:7 maxDescs:1311
2022-10-13T08:35:22.240013+08:00

分析响应的trace,可以看到响应的call stack信息:

Error Descriptor: ORA-600 [kfcInitPba15] [2] [0] [] [] [] [] [] [] [] [] []
Error class: 0
Problem Key # of args: 1
Number of actions: 18
----- Incident Context Dump -----
Address: 0x7ffd0d901148
Incident ID: 5424364
Problem Key: ORA 600 [kfcInitPba15]
Error: ORA-600 [kfcInitPba15] [2] [0] [] [] [] [] [] [] [] [] []
[00]: dbgeEndDDEInvocationImpl [diag_dde]
[01]: kfcInitPba [KFC]<-- Signaling
[02]: kfcKfblTranslatePriv [KFC]
[03]: kfcReadBuffer [KFC]
[04]: kfcGet0Priv [KFC]
[05]: kfcGet1Priv [KFC]
[06]: kfcGetPriv [KFC]
[07]: kfdvaBpcFbpGetPriv [KFDVA]
[08]: kfdAllocateAuSet [KFD]
[09]: kfdvaAtbGetXeq [KFDVA]
[10]: kfdvfStartAllocation [KFD]
[11]: kfdvfAllocDisk [KFD]
[12]: kfdvfGeneratePlan [KFD]
[13]: kfdvfValidateReloc [KFD]
[14]: kfdvfReloc [KFD]
[15]: kfgbVotingFileRelocateNow [KFGB]
[16]: kfgbTryFn [KFGB]
[17]: kfgbTimeout [KFGB]
[18]: kfgbDriver [KFGB]
[19]: ksbcti [background_proc]
[20]: ksbabs [background_proc]
[21]: ksbrdp [background_proc]
[22]: opirip [OPI]
[23]: opidrv [OPI]
[24]: sou2o []
[25]: opimai_real [OPI]
[26]: ssthrdmain []
[27]: main []

从分析看是元数据方面同步出现了问题,后面从mos发现了一篇文档有简单描述:

SYMPTOMS

While mounting the ASM diskgroup with compatible.asm set to 12.1 or above .It was failing with below ORA-600,

ORA-00600: internal error code, arguments: [kfcInitPba15], [3], [7], [], [], [], [], [], [], [], [], []
ORA-00600: internal error code, arguments: [kfcInitPba15], [3], [1], [], [], [], [], [], [], [], [], []
ORA-00600: internal error code, arguments: [kfcInitPba15], [3], [3], [], [], [], [], [], [], [], [], []
ORA-00600: internal error code, arguments: [kfcInitPba15], [3], [4], [], [], [], [], [], [], [], [], []

With compatible.asm set to 12.1 or above , phys_meta_replicated attribute get enabled by default.
And copy complete allocation unit 0 of each asm disk within a diskgroup to allocation unit 11.

CAUSE

This issue most of the time observed due to IO completion failure allocation unit 11.

SOLUTION

Please open a SR and engage Oracle Support to review your issue and assist you.

 

该文档没有做过多描述。从分析来看Oracle 12c开始,对于盘头的数据备份在AU 11上,因此可能是au 0到au 11的镜像有问题。

我猜测可能是之前在实施replace 磁盘更换时有些问题。最后建议他们盘头前面数据dd清掉,然后重跑root.sh即解决了问题。


评论

发表回复

您的电子邮箱地址不会被公开。 必填项已用 * 标注