案例:troubleshooting LGWR terminating the instance due to error 4021 on standby

该案例来自南区某客户的standby环境,数据库版本为11.2.0.4,凌晨实例发生了crash。

Sun Apr 24 04:40:48 2022
Errors in file /opt/app/oracle/diag/rdbms/cqwmsprd22/cqwmsprd/trace/cqwmsprd_lgwr_16079.trc:
ORA-04021: timeout occurred while waiting to lock object
Sun Apr 24 04:40:48 2022
MRP0: Background Media Recovery terminated with error 449
Errors in file /opt/app/oracle/diag/rdbms/cqwmsprd22/cqwmsprd/trace/cqwmsprd_pr00_16241.trc:
ORA-00449: background process 'LGWR' unexpectedly terminated with error 4021
LGWR (ospid: 16079): terminating the instance due to error 4021
Managed Standby Recovery not using Real Time Apply
Sun Apr 24 04:40:48 2022
System state dump requested by (instance=1, osid=16079 (LGWR)), summary=[abnormal instance termination].
System State dumped to trace file /opt/app/oracle/diag/rdbms/cqwmsprd22/cqwmsprd/trace/cqwmsprd_diag_16063_20220424044048.trc
Instance terminated by LGWR, pid = 16079

lgwr由于ORA-04021而终止了实例。diag进程收集了SystemState dump。

[oracle@rac1 ~]$ oerr ora 4021
04021, 00000, "timeout occurred while waiting to lock object %s%s%s%s%s"
// *Cause:  While waiting to lock a library object, a timeout occurred.
// *Action: Retry the operation later.

分析LGWR的process和session的state dump

PROCESS 14: LGWR
  ----------------------------------------
  SO: 0x15fa9b778, type: 2, owner: (nil), flag: INIT/-/-/0x00 if: 0x3 c: 0x3
   proc=0x15fa9b778, name=process, file=ksu.h LINE:12721, pg=0
  (process) Oracle pid:14, ser:1, calls cur/top: 0x15467a718/0x15467a718
            flags : (0x6) SYSTEM
            flags2: (0x0),  flags3: (0x10)
            intr error: 0, call error: 0, sess error: 0, txn error 0
            intr queue: empty
    ksudlp FALSE at location: 0
  (post info) last post received: 0 0 26
              last post received-location: ksa2.h LINE:285 ID:ksasnd
              last process to post me: 0x15aab64b8 1 0
              last post sent: 0 0 259
              last post sent-location: kgl.h LINE:8873 ID:kgllkdl: post after freeing latch
              last process posted by me: 0x15faa0b10 1 0
    (latch info) wait_event=0 bits=0x0
    Process Group: DEFAULT, pseudo proc: 0x158d3c5e8
    O/S info: user: oracle, term: UNKNOWN, ospid: 16079
    OSD pid info: Unix process pid: 16079, image: oracle@redprairie-chongqing-db03 (LGWR)

 SO: 0x156622740, type: 4, owner: 0x15fa9b778, flag: INIT/-/-/0x00 if: 0x3 c: 0x3
     proc=0x15fa9b778, name=session, file=ksu.h LINE:12729, pg=0
    (session) sid: 2647 ser: 1 trans: (nil), creator: 0x15fa9b778
              flags: (0x51) USR/- flags_idl: (0x1) BSY/-/-/-/-/-
              flags2: (0x409) -/-/INC
              DID: , short-term DID:
              txn branch: (nil)
              edition#: 0              oct: 0, prv: 0, sql: (nil), psql: (nil), user: 0/SYS
    ksuxds FALSE at location: 0
    service name: SYS$BACKGROUND
    Current Wait Stack:
      Not in wait; last wait ended 1.320661 sec ago
    There are 14 sessions blocked by this session.
    Dumping one waiter:
      inst: 1, sid: 191, ser: 3
      wait event: 'library cache lock'
        p1: 'handle address'=0x15ee73a58
        p2: 'lock address'=0x15da21970
        p3: '100*mode+namespace'=0x2004a0002
      row_wait_obj#: 4294967295, block#: 0, row#: 0, file# 0
      min_blocked_time: 896 secs, waiter_cache_ver: 59364
    Wait State:
      fixed_waits=0 flags=0x21 boundary=(nil)/-1
    Session Wait History:
        elapsed time of 1.320716 sec since last wait
     0: waited for 'library cache lock'
        handle address=0x15ee73a58, lock address=0x15cf510c8, 100*mode+namespace=0x2004a0003
        wait_id=9474667 seq_num=37628 snap_id=1
        wait times: snap=15 min 0 sec, exc=15 min 0 sec, total=15 min 0 sec
        wait times: max=15 min 0 sec
        wait counts: calls=302 os=302
        occurred after 0.000005 sec of elapsed time
     1: waited for 'library cache: mutex X'
        idn=0xc489f911, value=0x0, where=0x40
        wait_id=9474666 seq_num=37627 snap_id=1
        wait times: snap=0.000119 sec, exc=0.000119 sec, total=0.000119 sec
        wait times: max=infinite
        wait counts: calls=0 os=0
        occurred after 1 min 56 sec of elapsed time
     2: waited for 'rdbms ipc message'
        timeout=0x12c, =0x0, =0x0
        wait_id=9474665 seq_num=37626 snap_id=1
        wait times: snap=2.158775 sec, exc=2.158775 sec, total=2.158775 sec
        wait times: max=3.000000 sec
        wait counts: calls=1 os=1
        occurred after 0.000040 sec of elapsed time
     3: waited for 'rdbms ipc message'
        timeout=0x12c, =0x0, =0x0
        wait_id=9474664 seq_num=37625 snap_id=1
        wait times: snap=3.000774 sec, exc=3.000774 sec, total=3.000774 sec
        wait times: max=3.000000 sec
        wait counts: calls=1 os=1
        occurred after 0.000036 sec of elapsed time
     4: waited for 'rdbms ipc message'
        timeout=0x12c, =0x0, =0x0
        wait_id=9474663 seq_num=37624 snap_id=1
        wait times: snap=3.005809 sec, exc=3.005809 sec, total=3.005809 sec
        wait times: max=3.000000 sec
        wait counts: calls=1 os=1
        occurred after 0.000036 sec of elapsed time
     5: waited for 'rdbms ipc message'
        timeout=0x12c, =0x0, =0x0
        wait_id=9474662 seq_num=37623 snap_id=1
        wait times: snap=3.000757 sec, exc=3.000757 sec, total=3.000757 sec
        wait times: max=3.000000 sec
        wait counts: calls=1 os=1
        occurred after 0.000045 sec of elapsed time
     6: waited for 'rdbms ipc message'
        timeout=0x12c, =0x0, =0x0
        wait_id=9474661 seq_num=37622 snap_id=1
        wait times: snap=3.001774 sec, exc=3.001774 sec, total=3.001774 sec
        wait times: max=3.000000 sec
        wait counts: calls=1 os=1
        occurred after 0.000038 sec of elapsed time
     7: waited for 'rdbms ipc message'
        timeout=0x12c, =0x0, =0x0
        wait_id=9474660 seq_num=37621 snap_id=1
        wait times: snap=3.000749 sec, exc=3.000749 sec, total=3.000749 sec
        wait times: max=3.000000 sec
        wait counts: calls=1 os=1
        occurred after 0.000037 sec of elapsed time
     8: waited for 'rdbms ipc message'
        timeout=0x12c, =0x0, =0x0
        wait_id=9474659 seq_num=37620 snap_id=1
        wait times: snap=3.000751 sec, exc=3.000751 sec, total=3.000751 sec
        wait times: max=3.000000 sec
        wait counts: calls=1 os=1
        occurred after 0.000040 sec of elapsed time
     9: waited for 'rdbms ipc message'
        timeout=0x12c, =0x0, =0x0
        wait_id=9474658 seq_num=37619 snap_id=1
        wait times: snap=3.001728 sec, exc=3.001728 sec, total=3.001728 sec
        wait times: max=3.000000 sec
        wait counts: calls=1 os=1
        occurred after 0.000041 sec of elapsed time

LGWR长时间等待library cache lock,从namespace的p3分析可以看到lgwr正在以X模式请求dbinstance(这也是standby的一大特色),由于长时间(长达15分钟)无法请求到所以导致了ORA-04021从而crash实例。

通常standby的lgwr会在下列情况下去请求X模式的dbinstance library cache lock

Typical items that require an instancelevel
library cache lock in the standby before 18c and can cause these
symptoms:
1.DDL on primary
2.Statistics gathering on primary
3.Grants / revokes on primary
4.Logon on primary (the resulting updates to last login time / user metadata in standby require it)
5.EditionBased Redefinition changes on primary
6.Query parsing, specifically hard or failed parses (focus of this note)
7.Advancement of query SCN
8.Apply/Recovery
9.Administration(Start/Cancellation) of Managed Recovery

关于standby中由于LGWR长时间以X模式持有dbinstance引起故障的问题非常多,比如anbob的blog中的这篇文章https://www.anbob.com/archives/5126.html,供参考。但是本case恰好相反,是LGWR长时间请求不到dbinstance的library cache lock。

我们知道library cache lock和row cache lock都属于是一种enqueue,S会阻塞X,X也会阻塞S。所以肯定有进程长时间持有S模式的lc lock,或者阻塞队列过长。

从trace中我们可发现有session以S模式持有了handle为0x15ee73a58的lock,并且能够定位到该session的SO:0x15646da20

      SO: 0x15ecb4090, type: 78, owner: 0x14af0a590, flag: INIT/-/-/0x00 if: 0x3 c: 0x3
       proc=0x15aabfb30, name=LIBRARY OBJECT LOCK, file=kgl.h LINE:8751, pg=0

      LibraryObjectLock:  Address=0x15ecb4090 Handle=0x15ee73a58 Mode=S CanBeBrokenCount=1 Incarnation=1 ExecutionCount=0

User=0x15646da20 Session=0x15646da20 ReferenceCount=1
Flags=CNB/[0001] SavepointNum=2
      LibraryHandle:  Address=0x15ee73a58 Hash=c489f911 LockMode=S PinMode=0 LoadLockMode=0 Status=0
        ObjectName:  Name=SYS.cqwmsprd
          FullHashValue=cba69cb3d98177886e595a37c489f911 Namespace=DBINSTANCE(74) Type=CURSOR(00) Identifier=2 OwnerIdn=0
        Statistics:  InvalidationCount=0 ExecutionCount=0 LoadCount=0 ActiveLocks=6 TotalLockCount=537229 TotalPinCount=0
        Counters:  BrokenCount=1 RevocablePointer=1 KeepDependency=0 Version=0 BucketInUse=4458 HandleInUse=4458 HandleReferenceCount=0
        Concurrency:  DependencyMutex=0x15ee73b08(0, 0, 0, 0) Mutex=0x15ee73b98(0, 1106247, 111, 0)

搜索0x15646da20可以找到该session

 SO: 0x15646da20, type: 4, owner: 0x15aabfb30, flag: INIT/-/-/0x00 if: 0x3 c: 0x3
     proc=0x15aabfb30, name=session, file=ksu.h LINE:12729, pg=0
    (session) sid: 2085 ser: 15009 trans: (nil), creator: 0x15aabfb30
              flags: (0x41) USR/- flags_idl: (0x1) BSY/-/-/-/-/-
              flags2: (0x9) -/-/INC
              DID: , short-term DID:
              txn branch: (nil)
              edition#: 0              oct: 3, prv: 0, sql: 0x15d9e9a28, psql: (nil), user: 0/SYS
    ksuxds FALSE at location: 0
    service name: SYS$USERS
    client details:
      O/S info: user: root, term: pts/0, ospid: 14273
      machine: lwp-web-api-cfab8-5c7c6ffff5-tvvwj program: python3@lwp-web-api-cfab8-5c7c6ffff5-tvvwj (TNS
      application name: python3@lwp-web-api-cfab8-5c7c6ffff5-tvvwj (TNS , hash value=3770380047
    Current Wait Stack:
      Not in wait; last wait ended 3.125650 sec ago
    There are 148 sessions blocked by this session.
    Dumping one waiter:
      inst: 1, sid: 1710, ser: 5839
      wait event: 'library cache load lock'
        p1: 'object address'=0x1377e3440
        p2: 'lock address'=0x137921060
        p3: '100*mask+namespace'=0x450003
      row_wait_obj#: 4294967295, block#: 0, row#: 0, file# 0
      min_blocked_time: 896 secs, waiter_cache_ver: 59364
    Wait State:
      fixed_waits=0 flags=0x21 boundary=(nil)/-1
    Session Wait History:
        elapsed time of 3.125710 sec since last wait
     0: waited for 'db file sequential read'
        file#=0x1, block#=0xaa9, blocks=0x1
        wait_id=29 seq_num=32 snap_id=1
        wait times: snap=0.016092 sec, exc=0.016092 sec, total=0.016092 sec
        wait times: max=infinite
        wait counts: calls=0 os=0
        occurred after 0.000073 sec of elapsed time
     1: waited for 'read by other session'
        file#=0x1, block#=0xb02, class#=0x1
        wait_id=28 seq_num=31 snap_id=1
        wait times: snap=0.009356 sec, exc=0.009356 sec, total=0.009356 sec
        wait times: max=infinite
        wait counts: calls=1 os=1
        occurred after 0.000009 sec of elapsed time
     2: waited for 'latch: cache buffers chains'
        address=0x155f89310, number=0xb1, tries=0x0
        wait_id=27 seq_num=30 snap_id=1
        wait times: snap=0.000043 sec, exc=0.000043 sec, total=0.000043 sec
        wait times: max=infinite
        wait counts: calls=0 os=0
        occurred after 0.000028 sec of elapsed time
     3: waited for 'read by other session'
        file#=0x1, block#=0xb01, class#=0x1
        wait_id=26 seq_num=29 snap_id=1
        wait times: snap=0.000301 sec, exc=0.000301 sec, total=0.000301 sec
        wait times: max=infinite
        wait counts: calls=1 os=1
        occurred after 0.030419 sec of elapsed time
     4: waited for 'latch: row cache objects'
        address=0x1475525f8, number=0x135, tries=0x0
        wait_id=25 seq_num=28 snap_id=1
        wait times: snap=0.027819 sec, exc=0.027819 sec, total=0.027819 sec
        wait times: max=infinite
        wait counts: calls=0 os=0
        occurred after 0.000029 sec of elapsed time
     5: waited for 'db file sequential read'
        file#=0x1, block#=0x7950, blocks=0x1
        wait_id=24 seq_num=27 snap_id=1
        wait times: snap=0.002374 sec, exc=0.002374 sec, total=0.002374 sec
        wait times: max=infinite
        wait counts: calls=0 os=0
        occurred after 0.000081 sec of elapsed time
     6: waited for 'db file sequential read'
        file#=0x1, block#=0x794b, blocks=0x1
        wait_id=23 seq_num=26 snap_id=1
        wait times: snap=0.029142 sec, exc=0.029142 sec, total=0.029142 sec
        wait times: max=infinite
        wait counts: calls=0 os=0
        occurred after 0.089584 sec of elapsed time
     7: waited for 'row cache lock'
        cache id=0x3, mode=0x0, request=0x3
        wait_id=21 seq_num=25 snap_id=2
        wait times: snap=0.000000 sec, exc=11 min 48 sec, total=11 min 48 sec
        wait times: max=infinite
        wait counts: calls=237 os=237
        occurred after 0.000000 sec of elapsed time
     8: waited for 'latch: row cache objects'
        address=0x1475525f8, number=0x135, tries=0x0
        wait_id=22 seq_num=24 snap_id=1
        wait times: snap=0.010451 sec, exc=0.010451 sec, total=0.010451 sec
        wait times: max=infinite
        wait counts: calls=0 os=0
        occurred after 0.000000 sec of elapsed time
     9: waited for 'row cache lock'
        cache id=0x3, mode=0x0, request=0x3
        wait_id=21 seq_num=23 snap_id=1
        wait times: snap=11 min 48 sec, exc=11 min 48 sec, total=11 min 48 sec
        wait times: max=infinite
        wait counts: calls=237 os=237
        occurred after 0.004089 sec of elapsed time

从wait history可以看到会话2085在持有dbinstance的library cache lock的情况下,长时间请求不到s模式的row cache lock,使得LGWR请求dbinstance的library cache lock超时。row cache lock对应的dc为dc_rollback_segs。

当时肯定有session以X模式持有row cache lock,但是由于这是wait history,所以无法从此时的SystemState dump中去寻找到哪个session以X模式去请求访问dc_rollback_segs。

由于这是一个standby,是一个只读的数据库,为何还会有S/X模式去请求dc_rollback_segs的row cache lock呢?我也没想明白,之前处理过相关问题的同事发给我这样一段描述

On a primary with high transaction-througput but low transaction-concurrency (few undo segments) we would end up cycling transction slots more often.

Since we have non-logged cleanouts kicking in on the primary, this may not affect adversely queries on the primary.

However, since non-logged cleanouts do not appear on the standby, we would end up having to perform more cleanouts (for consistent read) on a standby than on the primary.

This in itself could lead to more undo header gets.

Additionally we could end up doing a lot of transaction table rollbacks.
There is no fix for this issue.  Only workarounds can be provided.
As total no of undo segments are not too high, we can ask them to set _rollback_segment_count=1000

似乎与备库没有redo也没有non-logged cleanouts有关。

关于non-logged cleanouts其实就是Fast Block Cleanout on Commit

Fast Block Cleanout on Commit
Cleanout at commit time is attempted by function kcbnlc() when the transaction, after committing,
deletes its List of Blocks SOs (ktldbl).
For each block found in the transaction’s BL State Object, the Oracle server tries a «no-logging
cleanout» by calling kcbnlc, passing it the relevant block context. Then:
• The statistic “commit cleanouts” is incremented.
• If cleanout is successful, the statistic “commit cleanouts successfully completed” is
incremented.
The ratio between the two statistics allows to measure the success rate of No-logging cleanout.
After 3 failures to perform cleanout on a specific BL State Object (possibly because the block
entry’s information is too old), the Oracle server will proceed to another state object (for the same
transaction) if exists.

此外还有一种情况就是dblink了,我们知道dblink会在本地产生事务

SQL> select 1 from dual@to_test;

         1
----------
         1

SQL> select status,xid from v$transaction where ses_addr in (select saddr from v$session where sid in (select sid from v$mystat where rownum=1));

STATUS           XID
---------------- ----------------
ACTIVE           8803010002000000

并且阻塞LGWR的session,正好也正在持有handle为0x1377e3440,namespace为dblink的library cache lock,并且长时间的X模式持有handle 0x1377e3440的library cache load lock。

User=0x15646da20 Session=0x15646da20 ReferenceCount=0
Flags=[0000] SavepointNum=2
Flags=[0000] SavepointNum=2
      LibraryHandle:  Address=0x1377e3440 Hash=3b1dfeba LockMode=S PinMode=0 LoadLockMode=X Status=0
        ObjectName:  Name=.RECO.ORACLE.COM
          FullHashValue=022bfb39389939832aaa659c3b1dfeba Namespace=DBLINK(69) Type=CURSOR(00) Identifier=0 OwnerIdn=0
        Statistics:  InvalidationCount=0 ExecutionCount=0 LoadCount=0 ActiveLocks=134 TotalLockCount=134 TotalPinCount=0

 

会话2085对应的sql为:

      LibraryObjectLock:  Address=0x15ecb3f90 Handle=0x15d9e9a28 Mode=N CanBeBrokenCount=1 Incarnation=1 ExecutionCount=0

User=0x15646da20 Session=0x155445ec0 ReferenceCount=1
Flags=CNB/[0001] SavepointNum=626460cb
      LibraryHandle:  Address=0x15d9e9a28 Hash=592bf3ba LockMode=N PinMode=0 LoadLockMode=0 Status=VALD
        ObjectName:  Name=select host,userid,password,flag,authusr,authpwd, passwordx, authpwdx from link$ where owner#=:1 and name=:2
          FullHashValue=45b5ea77f78a8282b1fdfd0f592bf3ba Namespace=SQL AREA(00) Type=CURSOR(00) Identifier=1496052666 OwnerIdn=0
        Statistics:  InvalidationCount=0 ExecutionCount=0 LoadCount=2 ActiveLocks=1 TotalLockCount=1 TotalPinCount=1
        Counters:  BrokenCount=1 RevocablePointer=1 KeepDependency=1 Version=0 BucketInUse=0 HandleInUse=0 HandleReferenceCount=0
        Concurrency:  DependencyMutex=0x15d9e9ad8(0, 1, 0, 0) Mutex=0x15d9e9b68(0, 20, 0, 0)

正在查询link$,并没有找到带dblink的语句,说明这是一个以standby为源端的dblink查询。并且也阻塞了很多dblink操作的session,event为library cache load lock

=================================================================================================================================================
Process              Waiter Details                                     Wait Event                               Time Waited          Blocker Details
-------              ---------------                                    ----------------                         -----------          --------------
PROCESS 30:          sid: 2648 ser: 9645                                library cache load lock                  15 min 1 sec         inst: 1, sid: 2085, ser: 15009
PROCESS 35:          sid: 570 ser: 17                                   library cache load lock                  14 min 49 sec        inst: 1, sid: 2085, ser: 15009
PROCESS 56:          sid: 1519 ser: 18263                               library cache load lock                  15 min 2 sec         inst: 1, sid: 2085, ser: 15009
PROCESS 57:          sid: 1710 ser: 5839                                library cache load lock                  15 min 2 sec         inst: 1, sid: 2085, ser: 15009
PROCESS 58:          sid: 1898 ser: 3237                                library cache load lock                  15 min 2 sec         inst: 1, sid: 2085, ser: 15009
PROCESS 64:          sid: 2 ser: 15153                                  library cache load lock                  15 min 3 sec         inst: 1, sid: 2085, ser: 15009
PROCESS 66:          sid: 388 ser: 14487                                library cache load lock                  15 min 3 sec         inst: 1, sid: 2085, ser: 15009
PROCESS 81:          sid: 195 ser: 15563                                library cache load lock                  15 min 3 sec         inst: 1, sid: 2085, ser: 15009
PROCESS 82:          sid: 386 ser: 33757                                library cache load lock                  15 min 4 sec         inst: 1, sid: 2085, ser: 15009
PROCESS 83:          sid: 576 ser: 14539                                library cache load lock                  15 min 4 sec         inst: 1, sid: 2085, ser: 15009
PROCESS 84:          sid: 764 ser: 8677                                 library cache load lock                  15 min 4 sec         inst: 1, sid: 2085, ser: 15009
PROCESS 85:          sid: 953 ser: 10021                                library cache load lock                  15 min 4 sec         inst: 1, sid: 2085, ser: 15009
PROCESS 86:          sid: 1142 ser: 4645                                library cache load lock                  15 min 4 sec         inst: 1, sid: 2085, ser: 15009
PROCESS 87:          sid: 1332 ser: 4175                                library cache load lock                  15 min 4 sec         inst: 1, sid: 2085, ser: 15009
PROCESS 88:          sid: 1518 ser: 4243                                library cache load lock                  15 min 4 sec         inst: 1, sid: 2085, ser: 15009
PROCESS 89:          sid: 1708 ser: 8511                                library cache load lock                  15 min 4 sec         inst: 1, sid: 2085, ser: 15009
PROCESS 90:          sid: 1895 ser: 11751                               library cache load lock                  15 min 4 sec         inst: 1, sid: 2085, ser: 15009
PROCESS 91:          sid: 2087 ser: 421                                 library cache load lock                  15 min 4 sec         inst: 1, sid: 2085, ser: 15009
PROCESS 92:          sid: 2274 ser: 10251                               library cache load lock                  15 min 4 sec         inst: 1, sid: 2085, ser: 15009
PROCESS 94:          sid: 2654 ser: 10465                               library cache load lock                  15 min 4 sec         inst: 1, sid: 2085, ser: 15009
PROCESS 95:          sid: 2841 ser: 2665                                library cache load lock                  14 min 51 sec        inst: 1, sid: 2085, ser: 15009
PROCESS 96:          sid: 7 ser: 813                                    library cache load lock                  14 min 51 sec        inst: 1, sid: 2085, ser: 15009
PROCESS 97:          sid: 197 ser: 3                                    library cache load lock                  14 min 51 sec        inst: 1, sid: 2085, ser: 15009
PROCESS 98:          sid: 385 ser: 7461                                 library cache load lock                  14 min 51 sec        inst: 1, sid: 2085, ser: 15009
PROCESS 99:          sid: 571 ser: 25                                   library cache load lock                  14 min 50 sec        inst: 1, sid: 2085, ser: 15009
PROCESS 100:         sid: 765 ser: 1                                    library cache load lock                  14 min 50 sec        inst: 1, sid: 2085, ser: 15009
PROCESS 101:         sid: 947 ser: 10833                                library cache load lock                  14 min 50 sec        inst: 1, sid: 2085, ser: 15009
PROCESS 102:         sid: 1144 ser: 7                                   library cache load lock                  14 min 50 sec        inst: 1, sid: 2085, ser: 15009
PROCESS 103:         sid: 1335 ser: 1                                   library cache load lock                  14 min 50 sec        inst: 1, sid: 2085, ser: 15009
PROCESS 104:         sid: 1522 ser: 467                                 library cache load lock                  14 min 50 sec        inst: 1, sid: 2085, ser: 15009
PROCESS 105:         sid: 1711 ser: 35                                  library cache load lock                  14 min 50 sec        inst: 1, sid: 2085, ser: 15009
PROCESS 106:         sid: 1896 ser: 6687                                library cache load lock                  14 min 50 sec        inst: 1, sid: 2085, ser: 15009
PROCESS 107:         sid: 2088 ser: 1                                   library cache load lock                  14 min 45 sec        inst: 1, sid: 2085, ser: 15009
PROCESS 108:         sid: 2278 ser: 1                                   library cache load lock                  14 min 45 sec        inst: 1, sid: 2085, ser: 15009
PROCESS 109:         sid: 2464 ser: 187                                 library cache load lock                  14 min 45 sec        inst: 1, sid: 2085, ser: 15009
PROCESS 110:         sid: 2653 ser: 1099                                library cache load lock                  14 min 43 sec        inst: 1, sid: 2085, ser: 15009
PROCESS 111:         sid: 2844 ser: 1                                   library cache load lock                  13 min 51 sec        inst: 1, sid: 2085, ser: 15009
PROCESS 112:         sid: 5 ser: 1275                                   library cache load lock                  13 min 51 sec        inst: 1, sid: 2085, ser: 15009
PROCESS 113:         sid: 198 ser: 1                                    library cache load lock                  13 min 28 sec        inst: 1, sid: 2085, ser: 15009
PROCESS 114:         sid: 387 ser: 79                                   library cache load lock                  12 min 12 sec        inst: 1, sid: 2085, ser: 15009
PROCESS 115:         sid: 577 ser: 849                                  library cache load lock                  10 min 57 sec        inst: 1, sid: 2085, ser: 15009
PROCESS 116:         sid: 766 ser: 1                                    library cache load lock                  10 min 51 sec        inst: 1, sid: 2085, ser: 15009
PROCESS 117:         sid: 954 ser: 39                                   library cache load lock                  10 min 51 sec        inst: 1, sid: 2085, ser: 15009
PROCESS 118:         sid: 1145 ser: 1                                   library cache load lock                  10 min 51 sec        inst: 1, sid: 2085, ser: 15009
PROCESS 119:         sid: 1336 ser: 1                                   library cache load lock                  10 min 51 sec        inst: 1, sid: 2085, ser: 15009
PROCESS 120:         sid: 1521 ser: 35                                  library cache load lock                  10 min 51 sec        inst: 1, sid: 2085, ser: 15009
PROCESS 121:         sid: 1709 ser: 9721                                library cache load lock                  10 min 51 sec        inst: 1, sid: 2085, ser: 15009
PROCESS 122:         sid: 1899 ser: 1                                   library cache load lock                  10 min 51 sec        inst: 1, sid: 2085, ser: 15009
PROCESS 123:         sid: 2089 ser: 1                                   library cache load lock                  10 min 51 sec        inst: 1, sid: 2085, ser: 15009
PROCESS 124:         sid: 2279 ser: 1                                   library cache load lock                  10 min 51 sec        inst: 1, sid: 2085, ser: 15009
PROCESS 125:         sid: 2467 ser: 1                                   library cache load lock                  10 min 51 sec        inst: 1, sid: 2085, ser: 15009
PROCESS 126:         sid: 2655 ser: 1                                   library cache load lock                  10 min 51 sec        inst: 1, sid: 2085, ser: 15009
PROCESS 127:         sid: 2845 ser: 1                                   library cache load lock                  10 min 51 sec        inst: 1, sid: 2085, ser: 15009
PROCESS 128:         sid: 10 ser: 1                                     library cache load lock                  10 min 51 sec        inst: 1, sid: 2085, ser: 15009
PROCESS 129:         sid: 199 ser: 1                                    library cache load lock                  10 min 51 sec        inst: 1, sid: 2085, ser: 15009
PROCESS 130:         sid: 389 ser: 97                                   library cache load lock                  10 min 51 sec        inst: 1, sid: 2085, ser: 15009
PROCESS 131:         sid: 578 ser: 1                                    library cache load lock                  10 min 51 sec        inst: 1, sid: 2085, ser: 15009
PROCESS 132:         sid: 767 ser: 1                                    library cache load lock                  9 min 51 sec         inst: 1, sid: 2085, ser: 15009
PROCESS 133:         sid: 955 ser: 1                                    library cache load lock                  9 min 51 sec         inst: 1, sid: 2085, ser: 15009
PROCESS 134:         sid: 1146 ser: 1                                   library cache load lock                  9 min 51 sec         inst: 1, sid: 2085, ser: 15009
PROCESS 135:         sid: 1337 ser: 1                                   library cache load lock                  9 min 51 sec         inst: 1, sid: 2085, ser: 15009
PROCESS 136:         sid: 1523 ser: 1                                   library cache load lock                  9 min 51 sec         inst: 1, sid: 2085, ser: 15009
PROCESS 137:         sid: 1712 ser: 1                                   library cache load lock                  9 min 51 sec         inst: 1, sid: 2085, ser: 15009
PROCESS 138:         sid: 1900 ser: 1                                   library cache load lock                  9 min 51 sec         inst: 1, sid: 2085, ser: 15009
PROCESS 139:         sid: 2090 ser: 1                                   library cache load lock                  9 min 51 sec         inst: 1, sid: 2085, ser: 15009
PROCESS 140:         sid: 2280 ser: 1                                   library cache load lock                  9 min 51 sec         inst: 1, sid: 2085, ser: 15009
PROCESS 141:         sid: 2466 ser: 7                                   library cache load lock                  9 min 51 sec         inst: 1, sid: 2085, ser: 15009
PROCESS 142:         sid: 2656 ser: 1                                   library cache load lock                  9 min 51 sec         inst: 1, sid: 2085, ser: 15009
PROCESS 143:         sid: 2846 ser: 1                                   library cache load lock                  9 min 51 sec         inst: 1, sid: 2085, ser: 15009
PROCESS 144:         sid: 11 ser: 1                                     library cache load lock                  9 min 51 sec         inst: 1, sid: 2085, ser: 15009
PROCESS 145:         sid: 200 ser: 1                                    library cache load lock                  9 min 51 sec         inst: 1, sid: 2085, ser: 15009
PROCESS 146:         sid: 390 ser: 1                                    library cache load lock                  9 min 52 sec         inst: 1, sid: 2085, ser: 15009
PROCESS 147:         sid: 579 ser: 1                                    library cache load lock                  9 min 51 sec         inst: 1, sid: 2085, ser: 15009
PROCESS 148:         sid: 768 ser: 1                                    library cache load lock                  9 min 42 sec         inst: 1, sid: 2085, ser: 15009
PROCESS 149:         sid: 956 ser: 1                                    library cache load lock                  8 min 51 sec         inst: 1, sid: 2085, ser: 15009
PROCESS 150:         sid: 1147 ser: 1                                   library cache load lock                  8 min 51 sec         inst: 1, sid: 2085, ser: 15009
PROCESS 151:         sid: 1338 ser: 1                                   library cache load lock                  8 min 27 sec         inst: 1, sid: 2085, ser: 15009
PROCESS 152:         sid: 1524 ser: 1                                   library cache load lock                  7 min 11 sec         inst: 1, sid: 2085, ser: 15009
PROCESS 153:         sid: 1713 ser: 1                                   library cache load lock                  5 min 55 sec         inst: 1, sid: 2085, ser: 15009
PROCESS 154:         sid: 1901 ser: 1                                   library cache load lock                  5 min 52 sec         inst: 1, sid: 2085, ser: 15009
PROCESS 155:         sid: 2091 ser: 1                                   library cache load lock                  5 min 52 sec         inst: 1, sid: 2085, ser: 15009
PROCESS 156:         sid: 2281 ser: 1                                   library cache load lock                  5 min 52 sec         inst: 1, sid: 2085, ser: 15009
PROCESS 157:         sid: 2469 ser: 1                                   library cache load lock                  5 min 52 sec         inst: 1, sid: 2085, ser: 15009
PROCESS 158:         sid: 2657 ser: 1                                   library cache load lock                  5 min 52 sec         inst: 1, sid: 2085, ser: 15009
PROCESS 159:         sid: 2847 ser: 1                                   library cache load lock                  5 min 52 sec         inst: 1, sid: 2085, ser: 15009
PROCESS 160:         sid: 12 ser: 1                                     library cache load lock                  5 min 52 sec         inst: 1, sid: 2085, ser: 15009
PROCESS 161:         sid: 201 ser: 1                                    library cache load lock                  5 min 52 sec         inst: 1, sid: 2085, ser: 15009
PROCESS 162:         sid: 391 ser: 1                                    library cache load lock                  5 min 52 sec         inst: 1, sid: 2085, ser: 15009
PROCESS 163:         sid: 580 ser: 1                                    library cache load lock                  5 min 52 sec         inst: 1, sid: 2085, ser: 15009
PROCESS 164:         sid: 769 ser: 1                                    library cache load lock                  5 min 52 sec         inst: 1, sid: 2085, ser: 15009
PROCESS 165:         sid: 957 ser: 1                                    library cache load lock                  5 min 52 sec         inst: 1, sid: 2085, ser: 15009
PROCESS 166:         sid: 1148 ser: 1                                   library cache load lock                  5 min 52 sec         inst: 1, sid: 2085, ser: 15009
PROCESS 167:         sid: 1339 ser: 1                                   library cache load lock                  5 min 52 sec         inst: 1, sid: 2085, ser: 15009
PROCESS 168:         sid: 1525 ser: 1                                   library cache load lock                  5 min 52 sec         inst: 1, sid: 2085, ser: 15009
PROCESS 169:         sid: 1714 ser: 1                                   library cache load lock                  5 min 52 sec         inst: 1, sid: 2085, ser: 15009
PROCESS 170:         sid: 1902 ser: 1                                   library cache load lock                  4 min 52 sec         inst: 1, sid: 2085, ser: 15009
PROCESS 171:         sid: 2092 ser: 1                                   library cache load lock                  4 min 52 sec         inst: 1, sid: 2085, ser: 15009
PROCESS 172:         sid: 2282 ser: 1                                   library cache load lock                  4 min 52 sec         inst: 1, sid: 2085, ser: 15009
PROCESS 173:         sid: 2468 ser: 7                                   library cache load lock                  4 min 52 sec         inst: 1, sid: 2085, ser: 15009
PROCESS 174:         sid: 2658 ser: 1                                   library cache load lock                  4 min 52 sec         inst: 1, sid: 2085, ser: 15009
PROCESS 175:         sid: 2848 ser: 1                                   library cache load lock                  4 min 52 sec         inst: 1, sid: 2085, ser: 15009
PROCESS 176:         sid: 13 ser: 1                                     library cache load lock                  4 min 52 sec         inst: 1, sid: 2085, ser: 15009

猜测当时也有很多的dblink查询操作。但是从trace中发现实例启动以来并没有dblink完整执行。应该都还在解析阶段。

0 bytes sent via SQL*Net to dblink
0 bytes received via SQL*Net from dblink
0 SQL*Net roundtrips to/from dblink
0 bytes via SQL*Net vector to client
664395087840 bytes via SQL*Net vector from client
0 bytes via SQL*Net vector to dblink
0 bytes via SQL*Net vector from dblink

由于是standby并且已经重启,无法使用v$sysstat、v$active_session_history等视图来分析,也给分析带来了不少难度。

综上分析,结合mos文档LGWR (ospid: Xxx): Terminating The Instance Due To Error 4021 (Doc ID 2177473.1),我给的建议是:

ALTER SYSTEM set "_adg_parselock_timeout"=3000 scope=both;
ALTER SYSTEM set "_rollback_segment_count"=1000 scope=both;

关于_adg_parselock_timeout的算法,文章前推荐的anbob的那篇文章https://www.anbob.com/archives/5126.html中有详细描述。增大_adg_parselock_timeout只是增加了15分钟的timeout crash时间,如果长时间LGWR还是无法请求到X模式dbinstance的library cache lock,也可以尝试手动kill blocker。

 

 


评论

发表回复

您的电子邮箱地址不会被公开。 必填项已用 * 标注