One case of troubleshooting row cache lock (2)

本案例来自东区某客户,数据库版本为rac 19.7。周日业务反应有大量业务阻塞,通过一线同事发来的wait chain信息可以看到大量的row cache lock(CID=16)和library cache lock,并且final blocking event是row cache lock。final block session是not wait的。

首先row cache lock在之前的一篇row cache lock分析中已经介绍过了基本分析思路,这里就不多说明了,直接进入正题http://www.minniebaby.tech/2021/10/27/%e6%a1%88%e4%be%8b%ef%bc%9a%e4%b8%80%e6%ac%a1row-cache-lock%e5%bc%82%e5%b8%b8%e5%a4%84%e7%90%86/

ash分析:

select instance_NUMBER,program,event,sql_id,sql_exec_start,min(sample_time),max(sample_time),seq#,count(*)*10 from dba_hist_active_sess_history
where event='row cache lock' and p3=5 and sample_time>trunc(sysdate-1)
group by instance_NUMBER,program,event,sql_id,sql_exec_start,seq#
order by sql_exec_start;

INSTANCE_NUMBER PROGRAM 		       EVENT		    SQL_ID	  SQL_EXEC_START      MIN(SAMPLE_TIME)		     MAX(SAMPLE_TIME)			  SEQ# COUNT(*)*10
--------------- ------------------------------ -------------------- ------------- ------------------- ------------------------------ ------------------------------ ---------- -----------
	      2 oracle@prdres2 (J003)	       row cache lock	    b6usrg82hwsa3 2023-05-07 10:07:45 07-MAY-23 10.09.03.948 AM      07-MAY-23 04.14.06.719 PM		  1239	     21890
	      2 oracle@prdres2 (J003)	       row cache lock	    b6usrg82hwsa3 2023-05-07 10:07:45 07-MAY-23 04.14.16.729 PM      07-MAY-23 06.31.25.760 PM		  1241	      8230

final block session(INSTANCE_NUMBER=2、SID=2018)执行的是自动统计信息任务。以X模式请求dc_histogram_defs的row cache lock,并且持续了非常长时间的请求。一直持续到客户重启数据库。

*** 2023-05-07T10:55:08.609919+08:00 ((1))
    (session) sid: 2018 ser: 26444 trans: 0x2b8ee7da0, creator: 0x310d9c5a8
              flags: (0x8210041) USR/- flags2: (0x40009) -/-/INC
              flags_idl: (0x1) status: BSY/-/-/- kill: -/-/-/-
              DID: 0002-00B0-0003F5E70002-00B0-0003F5E6, short-term DID:
              txn branch: (nil)
              con_id/con_uid/con_name: 3/2615539819/PDBRES
              con_logonuid: 2615539819 con_logonid: 3
              con_scuid: 1 con_scid: 1
              edition#: 134              user#/name: 0/SYS
              oct: 170, prv: 0, sql: 0x19a891ae8, psql: 0x48a3146c0
              stats: 0x1bdfa4d10, PX stats: 0x12f05d04
    service name: SYS$USERS
    client details:
      O/S info: user: oracle, term: UNKNOWN, ospid: 392
      machine: prdres2 program: oracle@prdres2 (J003)
      application name: DBMS_SCHEDULER, hash value=2478762354
      action name: ORA$AT_OS_OPT_SY_5210, hash value=513032857
    Current Wait Stack:
     0: waiting for 'row cache lock'
        cache id=0x10, mode=0x0, request=0x5
        pdb uid=0x9be5f86b, master hint=0x0
        wait_id=1230 seq_num=1239 snap_id=1
        wait times: snap=46 min 13 sec, exc=46 min 13 sec, total=46 min 13 sec
        wait times: max=infinite, heur=46 min 13 sec
        wait counts: calls=5570 os=5570
        in_wait=1 iflags=0x215a2
    There are 5 sessions blocked by this session.
    Dumping one waiter:
      inst: 2, sid: 3781, ser: 11134
      wait event: 'library cache lock'
        p1: 'handle address'=0x18c056a30
        p2: 'lock address'=0x17d3fdf58
        p3: '100*mode+namespace'=0x1600010002
      row_wait_obj#: 4294967295, block#: 0, row#: 0, file# 0
      min_blocked_time: 97 secs, waiter_cache_ver: 49043
    Wait State:
      fixed_waits=0 flags=0x22 boundary=(nil)/-1
    Session Wait History:
        elapsed time of 0.000027 sec since current wait
     0: waited for 'row cache lock'
        cache id=0x10, mode=0x0, request=0x5
        pdb uid=0x9be5f86b, master hint=0x0
        wait_id=1229 seq_num=1238 snap_id=1
        wait times: snap=0.000211 sec, exc=0.000211 sec, total=0.000211 sec
        wait times: max=0.000000 sec
        wait counts: calls=2 os=2
        occurred after 0.000012 sec of elapsed time
     1: waited for 'KJC: Wait for msg sends to complete'
        msg=0x353bf7860, dest|rcvr=0x10000, mtype=0x6a
        wait_id=1228 seq_num=1237 snap_id=1
        wait times: snap=0.000045 sec, exc=0.000045 sec, total=0.000045 sec
        wait times: max=5 min 1 sec
        wait counts: calls=2 os=2
        occurred after 0.000004 sec of elapsed time
     2: waited for 'DLM cross inst call completion'
        caller instance number=0x2, cluster incarnation number=0x7, request identifier=0x14f735f
        wait_id=1227 seq_num=1236 snap_id=1
        wait times: snap=0.000004 sec, exc=0.000004 sec, total=0.000004 sec
        wait times: max=0.000000 sec
        wait counts: calls=0 os=0
        occurred after 0.000021 sec of elapsed time
     3: waited for 'row cache lock'
        cache id=0x10, mode=0x0, request=0x5
        pdb uid=0x9be5f86b, master hint=0x0
        wait_id=1226 seq_num=1235 snap_id=1
        wait times: snap=0.000269 sec, exc=0.000269 sec, total=0.000269 sec
        wait times: max=0.000000 sec
        wait counts: calls=2 os=2
        occurred after 0.000033 sec of elapsed time
     4: waited for 'KJC: Wait for msg sends to complete'
        msg=0x353afe5a8, dest|rcvr=0x10000, mtype=0x6a
        wait_id=1225 seq_num=1234 snap_id=1
        wait times: snap=0.000042 sec, exc=0.000042 sec, total=0.000042 sec
        wait times: max=5 min 1 sec
        wait counts: calls=2 os=2
        occurred after 0.000013 sec of elapsed time
     5: waited for 'DLM cross inst call completion'
        caller instance number=0x2, cluster incarnation number=0x7, request identifier=0x14f735e
        wait_id=1224 seq_num=1233 snap_id=1
        wait times: snap=0.000011 sec, exc=0.000011 sec, total=0.000011 sec
        wait times: max=0.000000 sec
        wait counts: calls=0 os=0
        occurred after 0.000064 sec of elapsed time
     6: waited for 'row cache lock'
        cache id=0x10, mode=0x0, request=0x5
        pdb uid=0x9be5f86b, master hint=0x0
        wait_id=1223 seq_num=1232 snap_id=1
        wait times: snap=0.000219 sec, exc=0.000219 sec, total=0.000219 sec
        wait times: max=0.000000 sec
        wait counts: calls=2 os=2
        occurred after 0.000016 sec of elapsed time
     7: waited for 'KJC: Wait for msg sends to complete'
        msg=0x353f0abe0, dest|rcvr=0x10000, mtype=0x6a
        wait_id=1222 seq_num=1231 snap_id=1
        wait times: snap=0.000057 sec, exc=0.000057 sec, total=0.000057 sec
        wait times: max=5 min 1 sec
        wait counts: calls=2 os=2
        occurred after 0.000007 sec of elapsed time
     8: waited for 'DLM cross inst call completion'
        caller instance number=0x2, cluster incarnation number=0x7, request identifier=0x14f735d
        wait_id=1221 seq_num=1230 snap_id=1
        wait times: snap=0.000004 sec, exc=0.000004 sec, total=0.000004 sec
        wait times: max=0.000000 sec
        wait counts: calls=0 os=0
        occurred after 0.000035 sec of elapsed time
     9: waited for 'row cache lock'
        cache id=0x10, mode=0x0, request=0x5
        pdb uid=0x9be5f86b, master hint=0x0
        wait_id=1220 seq_num=1229 snap_id=1
        wait times: snap=0.000372 sec, exc=0.000372 sec, total=0.000372 sec
        wait times: max=0.000000 sec
        wait counts: calls=1 os=1
        occurred after 0.000752 sec of elapsed time

10点55分已经等待了46分钟的row cache lock

  • P1=16(dc_histogram_defs)
  • P3=5(X模式)
SOC: 0x8cb9e628, type: row cache enqueues (111), map: 0x17bf35d28
     state: LIVE (0x99fc), flags: INIT (0x1)
row cache enqueue: count=1 session=0x351263950 object=0x197150cc0, request=X
flag=03 WAI/TRC/-/-/-/-/-/- savepoint=0x4076e
type=MULTI-INSTANCE instance locked=F handle=0x8cb9e6f8
row cache parent object: addr=0x197150cc0 cid=16(dc_histogram_defs) conid=3 conuid=2615539819
hash=a3de4515 typ=61 transaction=(nil) flags=00000002 inc=1, pdbinc=1
version=2 mtx version=15688
own=0x197150d90[0x197150d90,0x197150d90] wat=0x197150da0[0x8cb9e6a8,0x8cb9e6a8] mode=N req=X
status=VALID/-/-/-/-/-/-/-/-  KGH pinned
instance lock=QQ 6a7e84e3 f2908b26
set=0, complete=FALSE
set=1, complete=FALSE

hash=a3de4515这个是关键信息,用于后续分析的匹配和搜索。

那么为什么J003请求不到X模式的row cache lock呢?原因只能是当前有进程正在以S或者X模式持有了该队列。那么在ash blocking为空的情况下如何找出这个进程呢?

我的方法是使用hash=a3de4515去搜搜trace下面的所有trace看看能不能有发现,最终在diag trace中找到了持有的进程。

SOC: 0x3398f0508, type: row cache enqueues (111), map: 0x19faa3ba8
               state: LIVE (0x99fc), flags: INIT (0x1)
          row cache enqueue: count=1 session=0x1c106b318 object=0x4d9d89fb0, mode=S
          flag=80 -/-/-/-/-/-/-/ILK savepoint=0x8fe64da
          type=MULTI-INSTANCE instance locked=T handle=0x3398f05d8
          row cache parent object: addr=0x4d9d89fb0 cid=16(dc_histogram_defs) conid=3 conuid=2615539819
          hash=a3de4515 typ=61 transaction=(nil) flags=00000002 inc=1, pdbinc=1
          version=2 mtx version=698
          own=0x4d9d8a080[0x3398f0588,0x3398f0588] wat=0x4d9d8a090[0x4d9d8a090,0x4d9d8a090] mode=S req=N
          status=VALID/-/-/-/-/-/-/-/-  KGH pinned
          instance lock=QQ 6a7e84e3 f2908b26
          set=0, complete=FALSE
          set=1, complete=FALSE

hash和instance lock和请求的完全匹配,通过SO和OWNER,往上搜索可以定位到process为GEN1,再通过process的SO可以找到gen1的session state信息。

 SO: 0x1bff52c30, type: session (4), map: 0x1c106b318
state: LIVE (0x4532), flags: 0x1
owner: 0x1bfee0180, proc: 0x1bfee0180
link: 0x1bff52c50[0x1bf6fb2a8, 0x1bf6fb228]
child list count: 7, link: 0x1bff52ca0[0x19faa3a48, 0x1bd6d1020]
conid: 1, conuid: 1, SGA version=(1,0), pg: 0
SOC: 0x1c106b318, type: session (4), map: 0x1bff52c30
state: LIVE (0x99fc), flags: INIT (0x1)
(session) sid: 631 ser: 35354 trans: (nil), creator: 0x1b0ea3d88
flags: (0x51) USR/- flags2: (0x409) -/-/INC
flags_idl: (0x1) status: BSY/-/-/- kill: -/-/-/-
0001-000A-000000020001-000A-00000003              DID: , short-term DID:
txn branch: (nil)
con_id/con_uid/con_name: 3/2615539819/PDBRES
con_logonuid: 1 con_logonid: 1
con_scuid: 1 con_scid: 1
edition#: 134              user#/name: 0/SYS
oct: 3, prv: 0, sql: 0x3398059e8, psql: (nil)
stats: 0x1dfe645a0, PX stats: 0x12f05d04
service name: SYS$BACKGROUND
Current Wait Stack:
0: waiting for 'gc cr request'
file#=0x9, block#=0x8a06, class#=0x1
wait_id=50342261 seq_num=11438 snap_id=1
wait times: snap=0.147084 sec, exc=0.147084 sec, total=0.147084 sec
wait times: max=infinite, heur=0.147084 sec
wait counts: calls=4 os=4
in_wait=1 iflags=0x5a0
Wait State:
fixed_waits=0 flags=0x22 boundary=(nil)/-1
Session Wait History:
elapsed time of 0.000099 sec since current wait
0: waited for 'gc cr block lost'
=0x9, =0x8a06, =0x1
wait_id=50342260 seq_num=11437 snap_id=1
wait times: snap=0.512788 sec, exc=0.512788 sec, total=0.512788 sec
wait times: max=infinite
wait counts: calls=5 os=5
occurred after 0.000090 sec of elapsed time
1: waited for 'gc cr block lost'
=0x9, =0x8a06, =0x1
wait_id=50342259 seq_num=11436 snap_id=1
wait times: snap=0.515190 sec, exc=0.515190 sec, total=0.515190 sec
wait times: max=infinite
wait counts: calls=5 os=5
occurred after 0.000062 sec of elapsed time
2: waited for 'gc cr block lost'
=0x9, =0x8a06, =0x1
wait_id=50342258 seq_num=11435 snap_id=1
wait times: snap=0.518847 sec, exc=0.518847 sec, total=0.518847 sec
wait times: max=infinite
wait counts: calls=5 os=5
occurred after 0.000065 sec of elapsed time
3: waited for 'gc cr block lost'
=0x9, =0x8a06, =0x1
wait_id=50342257 seq_num=11434 snap_id=1
wait times: snap=0.517242 sec, exc=0.517242 sec, total=0.517242 sec
wait times: max=infinite
wait counts: calls=5 os=5
occurred after 0.000060 sec of elapsed time
4: waited for 'gc cr block lost'
=0x9, =0x8a06, =0x1
wait_id=50342256 seq_num=11433 snap_id=1
wait times: snap=0.512952 sec, exc=0.512952 sec, total=0.512952 sec
wait times: max=infinite
wait counts: calls=5 os=5
occurred after 0.000060 sec of elapsed time
5: waited for 'gc cr block lost'
=0x9, =0x8a06, =0x1
wait_id=50342255 seq_num=11432 snap_id=1
wait times: snap=0.515456 sec, exc=0.515456 sec, total=0.515456 sec
wait times: max=infinite
wait counts: calls=5 os=5
occurred after 0.000074 sec of elapsed time
6: waited for 'gc cr block lost'
=0x9, =0x8a06, =0x1
wait_id=50342254 seq_num=11431 snap_id=1
wait times: snap=0.515304 sec, exc=0.515304 sec, total=0.515304 sec
wait times: max=infinite
wait counts: calls=5 os=5
occurred after 0.000063 sec of elapsed time
7: waited for 'gc cr block lost'
=0x9, =0x8a06, =0x1
wait_id=50342253 seq_num=11430 snap_id=1
wait times: snap=0.517334 sec, exc=0.517334 sec, total=0.517334 sec
wait times: max=infinite
wait counts: calls=5 os=5
occurred after 0.000057 sec of elapsed time
8: waited for 'gc cr block lost'
=0x9, =0x8a06, =0x1
wait_id=50342252 seq_num=11429 snap_id=1
wait times: snap=0.517039 sec, exc=0.517039 sec, total=0.517039 sec
wait times: max=infinite
wait counts: calls=5 os=5
occurred after 0.000067 sec of elapsed time
9: waited for 'gc cr block lost'
=0x9, =0x8a06, =0x1
wait_id=50342251 seq_num=11428 snap_id=1
wait times: snap=0.512986 sec, exc=0.512986 sec, total=0.512986 sec
wait times: max=infinite
wait counts: calls=5 os=5
occurred after 0.000066 sec of elapsed time
Sampled Session History of session 631 serial 35354
---------------------------------------------------
The sampled session history is constructed by sampling
the target session every 1 second. The sampling process
captures at each sample if the session is in a non-idle wait,
an idle wait, or not in a wait. If the session is in a
non-idle wait then one interval is shown for all the samples
the session was in the same non-idle wait. If the
session is in an idle wait or not in a wait for
consecutive samples then one interval is shown for all
the consecutive samples. Though we display these consecutive
samples  in a single interval the session may NOT be continuously
idle or not in a wait (the sampling process does not know).
The history is displayed in reverse chronological order.
sample interval: 1 sec, max history 120 sec
---------------------------------------------------
[1 sample,                                                      11:12:58]
waited for 'gc cr request', seq_num: 11438
p1: 'file#'=0x9
p2: 'block#'=0x8a06
p3: 'class#'=0x1
[1 sample,                                                      11:12:57]
waited for 'gc cr block lost', seq_num: 11436
p1: ''=0x9
p2: ''=0x8a06
p3: ''=0x1
[1 sample,                                                      11:12:56]
waited for 'gc cr block lost', seq_num: 11434
p1: ''=0x9
p2: ''=0x8a06
p3: ''=0x1
[1 sample,                                                      11:12:55]
waited for 'gc cr block lost', seq_num: 11432
p1: ''=0x9
p2: ''=0x8a06
p3: ''=0x1
[1 sample,                                                      11:12:54]
waited for 'gc cr block lost', seq_num: 11430
p1: ''=0x9
p2: ''=0x8a06
p3: ''=0x1
[1 sample,                                                      11:12:53]
waited for 'gc cr block lost', seq_num: 11428
p1: ''=0x9
p2: ''=0x8a06
p3: ''=0x1
[1 sample,                                                      11:12:52]
waited for 'gc cr block lost', seq_num: 11426
p1: ''=0x9
p2: ''=0x8a06
p3: ''=0x1
[1 sample,                                                      11:12:51]
waited for 'gc cr block lost', seq_num: 11424
p1: ''=0x9
p2: ''=0x8a06
p3: ''=0x1
[1 sample,                                                      11:12:50]
waited for 'gc cr block lost', seq_num: 11422
p1: ''=0x9
p2: ''=0x8a06
p3: ''=0x1
[1 sample,                                                      11:12:49]
waited for 'gc cr block lost', seq_num: 11420
p1: ''=0x9
p2: ''=0x8a06
p3: ''=0x1
[1 sample,                                                      11:12:48]
waited for 'gc cr block lost', seq_num: 11418
p1: ''=0x9
p2: ''=0x8a06
p3: ''=0x1
[1 sample,                                                      11:12:47]
waited for 'gc cr block lost', seq_num: 11416
p1: ''=0x9
p2: ''=0x8a06
p3: ''=0x1
[1 sample,                                                      11:12:46]
waited for 'gc cr block lost', seq_num: 11414
p1: ''=0x9
p2: ''=0x8a06
p3: ''=0x1
[1 sample,                                                      11:12:45]
waited for 'gc cr block lost', seq_num: 11412
p1: ''=0x9
p2: ''=0x8a06
p3: ''=0x1
[1 sample,                                                      11:12:44]
waited for 'gc cr block lost', seq_num: 11410
p1: ''=0x9
p2: ''=0x8a06
p3: ''=0x1
[1 sample,                                                      11:12:43]
waited for 'gc cr block lost', seq_num: 11409
p1: ''=0x9
p2: ''=0x8a06
p3: ''=0x1
[1 sample,                                                      11:12:42]
waited for 'gc cr block lost', seq_num: 11407
p1: ''=0x9
p2: ''=0x8a06
p3: ''=0x1
[1 sample,                                                      11:12:41]
waited for 'gc cr block lost', seq_num: 11405
p1: ''=0x9
p2: ''=0x8a06
p3: ''=0x1
[1 sample,                                                      11:12:40]
waited for 'gc cr block lost', seq_num: 11403
p1: ''=0x9
p2: ''=0x8a06
p3: ''=0x1
[1 sample,                                                      11:12:39]
waited for 'gc cr block lost', seq_num: 11401
p1: ''=0x9
p2: ''=0x8a06
p3: ''=0x1
[1 sample,                                                      11:12:38]
waited for 'gc cr block lost', seq_num: 11399
p1: ''=0x9
p2: ''=0x8a06
p3: ''=0x1
[1 sample,                                                      11:12:37]
waited for 'gc cr block lost', seq_num: 11397
p1: ''=0x9
p2: ''=0x8a06
p3: ''=0x1
[1 sample,                                                      11:12:36]
waited for 'gc cr block lost', seq_num: 11395
p1: ''=0x9
p2: ''=0x8a06
p3: ''=0x1
[1 sample,                                                      11:12:35]
waited for 'gc cr block lost', seq_num: 11393
p1: ''=0x9
p2: ''=0x8a06
p3: ''=0x1
[1 sample,                                                      11:12:34]
waited for 'gc cr block lost', seq_num: 11391
p1: ''=0x9
p2: ''=0x8a06
p3: ''=0x1
[1 sample,                                                      11:12:33]
waited for 'gc cr block lost', seq_num: 11389
p1: ''=0x9
p2: ''=0x8a06
p3: ''=0x1
[1 sample,                                                      11:12:32]
waited for 'gc cr block lost', seq_num: 11387
p1: ''=0x9
p2: ''=0x8a06
p3: ''=0x1
[1 sample,                                                      11:12:31]
waited for 'gc cr block lost', seq_num: 11385
p1: ''=0x9
p2: ''=0x8a06
p3: ''=0x1
[1 sample,                                                      11:12:30]
waited for 'gc cr block lost', seq_num: 11383
p1: ''=0x9
p2: ''=0x8a06
p3: ''=0x1
[1 sample,                                                      11:12:29]
waited for 'gc cr block lost', seq_num: 11381
p1: ''=0x9
p2: ''=0x8a06
p3: ''=0x1
[1 sample,                                                      11:12:28]
waited for 'gc cr block lost', seq_num: 11380
p1: ''=0x9
p2: ''=0x8a06
p3: ''=0x1
[1 sample,                                                      11:12:27]
waited for 'gc cr block lost', seq_num: 11378
p1: ''=0x9
p2: ''=0x8a06
p3: ''=0x1
[1 sample,                                                      11:12:26]
waited for 'gc cr block lost', seq_num: 11376
p1: ''=0x9
p2: ''=0x8a06
p3: ''=0x1
[1 sample,                                                      11:12:25]
waited for 'gc cr block lost', seq_num: 11375
p1: ''=0x9
p2: ''=0x8a06
p3: ''=0x1
[1 sample,                                                      11:12:24]
waited for 'gc cr block lost', seq_num: 11373
p1: ''=0x9
p2: ''=0x8a06
p3: ''=0x1
[1 sample,                                                      11:12:23]
waited for 'gc cr block lost', seq_num: 11372
p1: ''=0x9
p2: ''=0x8a06
p3: ''=0x1
[1 sample,                                                      11:12:22]
waited for 'gc cr block lost', seq_num: 11370
p1: ''=0x9
p2: ''=0x8a06
p3: ''=0x1
[1 sample,                                                      11:12:21]
waited for 'gc cr block lost', seq_num: 11368
p1: ''=0x9
p2: ''=0x8a06
p3: ''=0x1
[1 sample,                                                      11:12:20]
waited for 'gc cr block lost', seq_num: 11366
p1: ''=0x9
p2: ''=0x8a06
p3: ''=0x1
[1 sample,                                                      11:12:19]
waited for 'gc cr block lost', seq_num: 11364
p1: ''=0x9
p2: ''=0x8a06
p3: ''=0x1

大量的gc cr block lost,通常出现block lost都与网络丢包有关系。但是我反复查看了oswnetstat,并未发现异常,而且gc cr block lost只在gen1进程的等待中出现。

查看gen1的trace发现异常阶段有大量的如下输出:

*** 2023-05-07T08:34:12.831763+08:00 (PDBRES(3))
IPCLW:[0.30861]{-}[WAIT]:PROTO: [1683419652831702]ipclw_data_chunk_process:1165 Discarding msg with seq # 3515351735, expecting 544659863
*** 2023-05-07T08:34:13.347836+08:00 (PDBRES(3))
IPCLW:[0.30862]{-}[WAIT]:PROTO: [1683419653347772]ipclw_data_chunk_process:1165 Discarding msg with seq # 3515352003, expecting 544659863
IPCLW:[0.30863]{-}[WAIT]:PROTO: [1683419653863530]ipclw_data_chunk_process:1165 Discarding msg with seq # 3515352455, expecting 544659863
*** 2023-05-07T08:34:14.379122+08:00 (PDBRES(3))
IPCLW:[0.30864]{-}[WAIT]:PROTO: [1683419654379038]ipclw_data_chunk_process:1165 Discarding msg with seq # 3515352766, expecting 544659863
IPCLW:[0.30865]{-}[WAIT]:PROTO: [1683419654896105]ipclw_data_chunk_process:1165 Discarding msg with seq # 3515353098, expecting 544659863
*** 2023-05-07T08:34:15.409055+08:00 (PDBRES(3))
IPCLW:[0.30866]{-}[WAIT]:PROTO: [1683419655408955]ipclw_data_chunk_process:1165 Discarding msg with seq # 3515353384, expecting 544659863
IPCLW:[0.30867]{-}[WAIT]:PROTO: [1683419655926455]ipclw_data_chunk_process:1165 Discarding msg with seq # 3515353701, expecting 544659863
*** 2023-05-07T08:34:16.437884+08:00 (PDBRES(3))
IPCLW:[0.30868]{-}[WAIT]:PROTO: [1683419656437736]ipclw_data_chunk_process:1165 Discarding msg with seq # 3515353931, expecting 544659863
IPCLW:[0.30869]{-}[WAIT]:PROTO: [1683419656951721]ipclw_data_chunk_process:1165 Discarding msg with seq # 3515354273, expecting 544659863
*** 2023-05-07T08:34:17.473483+08:00 (PDBRES(3))
IPCLW:[0.30870]{-}[WAIT]:PROTO: [1683419657473412]ipclw_data_chunk_process:1165 Discarding msg with seq # 3515354587, expecting 544659863
IPCLW:[0.30871]{-}[WAIT]:PROTO: [1683419657986986]ipclw_data_chunk_process:1165 Discarding msg with seq # 3515354878, expecting 544659863

时间与gen1开始反复等待gc cr block lost吻合。搜索mos发现与Bug 34482043 : SOME “GC CR MULTI BLOCK MIXED” WAIT OUTLIERS FOLLOWED BY “IPCLW DISCARDING MSG” CAUSE APERFORMANCE REGRESSION IN 19C非常相似。

@ RELEASE NOTES:
@ ]] CR Multi block read performance degradation.
@ ]]
@ .
@ INTERNAL PROBLEM DESCRIPTION:
@ Discarding out of order messages in IPCLW causes CR lost block. The block
@ will be requested
@ again only after receiving the side channel message which by default was
@ sent 500 milliseconds later.
@
@ .
@ INTERNAL FIX DESCRIPTION:
@ Deliver out of order messages if they are received within a certain window
@ size.

workaround是升级到19.18。

gc cr block lost只有gen1进程的sql出现过,sql是同一个,我忘记记录了,但是我很清楚的记得与user$.SPARE6有关系,个人猜测可能禁用掉”_disable_last_successful_login_time”特性也可以。

over!!!

 


评论

发表回复

您的电子邮箱地址不会被公开。 必填项已用 * 标注