本案例来自东区某客户,数据库版本为rac 19.7。周日业务反应有大量业务阻塞,通过一线同事发来的wait chain信息可以看到大量的row cache lock(CID=16)和library cache lock,并且final blocking event是row cache lock。final block session是not wait的。
首先row cache lock在之前的一篇row cache lock分析中已经介绍过了基本分析思路,这里就不多说明了,直接进入正题http://www.minniebaby.tech/2021/10/27/%e6%a1%88%e4%be%8b%ef%bc%9a%e4%b8%80%e6%ac%a1row-cache-lock%e5%bc%82%e5%b8%b8%e5%a4%84%e7%90%86/
ash分析:
select instance_NUMBER,program,event,sql_id,sql_exec_start,min(sample_time),max(sample_time),seq#,count(*)*10 from dba_hist_active_sess_history where event='row cache lock' and p3=5 and sample_time>trunc(sysdate-1) group by instance_NUMBER,program,event,sql_id,sql_exec_start,seq# order by sql_exec_start; INSTANCE_NUMBER PROGRAM EVENT SQL_ID SQL_EXEC_START MIN(SAMPLE_TIME) MAX(SAMPLE_TIME) SEQ# COUNT(*)*10 --------------- ------------------------------ -------------------- ------------- ------------------- ------------------------------ ------------------------------ ---------- ----------- 2 oracle@prdres2 (J003) row cache lock b6usrg82hwsa3 2023-05-07 10:07:45 07-MAY-23 10.09.03.948 AM 07-MAY-23 04.14.06.719 PM 1239 21890 2 oracle@prdres2 (J003) row cache lock b6usrg82hwsa3 2023-05-07 10:07:45 07-MAY-23 04.14.16.729 PM 07-MAY-23 06.31.25.760 PM 1241 8230
final block session(INSTANCE_NUMBER=2、SID=2018)执行的是自动统计信息任务。以X模式请求dc_histogram_defs的row cache lock,并且持续了非常长时间的请求。一直持续到客户重启数据库。
*** 2023-05-07T10:55:08.609919+08:00 ((1)) (session) sid: 2018 ser: 26444 trans: 0x2b8ee7da0, creator: 0x310d9c5a8 flags: (0x8210041) USR/- flags2: (0x40009) -/-/INC flags_idl: (0x1) status: BSY/-/-/- kill: -/-/-/- DID: 0002-00B0-0003F5E70002-00B0-0003F5E6, short-term DID: txn branch: (nil) con_id/con_uid/con_name: 3/2615539819/PDBRES con_logonuid: 2615539819 con_logonid: 3 con_scuid: 1 con_scid: 1 edition#: 134 user#/name: 0/SYS oct: 170, prv: 0, sql: 0x19a891ae8, psql: 0x48a3146c0 stats: 0x1bdfa4d10, PX stats: 0x12f05d04 service name: SYS$USERS client details: O/S info: user: oracle, term: UNKNOWN, ospid: 392 machine: prdres2 program: oracle@prdres2 (J003) application name: DBMS_SCHEDULER, hash value=2478762354 action name: ORA$AT_OS_OPT_SY_5210, hash value=513032857 Current Wait Stack: 0: waiting for 'row cache lock' cache id=0x10, mode=0x0, request=0x5 pdb uid=0x9be5f86b, master hint=0x0 wait_id=1230 seq_num=1239 snap_id=1 wait times: snap=46 min 13 sec, exc=46 min 13 sec, total=46 min 13 sec wait times: max=infinite, heur=46 min 13 sec wait counts: calls=5570 os=5570 in_wait=1 iflags=0x215a2 There are 5 sessions blocked by this session. Dumping one waiter: inst: 2, sid: 3781, ser: 11134 wait event: 'library cache lock' p1: 'handle address'=0x18c056a30 p2: 'lock address'=0x17d3fdf58 p3: '100*mode+namespace'=0x1600010002 row_wait_obj#: 4294967295, block#: 0, row#: 0, file# 0 min_blocked_time: 97 secs, waiter_cache_ver: 49043 Wait State: fixed_waits=0 flags=0x22 boundary=(nil)/-1 Session Wait History: elapsed time of 0.000027 sec since current wait 0: waited for 'row cache lock' cache id=0x10, mode=0x0, request=0x5 pdb uid=0x9be5f86b, master hint=0x0 wait_id=1229 seq_num=1238 snap_id=1 wait times: snap=0.000211 sec, exc=0.000211 sec, total=0.000211 sec wait times: max=0.000000 sec wait counts: calls=2 os=2 occurred after 0.000012 sec of elapsed time 1: waited for 'KJC: Wait for msg sends to complete' msg=0x353bf7860, dest|rcvr=0x10000, mtype=0x6a wait_id=1228 seq_num=1237 snap_id=1 wait times: snap=0.000045 sec, exc=0.000045 sec, total=0.000045 sec wait times: max=5 min 1 sec wait counts: calls=2 os=2 occurred after 0.000004 sec of elapsed time 2: waited for 'DLM cross inst call completion' caller instance number=0x2, cluster incarnation number=0x7, request identifier=0x14f735f wait_id=1227 seq_num=1236 snap_id=1 wait times: snap=0.000004 sec, exc=0.000004 sec, total=0.000004 sec wait times: max=0.000000 sec wait counts: calls=0 os=0 occurred after 0.000021 sec of elapsed time 3: waited for 'row cache lock' cache id=0x10, mode=0x0, request=0x5 pdb uid=0x9be5f86b, master hint=0x0 wait_id=1226 seq_num=1235 snap_id=1 wait times: snap=0.000269 sec, exc=0.000269 sec, total=0.000269 sec wait times: max=0.000000 sec wait counts: calls=2 os=2 occurred after 0.000033 sec of elapsed time 4: waited for 'KJC: Wait for msg sends to complete' msg=0x353afe5a8, dest|rcvr=0x10000, mtype=0x6a wait_id=1225 seq_num=1234 snap_id=1 wait times: snap=0.000042 sec, exc=0.000042 sec, total=0.000042 sec wait times: max=5 min 1 sec wait counts: calls=2 os=2 occurred after 0.000013 sec of elapsed time 5: waited for 'DLM cross inst call completion' caller instance number=0x2, cluster incarnation number=0x7, request identifier=0x14f735e wait_id=1224 seq_num=1233 snap_id=1 wait times: snap=0.000011 sec, exc=0.000011 sec, total=0.000011 sec wait times: max=0.000000 sec wait counts: calls=0 os=0 occurred after 0.000064 sec of elapsed time 6: waited for 'row cache lock' cache id=0x10, mode=0x0, request=0x5 pdb uid=0x9be5f86b, master hint=0x0 wait_id=1223 seq_num=1232 snap_id=1 wait times: snap=0.000219 sec, exc=0.000219 sec, total=0.000219 sec wait times: max=0.000000 sec wait counts: calls=2 os=2 occurred after 0.000016 sec of elapsed time 7: waited for 'KJC: Wait for msg sends to complete' msg=0x353f0abe0, dest|rcvr=0x10000, mtype=0x6a wait_id=1222 seq_num=1231 snap_id=1 wait times: snap=0.000057 sec, exc=0.000057 sec, total=0.000057 sec wait times: max=5 min 1 sec wait counts: calls=2 os=2 occurred after 0.000007 sec of elapsed time 8: waited for 'DLM cross inst call completion' caller instance number=0x2, cluster incarnation number=0x7, request identifier=0x14f735d wait_id=1221 seq_num=1230 snap_id=1 wait times: snap=0.000004 sec, exc=0.000004 sec, total=0.000004 sec wait times: max=0.000000 sec wait counts: calls=0 os=0 occurred after 0.000035 sec of elapsed time 9: waited for 'row cache lock' cache id=0x10, mode=0x0, request=0x5 pdb uid=0x9be5f86b, master hint=0x0 wait_id=1220 seq_num=1229 snap_id=1 wait times: snap=0.000372 sec, exc=0.000372 sec, total=0.000372 sec wait times: max=0.000000 sec wait counts: calls=1 os=1 occurred after 0.000752 sec of elapsed time
10点55分已经等待了46分钟的row cache lock
- P1=16(dc_histogram_defs)
- P3=5(X模式)
SOC: 0x8cb9e628, type: row cache enqueues (111), map: 0x17bf35d28 state: LIVE (0x99fc), flags: INIT (0x1) row cache enqueue: count=1 session=0x351263950 object=0x197150cc0, request=X flag=03 WAI/TRC/-/-/-/-/-/- savepoint=0x4076e type=MULTI-INSTANCE instance locked=F handle=0x8cb9e6f8 row cache parent object: addr=0x197150cc0 cid=16(dc_histogram_defs) conid=3 conuid=2615539819 hash=a3de4515 typ=61 transaction=(nil) flags=00000002 inc=1, pdbinc=1 version=2 mtx version=15688 own=0x197150d90[0x197150d90,0x197150d90] wat=0x197150da0[0x8cb9e6a8,0x8cb9e6a8] mode=N req=X status=VALID/-/-/-/-/-/-/-/- KGH pinned instance lock=QQ 6a7e84e3 f2908b26 set=0, complete=FALSE set=1, complete=FALSE
hash=a3de4515这个是关键信息,用于后续分析的匹配和搜索。
那么为什么J003请求不到X模式的row cache lock呢?原因只能是当前有进程正在以S或者X模式持有了该队列。那么在ash blocking为空的情况下如何找出这个进程呢?
我的方法是使用hash=a3de4515去搜搜trace下面的所有trace看看能不能有发现,最终在diag trace中找到了持有的进程。
SOC: 0x3398f0508, type: row cache enqueues (111), map: 0x19faa3ba8 state: LIVE (0x99fc), flags: INIT (0x1) row cache enqueue: count=1 session=0x1c106b318 object=0x4d9d89fb0, mode=S flag=80 -/-/-/-/-/-/-/ILK savepoint=0x8fe64da type=MULTI-INSTANCE instance locked=T handle=0x3398f05d8 row cache parent object: addr=0x4d9d89fb0 cid=16(dc_histogram_defs) conid=3 conuid=2615539819 hash=a3de4515 typ=61 transaction=(nil) flags=00000002 inc=1, pdbinc=1 version=2 mtx version=698 own=0x4d9d8a080[0x3398f0588,0x3398f0588] wat=0x4d9d8a090[0x4d9d8a090,0x4d9d8a090] mode=S req=N status=VALID/-/-/-/-/-/-/-/- KGH pinned instance lock=QQ 6a7e84e3 f2908b26 set=0, complete=FALSE set=1, complete=FALSE
hash和instance lock和请求的完全匹配,通过SO和OWNER,往上搜索可以定位到process为GEN1,再通过process的SO可以找到gen1的session state信息。
SO: 0x1bff52c30, type: session (4), map: 0x1c106b318 state: LIVE (0x4532), flags: 0x1 owner: 0x1bfee0180, proc: 0x1bfee0180 link: 0x1bff52c50[0x1bf6fb2a8, 0x1bf6fb228] child list count: 7, link: 0x1bff52ca0[0x19faa3a48, 0x1bd6d1020] conid: 1, conuid: 1, SGA version=(1,0), pg: 0 SOC: 0x1c106b318, type: session (4), map: 0x1bff52c30 state: LIVE (0x99fc), flags: INIT (0x1) (session) sid: 631 ser: 35354 trans: (nil), creator: 0x1b0ea3d88 flags: (0x51) USR/- flags2: (0x409) -/-/INC flags_idl: (0x1) status: BSY/-/-/- kill: -/-/-/- 0001-000A-000000020001-000A-00000003 DID: , short-term DID: txn branch: (nil) con_id/con_uid/con_name: 3/2615539819/PDBRES con_logonuid: 1 con_logonid: 1 con_scuid: 1 con_scid: 1 edition#: 134 user#/name: 0/SYS oct: 3, prv: 0, sql: 0x3398059e8, psql: (nil) stats: 0x1dfe645a0, PX stats: 0x12f05d04 service name: SYS$BACKGROUND Current Wait Stack: 0: waiting for 'gc cr request' file#=0x9, block#=0x8a06, class#=0x1 wait_id=50342261 seq_num=11438 snap_id=1 wait times: snap=0.147084 sec, exc=0.147084 sec, total=0.147084 sec wait times: max=infinite, heur=0.147084 sec wait counts: calls=4 os=4 in_wait=1 iflags=0x5a0 Wait State: fixed_waits=0 flags=0x22 boundary=(nil)/-1 Session Wait History: elapsed time of 0.000099 sec since current wait 0: waited for 'gc cr block lost' =0x9, =0x8a06, =0x1 wait_id=50342260 seq_num=11437 snap_id=1 wait times: snap=0.512788 sec, exc=0.512788 sec, total=0.512788 sec wait times: max=infinite wait counts: calls=5 os=5 occurred after 0.000090 sec of elapsed time 1: waited for 'gc cr block lost' =0x9, =0x8a06, =0x1 wait_id=50342259 seq_num=11436 snap_id=1 wait times: snap=0.515190 sec, exc=0.515190 sec, total=0.515190 sec wait times: max=infinite wait counts: calls=5 os=5 occurred after 0.000062 sec of elapsed time 2: waited for 'gc cr block lost' =0x9, =0x8a06, =0x1 wait_id=50342258 seq_num=11435 snap_id=1 wait times: snap=0.518847 sec, exc=0.518847 sec, total=0.518847 sec wait times: max=infinite wait counts: calls=5 os=5 occurred after 0.000065 sec of elapsed time 3: waited for 'gc cr block lost' =0x9, =0x8a06, =0x1 wait_id=50342257 seq_num=11434 snap_id=1 wait times: snap=0.517242 sec, exc=0.517242 sec, total=0.517242 sec wait times: max=infinite wait counts: calls=5 os=5 occurred after 0.000060 sec of elapsed time 4: waited for 'gc cr block lost' =0x9, =0x8a06, =0x1 wait_id=50342256 seq_num=11433 snap_id=1 wait times: snap=0.512952 sec, exc=0.512952 sec, total=0.512952 sec wait times: max=infinite wait counts: calls=5 os=5 occurred after 0.000060 sec of elapsed time 5: waited for 'gc cr block lost' =0x9, =0x8a06, =0x1 wait_id=50342255 seq_num=11432 snap_id=1 wait times: snap=0.515456 sec, exc=0.515456 sec, total=0.515456 sec wait times: max=infinite wait counts: calls=5 os=5 occurred after 0.000074 sec of elapsed time 6: waited for 'gc cr block lost' =0x9, =0x8a06, =0x1 wait_id=50342254 seq_num=11431 snap_id=1 wait times: snap=0.515304 sec, exc=0.515304 sec, total=0.515304 sec wait times: max=infinite wait counts: calls=5 os=5 occurred after 0.000063 sec of elapsed time 7: waited for 'gc cr block lost' =0x9, =0x8a06, =0x1 wait_id=50342253 seq_num=11430 snap_id=1 wait times: snap=0.517334 sec, exc=0.517334 sec, total=0.517334 sec wait times: max=infinite wait counts: calls=5 os=5 occurred after 0.000057 sec of elapsed time 8: waited for 'gc cr block lost' =0x9, =0x8a06, =0x1 wait_id=50342252 seq_num=11429 snap_id=1 wait times: snap=0.517039 sec, exc=0.517039 sec, total=0.517039 sec wait times: max=infinite wait counts: calls=5 os=5 occurred after 0.000067 sec of elapsed time 9: waited for 'gc cr block lost' =0x9, =0x8a06, =0x1 wait_id=50342251 seq_num=11428 snap_id=1 wait times: snap=0.512986 sec, exc=0.512986 sec, total=0.512986 sec wait times: max=infinite wait counts: calls=5 os=5 occurred after 0.000066 sec of elapsed time Sampled Session History of session 631 serial 35354 --------------------------------------------------- The sampled session history is constructed by sampling the target session every 1 second. The sampling process captures at each sample if the session is in a non-idle wait, an idle wait, or not in a wait. If the session is in a non-idle wait then one interval is shown for all the samples the session was in the same non-idle wait. If the session is in an idle wait or not in a wait for consecutive samples then one interval is shown for all the consecutive samples. Though we display these consecutive samples in a single interval the session may NOT be continuously idle or not in a wait (the sampling process does not know). The history is displayed in reverse chronological order. sample interval: 1 sec, max history 120 sec --------------------------------------------------- [1 sample, 11:12:58] waited for 'gc cr request', seq_num: 11438 p1: 'file#'=0x9 p2: 'block#'=0x8a06 p3: 'class#'=0x1 [1 sample, 11:12:57] waited for 'gc cr block lost', seq_num: 11436 p1: ''=0x9 p2: ''=0x8a06 p3: ''=0x1 [1 sample, 11:12:56] waited for 'gc cr block lost', seq_num: 11434 p1: ''=0x9 p2: ''=0x8a06 p3: ''=0x1 [1 sample, 11:12:55] waited for 'gc cr block lost', seq_num: 11432 p1: ''=0x9 p2: ''=0x8a06 p3: ''=0x1 [1 sample, 11:12:54] waited for 'gc cr block lost', seq_num: 11430 p1: ''=0x9 p2: ''=0x8a06 p3: ''=0x1 [1 sample, 11:12:53] waited for 'gc cr block lost', seq_num: 11428 p1: ''=0x9 p2: ''=0x8a06 p3: ''=0x1 [1 sample, 11:12:52] waited for 'gc cr block lost', seq_num: 11426 p1: ''=0x9 p2: ''=0x8a06 p3: ''=0x1 [1 sample, 11:12:51] waited for 'gc cr block lost', seq_num: 11424 p1: ''=0x9 p2: ''=0x8a06 p3: ''=0x1 [1 sample, 11:12:50] waited for 'gc cr block lost', seq_num: 11422 p1: ''=0x9 p2: ''=0x8a06 p3: ''=0x1 [1 sample, 11:12:49] waited for 'gc cr block lost', seq_num: 11420 p1: ''=0x9 p2: ''=0x8a06 p3: ''=0x1 [1 sample, 11:12:48] waited for 'gc cr block lost', seq_num: 11418 p1: ''=0x9 p2: ''=0x8a06 p3: ''=0x1 [1 sample, 11:12:47] waited for 'gc cr block lost', seq_num: 11416 p1: ''=0x9 p2: ''=0x8a06 p3: ''=0x1 [1 sample, 11:12:46] waited for 'gc cr block lost', seq_num: 11414 p1: ''=0x9 p2: ''=0x8a06 p3: ''=0x1 [1 sample, 11:12:45] waited for 'gc cr block lost', seq_num: 11412 p1: ''=0x9 p2: ''=0x8a06 p3: ''=0x1 [1 sample, 11:12:44] waited for 'gc cr block lost', seq_num: 11410 p1: ''=0x9 p2: ''=0x8a06 p3: ''=0x1 [1 sample, 11:12:43] waited for 'gc cr block lost', seq_num: 11409 p1: ''=0x9 p2: ''=0x8a06 p3: ''=0x1 [1 sample, 11:12:42] waited for 'gc cr block lost', seq_num: 11407 p1: ''=0x9 p2: ''=0x8a06 p3: ''=0x1 [1 sample, 11:12:41] waited for 'gc cr block lost', seq_num: 11405 p1: ''=0x9 p2: ''=0x8a06 p3: ''=0x1 [1 sample, 11:12:40] waited for 'gc cr block lost', seq_num: 11403 p1: ''=0x9 p2: ''=0x8a06 p3: ''=0x1 [1 sample, 11:12:39] waited for 'gc cr block lost', seq_num: 11401 p1: ''=0x9 p2: ''=0x8a06 p3: ''=0x1 [1 sample, 11:12:38] waited for 'gc cr block lost', seq_num: 11399 p1: ''=0x9 p2: ''=0x8a06 p3: ''=0x1 [1 sample, 11:12:37] waited for 'gc cr block lost', seq_num: 11397 p1: ''=0x9 p2: ''=0x8a06 p3: ''=0x1 [1 sample, 11:12:36] waited for 'gc cr block lost', seq_num: 11395 p1: ''=0x9 p2: ''=0x8a06 p3: ''=0x1 [1 sample, 11:12:35] waited for 'gc cr block lost', seq_num: 11393 p1: ''=0x9 p2: ''=0x8a06 p3: ''=0x1 [1 sample, 11:12:34] waited for 'gc cr block lost', seq_num: 11391 p1: ''=0x9 p2: ''=0x8a06 p3: ''=0x1 [1 sample, 11:12:33] waited for 'gc cr block lost', seq_num: 11389 p1: ''=0x9 p2: ''=0x8a06 p3: ''=0x1 [1 sample, 11:12:32] waited for 'gc cr block lost', seq_num: 11387 p1: ''=0x9 p2: ''=0x8a06 p3: ''=0x1 [1 sample, 11:12:31] waited for 'gc cr block lost', seq_num: 11385 p1: ''=0x9 p2: ''=0x8a06 p3: ''=0x1 [1 sample, 11:12:30] waited for 'gc cr block lost', seq_num: 11383 p1: ''=0x9 p2: ''=0x8a06 p3: ''=0x1 [1 sample, 11:12:29] waited for 'gc cr block lost', seq_num: 11381 p1: ''=0x9 p2: ''=0x8a06 p3: ''=0x1 [1 sample, 11:12:28] waited for 'gc cr block lost', seq_num: 11380 p1: ''=0x9 p2: ''=0x8a06 p3: ''=0x1 [1 sample, 11:12:27] waited for 'gc cr block lost', seq_num: 11378 p1: ''=0x9 p2: ''=0x8a06 p3: ''=0x1 [1 sample, 11:12:26] waited for 'gc cr block lost', seq_num: 11376 p1: ''=0x9 p2: ''=0x8a06 p3: ''=0x1 [1 sample, 11:12:25] waited for 'gc cr block lost', seq_num: 11375 p1: ''=0x9 p2: ''=0x8a06 p3: ''=0x1 [1 sample, 11:12:24] waited for 'gc cr block lost', seq_num: 11373 p1: ''=0x9 p2: ''=0x8a06 p3: ''=0x1 [1 sample, 11:12:23] waited for 'gc cr block lost', seq_num: 11372 p1: ''=0x9 p2: ''=0x8a06 p3: ''=0x1 [1 sample, 11:12:22] waited for 'gc cr block lost', seq_num: 11370 p1: ''=0x9 p2: ''=0x8a06 p3: ''=0x1 [1 sample, 11:12:21] waited for 'gc cr block lost', seq_num: 11368 p1: ''=0x9 p2: ''=0x8a06 p3: ''=0x1 [1 sample, 11:12:20] waited for 'gc cr block lost', seq_num: 11366 p1: ''=0x9 p2: ''=0x8a06 p3: ''=0x1 [1 sample, 11:12:19] waited for 'gc cr block lost', seq_num: 11364 p1: ''=0x9 p2: ''=0x8a06 p3: ''=0x1
大量的gc cr block lost,通常出现block lost都与网络丢包有关系。但是我反复查看了oswnetstat,并未发现异常,而且gc cr block lost只在gen1进程的等待中出现。
查看gen1的trace发现异常阶段有大量的如下输出:
*** 2023-05-07T08:34:12.831763+08:00 (PDBRES(3)) IPCLW:[0.30861]{-}[WAIT]:PROTO: [1683419652831702]ipclw_data_chunk_process:1165 Discarding msg with seq # 3515351735, expecting 544659863 *** 2023-05-07T08:34:13.347836+08:00 (PDBRES(3)) IPCLW:[0.30862]{-}[WAIT]:PROTO: [1683419653347772]ipclw_data_chunk_process:1165 Discarding msg with seq # 3515352003, expecting 544659863 IPCLW:[0.30863]{-}[WAIT]:PROTO: [1683419653863530]ipclw_data_chunk_process:1165 Discarding msg with seq # 3515352455, expecting 544659863 *** 2023-05-07T08:34:14.379122+08:00 (PDBRES(3)) IPCLW:[0.30864]{-}[WAIT]:PROTO: [1683419654379038]ipclw_data_chunk_process:1165 Discarding msg with seq # 3515352766, expecting 544659863 IPCLW:[0.30865]{-}[WAIT]:PROTO: [1683419654896105]ipclw_data_chunk_process:1165 Discarding msg with seq # 3515353098, expecting 544659863 *** 2023-05-07T08:34:15.409055+08:00 (PDBRES(3)) IPCLW:[0.30866]{-}[WAIT]:PROTO: [1683419655408955]ipclw_data_chunk_process:1165 Discarding msg with seq # 3515353384, expecting 544659863 IPCLW:[0.30867]{-}[WAIT]:PROTO: [1683419655926455]ipclw_data_chunk_process:1165 Discarding msg with seq # 3515353701, expecting 544659863 *** 2023-05-07T08:34:16.437884+08:00 (PDBRES(3)) IPCLW:[0.30868]{-}[WAIT]:PROTO: [1683419656437736]ipclw_data_chunk_process:1165 Discarding msg with seq # 3515353931, expecting 544659863 IPCLW:[0.30869]{-}[WAIT]:PROTO: [1683419656951721]ipclw_data_chunk_process:1165 Discarding msg with seq # 3515354273, expecting 544659863 *** 2023-05-07T08:34:17.473483+08:00 (PDBRES(3)) IPCLW:[0.30870]{-}[WAIT]:PROTO: [1683419657473412]ipclw_data_chunk_process:1165 Discarding msg with seq # 3515354587, expecting 544659863 IPCLW:[0.30871]{-}[WAIT]:PROTO: [1683419657986986]ipclw_data_chunk_process:1165 Discarding msg with seq # 3515354878, expecting 544659863
时间与gen1开始反复等待gc cr block lost吻合。搜索mos发现与Bug 34482043 : SOME “GC CR MULTI BLOCK MIXED” WAIT OUTLIERS FOLLOWED BY “IPCLW DISCARDING MSG” CAUSE APERFORMANCE REGRESSION IN 19C非常相似。
@ RELEASE NOTES: @ ]] CR Multi block read performance degradation. @ ]] @ . @ INTERNAL PROBLEM DESCRIPTION: @ Discarding out of order messages in IPCLW causes CR lost block. The block @ will be requested @ again only after receiving the side channel message which by default was @ sent 500 milliseconds later. @ @ . @ INTERNAL FIX DESCRIPTION: @ Deliver out of order messages if they are received within a certain window @ size.
workaround是升级到19.18。
gc cr block lost只有gen1进程的sql出现过,sql是同一个,我忘记记录了,但是我很清楚的记得与user$.SPARE6有关系,个人猜测可能禁用掉”_disable_last_successful_login_time”特性也可以。
over!!!
发表回复