加入收藏 | 设为首页 | 会员中心 | 我要投稿 PHP编程网 - 黄冈站长网 (http://www.0713zz.com/)- 数据应用、建站、人体识别、智能机器人、语音技术!
当前位置: 首页 > 百科 > 正文

log file sync等待超高案例浅析

发布时间:2020-12-31 02:45:29 所属栏目:百科 来源:网络整理
导读:监控工具DPA发现海外一台Oracle数据库服务器DB Commit Time指标告警,超过红色告警线(40毫秒左右,黄色告警是10毫秒,红色告警线是20毫秒),如下截图所示,生成了对应的时段的AWR报告,发现Top 5 Timed Events里面,log file sync等待事件的平均等待时间

监控工具DPA发现海外一台Oracle数据库服务器DB Commit Time指标告警,超过红色告警线(40毫秒左右,黄色告警是10毫秒,红色告警线是20毫秒),如下截图所示,生成了对应的时段的AWR报告,发现Top 5 Timed Events里面,log file sync等待事件的平均等待时间为37毫秒,log file parallel write等待事件的平均等待时间为40毫秒

?

log file sync等待超高案例浅析

?

log file sync等待超高案例浅析

?

?

如果对Tanel Poder::Understanding LGWR,Log File Sync Waits and Commit Performance这篇文章所讲述的内容很熟悉的话(经典图如下),那么通过等待事件log file sync与log file parallel write的Avg Wait(ms)指标: 37ms & 40ms,基本上可以判断就是redo log所在的磁盘I/O出现了性能问题

?

?

log file sync等待超高案例浅析

?

?

然后在用lfsdiag.sql脚本分析一下详细的统计数据,如下所示:

?

log file sync等待超高案例浅析

?

?

?

HISTOGRAM DATA?FOR LFS AND OTHER RELATED WAITS:
?
APPROACH: Look at the wait distribution for log file sync waits
by looking at "wait_time_milli". Look at the high wait times then
see if you can correlate those with other related wait events.
?? INST_ID EVENT??????????????????????????????????? WAIT_TIME_MILLI WAIT_COUNT
---------- ---------------------------------------- --------------- ----------
???????? 1 log file sync????????????????????????????????????????? 1?????? 4363
???????? 1 log file sync????????????????????????????????????????? 2??????? 835
???????? 1 log file sync????????????????????????????????????????? 4?????? 1650
???????? 1 log file sync????????????????????????????????????????? 8?????? 4937
???????? 1 log file sync???????????????????????????????????????? 16???? 146252
???????? 1 log file sync???????????????????????????????????????? 32???? 606674
???????? 1 log file sync???????????????????????????????????????? 64???? 263377
???????? 1 log file sync??????????????????????????????????????? 128???? 253254
???????? 1 log file sync??????????????????????????????????????? 256????????? 2
???????? 1 log file switch completion???????????????????????????? 1??????? 124
???????? 1 log file switch completion???????????????????????????? 2????????? 9
???????? 1 log file switch completion???????????????????????????? 4???????? 19
???????? 1 log file switch completion???????????????????????????? 8???????? 21
???????? 1 log file switch completion??????????????????????????? 16???????? 35
???????? 1 log file switch completion??????????????????????????? 32???????? 97
???????? 1 log file switch completion??????????????????????????? 64??????? 133
???????? 1 log file switch completion?????????????????????????? 128??????? 326
???????? 1 log file switch completion?????????????????????????? 256?????? 1736
???????? 1 log file switch completion?????????????????????????? 512?????? 3042
???????? 1 log file switch completion????????????????????????? 1024?????? 2020
???????? 1 log file parallel write??????????????????????????????? 1????????? 0
???????? 1 log file parallel write??????????????????????????????? 2????????? 0
???????? 1 log file parallel write??????????????????????????????? 4???????? 80
???????? 1 log file parallel write??????????????????????????????? 8?????? 2142
???????? 1 log file parallel write?????????????????????????????? 16???? 170987
???????? 1 log file parallel write?????????????????????????????? 32???? 779205
???????? 1 log file parallel write?????????????????????????????? 64???? 311463
???????? 1 log file parallel write????????????????????????????? 128????? 79688
???????? 1 log file parallel write????????????????????????????? 256????? 42763
???????? 1 log file parallel write????????????????????????????? 512????? 13052
???????? 1 log file parallel write???????????????????????????? 1024????? 20468
???????? 1 log file parallel write???????????????????????????? 2048????? 14020
???????? 1 log file parallel write???????????????????????????? 4096??????? 921
???????? 1 log file parallel write???????????????????????????? 8192???????? 96
???????? 1 log file parallel write??????????????????????????? 16384???????? 18
???????? 1 log file parallel write??????????????????????????? 32768???????? 18
???????? 1 log file parallel write??????????????????????????? 65536????????? 8
???????? 1 log file parallel write?????????????????????????? 131072????????? 2
???????? 1 LGWR wait for redo copy??????????????????????????????? 1?????? 8516
???????? 1 LGWR wait for redo copy??????????????????????????????? 2???????? 20
???????? 1 LGWR wait for redo copy??????????????????????????????? 4???????? 19
???????? 1 LGWR wait for redo copy??????????????????????????????? 8???????? 20
???????? 1 LGWR wait for redo copy?????????????????????????????? 16???????? 11
ORDERED BY WAIT_TIME_MILLI
???????? 1 LGWR wait for redo copy??????????????????????????????? 1?????? 8516
???????? 1 log file sync????????????????????????????????????????? 2??????? 835
???????? 1 log file switch completion???????????????????????????? 2????????? 9
???????? 1 log file parallel write??????????????????????????????? 2????????? 0
???????? 1 LGWR wait for redo copy??????????????????????????????? 4???????? 19
???????? 1 log file sync????????????????????????????????????????? 8?????? 4937
???????? 1 log file switch completion???????????????????????????? 8???????? 21
???????? 1 log file parallel write??????????????????????????????? 8?????? 2142
???????? 1 LGWR wait for redo copy?????????????????????????????? 16???????? 11
???????? 1 log file sync???????????????????????????????????????? 32???? 606674
???????? 1 log file switch completion??????????????????????????? 32???????? 97
???????? 1 log file parallel write?????????????????????????????? 32???? 779205
???????? 1 log file sync???????????????????????????????????????? 64???? 263377
???????? 1 log file switch completion??????????????????????????? 64??????? 133
???????? 1 log file parallel write?????????????????????????????? 64???? 311463
???????? 1 log file sync??????????????????????????????????????? 128???? 253254
???????? 1 log file switch completion?????????????????????????? 128??????? 326
???????? 1 log file parallel write????????????????????????????? 128????? 79688
???????? 1 log file sync??????????????????????????????????????? 256????????? 2
???????? 1 log file switch completion?????????????????????????? 256?????? 1736
???????? 1 log file parallel write????????????????????????????? 256????? 42763
???????? 1 log file switch completion????????????????????????? 1024?????? 2020
???????? 1 log file parallel write???????????????????????????? 1024????? 20468
???????? 1 log file parallel write???????????????????????????? 2048????? 14020
???????? 1 log file parallel write???????????????????????????? 4096??????? 921
???????? 1 log file parallel write???????????????????????????? 8192???????? 96
???????? 1 log file parallel write??????????????????????????? 16384???????? 18
???????? 1 log file parallel write??????????????????????????? 32768???????? 18
???????? 1 log file parallel write??????????????????????????? 65536????????? 8
???????? 1 log file parallel write?????????????????????????? 131072????????? 2
REDO WRITE STATS
"redo write?time" in centiseconds (100 per second)
11.1: "redo write broadcast ack time" in centiseconds (100 per second)
11.2: "redo write broadcast ack time" in microseconds (1000 per millisecond)
VERSION????????????? INST_ID NAME???????????????????????????????????????????????????? VALUE??????? MILLISECONDS
----------------- ---------- ---------------------------------------- --------------------- -------------------
10.2.0.5.0???????????????? 1 redo write time??????????????????????????????????????? 9551524??????? 95515240.000
10.2.0.5.0???????????????? 1 redo writer latching time?????????????????????????????????? 51
10.2.0.5.0???????????????? 1 redo writes??????????????????????????????????????????? 1434931
AWR WORST AVG LOG FILE SYNC SNAPS:

?

上面数据可以看到,log file sync等待事件数量最多的是32ms这个区间的,log file parallel write等待事件发生最多的也是32ms这个区间的,其实这个值已经远远超过7ms,极其不正常。log file parallel write?事件是LGWR进程专属的等待事件,发生在LGWR将log_buffer中的重做日志信息写入联机重做日志文件组的成员文件,LGWR在该事件上等待该写入过程的完成。该事件等待时间过长,说明日志文件所在磁盘缓慢或存在争用。log file sync和log file parallel write是相互关联的。换句话讲,假设log file parallel write的时间非常长,那么必定导致log file sync等待时间拉长。如果log file parallel write 等待非常高,那么可能一般是物理磁盘I/O的问题

?

另外,我们也检查了一下redo log的切换频率,如下所示,redo log切换的次数并不频繁,生成的归档日志的量也并不大。大部分时候一小时切换零次或一次。

?

然后我们找了一台机器(上述指标正常的服务器)简单对测试了一下IO的速度,这个方法极其简单,就是看看生成一个大文件需要多长时间,简单测试一下I/O性能(没有考虑cache等,测试采样也不详尽),但是对比数据也基本能验证、反馈磁盘IO存在问题)。

?

?

?

问题服务器:

?

# time dd if=/dev/zero of=./test bs=512k count=2048 oflag=direct

2048+0 records in

2048+0 records out

1073741824 bytes (1.1 GB) copied,88.271 seconds,12.2 MB/s

?

real??? 1m28.273s

user??? 0m0.010s

sys???? 0m0.655s

?

?

?

对比服务器(正常的服务器):

?

?

?

?

# time dd if=/dev/zero of=./test bs=512k count=2048 oflag=direct

2048+0 records in

2048+0 records out

1073741824 bytes (1.1 GB) copied,2.48344 seconds,432 MB/s

?

real??? 0m2.485s

user??? 0m0.004s

sys???? 0m0.386s

?

?

? 如上对比所示,两台服务器生成同样一个大小文件,耗费的时间,I/O性能差别非常大,完全验证了告警的服务器所在的存储I/O存在性能问题,但是公司分工非常明确,DBA也不清楚底层存储出了什么问题,只能将这个问题反馈出来,等待海外负责维护系统和存储的同事的回复。

(编辑:PHP编程网 - 黄冈站长网)

【声明】本站内容均来自网络,其相关言论仅代表作者个人观点,不代表本站立场。若无意侵犯到您的权利,请及时与联系站长删除相关内容!

    热点阅读