Ai
1 Star 1 Fork 0

冯际成/Oracle_SQL_Scripts

加入 Gitee
与超过 1200万 开发者一起发现、参与优秀开源项目,私有仓库也完全免费 :)
免费加入
文件
该仓库未声明开源许可证文件(LICENSE),使用请关注具体项目描述及其代码上游依赖。
克隆/下载
latchprof_output.txt 11.69 KB
一键复制 编辑 原始数据 按行查看 历史
Abdul mohammed 提交于 2016-02-06 08:58 +08:00 . Fri Feb 5 18:58:16 CST 2016
LatchProf takes four parameters:
Parameter 1 specifies which columns from V$LATCHHOLDER to report and group by. In the case below I just want to report latch holds by latch name (and not even break it down by SID for starters).
Parameter 2 specifies which SIDs to monitor. In the case below, I am interested in any SID which holds a latch (%).
Parameter 3 specifies which latches to monitor. This can be set either to latch name or latch address in memory. In the case below, I monitor all latches (%).
Parameter 4 specifies how many times to sample V$LATCHHOLDER. I use 100000 samples below, which completed in a couple of seconds on my test database. The sampling speed depends on your server CPU/memory bus speed and the value of processes parameter. You should start from lower number like 1000 and adjust it so that LatchProf would complete its sampling in a couple of seconds, and that is usually enough for diagnosing ongoing latch contention problems. You shouldn't keep sampling for long periods since LatchProf runs constantly on the CPU.
Just for demo purposes I start with listing general latchholder stats (not broken down by holding SID):
SQL> @latchprof name % % 100000
NAME Held Gets Held % Held ms Avg hold ms
------------------------------ ---------- ---------- ------- ----------- -----------
cache buffers chains 21248 17418 21.25 267.725 .015
simulator lru latch 7192 7169 7.19 90.619 .013
simulator hash latch 66 66 .07 .832 .013
enqueue hash chains 4 4 .00 .050 .013
ges resource hash list 3 3 .00 .038 .013
process allocation 2 2 .00 .025 .013
library cache 2 2 .00 .025 .013
name-service namespace bucket 2 2 .00 .025 .013
name-service memory objects 1 1 .00 .013 .013
ges caches resource lists 1 1 .00 .013 .013
library cache pin 1 1 .00 .013 .013
11 rows selected.
The output column meanings are as follows:
Name - Latch name
Held - During how many samples out of total samples (100000) the particular latch was held by somebody
Gets - How many latch gets against that latch were detected during LatchProf sampling
Held % - How much % of time was the latch held by somebody during the sampling. This is the main column you want to be looking at in order to see who/what holds the latch the most (the latchprof output is reverse-ordered by that column)
Held ms - How many milliseconds in total was the latch held during the sampling
Avg hold ms - Average latch hold time in milliseconds (normally latches are held from a few to few hundred microseconds)
LatchProf allows us to break down the latch-holder statistics by SID,
and because I am interested only in the library cache latches for now, I can
monitor only this latch (as shown in bold below):
SQL> @latchprof sid,name % library cache 100000
-- LatchProf 1.20 by Tanel Poder ( http://www.tanelpoder.com )
SID NAME Held Gets Held % Avg hold ms
---------- -------------------- ---------- ---------- ------- -------------
120 library cache 5059 4618 5.06 .037
120 library cache lock 189 188 .19 .034
120 library cache pin 135 135 .14 .034
139 library cache lock 131 131 .13 .034
139 library cache 109 109 .11 .034
113 library cache lock 60 60 .06 .034
113 library cache 48 48 .05 .034
7 rows selected.
lets break down the individual latch holding sessions now:
SQL> @latchprof sid,name % % 100000
SID NAME Held Gets Held % Held ms Avg hold ms
---------- ------------------------------ ---------- ---------- ------- ----------- -----------
81 cache buffers chains 14916 14915 14.92 187.942 .013
81 simulator lru latch 6797 6745 6.80 85.642 .013
85 cache buffers chains 6195 2284 6.20 78.057 .034
85 simulator lru latch 524 506 .52 6.602 .013
81 simulator hash latch 121 85 .12 1.525 .018
112 name-service namespace bucket 2 2 .00 .025 .013
100 enqueue hash chains 2 2 .00 .025 .013
98 library cache 2 2 .00 .025 .013
112 process allocation 2 2 .00 .025 .013
112 ges caches resource lists 1 1 .00 .013 .013
112 enqueues 1 1 .00 .013 .013
104 messages 1 1 .00 .013 .013
100 cache buffers chains 1 1 .00 .013 .013
100 active service list 1 1 .00 .013 .013
100 enqueues 1 1 .00 .013 .013
85 simulator hash latch 1 1 .00 .013 .013
100 library cache 1 1 .00 .013 .013
17 rows selected.
Lets include latch address (child latch address) into picture and query only cache related latches. This returns lots of rows, so I normally press CTRL+C after first pageful of results as they are most important (the results are ordered by most busy ones first):
SQL> @latchprof sid,name,laddr % cache 100000
SID NAME LADDR Held Gets Held % Held ms Avg hold ms
---------- ------------------------------ -------- ---------- ---------- ------- ----------- -----------
81 cache buffers chains 41AACEEC 15061 15017 15.06 186.756 .012
85 cache buffers chains 41B2C37C 34 34 .03 .422 .012
85 cache buffers chains 41B85D64 31 31 .03 .384 .012
85 cache buffers chains 41BCF7FC 30 30 .03 .372 .012
85 cache buffers chains 41B415EC 28 28 .03 .347 .012
85 cache buffers chains 41BCA658 25 25 .03 .310 .012
85 cache buffers chains 41B4E738 25 25 .03 .310 .012
85 cache buffers chains 41AE4694 25 25 .03 .310 .012
85 cache buffers chains 41AB02E0 25 9 .03 .310 .034
85 cache buffers chains 427FA1F8 24 24 .02 .298 .012
85 cache buffers chains 427F6E04 22 22 .02 .273 .012
85 cache buffers chains 41B78D94 22 22 .02 .273 .012
85 cache buffers chains 41BA2220 21 21 .02 .260 .012
85 cache buffers chains 41BEDA68 21 21 .02 .260 .012
85 cache buffers chains 41BC8D1C 21 21 .02 .260 .012
85 cache buffers chains 41B486BC 20 20 .02 .248 .012
85 cache buffers chains 42B99698 19 19 .02 .236 .012
85 cache buffers chains 41AE6EA8 19 14 .02 .236 .017
85 cache buffers chains 41B1F230 19 19 .02 .236 .012
85 cache buffers chains 41B09E44 19 17 .02 .236 .014
85 cache buffers chains 41AE7A88 19 19 .02 .236 .012
85 cache buffers chains 41B69E98 18 18 .02 .223 .012
[...snip...]
And voila, we see SID 81 is hitting one single child latch in maniac fashion while SID 85 is behaving more normally, just hitting lots of different CBC latches (you see the number of gets of a single child during 100k sampling doesnt exceed 34). Interesting.
So now that we know the child latch address (LADDR), we can refine our query to only monitor that child latch to see if any other sessions could be affected:
SQL> @latchprof sid,name,laddr % 41AACEEC 100000
SID NAME LADDR Held Gets Held % Held ms Avg hold ms
---------- ------------------------------ -------- ---------- ---------- ------- ----------- -----------
81 cache buffers chains 41AACEEC 14058 14047 14.06 177.131 .013
85 cache buffers chains 41AACEEC 18 18 .02 .227 .013
As of now it looks like noone else except the maniac SID 81 and victim SID 85 contend for the latch (otherwise one would expect to see at least few successful holds against that latch).
So, as SID 81 is so evidently the troublemaker here, lets check what its doing:
SQL> select sql_text from v$sql where hash_value = (select sql_hash_value from v$session where sid = 81);
SQL_TEXT
---------------------------------------------------------------------------------------------------------
select count(*) X from kill_cpu connect by n > prior n start with n = 1
1 row selected.
Let's assume that we could get this code fixed, but our monitoring system still reports some ongoing library cache latch contention and we don't see an obvious troublemaker anymore (all sessions reported are holding the monitored latches roughly equal amount of time) as seen below:
I will proceed with option 2, and will just specify SQLID (or SQLHASH in 9i) in the first parameter to LatchProf:
SQL> @latchprof name,sqlid % "library cache" 100000
-- LatchProf 1.20 by Tanel Poder ( http://www.tanelpoder.com )
NAME SQLID Held Gets Held % Avg hold ms
-------------------- ------------- ----- ------ ------- -----------
library cache 5csdgx2jquqvj 858 858 .86 .028
library cache lock 8vdny41vqbq7t 377 377 .38 .028
library cache lock 5csdgx2jquqvj 369 369 .37 .028
This shows me that the main statement causing library cache latches to be held has SQL_ID '5csdgx2jquqvj'.
Using the SQL ID I can find the text of the query as shown below:
SQL> select executions, sql_text from v$sql where sql_id = '5csdgx2jquqvj';
EXECUTIONS SQL_TEXT
---------- -------------------------------------------------
40522096 select count(*) from dual where rownum = 1
Apparently someone is running the above query really frequently! (Of course I should see how much the executions increases right now instead of just looking into the total number of executions since that cursor was loaded into the library cache).
The common reasons for such frequently re-executed, but useless SQL can be either someone's really bad piece of code or bad connection pool design feature, which constantly polls DUAL table to verify if the connection is still alive.
Loading...
马建仓 AI 助手
尝试更多
代码解读
代码找茬
代码优化
1
https://gitee.com/sapall/Oracle_SQL_Scripts.git
git@gitee.com:sapall/Oracle_SQL_Scripts.git
sapall
Oracle_SQL_Scripts
Oracle_SQL_Scripts
master

搜索帮助