同样的查询每次都产生大量物理读的调优示例

2009/08/09 by admin 6 Comments

12月中旬用户反映综合传输网管库上的一个查询影响迟缓，具体现象表现为当多个用户在应用界面上同时点下查询后，结果返回耗时长，影响正常业务的运作。经过初步分析该操作主要的等待事件在db file sequential read上，为了进一步明确问题，我们在系统的高峰时段使用性能报告工具抓取了统计信息，以下为top3等待事件:

Event Waits Timeouts Time (s) (ms)

—————————- ———— ———- ———- ——

enqueue 542 402 1,406 2595

db file sequential read 446,099 0 391 1

db file scattered read 156,634 0 209 1

可以看到db file sequential read事件仅次于数据库队列事件为主要的数据库性能瓶颈，以下列出缓存占用较高的典型SQL:

Selecta.objectid,a.emsalarm_time,a.emsend_time,c.label_cn,c.alias,a.alarm_name,a.alm_devinfo from traph c,alarm_to_traph b,current_alarm a where a.cuid=b.related_alarm_cuid and b.related_traph_cuid=c.cuid and (c.ext_ids=’,8,’ or c.ext_ids=’,9,’ or c.ext_ids=’,12,’ or c.ext_ids=’,19,’ or c.ext_ids=’,25,’) and a.emsend_time is null and a.emsalarm_time > to_date(‘2008-11-19′,’yyyy-MM-dd’)

经过进一步追踪我们发现以上SQL正是应用界面点击查询所做的操作，这就明确了此次优化的主要目的，即分析并尽可能降低该语句所产生的物理读和逻辑读，保证应用的正常运行。

具体分析

该句查询语句涉及到三个表的连接，因为应用设计使用的是基于RULE的优化模式，故执行计划倾向于使用索引来代替全表扫描，在表与表的连接方式上倾向使用嵌套循环即（NESTED LOOP）,具体执行计划如下图：

经过查询缓存池中各个表的具体缓存状况发现，表ALARM_TO_TRAPH上的索引ALARM_TO_TRAPH_INDEX1与索引TRAPH_CUID均已被缓存，而在该执行计划中需要做全表扫描操作的CURRENT_ALARM则只有部分块被缓存，充分说明了引起物理读需求的正是对CURRENT_ALARM的全表扫描操作，为了进一步证实这一点，我们做了一次10046事件的trace,该事件可以记录SQL语句执行过程中详细的等待事件以及相关信息,在trace中发现大量db file sequential read等待时间，摘录如下：

以上记录中p1,p2对应了数据文件顺序读具体发生的文件号和块号，经过查询确定以上相关文件号和数据块号具对应于表CURRENT_ALARM,至此本次性能问题的主要原因已十分明确，即CURRENT_ALARM表未被完全缓存，引起问题的可能有两种：1.表中存在大量的chained rows即链式行;2.数据库高速缓存在系统高峰时段存在瓶颈。

对于前一种可能性，我们首先分析了表的结构，CURRENT_ALARM表包括96个列，其中包括大量varchar2(255)类型的长列，理论单行长度可能达到14K字节，的确可能引起严重的链式行，为了进一步证实，我们对该表做了一次链式行扫描，操作如下：

ANALYZE TABLE CURRENT_ALARM LIST CHAINED ROWS INTO chained_rows;

该分析操作会将表上的链式行记录到临时表chained_rows中，查询chained_rows发现没有记录，说明表CURRENT_ALARM上没有链式行的问题。其后我们通过函数计算CURRENT_ALARM表上的行长度，发现最长的一行占用1367个字节，就目前来说仍不至于发生严重的链式行问题。

经过以上分析问题已经定位到了高峰时段数据库高速缓存的使用状况上了，通过在早晨，上班高峰时段以及下班时段的进一步观察，我们发现在早晨上班之前，buffer_cache中大约有600个free块可以立即使用，在该时段做上述查询不存在物理读的操作，而在上班高峰时段free块的数量下降到个位数乃至为零，即需要加载数据块时没有可以立即使用的空闲块，同时因为其他查询更为活跃，CURRNET_ALARM表上的数据块在读入后迅速被刷出，没有被缓存住以便于下次的查询，导致了性能问题的最终产生。

总结

由于数据库高速缓存在高峰时段没有空闲块导致需要全表扫描的表未被完全缓存，进而在多用户同时查询的情况下产生大量的物理读等待，影响了正常业务的运行。针对该问题提出以下建议：

增大SGA总量，增大数据库高速缓存即DB_CACHE_SIZE的空间，并扩展高速缓存的keep池，将表CURRENT_ALARM保存在keep池中，保证不被刷出。
定期分析CURRENT_ALARM表的行扩展情况，从而避免可能的链式行的产生。
在可能的情况下，为上述查询语句加上执行计划暗示即ALL_ROWS的hit,使其在连接方式上倾向于使用哈希连接，可以降低三分之一的逻辑读。
增大数据库使用内存总量的同时可能引发CPU的使用量有一定上升，需要密切关注主机的负载情况。

[gview file=”http://askmac.cn/wp-content/uploads/resource/BUFFPOOL.pdf”]

Filed Under: Oracle Tagged With: RAC, sql

Comments

admin says

2010/08/22 at 00:36

How To Identify a Hot Block Within The Database Buffer Cache

Applies to:

Oracle Server – Enterprise Edition
Information in this document applies to any platform.

Goal

How to identify blocks which cause latch contention on the ‘cache buffers chains’ latch.
How to identify a hot block within the database buffer cache.

Solution

Possible hot blocks in the buffer cache normally can be identified by a high or
rapid increasing wait count on the CACHE BUFFERS CHAINS latch.

This latch is acquired when searching for data blocks cached in the buffer cache.
Since the Buffer cache is implemented as a sum of chains of blocks, each of those
chains is protected by a child of this latch when needs to be scanned. Contention
in this latch can be caused by very heavy access to a single block. This can
require the application to be reviewed.

To solve a hot block, the application maybe need to be reviewed.

By examining the waits on this latch, information about the segment and the
specific block can be obtained using the following queries.

First determine which latch id(ADDR) are interesting by examining the number of
sleeps for this latch. The higher the sleep count, the more interesting the
latch id(ADDR) is:

SQL> select CHILD#  “cCHILD”

,      ADDR    “sADDR”

,      GETS     “sGETS”

,      MISSES   “sMISSES”

,      SLEEPS   “sSLEEPS”

from v$latch_children

where name = ‘cache buffers chains’

order by 5, 1, 2, 3;

Run the above query a few times to to establish the id(ADDR) that has the most
consistent amount of sleeps. Once the id(ADDR) with the highest sleep count is found
then this latch address can be used to get more details about the blocks
currently in the buffer cache protected by this latch.
The query below should be run just after determining the ADDR with
the highest sleep count.

SQL> column segment_name format a35

select /*+ RULE */

e.owner ||’.’|| e.segment_name  segment_name,

e.extent_id  extent#,

x.dbablk – e.block_id + 1  block#,

x.tch,

l.child#

from

sys.v$latch_children  l,

sys.x$bh  x,

sys.dba_extents  e

where

x.hladdr  = ‘ &ADDR’ and

e.file_id = x.file# and

x.hladdr = l.addr and

x.dbablk between e.block_id and e.block_id + e.blocks -1

order by x.tch desc ;

Example of the output :
SEGMENT_NAME EXTENT#      BLOCK#       TCH    CHILD#
——————————– ———— ———— —— ———-
SCOTT.EMP_PK                     5            474          17     7,668
SCOTT.EMP                          1            449           2     7,668

Depending on the TCH column (The number of times the block is hit by a SQL
statement), you can identify a hotblock. The higher the value of the TCH column,
the more frequent the block is accessed by SQL statements.

In order to reduce contention for this object the following mechanisms can be put in place:

1) Examine the application to see if the execution of certain DML and SELECT statements

can be reorganized to eliminate contention on the object.

2) Decrease the buffer cache -although this may only help in a small amount of cases.

3) DBWR throughput may have a factor in this as well.

If using multiple DBWR’s then increase the number of DBWR’s.

4) Increase the PCTFREE for the table storage parameters via ALTER TABLE

or rebuild. This will result in less rows per block.

5) Consider implementing reverse key indexes

(if range scans aren’t commonly used against the segment)

Related bugs :
Bug 3611471 : High latch waits for “cache buffers chain” latch possible originating from “kcbgtcr: kslbegin ..”

30 min statspack shows
NoWait Waiter
Latch Name           Where                      Misses Sleeps     Sleeps
——————– ————————– ——- ———- ——–
cache buffers chains kcbgtcr: kslbegin excl        0      206,281   280,674

Bug 1967363 “CACHE BUFFERS CHAINS” LATCH CONTENTION AFTER UPGRADE
TO 8.1.7 FROM 8.0.6

The following query joins with DBA_OBJECTS :

SQL> with bh_lc as

(select /*+ ORDERED */

lc.addr, lc.child#, lc.gets, lc.misses, lc.immediate_gets,

lc.immediate_misses, lc.spin_gets, lc.sleeps,

bh.hladdr, bh.tch tch, bh.file#, bh.dbablk, bh.class,

bh.state, bh.obj

from

x$kslld ld,

v$session_wait sw,

v$latch_children lc,

x$bh bh

where lc.addr =sw.p1raw

and sw.p2= ld.indx

and ld.kslldnam=’cache buffers chains’

and lower(sw.event) like ‘%latch%’

— and state=’WAITING’

and bh.hladdr=lc.addr

)

select bh_lc.hladdr, bh_lc.tch, o.owner, o.object_name, o.object_type,

bh_lc.child#, bh_lc.gets,

bh_lc.misses, bh_lc.immediate_gets,

bh_lc.immediate_misses, spin_gets, sleeps

from

bh_lc,

dba_objects o

where bh_lc.obj = o.object_id(+)

union

select bh_lc.hladdr, bh_lc.tch, o.owner, o.object_name, o.object_type,

bh_lc.child#, bh_lc.gets, bh_lc.misses, bh_lc.immediate_gets,

bh_lc.immediate_misses, spin_gets, sleeps

from

bh_lc,

dba_objects o

where bh_lc.obj = o.data_object_id(+)

order by 1,2 desc;

回复
admin says

2010/08/22 at 00:42

SCRIPT: Script to Report SGA Buffer Summary

Execution Environment:

Access Privileges:
This script needs to be run connected as SYS.

Usage:
sqlplus SYS/ @[SCRIPTFILE]

PROOFREAD THIS SCRIPT BEFORE USING IT! Due to differences in the way text
editors, e-mail packages, and operating systems handle text formatting (spaces,
tabs, and carriage returns), this script may not be in an executable state
when you first receive it. Check over the script to ensure that errors of
this type are corrected.

Description

The following script reports statistics about the buffers in the SGA. It will
return information about the types of buffers in the SGA, how many of them
are on the dirty queue and how many are not.

Sample Output
=============

Class Not Dirty Dirty On Dirty Total
———- ———- ———- ———- ———-
7 1 00 1
Data 2194 598 0 2792
Header 35 0 0 35
Rollback 370 2 0 372

References

Script

SET ECHO off
REM NAME: TFSSGABF.SQL
REM USAGE:”@path/tfssgabf”
REM ————————————————————————
REM REQUIREMENTS:
REM Needs to be run as SYS
REM ————————————————————————
REM AUTHOR:
REM Virag Saksena, Oracle Corporation
REM (c)1996 Oracle Corporation
REM ————————————————————————
REM PURPOSE:
REM Reports statistics about the buffers in the SGA. It will print
REM the information about the types of buffers in the SGA, how many of
REM them are on the dirty queue and how many are not.
REM ————————————————————————
REM DISCLAIMER:
REM This script is provided for educational purposes only. It is NOT
REM supported by Oracle World Wide Technical Support.
REM The script has been tested and appears to work as intended.
REM You should always run new scripts on a test instance initially.
REM ————————————————————————
REM Main text of script follows:

col class form A10
select decode(greatest(class,10),10,decode(class,1,’Data’,2
,’Sort’,4,’Header’,to_char(class)),’Rollback’) “Class”,
sum(decode(bitand(flag,1),1,0,1)) “Not Dirty”,
sum(decode(bitand(flag,1),1,1,0)) “Dirty”,
sum(dirty_queue) “On Dirty”,count(*) “Total”
from x$bh
group by decode(greatest(class,10),10,decode(class,1,’Data’,2
,’Sort’,4,’Header’,to_char(class)),’Rollback’)
/

回复
admin says

2010/08/22 at 00:42

What is “v$bh”? How should it be used?
PURPOSE

This document tries to identify some parameters that need to be looked into
when tuning ops in an oracle7 or oracle8 environment and talks about what
the view v$bh is.

Introduction

In version 7, v$ping, v$cache and v$bh (dynamic performance views)
were defined as views whose base object was ext_to_obj (a view itself).

The base table for the ext_to_obj view are the tables:
1. obj$
2. uet$
3. file$
4. undo$

In version8, the base objects for v$cache and v$ping are v$bh, obj$ and undo$.
This avoids the necessity to continuously recreate the ext_to_obj view
(to maintain dynamism of the view) which is an expensive operation
(as Oracle stores the view definition in the dictionary as the text of the
query that defines the view. When a view is referenced in an SQL statement,
Oracle merges the statement with the query that defines the view and
then parses the statement in the shared pool and then executes it).
v$bh now being an internally defined fixed view is not recreated again
and again.

v$bh –> what it is and where is it used?
=========================================

This is a very important view when we are looking at performance
monitoring in a parallel server environment.

v$bh gives the status of local forced writes and forced reads
for every buffer in the buffer cache.

In version 8 as compared to version 7 (the base tables for this view
are the tables x$bh and x$le), the view is defined internally. The columns
are still the same though.

Important columns in this view are as follows:

1. The ‘forced_write’ column which counts the number of times DBWR had to
write this this block to disk because this instance had dirtied the block
and another instance had requested the lock on the block in a conflicting
mode.

2. The ‘forced_read’ column which counts the number of times the block had
to be re-read from disk because another instance had forced it out of this
instance’s cache by requesting the PCM lock on the block in exclusive mode.

3. The ‘lock_element_addr’ column contains the address of the lock element
element that is locking this buffer.

If two buffers have the same lock_element_addr, then they are being
protected by the same lock. Anytime two buffers are covered by the PCM lock,
you can have false collisions between the buffers.

4. The ‘status’ column says whether the block is currently :

a) not currently in use (FREE)
b) exclusive (XCUR)
c) shared current (SCUR)
d) in consistent read mode (CR)
e) being read from disk (READ)
f) in media recover mode or (MREC)
g) in instance recovery mode. (IREC)

Columns 1 and 2 together represent the number of disk i/o’s an instance had to
perform on each block in the cache due to conflicting lock requests by other
instances. These i/o’s are wasteful since they occur only due to lock activity
and thus they need to be avoided.

Thus, the columns force_write and force_read basically determine how
gc_files_to_locks need to be allocated to various files and also serve as
an important parameter to determine what the value of gc_db_locks should be.

Search Words:
=============

v$bh, ops, ext_to_obj

回复
admin says

2010/08/22 at 00:43

Select database objects which are candidates for the recycle buffer cache
PURPOSE
——-

To select database objects which are candidates for the recycle database
buffer cache.

SCOPE & APPLICATION
——————-

This article is intended for Server Technologies support analysts and
customers.

Which objects should be put in the recycle cache
————————————————-

Objects with single-touch buffer counts and totaling more than 5% of the
buffer cache are candidates of the recycle buffer. The touch count is
new since version 8.0. The touch count is increased each time a buffer
is accessed. This touch count is usded in the following statement

SELECT obj object,
COUNT(1) buffers,
(COUNT(1)/totsize) * 100 pct_cache
FROM x$bh,
(SELECT value totsize
FROM v$parameter
WHERE name = ‘db_block_buffers’)
WHERE tch = 1
OR (tch = 0 AND lru_flag < 8) GROUP BY obj, totsize HAVING (COUNT(1)/totsize) * 100 > 5;

This query returns object id’s. To get the corresponding names of the
database objects you have to perform the follwoing select:

SELECT object_id, owner, object_name, object_type
FROM dba_objects
WHERE data_object_id = return-value-from-last-query;

回复
admin says

2010/08/22 at 00:44

Script to Identify Objects and Amount of Blocks in the Buffer Pools – Default, Keep, Recycle, nK Cache

Applies to:

Oracle Server – Enterprise Edition – Version: 9.2.0.1 to 10.2.0.1
Information in this document applies to any platform.

Purpose

This script will provide the objects that are currently pinned to the buffer pools as well as the number of blocks from the object that are currently loaded into memory.

Software Requirements/Prerequisites

Tested on :
9.2.0.7
10.1.0.3
10.2.0.1

Configuring the Script

The base (X$) table is used; therefore, this must be performed by connecting as :
SQL> conn / as sysdba

Running the Script

1. Copy the provided script into the textpad/editor and save it as :
dbbuffer.sql
2. Start a SQL*Plus session.
$ sqlplus /nolog
3. Connect to the database with the SYSDBA privilege.
SQL> conn / as sysdba
4. Run the script :
SQL> @dbbuffer.sql
5. Inspect the dbbuffer.log for success/failure of the script.
$ more dbbuffer.log

Caution

This script is provided for educational purposes only and not supported by Oracle Support Services. It has been tested internally, however, and works as documented. We do not guarantee that it will work for you, so be sure to test it in your environment before relying on it.

Proofread this script before using it! Due to the differences in the way text editors, e-mail packages and operating systems handle text formatting (spaces, tabs and carriage returns), this script may not be in an executable state when you first receive it. Check over the script to ensure that errors of this type are corrected.

Script

Save the following text as : dbbuffer.sql

spool dbbuffer.log

select decode(pd.bp_id,1,’KEEP’,2,’RECYCLE’,3,’DEFAULT’,

4,’2K SUBCACHE’,5,’4K SUBCACHE’,6,’8K SUBCACHE’,

7,’16K SUBCACHE’,8,’32KSUBCACHE’,’UNKNOWN’) subcache,

bh.object_name,bh.blocks
from x$kcbwds ds,x$kcbwbpd pd,(select /*+ use_hash(x) */ set_ds,
o.name object_name,count(*) BLOCKS

from obj$ o, x$bh x where o.dataobj# = x.obj

and x.state !=0 and o.owner# !=0

group by set_ds,o.name) bh
where ds.set_id >= pd.bp_lo_sid
and ds.set_id <= pd.bp_hi_sid
and pd.bp_size != 0
and ds.addr=bh.set_ds;

spool off;

Script Output

SQL> conn / as sysdba
Connected.
SQL> @dbbuffer.sql

SUBCACHE OBJECT_NAME BLOCKS
———— —————————— ———-
DEFAULT SQLPLUS_PRODUCT_PROFILE 1
DEFAULT ABCD 6
DEFAULT SDO_TOPO_METADATA_TABLE 1
KEEP ABCD 2
DEFAULT RLM$SCHACTIONORDER 1
DEFAULT RLM$JOINQKEY 1
DEFAULT TEST 8
DEFAULT REPCAT$_REPPROP 1
DEFAULT XDB$CONFIG 6
DEFAULT AQ$_QUEUES 2
DEFAULT UNIQUE_GEORASTERS 1

11 rows selected.

SQL>

回复
admin says

2010/08/22 at 00:46

How To Give Grant Select On X$ Objects In Oracle 10g?

Applies to:

Oracle Server – Enterprise Edition – Version: 10.2.0
Information in this document applies to any platform.

Goal

How to give grant select on X$ objects in Oracle 10g?

Solution

1) You cannot grant a select privilege on X$* tables to another user due to they are internally
protected.

– If you try grant select on x$ objects then you will receive the following error:

SQL> grant select on sys.x$bh to scott;
grant select on sys.x$bh to scott
*
ERROR at line 1:
ORA-02030: can only select from fixed tables/views

2) Alternatively, you can create a view and then grant a select privilege on that new view to
another user as follow:

Connected to:
Oracle Database 10g Enterprise Edition Release 10.1.0.4.0 – 64bit Production
With the Partitioning, OLAP and Data Mining options

SQL> show user
USER is “SYS”
SQL> create user scott identified by tiger default tablespace users temporary tablespace temp;

User created.

SQL> grant resource to scott;

Grant succeeded.

SQL> grant connect to scott;

Grant succeeded.

SQL> connect scott/tiger
Connected.

SQL> show user
USER is “SCOTT”

SQL> select * from sys.x$bh;
select * from sys.x$bh
*
ERROR at line 1:
ORA-00942: table or view does not exist

==================================================================

SQL> show user
USER is “SYS”
SQL> grant select on sys.x$bh to scott;
grant select on sys.x$bh to scott
*
ERROR at line 1:
ORA-02030: can only select from fixed tables/views

SQL> create view vw_x$bh as select * from sys.x$bh; <(=== This works

View created.

SQL> grant select on sys.vw_x$bh to scott; <(=== This works

Grant succeeded.

SQL> connect scott/tiger
Connected.
SQL> select * from sys.vw_x$bh; <(=== This works

回复

同样的查询每次都产生大量物理读的调优示例

总结

Comments

Applies to:

Goal

Solution

Applies to:

Purpose

Software Requirements/Prerequisites

Configuring the Script

Running the Script

Caution

Script

Script Output

Applies to:

Goal

Solution

Comment 取消回复