同样的查询每次都产生大量物理读的调优示例

12月中旬用户反映综合传输网管库上的一个查询影响迟缓,具体现象表现为当多个用户在应用界面上同时点下查询后,结果返回耗时长,影响正常业务的运作。经过初步分析该操作主要的等待事件在db file sequential read上,为了进一步明确问题,我们在系统的高峰时段使用性能报告工具抓取了统计信息,以下为top3等待事件:

Event                               Waits   Timeouts   Time (s)   (ms)

—————————- ———— ———- ———- ——

enqueue                           542        402      1,406   2595

db file sequential read           446,099          0        391      1

db file scattered read            156,634          0        209      1

可以看到db file sequential read事件仅次于数据库队列事件为主要的数据库性能瓶颈,以下列出缓存占用较高的典型SQL:

Selecta.objectid,a.emsalarm_time,a.emsend_time,c.label_cn,c.alias,a.alarm_name,a.alm_devinfo from traph c,alarm_to_traph b,current_alarm a where a.cuid=b.related_alarm_cuid and b.related_traph_cuid=c.cuid and (c.ext_ids=’,8,’ or c.ext_ids=’,9,’ or c.ext_ids=’,12,’ or c.ext_ids=’,19,’ or c.ext_ids=’,25,’) and a.emsend_time is null and a.emsalarm_time > to_date(‘2008-11-19′,’yyyy-MM-dd’)

经过进一步追踪我们发现以上SQL正是应用界面点击查询所做的操作,这就明确了此次优化的主要目的,即分析并尽可能降低该语句所产生的物理读和逻辑读,保证应用的正常运行。

具体分析
该句查询语句涉及到三个表的连接,因为应用设计使用的是基于RULE的优化模式,故执行计划倾向于使用索引来代替全表扫描,在表与表的连接方式上倾向使用嵌套循环即(NESTED LOOP),具体执行计划如下图:

经过查询缓存池中各个表的具体缓存状况发现,表ALARM_TO_TRAPH上的索引ALARM_TO_TRAPH_INDEX1与索引TRAPH_CUID均已被缓存,而在该执行计划中需要做全表扫描操作的CURRENT_ALARM则只有部分块被缓存,充分说明了引起物理读需求的正是对CURRENT_ALARM的全表扫描操作,为了进一步证实这一点,我们做了一次10046事件的trace,该事件可以记录SQL语句执行过程中详细的等待事件以及相关信息,在trace中发现大量db file sequential read等待时间,摘录如下:

以上记录中p1,p2对应了数据文件顺序读具体发生的文件号和块号,经过查询确定以上相关文件号和数据块号具对应于表CURRENT_ALARM,至此本次性能问题的主要原因已十分明确,即CURRENT_ALARM表未被完全缓存,引起问题的可能有两种:1.表中存在大量的chained rows即链式行;2.数据库高速缓存在系统高峰时段存在瓶颈。

对于前一种可能性,我们首先分析了表的结构,CURRENT_ALARM表包括96个列,其中包括大量varchar2(255)类型的长列,理论单行长度可能达到14K字节,的确可能引起严重的链式行,为了进一步证实,我们对该表做了一次链式行扫描,操作如下:

ANALYZE TABLE CURRENT_ALARM  LIST CHAINED ROWS INTO chained_rows;

该分析操作会将表上的链式行记录到临时表chained_rows中,查询chained_rows发现没有记录,说明表CURRENT_ALARM上没有链式行的问题。其后我们通过函数计算CURRENT_ALARM表上的行长度,发现最长的一行占用1367个字节,就目前来说仍不至于发生严重的链式行问题。

经过以上分析问题已经定位到了高峰时段数据库高速缓存的使用状况上了,通过在早晨,上班高峰时段以及下班时段的进一步观察,我们发现在早晨上班之前,buffer_cache中大约有600个free块可以立即使用,在该时段做上述查询不存在物理读的操作,而在上班高峰时段free块的数量下降到个位数乃至为零,即需要加载数据块时没有可以立即使用的空闲块,同时因为其他查询更为活跃,CURRNET_ALARM表上的数据块在读入后迅速被刷出,没有被缓存住以便于下次的查询,导致了性能问题的最终产生。

总结

由于数据库高速缓存在高峰时段没有空闲块导致需要全表扫描的表未被完全缓存,进而在多用户同时查询的情况下产生大量的物理读等待,影响了正常业务的运行。针对该问题提出以下建议:

  • 增大SGA总量,增大数据库高速缓存即DB_CACHE_SIZE的空间,并扩展高速缓存的keep池,将表CURRENT_ALARM保存在keep池中,保证不被刷出。
  • 定期分析CURRENT_ALARM表的行扩展情况,从而避免可能的链式行的产生。
  • 在可能的情况下,为上述查询语句加上执行计划暗示即ALL_ROWS的hit,使其在连接方式上倾向于使用哈希连接,可以降低三分之一的逻辑读。
  • 增大数据库使用内存总量的同时可能引发CPU的使用量有一定上升,需要密切关注主机的负载情况。

[gview file=”http://askmac.cn/wp-content/uploads/resource/BUFFPOOL.pdf”]

Comments

  1. admin says

    How To Identify a Hot Block Within The Database Buffer Cache

    Applies to:

    Oracle Server – Enterprise Edition
    Information in this document applies to any platform.

    Goal

    How to identify blocks which cause latch contention on the ‘cache buffers chains’ latch.
    How to identify a hot block within the database buffer cache.

    Solution

    Possible hot blocks in the buffer cache normally can be identified by a high or
    rapid increasing wait count on the CACHE BUFFERS CHAINS latch.

    This latch is acquired when searching for data blocks cached in the buffer cache.
    Since the Buffer cache is implemented as a sum of chains of blocks, each of those
    chains is protected by a child of this latch when needs to be scanned. Contention
    in this latch can be caused by very heavy access to a single block. This can
    require the application to be reviewed.

    To solve a hot block, the application maybe need to be reviewed.

    By examining the waits on this latch, information about the segment and the
    specific block can be obtained using the following queries.

    First determine which latch id(ADDR) are interesting by examining the number of
    sleeps for this latch. The higher the sleep count, the more interesting the
    latch id(ADDR) is:

    SQL> select CHILD#  “cCHILD”
        
    ,      ADDR    “sADDR”
        
    ,      GETS     “sGETS”
        
    ,      MISSES   “sMISSES”
        
    ,      SLEEPS   “sSLEEPS” 
        
    from v$latch_children 
        
    where name = ‘cache buffers chains’
        
    order by 5, 1, 2, 3;

    Run the above query a few times to to establish the id(ADDR) that has the most
    consistent amount of sleeps. Once the id(ADDR) with the highest sleep count is found
    then this latch address can be used to get more details about the blocks
    currently in the buffer cache protected by this latch.
    The query below should be run just after determining the ADDR with
    the highest sleep count.

    SQL> column segment_name format a35
        
    select /*+ RULE */
          
    e.owner ||’.’|| e.segment_name  segment_name,
          
    e.extent_id  extent#,
          
    x.dbablk – e.block_id + 1  block#,
          
    x.tch,
          
    l.child#
        
    from
          
    sys.v$latch_children  l,
          
    sys.x$bh  x,
          
    sys.dba_extents  e
        
    where
          
    x.hladdr  = ‘ &ADDR’ and
          
    e.file_id = x.file# and
          
    x.hladdr = l.addr and
          
    x.dbablk between e.block_id and e.block_id + e.blocks -1
        
    order by x.tch desc ;

    Example of the output :
    SEGMENT_NAME                     EXTENT#      BLOCK#       TCH    CHILD#
    ——————————– ———— ———— —— ———-
    SCOTT.EMP_PK                       5            474          17     7,668
    SCOTT.EMP                          1            449           2     7,668

    Depending on the TCH column (The number of times the block is hit by a SQL
    statement), you can identify a hotblock. The higher the value of the TCH column,
    the more frequent the block is accessed by SQL statements.

    In order to reduce contention for this object the following mechanisms can be put in place:

    1) Examine the application to see if the execution of certain DML and SELECT statements
       
    can be reorganized to eliminate contention on the object.

    2) Decrease the buffer cache -although this may only help in a small amount of cases.

    3) DBWR throughput may have a factor in this as well.
       
    If using multiple DBWR’s then increase the number of DBWR’s.

    4) Increase the PCTFREE for the table storage parameters via ALTER TABLE
       
    or rebuild. This will result in less rows per block.

    5) Consider implementing reverse key indexes
      
    (if range scans aren’t commonly used against the segment)

    Related bugs :
    Bug 3611471 : High latch waits for “cache buffers chain” latch possible originating from “kcbgtcr: kslbegin ..”

    30 min statspack shows
    NoWait Waiter
    Latch Name           Where                      Misses  Sleeps     Sleeps
    ——————– ————————– ——- ———- ——–
    cache buffers chains kcbgtcr: kslbegin excl        0      206,281   280,674

    Bug 1967363 “CACHE BUFFERS CHAINS” LATCH CONTENTION AFTER UPGRADE
    TO 8.1.7 FROM 8.0.6

    The following query joins with DBA_OBJECTS :

    SQL> with bh_lc as
          
    (select /*+ ORDERED */
             
    lc.addr, lc.child#, lc.gets, lc.misses, lc.immediate_gets,
             
    lc.immediate_misses, lc.spin_gets, lc.sleeps,
             
    bh.hladdr, bh.tch tch, bh.file#, bh.dbablk, bh.class,
             
    bh.state, bh.obj
           
    from
             
    x$kslld ld,
             
    v$session_wait sw,
             
    v$latch_children lc,
             
    x$bh bh
           
    where lc.addr =sw.p1raw
             
    and sw.p2= ld.indx
             
    and ld.kslldnam=’cache buffers chains’
             
    and lower(sw.event) like ‘%latch%’
           
    — and state=’WAITING’
             
    and bh.hladdr=lc.addr
          
    )
        
    select bh_lc.hladdr, bh_lc.tch, o.owner, o.object_name, o.object_type,
               
    bh_lc.child#, bh_lc.gets,
               
    bh_lc.misses, bh_lc.immediate_gets,
               
    bh_lc.immediate_misses, spin_gets, sleeps
        
    from
          
    bh_lc,
          
    dba_objects o
        
    where bh_lc.obj = o.object_id(+)
      
    union
        
    select bh_lc.hladdr, bh_lc.tch, o.owner, o.object_name, o.object_type,
               
    bh_lc.child#, bh_lc.gets, bh_lc.misses, bh_lc.immediate_gets,
               
    bh_lc.immediate_misses, spin_gets, sleeps
        
    from
          
    bh_lc,
          
    dba_objects o
        
    where bh_lc.obj = o.data_object_id(+)
     
    order by 1,2 desc;

     

  2. admin says

    SCRIPT: Script to Report SGA Buffer Summary

    Execution Environment:

    Access Privileges:
    This script needs to be run connected as SYS.

    Usage:
    sqlplus SYS/ @[SCRIPTFILE]

    PROOFREAD THIS SCRIPT BEFORE USING IT! Due to differences in the way text
    editors, e-mail packages, and operating systems handle text formatting (spaces,
    tabs, and carriage returns), this script may not be in an executable state
    when you first receive it. Check over the script to ensure that errors of
    this type are corrected.

    Description

    The following script reports statistics about the buffers in the SGA. It will
    return information about the types of buffers in the SGA, how many of them
    are on the dirty queue and how many are not.

    Sample Output
    =============

    Class Not Dirty Dirty On Dirty Total
    ———- ———- ———- ———- ———-
    7 1 00 1
    Data 2194 598 0 2792
    Header 35 0 0 35
    Rollback 370 2 0 372

    References

    Script

    SET ECHO off
    REM NAME: TFSSGABF.SQL
    REM USAGE:”@path/tfssgabf”
    REM ————————————————————————
    REM REQUIREMENTS:
    REM Needs to be run as SYS
    REM ————————————————————————
    REM AUTHOR:
    REM Virag Saksena, Oracle Corporation
    REM (c)1996 Oracle Corporation
    REM ————————————————————————
    REM PURPOSE:
    REM Reports statistics about the buffers in the SGA. It will print
    REM the information about the types of buffers in the SGA, how many of
    REM them are on the dirty queue and how many are not.
    REM ————————————————————————
    REM DISCLAIMER:
    REM This script is provided for educational purposes only. It is NOT
    REM supported by Oracle World Wide Technical Support.
    REM The script has been tested and appears to work as intended.
    REM You should always run new scripts on a test instance initially.
    REM ————————————————————————
    REM Main text of script follows:

    col class form A10
    select decode(greatest(class,10),10,decode(class,1,’Data’,2
    ,’Sort’,4,’Header’,to_char(class)),’Rollback’) “Class”,
    sum(decode(bitand(flag,1),1,0,1)) “Not Dirty”,
    sum(decode(bitand(flag,1),1,1,0)) “Dirty”,
    sum(dirty_queue) “On Dirty”,count(*) “Total”
    from x$bh
    group by decode(greatest(class,10),10,decode(class,1,’Data’,2
    ,’Sort’,4,’Header’,to_char(class)),’Rollback’)
    /

  3. admin says

    What is “v$bh”? How should it be used?
    PURPOSE

    This document tries to identify some parameters that need to be looked into
    when tuning ops in an oracle7 or oracle8 environment and talks about what
    the view v$bh is.

    Introduction

    In version 7, v$ping, v$cache and v$bh (dynamic performance views)
    were defined as views whose base object was ext_to_obj (a view itself).

    The base table for the ext_to_obj view are the tables:
    1. obj$
    2. uet$
    3. file$
    4. undo$

    In version8, the base objects for v$cache and v$ping are v$bh, obj$ and undo$.
    This avoids the necessity to continuously recreate the ext_to_obj view
    (to maintain dynamism of the view) which is an expensive operation
    (as Oracle stores the view definition in the dictionary as the text of the
    query that defines the view. When a view is referenced in an SQL statement,
    Oracle merges the statement with the query that defines the view and
    then parses the statement in the shared pool and then executes it).
    v$bh now being an internally defined fixed view is not recreated again
    and again.

    v$bh –> what it is and where is it used?
    =========================================

    This is a very important view when we are looking at performance
    monitoring in a parallel server environment.

    v$bh gives the status of local forced writes and forced reads
    for every buffer in the buffer cache.

    In version 8 as compared to version 7 (the base tables for this view
    are the tables x$bh and x$le), the view is defined internally. The columns
    are still the same though.

    Important columns in this view are as follows:

    1. The ‘forced_write’ column which counts the number of times DBWR had to
    write this this block to disk because this instance had dirtied the block
    and another instance had requested the lock on the block in a conflicting
    mode.

    2. The ‘forced_read’ column which counts the number of times the block had
    to be re-read from disk because another instance had forced it out of this
    instance’s cache by requesting the PCM lock on the block in exclusive mode.

    3. The ‘lock_element_addr’ column contains the address of the lock element
    element that is locking this buffer.

    If two buffers have the same lock_element_addr, then they are being
    protected by the same lock. Anytime two buffers are covered by the PCM lock,
    you can have false collisions between the buffers.

    4. The ‘status’ column says whether the block is currently :

    a) not currently in use (FREE)
    b) exclusive (XCUR)
    c) shared current (SCUR)
    d) in consistent read mode (CR)
    e) being read from disk (READ)
    f) in media recover mode or (MREC)
    g) in instance recovery mode. (IREC)

    Columns 1 and 2 together represent the number of disk i/o’s an instance had to
    perform on each block in the cache due to conflicting lock requests by other
    instances. These i/o’s are wasteful since they occur only due to lock activity
    and thus they need to be avoided.

    Thus, the columns force_write and force_read basically determine how
    gc_files_to_locks need to be allocated to various files and also serve as
    an important parameter to determine what the value of gc_db_locks should be.

    Search Words:
    =============

    v$bh, ops, ext_to_obj

  4. admin says

    Select database objects which are candidates for the recycle buffer cache
    PURPOSE
    ——-

    To select database objects which are candidates for the recycle database
    buffer cache.

    SCOPE & APPLICATION
    ——————-

    This article is intended for Server Technologies support analysts and
    customers.

    Which objects should be put in the recycle cache
    ————————————————-

    Objects with single-touch buffer counts and totaling more than 5% of the
    buffer cache are candidates of the recycle buffer. The touch count is
    new since version 8.0. The touch count is increased each time a buffer
    is accessed. This touch count is usded in the following statement

    SELECT obj object,
    COUNT(1) buffers,
    (COUNT(1)/totsize) * 100 pct_cache
    FROM x$bh,
    (SELECT value totsize
    FROM v$parameter
    WHERE name = ‘db_block_buffers’)
    WHERE tch = 1
    OR (tch = 0 AND lru_flag < 8) GROUP BY obj, totsize HAVING (COUNT(1)/totsize) * 100 > 5;

    This query returns object id’s. To get the corresponding names of the
    database objects you have to perform the follwoing select:

    SELECT object_id, owner, object_name, object_type
    FROM dba_objects
    WHERE data_object_id = return-value-from-last-query;

  5. admin says

    Script to Identify Objects and Amount of Blocks in the Buffer Pools – Default, Keep, Recycle, nK Cache

    Applies to:

    Oracle Server – Enterprise Edition – Version: 9.2.0.1 to 10.2.0.1
    Information in this document applies to any platform.

    Purpose

    This script will provide the objects that are currently pinned to the buffer pools as well as the number of blocks from the object that are currently loaded into memory.

    Software Requirements/Prerequisites

    Tested on :
    9.2.0.7
    10.1.0.3
    10.2.0.1

    Configuring the Script

    The base (X$) table is used; therefore, this must be performed by connecting as :
    SQL> conn / as sysdba

    Running the Script

    1. Copy the provided script into the textpad/editor and save it as :
    dbbuffer.sql
    2. Start a SQL*Plus session.
    $ sqlplus /nolog
    3. Connect to the database with the SYSDBA privilege.
    SQL> conn / as sysdba
    4. Run the script :
    SQL> @dbbuffer.sql
    5. Inspect the dbbuffer.log for success/failure of the script.
    $ more dbbuffer.log

    Caution

    This script is provided for educational purposes only and not supported by Oracle Support Services. It has been tested internally, however, and works as documented. We do not guarantee that it will work for you, so be sure to test it in your environment before relying on it.

    Proofread this script before using it! Due to the differences in the way text editors, e-mail packages and operating systems handle text formatting (spaces, tabs and carriage returns), this script may not be in an executable state when you first receive it. Check over the script to ensure that errors of this type are corrected.

    Script

    Save the following text as : dbbuffer.sql

    spool dbbuffer.log

    select decode(pd.bp_id,1,’KEEP’,2,’RECYCLE’,3,’DEFAULT’,
            
    4,’2K SUBCACHE’,5,’4K SUBCACHE’,6,’8K SUBCACHE’,
            
    7,’16K SUBCACHE’,8,’32KSUBCACHE’,’UNKNOWN’) subcache,
            
    bh.object_name,bh.blocks
    from x$kcbwds ds,x$kcbwbpd pd,(select /*+ use_hash(x) */ set_ds,
    o.name object_name,count(*) BLOCKS
            
    from obj$ o, x$bh x where o.dataobj# = x.obj
            
    and x.state !=0 and o.owner# !=0
             
    group by set_ds,o.name) bh 
    where ds.set_id >= pd.bp_lo_sid
    and ds.set_id <= pd.bp_hi_sid
    and pd.bp_size != 0
    and ds.addr=bh.set_ds;

    spool off;

    Script Output

    SQL> conn / as sysdba
    Connected.
    SQL> @dbbuffer.sql

    SUBCACHE OBJECT_NAME BLOCKS
    ———— —————————— ———-
    DEFAULT SQLPLUS_PRODUCT_PROFILE 1
    DEFAULT ABCD 6
    DEFAULT SDO_TOPO_METADATA_TABLE 1
    KEEP ABCD 2
    DEFAULT RLM$SCHACTIONORDER 1
    DEFAULT RLM$JOINQKEY 1
    DEFAULT TEST 8
    DEFAULT REPCAT$_REPPROP 1
    DEFAULT XDB$CONFIG 6
    DEFAULT AQ$_QUEUES 2
    DEFAULT UNIQUE_GEORASTERS 1

    11 rows selected.

    SQL>

     

  6. admin says

    How To Give Grant Select On X$ Objects In Oracle 10g?

    Applies to:

    Oracle Server – Enterprise Edition – Version: 10.2.0
    Information in this document applies to any platform.

    Goal

    How to give grant select on X$ objects in Oracle 10g?

    Solution

    1) You cannot grant a select privilege on X$* tables to another user due to they are internally
    protected.

    – If you try grant select on x$ objects then you will receive the following error:

    SQL> grant select on sys.x$bh to scott;
    grant select on sys.x$bh to scott
    *
    ERROR at line 1:
    ORA-02030: can only select from fixed tables/views

    2) Alternatively, you can create a view and then grant a select privilege on that new view to
    another user as follow:

    Connected to:
    Oracle Database 10g Enterprise Edition Release 10.1.0.4.0 – 64bit Production
    With the Partitioning, OLAP and Data Mining options

    SQL> show user
    USER is “SYS”
    SQL> create user scott identified by tiger default tablespace users temporary tablespace temp;

    User created.

    SQL> grant resource to scott;

    Grant succeeded.

    SQL> grant connect to scott;

    Grant succeeded.

    SQL> connect scott/tiger
    Connected.

    SQL> show user
    USER is “SCOTT”

    SQL> select * from sys.x$bh;
    select * from sys.x$bh
    *
    ERROR at line 1:
    ORA-00942: table or view does not exist

    ==================================================================

    SQL> show user
    USER is “SYS”
    SQL> grant select on sys.x$bh to scott;
    grant select on sys.x$bh to scott
    *
    ERROR at line 1:
    ORA-02030: can only select from fixed tables/views

    SQL> create view vw_x$bh as select * from sys.x$bh; <(=== This works

    View created.

    SQL> grant select on sys.vw_x$bh to scott; <(=== This works

    Grant succeeded.

    SQL> connect scott/tiger
    Connected.
    SQL> select * from sys.vw_x$bh; <(=== This works

     

Comment

*

沪ICP备14014813号-2

沪公网安备 31010802001379号