hint

解决Oracle中Split Partition缓慢的问题

2011/11/17 by admin 4 Comments

有这样一个case ，用户的10g产品数据库中有一张按照月份划分的RANGE分区表，其范围为Less than (maxvalue)的最后一个Partition分区总是为空Empty的，用户每隔半年会对这个MaxValue Partition 执行Split Partition 操作，以便存放后续月份的数据，同时这个表上还有一个非分区索引Nonpartitioned indexes。

满以为Split 这样一个Empry Partition会是一个回车就结束的任务，但是Performance issue就在这平淡无奇的分区维护工作中出现了，实际在执行”Alter Table Split partition Pn at (value) into …” 的DDL语句时，发现需要花费超过十多分钟才能完成一次Split。问题就在于，如果是有大量数据的Partition分区， Split 操作慢一些也是正常的(expected预期内的) ，但是实际这里的Max Partition总是为空的，没有任何一条数据，为什么Split 一个空的分区也会是这种蜗牛的速度呢？

我们来模拟这个现象，首先创建一张分区表，Maxvalue的Partition是Empty的，且有一个普通索引:

SQL> select * from v$version;

BANNER
----------------------------------------------------------------
Oracle Database 10g Enterprise Edition Release 10.2.0.1.0 - 64bi
PL/SQL Release 10.2.0.1.0 - Production
CORE    10.2.0.1.0      Production
TNS for Linux: Version 10.2.0.1.0 - Production
NLSRTL Version 10.2.0.1.0 - Production

SQL> select * from global_name;

GLOBAL_NAME
--------------------------------------------------------------------------------
www.askmac.cn & www.askmac.cn

SQL>   CREATE TABLE Maclean
  2     (    "OWNER" VARCHAR2(30),
  3          "OBJECT_NAME" VARCHAR2(128),
  4          "SUBOBJECT_NAME" VARCHAR2(30),
  5          "OBJECT_ID" NUMBER,
  6          "DATA_OBJECT_ID" NUMBER,
  7          "OBJECT_TYPE" VARCHAR2(19),
  8          "CREATED" DATE,
  9          "LAST_DDL_TIME" DATE,
 10          "TIMESTAMP" VARCHAR2(19),
 11          "STATUS" VARCHAR2(7),
 12          "TEMPORARY" VARCHAR2(1),
 13          "GENERATED" VARCHAR2(1),
 14          "SECONDARY" VARCHAR2(1)
 15     ) nologging
 16     partition by range(object_id)
 17     (partition p1 values less than (99999) tablespace users,
 18     partition p2 values less than (maxvalue) tablespace users);

Table created.

SQL> insert /*+ append */ into maclean select * from maclean1;

38736384 rows created.

SQL> commit; 

Commit complete.

SQL> create index ind_obj on maclean(DATA_OBJECT_ID,OBJECT_ID,LAST_DDL_TIME,TIMESTAMP,object_type,owner,status)
nologging parallel
  2  ;

Index created.

SQL> alter index ind_obj noparallel;

Index altered.

SQL> exec dbms_stats.gather_table_stats('SYS','MACLEAN',cascade=>true,degree=>2);

PL/SQL procedure successfully completed.

SQL> select num_rows from dba_tables where table_name='MACLEAN';

  NUM_ROWS
----------
  38818438

SQL>  select * from maclean partition (p2);

no rows selected

/* Maclean表有大量的数据 ，但是都在p1分区中， p2分区没有任何数据 */

我们执行Split partition 的DDL 语句，并使用10046 level 12 event监控该过程：

 oradebug setmypid;
 oradebug event 10046 trace name context forever,level 12;

SQL>  alter table maclean split partition p2 at (100001)
  2   into (partition p3, partition p4);

Table altered.

[oracle@vrh8 ~]$ tkprof /s01/admin/G10R21/udump/g10r21_ora_4896.trc g10r21_ora_4896.tkf

TKPROF: Release 10.2.0.1.0 - Production on Thu Nov 17 23:42:48 2011

Copyright (c) 1982, 2005, Oracle.  All rights reserved.

从tkf 文件中可以找出以下内容：

 alter table maclean split partition p2 at (100001)
 into (partition p3, partition p4)

call     count       cpu    elapsed       disk      query    current        rows
------- ------  -------- ---------- ---------- ---------- ----------  ----------
Parse        1      0.13       0.30         20       1139          0           0
Execute      1      0.01       0.18          3          6         33           0
Fetch        0      0.00       0.00          0          0          0           0
------- ------  -------- ---------- ---------- ---------- ----------  ----------
total        2      0.14       0.48         23       1145         33           0

select /*+ FIRST_ROWS PARALLEL("MACLEAN", 1) */ 1
from
 "SYS"."MACLEAN" PARTITION ("P2")  where ( (  (  ( "OBJECT_ID" < 100001 )  )
  ) ) and rownum < 2

call     count       cpu    elapsed       disk      query    current        rows
------- ------  -------- ---------- ---------- ---------- ----------  ----------
Parse        1      0.00       0.00          0          0          0           0
Execute      1      0.00       0.00          0          0          0           0
Fetch        1     24.85      39.15     371276     385828          0           0
------- ------  -------- ---------- ---------- ---------- ----------  ----------
total        3     24.85      39.15     371276     385828          0           0

Misses in library cache during parse: 1
Optimizer mode: FIRST_ROWS
Parsing user id: SYS   (recursive depth: 1)

Rows     Row Source Operation
-------  ---------------------------------------------------
      0  COUNT STOPKEY (cr=385828 pr=371276 pw=0 time=39153836 us)
      0   TABLE ACCESS BY GLOBAL INDEX ROWID MACLEAN PARTITION: 2 2 (cr=385828 pr=371276 pw=0 time=39153817 us)
38736384    INDEX FULL SCAN IND_OBJ (cr=385828 pr=371276 pw=0 time=309891137 us)(object id 52832)

Elapsed times include waiting on following events:
  Event waited on                             Times   Max. Wait  Total Waited
  ----------------------------------------   Waited  ----------  ------------
  db file sequential read                    371276        0.08         19.46
  latch: cache buffers lru chain                  1        0.00          0.00

select /*+ FIRST_ROWS PARALLEL("MACLEAN", 1) */ 1
from
 "SYS"."MACLEAN" PARTITION ("P2")  where ( (  (  ( "OBJECT_ID" >= 100001 OR
  "OBJECT_ID" IS NULL  )  )  ) ) and rownum < 2

call     count       cpu    elapsed       disk      query    current        rows
------- ------  -------- ---------- ---------- ---------- ----------  ----------
Parse        1      0.00       0.00          0          0          0           0
Execute      1      0.00       0.00          0          0          0           0
Fetch        1      0.00       0.00          0          3          0           0
------- ------  -------- ---------- ---------- ---------- ----------  ----------
total        3      0.00       0.00          0          3          0           0

Misses in library cache during parse: 1
Optimizer mode: FIRST_ROWS
Parsing user id: SYS   (recursive depth: 1)

Rows     Row Source Operation
-------  ---------------------------------------------------
      0  COUNT STOPKEY (cr=3 pr=0 pw=0 time=83 us)
      0   PARTITION RANGE SINGLE PARTITION: 2 2 (cr=3 pr=0 pw=0 time=63 us)
      0    TABLE ACCESS FULL MACLEAN PARTITION: 2 2 (cr=3 pr=0 pw=0 time=36 us)

可以看到在执行”Alter table Split partition”的时候该DDL 语句产生了另外2条递归SQL(recursive sql)被调用，即上例中红色标注的SQL语句，这2条递归SQL分别以 “OBJECT_ID” >= 100001 OR “OBJECT_ID” IS NULL 和 “OBJECT_ID” < 100001 作为条件查询P2分区中的数据， Split Partition的DDL需要使用这2条SQL来找出是否存在满足分隔条件过滤的数据(注意语句中有rownum<2 ，所以实际最多也只返回1条数据，Oracle这样来判定分隔条件的左端或右端是否有数据)。

但是这里可以看到，其中以”OBJECT_ID” < 100001 作为条件的语句运行耗时39.15s，产生了大量的逻辑和物理读，究其原因是该SQL的执行计划采用了Index FULL SCAN ，而另外一条以 “OBJECT_ID” >= 100001 OR “OBJECT_ID” IS NULL 作为条件的递归SQL语句则采用了TABLE ACCESS FULL MACLEAN PARTITION，因为实际P2分区中是没有任何数据的，所以后者运行时间是us级别的，而前者所要扫描的是一整个没有分区的索引，这产生了大量的”db file sequential read”等待事件，我们再来看一下其执行计划：

SQL> explain plan for
  2  select /*+ FIRST_ROWS PARALLEL("MACLEAN", 1) */
  3   1
  4    from "SYS"."MACLEAN" PARTITION("P2")
  5   where (((("OBJECT_ID" < 100001))))
  6     and rownum < 2; 

SQL> select * from table(dbms_xplan.display());

PLAN_TABLE_OUTPUT
-------------------------------------------------
Plan hash value: 985826631

---------------------------------------------------------------------------------------------------------------
| Id  | Operation                           | Name    | Rows  | Bytes | Cost (%CPU)| Time     | Pstart| Pstop |
---------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                    |         |     1 |    13 |   380K  (1)| 01:16:01 |       |       |
|*  1 |  COUNT STOPKEY                      |         |       |       |            |          |       |       |
|   2 |   TABLE ACCESS BY GLOBAL INDEX ROWID| MACLEAN |     1 |    13 |   380K  (1)| 01:16:01 |     2 |     2 |
|*  3 |    INDEX FULL SCAN                  | IND_OBJ |    38M|       |   380K  (1)| 01:16:01 |       |       |
---------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

      1 - filter(ROWNUM<2)
   3 - access("OBJECT_ID"<100001)
       filter("OBJECT_ID"<100001)

注意以上递归SQL中的Hint “FIRST_ROWS PARALLEL(“MACLEAN”, 1)”是该Recursive SQL所固有的，换句话说是写死在Split Partition的Oracle代码层里的。

我们可以分析该Recursive SQL 采用INDEX FULL SCAN的原因可能有2种：
1. Split 所指定的分区的分区统计信息不准确，或者已经被清除。注意一旦我们Split 某个分区后该分区原有统计信息将失效，且被清除。

如下面的例子：

SQL> exec dbms_stats.gather_table_stats('SYS','MACLEAN',cascade=>true,degree=>2);

PL/SQL procedure successfully completed.

SQL> col high_value for a20

SQL> select partition_name,high_value,num_rows,blocks from dba_tab_partitions where table_name='MACLEAN';

PARTITION_NAME                 HIGH_VALUE             NUM_ROWS     BLOCKS
------------------------------ -------------------- ---------- ----------
P1                             99999                  38789142     533240
P3                             100001                        0          0
P4                             MAXVALUE                      0          0

SQL>  alter table maclean split partition p4 at (100010)
  2   into (partition p5, partition p4);

这里我们再次Split 当前的MAXVALUE分区p4

SQL> select partition_name,high_value,num_rows,blocks from dba_tab_partitions where table_name='MACLEAN';

PARTITION_NAME                 HIGH_VALUE             NUM_ROWS     BLOCKS
------------------------------ -------------------- ---------- ----------
P1                             99999                  38789142     533240
P3                             100001                        0          0
P4                             MAXVALUE
P5                             100010

可以发现Split Partiton 会导致原Partiton的统计信息失效，即便使用dbms_stats.lock_table_stats锁住统计信息也无法避免这种失效。
且单个partiton的统计信息失效并不会导致动态采用的发生(dynamic sampling)：

SQL> show parameter dyna 

NAME                                 TYPE        VALUE
------------------------------------ ----------- ------------------------------
optimizer_dynamic_sampling           integer     2

SQL> select * from maclean partition (p4);

no rows selected

Execution Plan
----------------------------------------------------------
Plan hash value: 3900731449

--------------------------------------------------------------------------------------------------
| Id  | Operation              | Name    | Rows  | Bytes | Cost (%CPU)| Time     | Pstart| Pstop |
--------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT       |         |  9699K|   860M| 29715   (2)| 00:05:57 |       |       |
|   1 |  PARTITION RANGE SINGLE|         |  9699K|   860M| 29715   (2)| 00:05:57 |     4 |     4 |
|   2 |   TABLE ACCESS FULL    | MACLEAN |  9699K|   860M| 29715   (2)| 00:05:57 |     4 |     4 |
--------------------------------------------------------------------------------------------------

Statistics
----------------------------------------------------------
        124  recursive calls
          0  db block gets
         23  consistent gets
          0  physical reads
          0  redo size
       1139  bytes sent via SQL*Net to client
        458  bytes received via SQL*Net from client
          1  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
          0  rows processed

SQL> alter system flush shared_pool;

System altered.

SQL> alter session set optimizer_dynamic_sampling=10;

Session altered.

SQL> select * from maclean partition (p4);

no rows selected

Execution Plan
----------------------------------------------------------
Plan hash value: 3900731449

--------------------------------------------------------------------------------------------------
| Id  | Operation              | Name    | Rows  | Bytes | Cost (%CPU)| Time     | Pstart| Pstop |
--------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT       |         |  9699K|   860M| 29715   (2)| 00:05:57 |       |       |
| 1 | PARTITION RANGE SINGLE | | 9699K| 860M| 29715 (2)| 00:05:57 | 4 | 4 |
|   2 |   TABLE ACCESS FULL    | MACLEAN |  9699K|   860M| 29715   (2)| 00:05:57 |     4 |     4 |
--------------------------------------------------------------------------------------------------

Statistics
----------------------------------------------------------
       5812  recursive calls
          0  db block gets
       1141  consistent gets
         22  physical reads
          0  redo size
       1139  bytes sent via SQL*Net to client
        458  bytes received via SQL*Net from client
          1  SQL*Net roundtrips to/from client
        139  sorts (memory)
          0  sorts (disk)
          0  rows processed

SQL> exec dbms_stats.gather_table_stats('SYS','MACLEAN',partname=>'P4');

PL/SQL procedure successfully completed.

SQL> select * from maclean partition (p4);

no rows selected

Execution Plan
----------------------------------------------------------
Plan hash value: 3900731449

--------------------------------------------------------------------------------------------------
| Id  | Operation              | Name    | Rows  | Bytes | Cost (%CPU)| Time     | Pstart| Pstop |
--------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT       |         |     1 |   177 |     2   (0)| 00:00:01 |       |       |
|   1 |  PARTITION RANGE SINGLE|         |     1 |   177 |     2   (0)| 00:00:01 |     4 |     4 |
|   2 |   TABLE ACCESS FULL    | MACLEAN |     1 |   177 |     2   (0)| 00:00:01 |     4 |     4 |
--------------------------------------------------------------------------------------------------

Statistics
----------------------------------------------------------
          0  recursive calls
          0  db block gets
          3  consistent gets
          0  physical reads
          0  redo size
       1139  bytes sent via SQL*Net to client
        458  bytes received via SQL*Net from client
          1  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
          0  rows processed

由于Split Partition 本身会导致分区的统计信息失效，这叫造成由该Split DDL所生成的递归查询SQL语句在解析时CBO Optimizer优化器无法了解该分区的必要统计信息，所以优化器会根据整张Table的统计信息来估算(根据算法来估，而没有实际的统计操作)，实际统计信息中整张表有38818438 行，且共有4个分区，所以P4分区就顺利成长的拥有38818438 /4 = 9704k Rows了，实际上例中红色标注的估算值时9699k rows，因为CBO 优化器得到的统计信息是该分区中有大量的数据，这导致其最终选择了 FULL INDEX SCAN的执行计划，而不是去扫描其实是空空如也的分区。

这是我们在 10g 中执行Split Partition 操作时需要密切注意的一个问题，解决方法是没执行一次Split Partition DDL语句之前都收集一遍MaxValue Partiton 的统计信息，因为该分区是空的，所以这种统计是十分迅速的：

exec dbms_stats.gather_table_stats('&OWNER','&TABNAME',partname=>'&PARNAME');

另一种手段则是在每次Split 之前都手动修改Maxvalue分区的统计信息，这样做会更快一些:

exec dbms_stats.set_table_stats(ownname => '&OWNER',tabname => '&TABNAME',partname => '&PARNAME',
numrows => 0,numblks => 0,force => TRUE);

2. 另一个原因是相关的递归SQL语句被嵌入了”First Rows”的hint ，该提示会让CBO更倾向于使用索引以便快速返回前几行的结果，注意因为这些递归SQL实际只要求返回一行结果，所以First Rows 可以说是恰当且明智的；另外在分区表+本地分区的情景中，即便这个分区是非空的且存有大量的数据，那么使用索引都可以说是正确的选择。

但是在这里选择INDEX FULL SCAN 恰恰是不明智的，很显然Oracle开发部门没有为分区表+ 非分区索引(Non-partitioned Indexes) 或全局索引(global partitioned indexes)的场景考虑该First Rows提示可能带来的后果，已知在版本10.2.0.2 和 10.2.0.3 上都存在该不恰当的递归SQL hint造成的Split Partiton性能问题，Bug Note<Bug 6435544: SPLIT PARTITION SLOW BECAUSE OF HINTED RECURSIVE SQL>说明了该问题:

Hdr: 6435544 10.2.0.2.0 RDBMS 10.2.0.2.0 QRY OPTIMIZER PRODID-5 PORTID-226
Abstract: SPLIT PARTITION SLOW BECAUSE OF HINTED RECURSIVE SQL

PROBLEM:
--------
Split partition operation took more than 45 minutes to complete. Almost all
the time is taken up by the following SQL -
select /*+ FIRST_ROWS PARALLEL("D_CUSTOMER_ORDER_ITEM_CHANNELS", 8) */ 1
from
 "BOOKER"."D_CUSTOMER_ORDER_ITEM_CHANNELS" PARTITION ("COIC101_MAX")
where ( ( ( ( "LEGAL_ENTITY_ID" < 101 )  )  OR  ( "LEGAL_ENTITY_ID" = 101 AND
 (
 "ORDER_DAY" < TO_DATE(' 2007-11-01 00:00:00', 'SYYYY-MM-DD HH24:MI:SS',
 'NLS_CALENDAR=GREGORIAN') )  )  ) ) and rownum < 2

DIAGNOSTIC ANALYSIS:
--------------------
a. Table is partitioned on - (LEGAL_ENTITY_ID, ORDER_DAY).
   Index PK_D_CUST_ORDER_ITEM_CHANNELS is on -
       (CUSTOMER_ORDER_ITEM_ID, MARKETPLACE_ID, LEGAL_ENTITY_ID, ORDER_DAY)

b. Table has 555 partitions. Index is a global index. 

c. From the tkprof output -

call   count       cpu    elapsed       disk      query  current  rows
----- ------  -------- ---------- ---------- ---------- --------  ----
Parse      1      0.00       0.00          0          0        0     0
Execute    1      0.00       0.00          0          0        0     0
Fetch      1    405.50    2796.28    1408207    1444973        0     0
----- ------  -------- ---------- ---------- ---------- --------  ----
total      3    405.50    2796.28    1408207    1444973        0     0

Misses in library cache during parse: 1
Optimizer mode: FIRST_ROWS
Parsing user id: 108     (recursive depth: 1)

Rows     Row Source Operation
-------  ---------------------------------------------------
time=6 us)(object id 1160811)

d. Although we have partition_name in the query, we go for INDEX FULL scan
because of hard coded hint FIRST_ROWS. A FTS of the concerned partition would
be much faster in this case.

WORKAROUND:
-----------
Drop the global index which would force an FTS on a single partition.

RELEASE NOTES:
]] Poor performance was possible for ALTER TABLE SPLIT PARTITION if there
]] was an index on the partition key.
REDISCOVERY INFORMATION:
If you get poor performance for an ALTER TABLE SPLIT PARTITION operation
and there is an index on the partition key, you may be hitting this bug.
WORKAROUND:
None

Metalink给出的Workaround方案是将分区表上的global index 全局索引drop 掉，这样可以让CBO只能选择对single partition的FULL TABLE SCAN。

实际上这个Solution并很不能让人满意，在产品环境中我们不可能随意drop掉某张关键表上的索引，所以这个solution的可操作性很低。

补充：我们来看一下First Rows Hint 在CBO计算成本时如何起作用的：

SQL> oradebug setmypid;
Statement processed.
SQL> oradebug event 10053 trace name context forever,level 1;
Statement processed.
SQL> explain plan for select /*+ FIRST_ROWS PARALLEL("MACLEAN", 1) */
  2   1
  3    from "SYS"."MACLEAN" PARTITION("P4")
  4   where (((("OBJECT_ID" < 100010))))
  5     and rownum < 2; Explained. SQL> oradebug tracefile_name;

10053 trace content ================================================

  *************************************
  PARAMETERS WITH ALTERED VALUES
  ******************************
  optimizer_mode_hinted               = true
  optimizer_mode                      = first_rows

***************************************
BASE STATISTICAL INFORMATION
***********************
Table Stats::
  Table: MACLEAN  Alias: MACLEAN  Partition [3]
    #Rows: 0  #Blks:  1  AvgRowLen:  0.00
    #Rows: 0  #Blks:  1  AvgRowLen:  0.00
Index Stats::
  Index: IND_OBJ  Col#: 5 4 8 9 6 1 10
    LVLS: 3  #LB: 380544  #DK: 50447  LB/K: 7.00  DB/K: 777.00  CLUF: 39208542.00
***************************************
SINGLE TABLE ACCESS PATH
  Column (#4): OBJECT_ID(NUMBER)  Part#: 3
    AvgLen: 22.00 NDV: 0 Nulls: 0 Density: 0.0000e+00 Min: 0 Max: 0
  Column (#4): OBJECT_ID(NUMBER)
    AvgLen: 22.00 NDV: 0 Nulls: 0 Density: 0.0000e+00 Min: 0 Max: 0
  Table: MACLEAN  Alias: MACLEAN     
    Card: Original: 0  Rounded: 1  Computed: 0.00  Non Adjusted: 0.00
  Access Path: TableScan
    Cost:  2.00  Resp: 2.00  Degree: 0
      Cost_io: 2.00  Cost_cpu: 7121
      Resp_io: 2.00  Resp_cpu: 7121
kkofmx: index filter:"MACLEAN"."OBJECT_ID"<100010 AND ROWNUM<2
  Access Path: index (skip-scan)
    SS sel: 0.0000e+00  ANDV (#skips): 4073
    SS io: 32584.00 vs. table scan io: 2.00
    Skip Scan rejected
  Access Path: index (FullScan)
    Index: IND_OBJ
    resc_io: 380547.00  resc_cpu: 10551751028
    ix_sel: 1  ix_sel_with_filters: 1
    Cost: 382007.38  Resp: 382007.38  Degree: 1
  Best:: AccessPath: IndexRange  Index: IND_OBJ
         Cost: 382007.38  Degree: 1  Resp: 382007.38  Card: 0.00  Bytes: 0
***************************************
OPTIMIZER STATISTICS AND COMPUTATIONS
***************************************
GENERAL PLANS
***************************************
Considering cardinality-based initial join order.
***********************
Join order[1]:  MACLEAN[MACLEAN]#0
***********************
Best so far: Table#: 0  cost: 382007.3822  card: 0.0000  bytes: 13
(newjo-stop-1) k:0, spcnt:0, perm:1, maxperm:80000
*********************************
Number of join permutations tried: 1
*********************************
prefetching is on for IND_OBJ
Final - First Rows Plan:  Best join order: 1
  Cost: 382007.3822  Degree: 1  Card: 1.0000  Bytes: 13
  Resc: 382007.3822  Resc_io: 380547.0000  Resc_cpu: 12512178128
  Resp: 382007.3822  Resp_io: 380547.0000  Resc_cpu: 12512178128
kkoipt: Query block SEL$1 (#0)
******* UNPARSED QUERY IS *******
SELECT /*+ FIRST_ROWS NO_PARALLEL ("MACLEAN") */ 1 "1" FROM "SYS"."MACLEAN" PARTITION ("P4") 
"MACLEAN" WHERE ROWNUM<2 AND "MACLEAN"."OBJECT_ID"<100010
kkoqbc-end
          : call(in-use=46464, alloc=49080), compile(in-use=39288, alloc=40552)
apadrv-end: call(in-use=46464, alloc=49080), compile(in-use=40072, alloc=40552)

sql_id=ff1ft3uxsq105.
Current SQL statement for this session:
explain plan for select /*+ FIRST_ROWS PARALLEL("MACLEAN", 1) */
 1
  from "SYS"."MACLEAN" PARTITION("P4")
 where (((("OBJECT_ID" < 100010))))
   and rownum < 2

============
Plan Table
============
-------------------------------------------------------+-----------------------------------+---------------+
| Id  | Operation                            | Name    | Rows  | Bytes | Cost  | Time      | Pstart| Pstop |
-------------------------------------------------------+-----------------------------------+---------------+
| 0   | SELECT STATEMENT                     |         |       |       |  373K |           |       |       |
| 1   |  COUNT STOPKEY                       |         |       |       |       |           |       |       |
| 2   |   TABLE ACCESS BY GLOBAL INDEX ROWID | MACLEAN |     1 |    13 |  373K |  01:16:25 | 4     | 4     |
| 3   |    INDEX FULL SCAN                   | IND_OBJ |   37M |       |  373K |  01:16:25 |       |       |
-------------------------------------------------------+-----------------------------------+---------------+
Predicate Information:
----------------------
1 - filter(ROWNUM<2)
3 - access("OBJECT_ID"<100010)
3 - filter("OBJECT_ID"<100010)

Content of other_xml column
===========================
  db_version     : 10.2.0.1
  parse_schema   : SYS
  plan_hash      : 985826631
  Outline Data:
  /*+
    BEGIN_OUTLINE_DATA
      IGNORE_OPTIM_EMBEDDED_HINTS
      OPTIMIZER_FEATURES_ENABLE('10.2.0.1')
      FIRST_ROWS
      OUTLINE_LEAF(@"SEL$1")
      INDEX(@"SEL$1" "MACLEAN"@"SEL$1" ("MACLEAN"."DATA_OBJECT_ID" "MACLEAN"."OBJECT_ID"
"MACLEAN"."LAST_DDL_TIME" "MACLEAN"."TIMESTAMP" "MACLEAN"."OBJECT_TYPE" "MACLE
AN"."OWNER" "MACLEAN"."STATUS"))
    END_OUTLINE_DATA
  */

可以看到虽然INDEX FULL SCAN的成本(cost:382007)大于 Access Path: TableScan (cost : 2.00) 很多，但是optimizer 最终仍然选择了Index Full Scan ，因为其是满足First Rows 要求的执行计划(红色标注部分)。

于是我开始自己寻找workaround的路径，目标是让优化器忽略”First Rows”的影响。我一开始寄望于能够通过设置一些影响CBO计算cost的优化器参数来让optimizer 迷途知返，包括设置optimizer_index_cost_adj和”_db_file_optimizer_read_count”的值到一个很大水平，但发现并不起作用:

SQL> alter session set "_db_file_optimizer_read_count"=65535;

Session altered.

SQL>  alter session set optimizer_index_cost_adj=10000;

Session altered.

SQL> alter system flush shared_pool;

System altered.

SQL> select /*+ FIRST_ROWS PARALLEL("MACLEAN", 1) */
  2   1
  3    from "SYS"."MACLEAN" PARTITION("P4")
  4   where (((("OBJECT_ID" < 100010))))
  5     and rownum < 2;

no rows selected

Execution Plan
----------------------------------------------------------
Plan hash value: 985826631

---------------------------------------------------------------------------------------------------------------
| Id  | Operation                           | Name    | Rows  | Bytes | Cost (%CPU)| Time     | Pstart| Pstop |
---------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                    |         |     1 |    13 |    38M  (1)|127:20:09 |       |       |
|*  1 |  COUNT STOPKEY                      |         |       |       |            |          |       |       |
|   2 |   TABLE ACCESS BY GLOBAL INDEX ROWID| MACLEAN |     1 |    13 |    38M  (1)|127:20:09 |     4 |     4 |
|*  3 |    INDEX FULL SCAN                  | IND_OBJ |    39M|       |    38M  (1)|127:20:09 |       |       |
---------------------------------------------------------------------------------------------------------------

得益于好奇心，我以’%optimizer%ignore’的Like语句去查了Oracle的隐式参数表，果然有志者事竟成，最终有所斩获:

SQL> col name for a40
SQL> col value for a20
SQL> col describ for a60
SQL> set linesize 200 pagesize 1400
SQL> SELECT x.ksppinm NAME, y.ksppstvl VALUE, x.ksppdesc describ
  2   FROM SYS.x$ksppi x, SYS.x$ksppcv y
  3   WHERE x.inst_id = USERENV ('Instance')
  4   AND y.inst_id = USERENV ('Instance')
  5   AND x.indx = y.indx
  6  AND x.ksppinm LIKE '%optimizer%ignore%';

NAME                                     VALUE                DESCRIB
---------------------------------------- -------------------- ---------------------
_optimizer_ignore_hints                  TRUE                 enables the embedded hints to be ignored

在metalink上查了下没有关于该”_optimizer_ignore_hints” 隐式参数的任何有用信息，就注释来看是可以启用是否忽略SQL中嵌入的HINT提示信息，我们来具体看以下是否其作用:

SQL> alter system flush shared_pool;

System altered.

SQL> explain plan for select /*+ FIRST_ROWS PARALLEL("MACLEAN", 1) */
  2   1
  3    from "SYS"."MACLEAN" PARTITION("P4")
  4   where (((("OBJECT_ID" < 100010))))
  5     and rownum < 2; 

Explained. 

SQL> select * from table(dbms_xplan.display());

PLAN_TABLE_OUTPUT
---------------------------------------------
Plan hash value: 985826631

---------------------------------------------------------------------------------------------------------------
| Id  | Operation                           | Name    | Rows  | Bytes | Cost (%CPU)| Time     | Pstart| Pstop |
---------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                    |         |     1 |    13 |    38M  (1)|127:20:09 |       |       |
|*  1 |  COUNT STOPKEY                      |         |       |       |            |          |       |       |
|   2 |   TABLE ACCESS BY GLOBAL INDEX ROWID| MACLEAN |     1 |    13 |    38M  (1)|127:20:09 |     4 |     4 |
|*  3 |    INDEX FULL SCAN                  | IND_OBJ |    39M|       |    38M  (1)|127:20:09 |       |       |
---------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter(ROWNUM<2)
   3 - access("OBJECT_ID"<100010)
       filter("OBJECT_ID"<100010)

SQL> alter session set "_optimizer_ignore_hints"=true;

Session altered.

SQL> oradebug setmypid;
Statement processed.

SQL> oradebug event 10053 trace name context forever , level 1;
Statement processed.

SQL> explain plan for select /*+ FIRST_ROWS PARALLEL("MACLEAN", 1) */
  2   1
  3    from "SYS"."MACLEAN" PARTITION("P4")
  4   where (((("OBJECT_ID" < 100010))))
  5     and rownum < 2; Explained. SQL> select * from table(dbms_xplan.display());

PLAN_TABLE_OUTPUT
-----------------------------------------------
Plan hash value: 4280157877

---------------------------------------------------------------------------------------------------
| Id  | Operation               | Name    | Rows  | Bytes | Cost (%CPU)| Time     | Pstart| Pstop |
---------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT        |         |     1 |    13 |     2   (0)| 00:00:01 |       |       |
|*  1 |  COUNT STOPKEY          |         |       |       |            |          |       |       |
|   2 |   PARTITION RANGE SINGLE|         |     1 |    13 |     2   (0)| 00:00:01 |     4 |     4 |
|*  3 |    TABLE ACCESS FULL    | MACLEAN |     1 |    13 |     2   (0)| 00:00:01 |     4 |     4 |
---------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter(ROWNUM<2)
   3 - filter("OBJECT_ID"<100010)

惊喜！　该 “_optimizer_ignore_hints”参数确实起到无视HINT提示的作用，且可以直接在session级别修改，十分方便，我们透过10053 event来观察该参数是如何其作用的：

***************************************
PARAMETERS USED BY THE OPTIMIZER
********************************
  *************************************
  PARAMETERS WITH ALTERED VALUES
  ******************************
  _optimizer_ignore_hints             = true

***************************************
BASE STATISTICAL INFORMATION
***********************
Table Stats::
  Table: MACLEAN  Alias: MACLEAN  Partition [3]
    #Rows: 0  #Blks:  1  AvgRowLen:  0.00
    #Rows: 0  #Blks:  1  AvgRowLen:  0.00
Index Stats::
  Index: IND_OBJ  Col#: 5 4 8 9 6 1 10
    LVLS: 3  #LB: 380544  #DK: 50447  LB/K: 7.00  DB/K: 777.00  CLUF: 39208542.00
***************************************
SINGLE TABLE ACCESS PATH
  Column (#4): OBJECT_ID(NUMBER)  Part#: 3
    AvgLen: 22.00 NDV: 0 Nulls: 0 Density: 0.0000e+00 Min: 0 Max: 0
  Column (#4): OBJECT_ID(NUMBER)
    AvgLen: 22.00 NDV: 0 Nulls: 0 Density: 0.0000e+00 Min: 0 Max: 0
  Table: MACLEAN  Alias: MACLEAN
    Card: Original: 0  Rounded: 1  Computed: 0.00  Non Adjusted: 0.00
  Access Path: TableScan
    Cost:  2.00  Resp: 2.00  Degree: 0
      Cost_io: 2.00  Cost_cpu: 7121
      Resp_io: 2.00  Resp_cpu: 7121
kkofmx: index filter:"MACLEAN"."OBJECT_ID"<100010 AND ROWNUM<2
  Access Path: index (skip-scan)
    SS sel: 0.0000e+00  ANDV (#skips): 4073
    SS io: 32584.00 vs. table scan io: 2.00
    Skip Scan rejected
  Access Path: index (FullScan)
    Index: IND_OBJ
    resc_io: 380547.00  resc_cpu: 10551751028
    ix_sel: 1  ix_sel_with_filters: 1
    Cost: 382007.38  Resp: 382007.38  Degree: 1
  Best:: AccessPath: TableScan
***************************************
OPTIMIZER STATISTICS AND COMPUTATIONS
***************************************
GENERAL PLANS
***************************************
Considering cardinality-based initial join order.
***********************
Join order[1]:  MACLEAN[MACLEAN]#0
***********************
Best so far: Table#: 0  cost: 2.0008  card: 0.0000  bytes: 13
(newjo-stop-1) k:0, spcnt:0, perm:1, maxperm:80000
*********************************
Number of join permutations tried: 1
*********************************
Final - All Rows Plan:  Best join order: 1
  Cost: 2.0008  Degree: 1  Card: 1.0000  Bytes: 13
  Resc: 2.0008  Resc_io: 2.0000  Resc_cpu: 7121
  Resp: 2.0008  Resp_io: 2.0000  Resc_cpu: 7121

Current SQL statement for this session:
explain plan for select /*+ FIRST_ROWS PARALLEL("MACLEAN", 1) */
 1
  from "SYS"."MACLEAN" PARTITION("P4")
 where (((("OBJECT_ID" < 100010))))
   and rownum < 2

============
Plan Table
============
-------------------------------------------+-----------------------------------+---------------+
| Id  | Operation                | Name    | Rows  | Bytes | Cost  | Time      | Pstart| Pstop |
-------------------------------------------+-----------------------------------+---------------+
| 0   | SELECT STATEMENT         |         |       |       |     2 |           |       |       |
| 1   |  COUNT STOPKEY           |         |       |       |       |           |       |       |
| 2   |   PARTITION RANGE SINGLE |         |     1 |    13 |     2 |  00:00:01 | 4     | 4     |
| 3   |    TABLE ACCESS FULL     | MACLEAN |     1 |    13 |     2 |  00:00:01 | 4     | 4     |
-------------------------------------------+-----------------------------------+---------------+
Predicate Information:
----------------------
1 - filter(ROWNUM<2)
3 - filter("OBJECT_ID"<100010)

Content of other_xml column
===========================
  db_version     : 10.2.0.1
  parse_schema   : SYS
  plan_hash      : 4280157877
  Outline Data:
  /*+
    BEGIN_OUTLINE_DATA
      IGNORE_OPTIM_EMBEDDED_HINTS
      OPTIMIZER_FEATURES_ENABLE('10.2.0.1')
      ALL_ROWS
      OUTLINE_LEAF(@"SEL$1")
      FULL(@"SEL$1" "MACLEAN"@"SEL$1")
    END_OUTLINE_DATA
  */

Dumping Hints
=============
  atom_hint=(@=0x988453c0 err=20 resol=1 used=0 token=453 org=1 lvl=2 txt=FIRST_ROWS )
  atom_hint=(@=0x988451f8 err=0 resol=1 used=1 token=177 org=1 lvl=3 txt=NO_PARALLEL ("MACLEAN") )
********** WARNING: SOME HINTS HAVE ERRORS *********

实际 “_optimizer_ignore_hints”参数是起到的作用是使SQL 隐式地回归到默认的optimizer_mode=ALL_ROWS上来，我们终于战胜了嵌入在SQL语句中的”First Rows” Hint 。

总结

在针对10g 多个版本上的Split Partition 可能因其DDL所附带的递归SQL 使用了固定的而又不恰当的”First rows Hint” 提示而造成语句执行缓慢的问题，我们可以通过以下3个方案解决：

A. 通过每次执行Split之前都收集指定分区的统计，并设置”_optimizer_ignore_hints” =true 来规避分区统计信息失效和不恰当的”First Rows hint” 所可能带来的问题，方法如下:

exec dbms_stats.gather_table_stats('&OWNER','&TABNAME',partname=>'&PARNAME');

alter session set "_optimizer_ignore_hints"=true;

推荐使用方案A

补充：

有同学反映：

”隐藏参数_optimizer_ignore_hints在分区表的split操作中并未启到作用。

从我的测试结果来看，虽然导致split操作慢的根源是FIRST_ROWS优化器模式下的分区表select语句：

select /*+ FIRST_ROWS PARALLEL(“MACLEAN”, 1) */ 1
from
MACLEAN PARTITION (“P4”) where ( ( ( (
“OBJECT_ID” < ‘1000010’ ) ) ) ) and rownum < 2;

不过以上的sql语句在_optimizer_ignore_hints参数调整后仍使用FIRST_ROWS的优化器模式，即没有生效。

我觉得这可能与oracle在执行自己内部命令时，会忽略该参数的设置。（直接执行select的sql语句确实会忽略hint）”

实际通过10053 事件追踪该SPLIT PARTITION DDL语句所生产的递归SQL语句，发现带有”FIRST_ROWS”提示的SELECT语句甚至没有解析的过程，很有可能是该递归SQL语句直接使用了内部硬编码的存储大纲OUTLINES所导致的。

换句话说之前_optimizer_ignore_hints的隐式参数针对我们手动执行的带有FIRST_ROWS HINT的SELECT语句是有效的，而对于DDL所附带的递归SQL无效。

注：实际在对嵌入了HINT的非recursive SQL语句做调优时，若我们无法修改该SQL的HINT，则还是可以利用到”_optimizer_ignore_hints”的。

想了一下可以通过设置较旧的优化器特性了解决该问题(alter session set optimizer_features_enable=’8.0.0′;)，该optimizer_features_enable参数同样可以在session级别设置，如：

TEST A:

SQL> set timing on;

SQL> oradebug setmypid;
Statement processed.

SQL> oradebug event 10046 trace name context forever,level 12;
Statement processed.

SQL> oradebug tracefile_name;
/s01/admin/G10R21/udump/g10r21_ora_13646.trc

SQL> alter table maclean split partition p4 at (10000081) into (partition p14, partition p4);

Table altered.

Elapsed: 00:00:42.50

TEST B:

SQL> set timing on;

SQL> oradebug setmypid;
Statement processed.

SQL> oradebug tracefile_name;
/s01/admin/G10R21/udump/g10r21_ora_13656.trc

SQL> alter session set optimizer_features_enable='8.0.0';

Session altered.

Elapsed: 00:00:00.01

SQL> alter table maclean split partition p4 at (10000091) into (partition p15, partition p4);

Table altered.

Elapsed: 00:00:00.05

PARSING IN CURSOR #2 len=152 dep=1 uid=0 oct=3 lid=0 tim=1291531645417871 hv=2124209225 ad='a74b41f0'

select /*+ FIRST_ROWS PARALLEL("MACLEAN", 1) */ 1 from "SYS"."MACLEAN" PARTITION ("P4") 
where ( (  (  ( "OBJECT_ID" < 10000091 )  )  ) ) and rownum < 2
END OF STMT
PARSE #2:c=1000,e=684,p=0,cr=0,cu=0,mis=1,r=0,dep=1,og=2,tim=1291531645417864
BINDS #2:
EXEC #2:c=0,e=71,p=0,cr=0,cu=0,mis=0,r=0,dep=1,og=2,tim=1291531645417984
FETCH #2:c=0,e=92,p=0,cr=3,cu=0,mis=0,r=0,dep=1,og=2,tim=1291531645418094
STAT #2 id=1 cnt=0 pid=0 pos=1 obj=0 op='COUNT STOPKEY (cr=3 pr=0 pw=0 time=122 us)'
STAT #2 id=2 cnt=0 pid=1 pos=1 obj=0 op='PARTITION RANGE SINGLE PARTITION: 13 13 (cr=3 pr=0 pw=0 time=95 us)'
STAT #2 id=3 cnt=0 pid=2 pos=1 obj=52809 op='TABLE ACCESS FULL MACLEAN PARTITION: 13 13 (cr=3 pr=0 pw=0 time=59 us)'
=====================

PARSING IN CURSOR #2 len=177 dep=1 uid=0 oct=3 lid=0 tim=1291531645418799 hv=339345368 ad='a74cedd8'
select /*+ FIRST_ROWS PARALLEL("MACLEAN", 1) */ 1 from "SYS"."MACLEAN" PARTITION ("P4")
 where ( (  (  ( "OBJECT_ID" >= 10000091 OR "OBJECT_ID" IS NULL  )  )  ) ) and
rownum < 2
END OF STMT
PARSE #2:c=1000,e=589,p=0,cr=0,cu=0,mis=1,r=0,dep=1,og=2,tim=1291531645418792
BINDS #2:
EXEC #2:c=0,e=67,p=0,cr=0,cu=0,mis=0,r=0,dep=1,og=2,tim=1291531645418908
FETCH #2:c=0,e=66,p=0,cr=3,cu=0,mis=0,r=0,dep=1,og=2,tim=1291531645418992
STAT #2 id=1 cnt=0 pid=0 pos=1 obj=0 op='COUNT STOPKEY (cr=3 pr=0 pw=0 time=92 us)'
STAT #2 id=2 cnt=0 pid=1 pos=1 obj=0 op='PARTITION RANGE SINGLE PARTITION: 13 13 (cr=3 pr=0 pw=0 time=64 us)'
STAT #2 id=3 cnt=0 pid=2 pos=1 obj=52809 op='TABLE ACCESS FULL MACLEAN PARTITION: 13 13 (cr=3 pr=0 pw=0 time=36 us)'
WAIT #1: nam='control file sequential read' ela= 31 file#=0 block#=1 blocks=1 obj#=-1 tim=1291531645419383
WAIT #1: nam='control file sequential read' ela= 12 file#=1 block#=1 blocks=1 obj#=-1 tim=1291531645419432
WAIT #1: nam='control file sequential read' ela= 23 file#=0 block#=15 blocks=1 obj#=-1 tim=1291531645419494
WAIT #1: nam='control file sequential read' ela= 10 file#=0 block#=17 blocks=1 obj#=-1 tim=1291531645419556

B. 如果确实遇到了该问题，也可以将Index FULL SCAN 所使用的全局索引drop 掉来达到强制使用FULL single table partition SCAN的目的，实际使用中不推荐

C. 避免使用Global Partitioned index 或 Non-partitioned Index ，而采用Local partitioned index ，这似乎更难做到

D. 也可以通过将原Maxvalue的分区drop掉之后(前提是该分区真的是空的)，再添加新分区的做法来绕过该问题

Filed Under: Oracle Tagged With: _optimizer_ignore_hints, cbo, hint, partition

ORA-00600:[32695], [hash aggregation can't be done]错误一例

2010/09/03 by admin 3 Comments

还是那个hash group by算法的问题，日志文件中出现以下记录:

*** ACTION NAME:(SQL 窗口 - 新建) 2010-09-03 14:27:54.594
*** MODULE NAME:(PL/SQL Developer) 2010-09-03 14:27:54.594
*** SERVICE NAME:(HQYDB1) 2010-09-03 14:27:54.594
*** SESSION ID:(3205.17923) 2010-09-03 14:27:54.594
*** 2010-09-03 14:27:54.594
ksedmp: internal or fatal error
ORA-00600: internal error code, arguments: [32695], [hash aggregation can't be done], [], [], [], [], [], []
Current SQL statement for this session:
create table zou_201008_cell_id as
select /* g_all_cdr02,60 */
calling_num mobile_number,
lac,
lpad(cell_id,5,'0') cell_id,
count(*) c,
sum(call_duration) call_duration,
sum(decode(record_type,'00',1,0)*call_duration) moc_call_duration,
sum(decode(record_type,'01',1,0)*call_duration) mtc_call_duarion
from  g_all_cdr02
where substr(calling_num,1,7) in (select mobile_prefix from zou_mobile_prefix)
group by
calling_num ,
lac,
lpad(cell_id,5,'0')
----- Call Stack Trace -----
calling              call     entry                argument values in hex
location             type     point                (? means dubious value)
-------------------- -------- -------------------- ----------------------------
ksedst+001c          bl       ksedst1              100000000 ? 11055A9A0 ?
ksedmp+0290          bl       ksedst               104A54870 ?
ksfdmp+0018          bl       03F30204
kgeriv+0108          bl       _ptrgl
kgesiv+0080          bl       kgeriv               1050BE654 ? 1050BE604 ?
                                                   0000027E5 ? 080000000 ?
                                                   07FFFFFFF ?
ksesic1+0060         bl       kgesiv               43300000FFFF5310 ?
                                                   4530000000000000 ?
                                                   000000071 ? 000000001 ?
                                                   000000000 ?
qeshPartitionBuildH  bl       01F9CA24
D+04bc
qeshGBYOpenScan2+02  bl       qeshPartitionBuildH  0000027E5 ? 1105C06C0 ?
34                            D
qeshGBYOpenScan+001  bl       qeshGBYOpenScan2     FFFFFFFFFFF5740 ? 11055A938 ?
8                                                  000000000 ? 000000010 ?
qerghFetch+05e8      bl       qeshGBYOpenScan      000001000 ?
rwsfcd+0054          bl       _ptrgl
qerltFetch+036c      bl       03F2EB1C
ctcdrv+4160          bl       01F9C898
opiexe+2884          bl       ctcdrv               100000001 ? 100000001 ?
                                                   110467F30 ?
opiosq0+19f0         bl       opiexe               FFFFFFFFFFF8B50 ?
                                                   2824422142420820 ?
                                                   FFFFFFFFFFF8C10 ?
kpooprx+0168         bl       opiosq0              300000000 ? 000000000 ?
                                                   000000000 ? A4000000000000 ?
kpoal8+0400          bl       kpooprx              FFFFFFFFFFFB464 ?
                                                   FFFFFFFFFFFB068 ?
                                                   1BF000001BF ? 100000001 ?
                                                   000000000 ? A40000000000A4 ?
                                                   000000000 ? 1103878F8 ?
opiodr+0ae0          bl       _ptrgl
ttcpip+1020          bl       _ptrgl
opitsk+1124          bl       01F9F2A0
opiino+0990          bl       opitsk               000000000 ? 000000000 ?
opiodr+0ae0          bl       _ptrgl
opidrv+0484          bl       01F9E0E8
sou2o+0090           bl       opidrv               3C02DC1BBC ? 44065F000 ?
                                                   FFFFFFFFFFFF3A0 ?
opimai_real+01bc     bl       01F9B9F4
main+0098            bl       opimai_real          000000000 ? 000000000 ?
__start+0098         bl       main                 000000000 ? 000000000 ?

--------------------- Binary Stack Dump ---------------------

这次是因为应用人员不了解alter session的作用域，在PL/SQL Developer工具中的不同窗口(也就是不在同一会话中)中执行了”alter session set “_gby_hash_aggregation_enabled” = false;”和涉及group by操作的SQL，并导致了unpublished bug:6471770被触发。
我们比较容易地workaround绕过这个Bug:


/* 在会话级别设置优化参数_gby_hash_aggregation_enabled */

alter session set "_gby_hash_aggregation_enabled" = false;

/* 或者在语句中加入NO_USE_HASH_AGGREGATION的 hint */

select  /*+ NO_USE_HASH_AGGREGATION */ ....

以上提及的unpublished bug:6471770据称在10.2.0.5,11.1.0.7,11.2.0.1版本中被修正了。

Filed Under: Oracle Tagged With: cbo, hash group by, hint, internal error, optimizer, Oracle Bug, sort, SQL tuning

11g 新特性IGNORE_ROW_ON_DUPKEY_INDEX提示

2010/08/16 by admin 1 Comment

11g中引入一些功能强大的hint提示，其中就包括了IGNORE_ROW_ON_DUPKEY_INDEX。其官方定义为:
“The IGNORE_ROW_ON_DUPKEY_INDEX hint applies only to single-table INSERT operations. It is not supported for UPDATE, DELETE, MERGE, or multitable insert operations. IGNORE_ROW_ON_DUPKEY_INDEX causes the statement to ignore a unique key violation for a specified set of columns or for a specified index. When a unique key violation is encountered, a row-level rollback occurs and execution resumes with the next input row. If you specify this hint when inserting data with DML error logging enabled, then the unique key violation is not logged and does not cause statement termination.”

针对具有唯一性约束的键，若程序设计时没有考虑到插入具有重复键值的行会引发ORA-00001 unique constraint violated错误，进而可能导致程序过程终止的问题的话；直接修改程序将会是十分复杂的工程。所幸我们在11g中有了”IGNORE_ROW_ON_DUPKEY_INDEX”提示，在INSERT单表的语句中加入该hint可以让Oracle静默地(silently)忽略那些具有重复键值的插入行，而不触发ORA-00001错误，允许程序继续运行下去，这可以说是一种十分简便的折中方案。要在生产环境中使用该特性，我们有必要对比一下其同使用Exception处理违反唯一约束间的性能差别。

SQL> drop table youyus ;

Table dropped.

SQL> create table youyus (t1 int ,t2 varchar2(20),t3 varchar2(30)) tablespace users;

Table created

SQL> create unique index youyus_uk on youyus(t1) tablespace users;

Index created
/*清理现场，添加唯一约束索引*/

SQL> alter system set optimizer_dynamic_sampling=1;

System altered.

SQL> alter system flush shared_pool;

System altered.

SQL> alter system flush buffer_cache;

System altered.

/* 以下过程在0-30000的整数内随机取200000次值，可以确保尝试INSERT大量重复t1键值的行，
以便测试使用DUP_VAL_ON_INDEX Exception时的各项性能参数;此处以及之后我们都将commit置于loop循环外，
从而避免大量commit影响我们的实验结果*/

declare
  rnd int;
begin
  /* 使用exception处理重复键值插入违反约束的问题*/
  for i in 1 .. 200000 loop
    BEGIN
      select round(dbms_random.value * 30000) into rnd from dual;
      insert into youyus
        (t1, t2, t3)
      values
        (rnd, 'DUPLICATE', 'INSERT TEST');
    exception
      when DUP_VAL_ON_INDEX then
        continue;
    end;
  end loop;
  commit;
end;

SQL> select plsql_exec_time,cpu_time,elapsed_time,user_io_wait_time from v$sql where sql_text like 'declare%exception%';

PLSQL_EXEC_TIME   CPU_TIME ELAPSED_TIME USER_IO_WAIT_TIME
--------------- ---------- ------------ -----------------
        4392268   88296566     92345066            621020
/* 逝去时间92s,CPU时间为88s,PLSQL执行时间4s*/

SQL> select count(*) from youyus;

  COUNT(*)
----------
     29958

SQL> truncate table youyus;

Table truncated.

SQL> alter system flush shared_pool;

System altered.

SQL> alter system flush buffer_cache;

System altered.

/* 在插入前判断插入值是否违反唯一约束应当是一种不错的想法，不过写起来多少有些"麻烦"*/

declare
  dup_count int;
  rnd       int;
begin
   /* 使用插入前判断(check before insert)是否违反唯一约束的方式*/
  for i in 1 .. 200000 loop
    BEGIN
      select round(dbms_random.value * 30000) into rnd from dual;
      select count(*) into dup_count from youyus where t1 = rnd;
      IF (dup_count = 0) then
        insert into youyus
          (t1, t2, t3)
        values
          (rnd, 'DUPLICATE', 'INSERT TEST');
      END IF;
    END;
  END LOOP;
  commit;
end;

SQL> select plsql_exec_time,cpu_time,elapsed_time,user_io_wait_time  from v$sql where sql_text like 'declare%check%';

PLSQL_EXEC_TIME   CPU_TIME ELAPSED_TIME USER_IO_WAIT_TIME
--------------- ---------- ------------ -----------------
        2153769   15709301     18265730            679813

/* PLSQL执行时间缩短到2s,整个过程的CPU时间大幅减少到15s*/
/***  以上对比可以得出Exception处理是一种CPU敏感操作的结论  ***/

SQL> select count(*) from youyus;

  COUNT(*)
----------
     29968

SQL> truncate table youyus;

Table truncated.

SQL> alter system flush shared_pool;

System altered.

SQL> alter system flush buffer_cache;

System altered.

declare
  rnd int;
begin
/* 使用IGNORE_ROW_ON_DUPKEY_INDEX hint方式*/
  for i in 1 .. 200000 loop
    select round(dbms_random.value * 30000) into rnd from dual;
    insert /*+ IGNORE_ROW_ON_DUPKEY_INDEX(YOUYUS,YOUYUS_UK) */
    into youyus
      (t1, t2, t3)
    values
      (rnd, 'DUPLICATE', 'INSERT TEST');
  end loop;
  commit;
end;

SQL> select plsql_exec_time,cpu_time,elapsed_time,user_io_wait_time  from v$sql where sql_text like 'declare%IGNORE%';

PLSQL_EXEC_TIME   CPU_TIME ELAPSED_TIME USER_IO_WAIT_TIME
--------------- ---------- ------------ -----------------
        2377262   78452903     84209306            623539

SQL> select count(*) from youyus;

  COUNT(*)
----------
     29959

/*** 
IGNORE_ROW_ON_DUPKEY_INDEX hint模式下，
CPU_TIME对比Exception模式时减少11%，但仍远高于插入前预检查模式;
就修改程序的复杂度而言IGNORE_ROW_ON_DUPKEY_INDEX模式要低于使用Exception模式,
而Exception模式又要低于CHECK_BEFORE_INSERT模式；CHECK_BEFORE_INSERT模式的CPU成本最低，但修改程序时的成本时间最高 
                                                                                                  ***/

/*需要注意的是IGNORE_ROW_ON_DUPKEY_INDEX提示与我们以往使用的hint略有不同，不正确使用它将导致报错*/

declare
  rnd int;
begin
/* 使用IGNORE_ROW_ON_DUPKEY_INDEX hint方式*/
  for i in 1 .. 200000 loop
    select round(dbms_random.value * 30000) into rnd from dual;
    insert /*+ IGNORE_ROW_ON_DUPKEY_INDEX(YOUYUS,I_AM_ERROR) */
    into youyus
      (t1, t2, t3)
    values
      (rnd, 'DUPLICATE', 'INSERT TEST');
  end loop;
  commit;
end;
/
ERROR at line 1:
ORA-38913: Index specified in the index hint is invalid
ORA-06512: at line 7

上述三者各环节耗时图示:

总结一句，IGNORE_ROW_ON_DUPKEY_INDEX为lazy developer专备。

Filed Under: Oracle, Oracle SQL性能调优 Tagged With: duplicate key, hint, New Feature, Oracle 11g新特性

OPT_PARAM Hint

2009/09/11 by admin 1 Comment

Applies to:
Oracle Server – Enterprise Edition – Version: 10.2.0.1 to 10.2.0.3 – Release: 10.2 to 10.2
Information in this document applies to any platform.
Goal
This article is explains the new optimizer hint “OPT_PARAM” introduced in 10g R2.
Solution
“OPT_PARAM” is a new optimizer hint introduced in 10g Release 2. This hint behaves the same way as
setting a parameter (e.g, using alter session) except that the effect is for the statement only. The hint only works for optimizer parameters. Global parameters such as optimizer_features_enable are not covered but optimizer_features_enable specifically has its own hint:

/*+ optimizer_features_enable(‘9.2.0’) */

@For an INTERNAL list of usable parameters see: Note:986618.1 Parameters useable by OPT_PARAM hint
Hint Syntax
The syntax is:

/*+ opt_param( [,] ) */

parameter_name is the name of a parameter
parameter_value is its value.

If the parameter contains a numeric value, the parameter value has to be specified without quotes.
The hint can be used to set multiple parameters by repeating the hint, i.e.

/*+ opt_param( [,] )
opt_param( [,] )
*/

Basic Usage Example
For example, the following hint sets <> to ‘false’ when added to a statement:

/*+ opt_param(‘hash_join_enabled’,’false’) */

e.g.:
Without the hint:


SQL> select empno from emp e, dept d where e.ename=d.dname

---------------------------------------------------------------------------
| Id  | Operation          | Name | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |      |     8 |   160 |     7  (15)| 00:00:01 |
|*  1 |  HASH JOIN         |      |     8 |   160 |     7  (15)| 00:00:01 |
|   2 |   TABLE ACCESS FULL| DEPT |     4 |    40 |     3   (0)| 00:00:01 |
|   3 |   TABLE ACCESS FULL| EMP  |    28 |   280 |     3   (0)| 00:00:01 |
---------------------------------------------------------------------------

With the hint the hash join is disabled choosing a different plan:


SQL> select /*+ opt_param('hash_join_enabled','false') */ empno 
from emp e, dept d where e.ename=d.dname;


----------------------------------------------------------------------------
| Id  | Operation           | Name | Rows  | Bytes | Cost (%CPU)| Time     |
----------------------------------------------------------------------------
|   0 | SELECT STATEMENT    |      |     8 |   160 |     8  (25)| 00:00:01 |
|   1 |  MERGE JOIN         |      |     8 |   160 |     8  (25)| 00:00:01 |
|   2 |   SORT JOIN         |      |     4 |    40 |     4  (25)| 00:00:01 |
|   3 |    TABLE ACCESS FULL| DEPT |     4 |    40 |     3   (0)| 00:00:01 |
|*  4 |   SORT JOIN         |      |    28 |   280 |     4  (25)| 00:00:01 |
|   5 |    TABLE ACCESS FULL| EMP  |    28 |   280 |     3   (0)| 00:00:01 |
----------------------------------------------------------------------------



Multiple Parameter Settings example
The OPT_PARAM hint can be specified more than once Time to adjust more than one parameter at once as follows:


/*+ OPT_PARAM('_always_semi_join' 'off')
      OPT_PARAM('_b_tree_bitmap_plans' 'false')
      OPT_PARAM('query_rewrite_enabled' 'false')
      OPT_PARAM('_new_initial_join_orders' 'false')
      OPT_PARAM('optimizer_dynamic_sampling' 1)
      OPT_PARAM('optimizer_index_cost_adj' 1) */

Filed Under: Oracle, Oracle SQL性能调优 Tagged With: hint, skill, SQL tuning