11.2.0.2 “_datafile_write_errors_crash_instance”

可以通过隐藏参数_datafile_write_errors_crash_instance控制DBWR进程写出到数据文件遇到I/O问题时的表现:
在版本11.2.0.2中:
1. 默认情况下,若是RAC或未启用归档的实例或出现问题的是SYSTEM表空间的数据文件会导致实例终止;其他情况下,会造成相关数据文件被OFFLINE。
2.若该参数设置为TRUE,则在DBWR写文件出现问题时总是终止本实例
3. 若设置为FALSE,仅在未启用归档或SYSTEM表空间的数据文件的情况下会终止实例;

 

” The new behavior is defined by _datafile_write_errors_crash_instance
parameter. Parameter has no default value and 3 different behaviors depending
if it is set and to which value. For default behavior (param not set) the
instance is crashed if RAC or archivelog mode is disabled or this is a system
tablespace. Otherwise the file is offlined. If param is set to TRUE, it is
crashed in all cases. If set to FALSE, it is crashed only if archivelog mode
is disabled or this is a system tablespace.”

 

参数并不能完全保证实例不受IO错误影响而意外终止,且设置该参数可能导致数据文件被OFFLINE。 具体设置方法,该参数可以在线设置:

 

SQL> alter system set “_datafile_write_errors_crash_instance”=false;

[repost]Oracle RDBMS:Generic Large Object (LOB) Performance Guidelines

Oracle的Giri Mandalika给我们介绍了LOB大型对象的一些调优注意事项,之前我一直对_shared_io_pool_size这个undocumented参数不甚了解,以为它是一个shared pool相关的参数,根本原因是甚至没有任何一个公开的Mos Note介绍了这几个隐藏参数。

而这篇文章给出了比较好的解释,这里引用一下:

This blog post is generic in nature and based on my recent experience with a content management system where securefile BLOBs are critical in storing and retrieving the checked in content. It is stro ngly suggested to check the official documentation in addition to these brief guidelines. In general, Oracle Database SecureFiles and Large Objects Developer’s Guide 11g Release 2 (11.2) is a good starting point when creating tables involving SecureFiles and LOBs.

Guidelines

  • Tablespace: create the LOB in a different tablespace isolated from the rest of the database
  • Block size: consider larger block size (default 8 KB) if the expected size of the LOB is big
  • Chunk size: consider larger chunk size (default 8 KB) if larger LOBs are expected to be stored and retrieved
  • Inline or Out-of-line: choose “DISABLE STORAGE IN ROW” (out-of-line) if the average LOB size is expected to be > 4 KB. The default inlining is fine for smaller LOBs
  • CACHE or NOCACHE: consider bypassing the database buffer cache (NOCACHE) if large number of LOBs are stored and not expected to be retrieved frequently
  • COMPRESS or NOCOMPRESS: choose COMPRESS option if storage capacity is a concern and a constraint. It saves disk space at the expense of some performance overhead. In a RAC database environment, it is recommended to compress the LOBs to reduce the interconnect traffic
  • De-duplication: by default, duplicate LOBs are stored as a separate copy in the database. Choosing DEDUPLICATE option enables sharing the same data blocks for similar files thus reducing storage overhead and simplifying storage management
  • Partitioning: consider partitioning the parent table to maximize application performance. Hash partitioning is one of the options if there is no potential partition key in the table
  • Zero-Copy I/O protocol: turned on by default. Turning it off in a RAC database environment could be beneficial. Set the initialization parameter _use_zero_copy_io=FALSE to turn o ff the Zero-Copy I/O protocol
  • Shared I/O pool: database uses the shared I/O pool to perform large I/O operations on securefile LOBs. The shared I/O pool uses shared memory segments. If this pool is not large enough or if there is not enough memory available in this pool for a securefile LOB I/O operation, Oracle uses a portion of PGA until there is sufficient memory available in the shared I/O pool. Hence it is recommen ded to size the shared I/O pool appropriately by monitoring the database during the peak activity. Relevant initialization parameters: _shared_io_pool_size and _shared_iop_max_size

Important parameters For Oracle BRM Application in 11gR2

什么是Oracle BRM?

BRM 指的是Billing and Revenue Management (BRM) ,是一套专门针对通信行业设计的集成的终端到终端的企业软件套件。

Oracle 公司在2006年收购了Portal Software软件公司后,成为了Portal BRM产品的拥有者。 BRM最早被叫做Infranet(版本6.5, 6.7和更早),之后被称作Portal(在版本7.0, 7.2, 7.3, 7.4时代), 官方第一次使用BRM这一代号是从版本7.3.1开始。

客户有一套BRM系统运行在4节点的Exadata X2-8 Half Rack上,数据库版本是11.2.0.1 。

但是这套系统使用了Exadata默认的配置参数,而没有使用BRM系统专用的初始化参数。

这里我有必要提一下,一般来说大型的应用程序(Application)特别是Oracle自身的产品都会有经过Oracle公司自身验证过的一套推荐参数,譬如说Oracle Ebs Suite 11i 最早是在Oracle database 8i/9i 上设计的,一般来说在安装介质上就会附带有11i 在databse 8i/9i上的推荐配置参数,而如果你要将11i 迁移到10g上那么就需要到MOS上找出是否这一软件组合已经经过Oracle公司的认证,如果认证了那么一般都会有推荐参数。

假设计划在Oracle Database 11gR2上部署Oracle EBS R12的应用,那么可以从MOS上找到<Interoperability Notes EBS R12 with Database 11gR2 [ID 1058763.1]> 这个指南。

其他一些非Oracle的应用程序产品,如Sap这类流行的大型应用,Oracle也会进行一系列的认证,但是未必就有非常完整的Recommended Parameter列表了,当然如果您是SAP的用户的话,也可以从Sap哪里获得必要的支持。

<Questions About BRM Support For Oracle DB 11GR2 And Exadata [ID 1319678.1]>介绍了BRM应用程序在11GR2+Exadata上的一些常见问题和推荐的参数设置,这里引用一下:

Applies to:

Oracle Communications Billing and Revenue Management – Version: 7.3.1.0.0 to 7.4.0.0.0 – Release: 7.3.1 to 7.4.0
Information in this document applies to any platform.

Purpose

The purpose is to address some common queries around the supportability of BRM with respect to 11gR2 plus Exadata.

Questions and Answers

1) About 10G client support: BRM(731 and 74) is certified against 11G R2 DB environment using the provided 10G R2 client software. We would like to know about the support for the client itself; because the 10G server environment is out of support end of this year(2011), so does this out of support phrase also apply to the client software.

From BRM point of view , YES , we will support customers using BRM 731 + 11gR2(server) + 10gR2(client)

2) We would like to know of any pre-requisite patches of the BRM application to support the 11G database.

On BRM, there is no specific patches needed for 11G support. You would still continue to use the dm_oracle10g.so library (using the 10gR2 DB client).

3) Is it correct that a base installation of oracle 11G Release 2 is needed, and no additional patches are needed on the DB ( to successfully support pipeline batch processing on the BRM side) ?

Yes, 11gR2 is enough for BRM. Please check with the database group if they would advice any later patches for performance improvements.

4) This question is more on the relationship between the Application (BRM) connecting with ExaData ; are there are no specific patches on Exadata with respect to BRM?

BRM is certified on Exadata . It is certified against 11gR2 only and with BRM 7.3.1 onwards. No DB patches are required from a BRM perspective. You would need to check with the Exadata team/documentation to see if that particular product has any pre-requisites.

5) The certification overview does not specially mention RAC support on 11G, can you confirm if RAC is supported on the 11G database environment?

Yes, we do support RAC also on 11G database.

6) Can you elaborate on the AQ consequences in running them on the 11G environment?

There is nothing about consequences that we can tell. However we do support it ( AQ ) even in 11g.

7) As per the recent press release at : http://www.oracle.com/us/corporate/press/364536 ; about the BRM certification and benchmarking against Exadata; this is referring to BRM 7.4 and not to 7.3.1 .  How about the same with respect to BRM731 in combination with Exadata?

We did not benchmark BRM 731 on Exadata; therefore, we cannot tell if it will show the same performance figures as seen in the tests on 74. However, the reason of good performance on BRM 7.4 is because Exadata offers an excellent database server capability, not that BRM 7.4 has made specific code change to take advantage of the Exadata hardware. For example, the optimization technique such as using FLASH disk to handle Redo log is equally applicable to BRM 731. Therefore, we envision that Exadata will also provide performance benefit to BRM 731.

8) Are there any specific settings we need to configure in BRM 7.3.1 or in Exadata itself besides the logical ones like DB hostname to work properly with Exadata and gain (more) of the advantages of Exadata?

Please see the white paper at the location “http://www.oracle.com/us/industries/communications/brm-exadata-performance-wp-362789.pdf”. This is also part of the press release mentioned before.
While there are no general guidelines about specific settings on BRM or Exadata ; below is a set of the DB configuration for the benchmark, if you may be interested. However, please note that this configurations work well for the benchmark, with the hardware setup that is specific to the benchmark. While doing similar testing at your end, it is advisable to review this configuration and obtain professional services opinion before deploying the configuration in a production system.

BRM DB Configuration :
Important parameters from init.ora in alphabetic order are:

_b_tree_bitmap_plans=false
_disk_sector_size_override=true
_file_size_increase_increment=2143289344
_gc_policy_time=0
_optimizer_skip_scan_enabled=false
cluster_database_instances=2
compatible=’11.2.0.2.0′
db_block_size=8192
db_cache_size=511101108224
fast_start_mttr_target=3600
lock_sga=true
log_buffer=1073741824
open_cursors=2000
optimizer_index_caching=90
optimizer_index_cost_adj=25
pga_aggregate_target=102400M
pre_page_sga=false
processes=2048
session_cached_cursors=400
sga_target=524288M
shared_pool_size=32768M
_b_tree_bitmap_plans=false so the BRM sqls don’t get executed using plans that involve bitmap conversions, which in general result in longer sql execution time for BRM.
_disk_sector_size_override=true so db objects like redo logs can be created in the Flash Disk Group using larger BLOCKSIZE like 4KB, which improves log flush efficiency from the default 512B BLOCKSIZE.
_file_size_increase_increment=2143289344 for faster db backup / restore.
_gc_policy_time=0 to disable DRM.
_optimizer_skip_scan_enabled=false so the BRM SQL does not get executed using plans that opt for index skip scan, which is usually not the optimal access path for BRM SQL.
compatible must be ‘11.2.0.2.0’ to get the DB created. Because the ASM Disk Groups were created
to be 11.2.0.2 compatible which enabled extra functionality, but restricted usage to databases with
this level of compatibility or higher.
pre_page_sga=false. False is the default setting of this parameter. Hugepage was configured for
the DB SGA.

 

可以看到以上对几个隐藏参数有了更具体的描述,值得注意的是_disk_sector_size_override和_gc_policy_time;

_gc_policy_time是11.1之后才出现的参数,他的前身是_gc_affinity_time(_gc_policy_time in 11g),最大的作用是禁用DRM
_disk_sector_size_override决定了创建在Flash Disk Group 上 redo log的BlockSize。

Know GCS AND GES structure size in shared pool

RAC环境中共享池很大一部分被gcs和ges资源所占用,一般来说这些资源对象都是永久的(perm)的,所以我们无法期待LRU或flush shared_pool操作能够清理这些资源。

在使用大缓存(large buffer cache)的RAC实例环境中,查询v$sgastat内存动态性能视图时总是能发现’gcs resources’、’gcs shadows’、’ ges resource’、’ges enqueues ‘这些组件占用了共享池中的大量内存,为了避免shared pool出现著名的ORA-04031错误,Oracle推荐在RAC环境中设置较大的shared_pool_size初始化参数,此外显示地设置较大的GCS和GES资源结构的初始化分配数(INITIAL_ALLOCATION)也有利于避免ORA-4031。

这些控制GES和GCS资源结构初始化分配数量的参数主要包括:

  • _gcs_resources  number of gcs resources to be allocated GCS Resources Number of GCS resource structures determined by
    _gcs_resources parameter
    Stored in segmented array
    Externalized in X$KJBR
    Number of free GCS resource structures in X$KJBRFX
  • _gcs_shadow_locks number of pcm shadow locks to be allocated GCS Enqueues (Shadows/Clients) Number of GCS enqueue structures determined by  _gcs_shadow_locks parameter Stored in segmented array
    Externalized in X$KJBL
    Number of free GCS enqueue structures in X$KJBLFX
  • _lm_ress number of resources configured for cluster database LM_RESS controls the number of resources that can be locked by each lock manager instance. These resources include lock resources allocated for DML, DDL (data dictionary locks), data dictionary, and library cache locks plus the file and log management locks. Stored in heap
    Externalized in X$KJIRFT
  • _lm_locks number of enqueues configured for cluster database Stored in segmented array
    Externalized in X$KJILKFT

为了更好地在RAC环境中设置shared_pool_size共享池的大小(手动设置该参数并不会disable AMM or ASMM),我们很有必要评估大量初始化分配的全局资源本身将占用shared pool多大的空间。

我们可以通过v$resource_limit视图了解这些GES、GCS全局资源的分配情况:

SQL> select * from v$version;

BANNER
--------------------------------------------------------------------------------
Oracle Database 11g Enterprise Edition Release 11.2.0.2.0 - 64bit Production
PL/SQL Release 11.2.0.2.0 - Production
CORE    11.2.0.2.0      Production
TNS for Linux: Version 11.2.0.2.0 - Production
NLSRTL Version 11.2.0.2.0 - Production

SQL> select * from global_name;

GLOBAL_NAME
--------------------------------------------------------------------------------
www.askmac.cn



SQL> select * from v$resource_limit where resource_name in ('gcs_resources', 'gcs_shadows','ges_ress','ges_locks'); 

RESOURCE_NAME                  CURRENT_UTILIZATION MAX_UTILIZATION INITIAL_ALLOCATION        LIMIT_VALUE
------------------------------ ------------------- --------------- ------------------------- ------------------
ges_ress                                      7223            7486    1000000                 UNLIMITED
ges_locks                                     4944            5027    1000000                 UNLIMITED
gcs_resources                                 4021            4021     114466                    114466
gcs_shadows                                   3925            3925     114466                    114466

可以通过v$sgastat视图了解这些全局资源占用了多少空间:

select *
  from v$sgastat
 where name in
       ('ges resource ', 'ges enqueues', 'gcs resources', 'gcs shadows');

POOL         NAME                            BYTES
------------ -------------------------- ----------
shared pool  gcs resources                16483232
shared pool  gcs shadows                  11904560
shared pool  ges enqueues                 47809680
shared pool  ges resource                288405768

单个gcs_resources结构大约占用120 bytes
单个gcs_shadows 结构大约占用72 bytes
单个ges_resource 结构大约占用288 bytes

我们可以使用一下初步估算GES、GCS资源结构将至少占用多大的共享池资源:

‘gcs_resources’ = initial_allocation * 120 bytes = “_gcs_resources parameter” * 120 bytes
‘gcs_shadows’ = initial_allocation * 72 bytes = “_gcs_shadow_locks parameter” * 72 bytes
‘ges_resource’= initial_allocation * 288 bytes = “_lm_ress parameter ” * 288 bytes

注意这里计算出的仅仅是理论的最小值,实际值因为内存分配的机制所以必然会远大于计算值

如上例中 gcs resources = 114466 * 120 =13735920 << 实际值的16483232
gcs_shadows = 114466 * 72 = 8241552 << 实际值的11904560
ges_resource = 1000000 * 288 = 288000000 < 实际的288405768

一般来说我们将计算值 * 160% 后可以得出一个较为客观的估算值。

注意以上公式只是为我们在RAC环境中调优共享池的大小提供参考的依据。当我们观察v$resource_limit视图并认为需要提高GES、GSC资源的初始化分配数目时,可以参照上述方式估算出必要的shared_pool_size或sga_target大小。

Oracle内部错误:ORA-07445[kcflfi()+466] [INT_DIVIDE_BY_ZERO]一例

一套Windows上的11.2.0.1单实例数据库在database open阶段出现了ORA-07445:core dump [kcflfi()+466] [INT_DIVIDE_BY_ZERO] [] [PC:0x500282E] [] []内部错误,具体的出错日志如下:

LOG CONTENT

=======================ALERT.LOG============================

Starting ORACLE instance (normal)
LICENSE_MAX_SESSION = 0
LICENSE_SESSIONS_WARNING = 0
Picked latch-free SCN scheme 2
Using LOG_ARCHIVE_DEST_1 parameter default value as USE_DB_RECOVERY_FILE_DEST
ARCH: Warning; less destinations available than specified
by LOG_ARCHIVE_MIN_SUCCEED_DEST init.ora parameter
Autotune of undo retention is turned on. 
IMODE=BR
ILAT =84
2011-08-01 13:13:47.068000 +08:00
LICENSE_MAX_USERS = 0
SYS auditing is disabled
Starting up:
Oracle Database 11g Enterprise Edition Release 11.2.0.1.0 - Production
With the Partitioning, OLAP, Data Mining and Real Application Testing options.
Using parameter settings in server-side spfile C:\APP\PRODUCT\11.2.0\DBHOME_1\DATABASE\SPFILEG11R2.ORA
System parameters with non-default values:
  _spin_count              = 2000
  processes                = 500
  event                    = "10500 trace name context forever,level 8:10013 trace name context forever,level 10:
10015 trace name context forever,level 10"
  sga_max_size             = 600M
  shared_pool_size         = 152M
  large_pool_size          = 32M
  java_pool_size           = 4M
  streams_pool_size        = 0
  _db_file_direct_io_count = 12
  sga_target               = 0
  memory_target            = 0
  control_files            = "C:\APP\ORADATA\G11R2\CONTROLFILE\O1_MF_6VWCSH9J_.CTL"
  control_files            = "C:\APP\FLASH_RECOVERY_AREA\G11R2\CONTROLFILE\O1_MF_6VWCSHNF_.CTL"
  db_block_checksum        = "TRUE"
  db_block_size            = 8192
  db_cache_size            = 196M
  _shared_io_pool_size     = 0
  compatible               = "11.2.0.0.0"
  log_archive_dest_2       = "service=stdby optional lgwr sync affirm valid_for=(online_logfiles,all_roles)"
  log_buffer               = 10485760
  db_create_file_dest      = "C:\app\oradata"
  db_recovery_file_dest    = "C:\app\flash_recovery_area"
  db_recovery_file_dest_size= 500000M
  undo_tablespace          = "UNDOTBS1"
  _kgl_bucket_count        = 2
  remote_login_passwordfile= "EXCLUSIVE"
  db_domain                = ""
  session_cached_cursors   = 300
  audit_file_dest          = "C:\APP\ADMIN\G11R2\ADUMP"
  optimizer_features_enable= "10.2.0.4"
  audit_trail              = "DB"
  cell_offload_plan_display= "ALWAYS"
  db_name                  = "G11R2"
  open_cursors             = 3000
  _optimizer_extended_cursor_sharing_rel= "NONE"
  pga_aggregate_target     = 300M
  diagnostic_dest          = "C:\APP"
2011-08-01 13:13:48.164000 +08:00
PMON started with pid=2, OS id=984 
VKTM started with pid=3, OS id=3656 at elevated priority
VKTM running at (10)millisec precision with DBRM quantum (100)ms
GEN0 started with pid=4, OS id=5824 
DIAG started with pid=5, OS id=5832 
DBRM started with pid=6, OS id=2784 
PSP0 started with pid=7, OS id=2500 
DIA0 started with pid=8, OS id=5320 
MMAN started with pid=9, OS id=4128 
DBW0 started with pid=10, OS id=5852 
LGWR started with pid=11, OS id=3960 
CKPT started with pid=12, OS id=4472 
SMON started with pid=13, OS id=5788 
RECO started with pid=14, OS id=6036 
MMON started with pid=15, OS id=5740 
MMNL started with pid=16, OS id=2112 
ORACLE_BASE from environment = C:\app
alter database mount exclusive
2011-08-01 13:13:52.390000 +08:00
Sweep [inc][135908]: completed
NSS2 started with pid=19, OS id=2728 
Sweep [inc][135901]: completed
Successful mount of redo thread 1, with mount id 2704081164
Database mounted in Exclusive Mode
2011-08-01 13:13:53.413000 +08:00
Lost write protection disabled
2011-08-01 13:13:54.578000 +08:00
Sweep [inc][135897]: completed
Sweep [inc2][135908]: completed
Sweep [inc2][135901]: completed
Sweep [inc2][135897]: completed
2011-08-01 13:13:55.788000 +08:00
Completed: alter database mount exclusive
alter database open
Beginning crash recovery of 1 threads
 parallel recovery started with 3 processes
2011-08-01 13:13:56.959000 +08:00
Started redo scan
Completed redo scan
 read 0 KB redo, 0 data blocks need recovery
Started redo application at
 Thread 1: logseq 867, block 88140, scn 9122496
Recovery of Online Redo Log: Thread 1 Group 3 Seq 867 Reading mem 0
  Mem# 0: C:\APP\ORADATA\G11R2\ONLINELOG\O1_MF_3_6VWCSMPO_.LOG
  Mem# 1: C:\APP\FLASH_RECOVERY_AREA\G11R2\ONLINELOG\O1_MF_3_6VWCSNGX_.LOG
Completed redo application of 0.00MB
Completed crash recovery at
 Thread 1: logseq 867, block 88140, scn 9142497
 0 data blocks read, 0 data blocks written, 0 redo k-bytes read
2011-08-01 13:13:58.738000 +08:00
LGWR: STARTING ARCH PROCESSES
ARC0 started with pid=22, OS id=4784 
2011-08-01 13:13:59.765000 +08:00
ARC0: Archival started
LGWR: STARTING ARCH PROCESSES COMPLETE
ARC0: STARTING ARCH PROCESSES
ARC1 started with pid=24, OS id=2780 
ARC2 started with pid=25, OS id=1288 
ARC1: Archival started
LGWR: Primary database is in MAXIMUM AVAILABILITY mode
ARC2: Archival started
ARC1: Becoming the 'no FAL' ARCH
ARC1: Becoming the 'no SRL' ARCH
ARC2: Becoming the heartbeat ARCH
LGWR: Destination LOG_ARCHIVE_DEST_1 is not serviced by LGWR
ARC3 started with pid=26, OS id=3876 
2011-08-01 13:14:00.828000 +08:00
ARC3: Archival started
ARC0: STARTING ARCH PROCESSES COMPLETE
NSS2 started with pid=19, OS id=5156 
2011-08-01 13:14:29.008000 +08:00
ORA-16198: LGWR received timedout error from KSR
2011-08-01 13:14:35.980000 +08:00
Errors in file c:\app\diag\rdbms\g11r2\g11r2\trace\g11r2_lgwr_3960.trc:
ORA-16198: Timeout incurred on internal channel during remote archival
LGWR: Error 16198 verifying archivelog destination LOG_ARCHIVE_DEST_2
Destination LOG_ARCHIVE_DEST_2 is UNSYNCHRONIZED
LGWR: Continuing...
ARCH: LGWR is scheduled to archive destination LOG_ARCHIVE_DEST_2 after log switch
2011-08-01 13:14:38.629000 +08:00
Trying to expand controlfile section 11 for Oracle Managed Files
Exception [type: INT_DIVIDE_BY_ZERO, ] [] [PC:0x500282E, __VInfreq__kcflfi()+466]
Errors in file c:\app\diag\rdbms\g11r2\g11r2\trace\g11r2_arc0_4784.trc  (incident=136091):
ORA-07445: exception encountered: core dump [kcflfi()+466] [INT_DIVIDE_BY_ZERO] [] [PC:0x500282E] [] []
Incident details in: c:\app\diag\rdbms\g11r2\g11r2\incident\incdir_136091\g11r2_arc0_4784_i136091.trc
2011-08-01 13:14:40.283000 +08:00
Trace dumping is performing id=[cdmp_20110801131440]
2011-08-01 13:14:52.417000 +08:00
Sweep [inc][136091]: completed
Sweep [inc2][136091]: completed
2011-08-01 13:14:59.805000 +08:00
ARC2: Detected ARCH process failure
ARC2: STARTING ARCH PROCESSES
ARC0 started with pid=19, OS id=5016 
2011-08-01 13:15:00.836000 +08:00
ARC0: Archival started
ARC2: STARTING ARCH PROCESSES COMPLETE
2011-08-01 13:15:36.689000 +08:00
Deleted Oracle managed file C:\APP\FLASH_RECOVERY_AREA\G11R2\ARCHIVELOG\2011_08_01\O1_MF_1_866_73DFKWRK_.ARC
2011-08-01 13:15:38.013000 +08:00
Error 12154 received logging on to the standby
Errors in file c:\app\diag\rdbms\g11r2\g11r2\trace\g11r2_ora_4852.trc:
ORA-12154: TNS:could not resolve the connect identifier specified
ARCH: Error 12154 Creating archive log file to 'stdby'
Trying to expand controlfile section 11 for Oracle Managed Files
Exception [type: INT_DIVIDE_BY_ZERO, ] [] [PC:0x500282E, __VInfreq__kcflfi()+466]
Errors in file c:\app\diag\rdbms\g11r2\g11r2\trace\g11r2_ora_4852.trc  (incident=136051):
ORA-07445: exception encountered: core dump [kcflfi()+466] [INT_DIVIDE_BY_ZERO] [] [PC:0x500282E] [] []
Incident details in: c:\app\diag\rdbms\g11r2\g11r2\incident\incdir_136051\g11r2_ora_4852_i136051.trc
2011-08-01 13:15:39.680000 +08:00
Trace dumping is performing id=[cdmp_20110801131539]
2011-08-01 13:15:42.782000 +08:00
PMON (ospid: 984): terminating the instance due to error 397
2011-08-01 13:15:50.520000 +08:00
Instance terminated by PMON, pid = 984

=============================g11r2_ora_4852_i136051.trc=============================

Dump file c:\app\diag\rdbms\g11r2\g11r2\incident\incdir_136051\g11r2_ora_4852_i136051.trc
Oracle Database 11g Enterprise Edition Release 11.2.0.1.0 - Production
With the Partitioning, OLAP, Data Mining and Real Application Testing options
Windows NT Version V6.1 Service Pack 1 
CPU                 : 4 - type 586, 2 Physical Cores
Process Affinity    : 0x0x00000000
Memory (Avail/Total): Ph:2122M/3566M, Ph+PgF:5413M/7130M, VA:1084M/2047M 
Instance name: g11r2
Redo thread mounted by this instance: 1
Oracle process number: 17
Windows thread id: 4852, image: ORACLE.EXE (SHAD)

*** 2011-08-01 13:15:38.527
*** SESSION ID:(197.1) 2011-08-01 13:15:38.527
*** CLIENT ID:() 2011-08-01 13:15:38.527
*** SERVICE NAME:() 2011-08-01 13:15:38.527
*** MODULE NAME:(oradim.exe) 2011-08-01 13:15:38.527
*** ACTION NAME:() 2011-08-01 13:15:38.527

Dump continued from file: c:\app\diag\rdbms\g11r2\g11r2\trace\g11r2_ora_4852.trc
ORA-07445: exception encountered: core dump [kcflfi()+466] [INT_DIVIDE_BY_ZERO] [] [PC:0x500282E] [] []

========= Dump for incident 136051 (ORA 7445 [kcflfi()+466]) ========
----- Beginning of Customized Incident Dump(s) -----
Exception [type: INT_DIVIDE_BY_ZERO, ] [] [PC:0x500282E, __VInfreq__kcflfi()+466]

Oracle Database 11g Enterprise Edition Release 11.2.0.1.0 - Production
Process Id: 0x000010bc  Thread Id : 0x000012f4    Time : Mon Aug 01 13:15:38 
Excp. Code: 0xc0000094  Excp. Type: INT_DIVIDE    Flags: 0x00000000

------------------- Registers ----------------------------
eip = 0500282e esp = 0d9f525c ebp = 0d9f577c edi = 37eefe00 esi = 00000265
eax = 00000265 ebx = 00000000 ecx = 089ee234 edx = 00000000
ecs = 0000001b eds = 00000023 ees = 00000023 ess = 00000023
egs = 00000000 efs = 0000003b
eflags = 00010246
------------------- End of Registers ---------------------

*** 2011-08-01 13:15:38.536
dbkedDefDump(): Starting a non-incident diagnostic dump (flags=0x3, level=3, mask=0x0)
----- Current SQL Statement for this session (sql_id=a01hp0psv0rrh) -----
alter database open
----------- messages from pre-loading .sym files:
Symbol file C:\app\product\11.2.0\dbhome_1\RDBMS\ADMIN\oracommon11.SYM does not match binary.
 Symbol TimeStamp=4bb5eaac, Module TimeStamp=0 are different
Symbol file C:\app\product\11.2.0\dbhome_1\RDBMS\ADMIN\oraclsra11.SYM does not match binary.
 Symbol TimeStamp=4bb4cf99, Module TimeStamp=0 are different
----------- end of messages from pre-loading .sym files
----- Call Stack Trace -----
calling              call     entry                argument values in hex      
location             type     point                (? means dubious value)     
-------------------- -------- -------------------- ----------------------------
Symbol file C:\app\product\11.2.0\dbhome_1\BIN\oracommon11.SYM does not match binary.
 Symbol TimeStamp=4bb5eaac, Module TimeStamp=0 are different
Symbol file C:\app\product\11.2.0\dbhome_1\BIN\oraclsra11.SYM does not match binary.
 Symbol TimeStamp=4bb4cf99, Module TimeStamp=0 are different
EnumerateLoadedModules64 failed with error -1073741819
Symbol file oraclsra11.SYM does not match binary.
 Symbol TimeStamp=4bb4cf99, Module TimeStamp=0 are different
Symbol file oracommon11.SYM does not match binary.
 Symbol TimeStamp=4bb5eaac, Module TimeStamp=0 are different
__VInfreq__kcflfi()           00000000             
+466                                               
_kccrszf()+287       CALLrel  _kcflfi()            0 318345B8 34 31C0DD40 4000
                                                   265 4 7FFFFFFF 1 0 0
_kccrsd_expd()+1418  CALLrel  _kccrszf()           D9F7CEC 268 264
_kccwnc_reuse_expan  CALLrel  _kccrsd_expd()       D9F7CEC B 38
d()+640                                            
__VInfreq__kccwnc()  CALLrel  _kccwnc_reuse_expan  D9F7CEC B 26
+235                          d()                  
_krse_arc_complete(  CALLrel  _kccwnc()            D9F7CEC D9F6D38 B
)+1615                                             
_krse_arc_driver_co  CALLrel  _krse_arc_complete(  D9F78AC
re()+1307                     )                    
_krse_arc_driver()+  CALLrel  _krse_arc_driver_co  D9F7CEC 1 D9F7C6C 0 0 D9F7CC8
274                           re()                 0 0 0 0 0 0 0
_krsq_arch_to_force  CALLrel  _krse_arc_driver()   D9F7CEC 1 D9F7C6C 0 0 D9F7CC8
_switch()+196                                      0 0 0 0 0 0 0
__VInfreq__kcttsc()  CALLrel  _krsq_arch_to_force  D9F7CEC 1
+129                          _switch()            
_kcfopd()+1504       CALLrel  _kcttsc()            2
_adbdrv()+16700      CALLrel  _kcfopd()            0 0 0 0 D9FBBF8
_opiexe()+13594      CALLrel  _adbdrv()            4A C0000094 33644518 D9FBD38
                                                   6D60697 2F3FC5F0
_opiosq0()+6248      CALLrel  _opiexe()            4 0 D9FC704
_kpooprx()+277       CALLrel  _opiosq0()           3 E D9FC970 A4 0
_kpoal8()+632        CALLrel  _kpooprx()           D9FF074 D9FD3F8 13 1 0 A4
_opiodr()+1248       CALLreg  00000000             5E 1C D9FF070
___dyn_tls_init_cal  CALLreg  00000000             5E 1C D9FF070 1
lback()+2935122                                    
_opitsk()+1404       CALL???  00000000             C9A10E8 5E D9FF070 0 D9FED00
                                                   D9FF19C 53E52E 0 D9FF1C8
_opiino()+980        CALLrel  _opitsk()            0 0
_opiodr()+1248       CALLreg  00000000             3C 4 D9FFBC4
_opidrv()+1201       CALLrel  _opiodr()            3C 4 D9FFBC4 0
_sou2o()+55          CALLrel  _opidrv()            3C 4 D9FFBC4
_opimai_real()+124   CALLrel  _sou2o()             D9FFBD4 3C 4 D9FFBC4
_opimai()+125        CALLrel  _opimai_real()       2 D9FFBFC
_OracleThreadStart@  CALLrel  _opimai()            2 D9FFF3C 0 70 FFFFFFFF
4()+830                                            FFFFFFFF
___dyn_tls_init_cal  CALLptr  00000000             901FF6C D9FFFD4 776437F5
lback()+366382316                                  901FF6C 765D34CB 0
___dyn_tls_init_cal  CALLreg  00000000             901FF6C 765D34CB 0 0 901FF6C
lback()+367384440                                  0
___dyn_tls_init_cal  CALLrel  ___dyn_tls_init_cal  401326 901FF6C 0 0 0 0
lback()+367384392             lback()+367384403    
00000000             CALL???  00000000             

--------------------- Binary Stack Dump ---------------------
..................

从以上日志中可以看到在”Trying to expand controlfile section 11 for Oracle Managed Files“扩扎控制文件过程中出现了
_kccwnc_reuse_expan->_kccrsd_expd->_kccrszf->_kcflfi->_VInfreq__kcflfi()
函数的7445错误,kcf意为(manages and coordinates operations on the control file(s),kcf.c),是在处理日志文件中引发了INT_DIVIDE_BY_ZERO除数为零的代码bug。

通过7445和kcflfi关键词在MOS上搜索没有太大的发现,说明该Bug的处罚几率非常低,正好让我碰到说明是某些特殊参数的设置引起了该问题。

目标锁定启动日志中的非默认隐藏参数”_db_file_direct_io_count”,该参数决定了直接路径读写的IO大小,从9i开始该参数的单位调整为bytes而非原先的blocks,之前因为对该参数进行一些测试所以设置了一个较小值。

Parameter: DB_FILE_DIRECT_IO_COUNT
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Versions:	8.0 - 8.1
                This parameter is hidden in 9.0 onwards.

 Parameter type:        integer
 Parameter class:       dynamic, scope = ALTER SYSTEM DEFERRED
 Default value:         64
 Range of values:       operating system-dependent

Description:
~~~~~~~~~~~~
DB_FILE_DIRECT_IO_COUNT is used to specify the number of blocks to be used
for IO operations done by backup, restore or direct path read and write
functions. The IO buffer size is a product of DB_FILE_DIRECT_IO_COUNT and
DB_BLOCK_SIZE. The IO buffer size cannot exceed max_IO_size for your
platform.

Assigning a high value to this parameter results in greater use of PGA or
SGA memory.

o In Oracle8i, minimize the number of I/O requests by setting the
  DB_FILE_DIRECT_IO_COUNT instance parameter so that

  DB_BLOCK_SIZE x DB_FILE_DIRECT_IO_COUNT = max_io_size of system

  In Oracle8i the default for this is 64 blocks.

  (In Oracle9i, it is replaced by _DB_FILE_DIRECT_IO_COUNT which governs
   the size of direct I/Os in BYTES (not blocks). The default is 1Mb but
   will be sized down if the max_io_size of the system is smaller.)

ORA-19863 during RMAN duplicate

Applies to:
Oracle Server - Enterprise Edition - Version: 10.2.0.3
This problem can occur on any platform.
Symptoms
-- Problem Statement:
Duplicate failed during the datafile restore stage:

Starting restore at 2008-Apr-09 09:28:24
using channel ORA_AUX_DISK_1

channel ORA_AUX_DISK_1: starting datafile backupset restore
channel ORA_AUX_DISK_1: specifying datafile(s) to restore from backup set
restoring datafile 00003 to /u06/oradata/hcmprdc/sysaux01.dbf
...
restoring datafile 00121 to /u06/oradata/hcmprdc/waapp.dbf
channel ORA_AUX_DISK_1: reading from backup piece
/u04/oradata/flash_recovery_area/HCMPRD/mdjd8s5v_1_1
RMAN-00571: ===========================================================
RMAN-00569: =============== ERROR MESSAGE STACK FOLLOWS ===============
RMAN-00571: ===========================================================
RMAN-03002: failure of Duplicate Db command at 04/09/2008 09:28:28
RMAN-03015: error occurred in stored script Memory Script
ORA-19870: error reading backup piece /u04/oradata/flash_recovery_area/HCMPRD/mdjd8s5v_1_1
ORA-19863: device block size 1040384 is larger than max allowed: 262144

Cause
The database parameter _db_file_direct_io_count in the target and auxiliary instance does not match.
Solution

-- To implement the solution:

Ensure that parameter _db_file_direct_io_count on the target and auxiliary database the same

_DB_FILE_DIRECT_IO_COUNT need to be set to the same value between the source database 
where the backup was taken and the target database where the backup is being restored.

2.0 Size of Input/Output Buffers
================================

a. input buffers
----------------

NOTE : DB_FILE_DIRECT_IO_COUNT is not available in Oracle9i onwards.
       In Oracle9i, it is replaced by a hidden _DB_FILE_DIRECT_IO_COUNT which 
       governs the size of direct I/Os in BYTES (not blocks). The default is 
       1Mb butwill be sized down if the max_io_size of the system is smaller.

The input buffer size is:
  buffersize = db_block_size * db_file_direct_io_count

As there are 4 input buffers, the total input buffer memory use per channel is:
 memory(input) = #buffers * #files * buffersize
               = 4 * #files * buffersize

For example, if 2 channels are used, and each of these channels backs up 3 
files, then for each channel

 memory(input) = 4 * 3 * db_block_size * db_file_direct_io_count

b. output buffers
-----------------

For disk channels, the output buffer size is:
  buffersize = db_block_size * db_file_direct_io_count

For SBT_TAPE channels, the output buffer size in Oracle8/8i is o/s dependant. (On Solaris,
this defaults to 64k) On 9i/10g it defaults to 256k for all platforms. The BLKSIZE argument to 'allocate channel...' can be
used to override the default value.

As there are 4 output buffers,
  memory(output) = #buffers * buffersize
                 = 4 * buffersize

一般来说使用该隐藏参数的默认值即可,通过重置该参数后修复启动问题:

SQL> select * from v$version;

BANNER
--------------------------------------------------------------------------------
Oracle Database 11g Enterprise Edition Release 11.2.0.1.0 - Production
PL/SQL Release 11.2.0.1.0 - Production
CORE    11.2.0.1.0      Production
TNS for 32-bit Windows: Version 11.2.0.1.0 - Production
NLSRTL Version 11.2.0.1.0 - Production

SQL> alter system reset "_db_file_direct_io_count" scope=spfile;

System altered.

SELECT x.ksppinm NAME, y.ksppstvl VALUE, x.ksppdesc describ
 FROM SYS.x$ksppi x, SYS.x$ksppcv y
 WHERE x.inst_id = USERENV ('Instance')
 AND y.inst_id = USERENV ('Instance')
 AND x.indx = y.indx
AND x.ksppinm LIKE '%db_file_direct_io_count%'
/

NAME                           VALUE                DESCRIB
------------------------------ -------------------- ------------------------------
_db_file_direct_io_count       1048576              Sequential I/O buf size

windows上的11gr2默认该参数为1MB

11g新动态性能视图V$SQL_MONITOR,V$SQL_PLAN_MONITOR

11g中引入了新的动态性能视图V$SQL_MONITOR,该视图用以显示Oracle监视的SQL语句信息。SQL监视会对那些并行执行或者消耗5秒以上cpu时间或I/O时间的SQL语句自动启动,同时在V$SQL_MONITOR视图中产生一条记录。当SQL语句正在执行,V$SQL_MONITOR视图中的统计信息将被实时刷新,频率为每秒1次。SQL语句执行完成后,监视信息将不会被立即删除,Oracle会保证相关记录保存一分钟(由参数_sqlmon_recycle_time所控制,默认为60s),最终这些记录都会被删除并被重用。这一新的SQL性能监视特性仅在CONTROL_MANAGEMENT_PACK_ACCESS为DIAGNOSTIC+TUNING和STATISTICS_LEVEL为ALL|TYPICAL时被启用。
[Read more…]

【Oracle数据恢复】ORA-600[4194]错误一例

ORA-600[4194]内部错误一般由重做记录与回滚记录不匹配引发。Oracle在验证Undo record number时,会对比redo change 和回滚段中的undo record number,若发现2者存在差异则报该4194错误。其错误argument[a][b],a代表回滚块中的最大undo record number,b代表重做日志中记录的undo record number。这个错误可能由回滚段或者redo log日志文件讹误引起。

ORA-00600[4194]错误的根本原因是 redo记录与回滚段(rollback/undo)记录之间的不一致。当ORACLE在验证undo记录时相对应的变化需要应用到undo数据块的最大undo记录上,此时若检验出错则会报ORA-00600[4194]

 

 

 

此错误不像ORA-600[2662]或ORA-600[4000]错误那样必然导致数据库无法打开,因为它很少出现在前滚阶段;当数据库被打开,smon开始执行事务恢复或一些回滚段的管理工作时则很有可能触发该错误。

 

ORA-600[4194]的2个的含义:

Arg [a] Maximum Undo record number in Undo block
Arg [b] Undo record number from Redo block

 

这个ORA-600[4194] 报错属于ORACLE内核从cache层到事务undo处理,可能的影响是进程失败或者可能的回滚段坏块。

 

可能的bug 包括:

 

8240762  10.2.0.5,
11.1.0.7.10,
11.2.0.1
Undo corruptions with ORA-600 [4193]/ORA-600 [4194] or ORA-600 [4137] /
SMON may spin to recover transaction

 

3210520 9.2.0.5, 10.1.0.2 OERI[kjccqmg:esm] / OERI[4194] / corruption possible in RAC

792610 8.0.6.0, 8.1.6.0 Rollback segment corruption

 

对于非自举对象non-bootstrap对象对应的undo记录可以通过如下方法搞定,如果涉及到的对象是bootstrap系统对象则可能需要手动 bbed来修复, 如果自己搞不定可以找ASKMACLEAN专业数据库修复团队成员帮您恢复

 

 

 

来具体看一下错误记录:

 

 

 

Thu Aug 26 18:58:50 2010
Errors in file /s01/10gdb/admin/YOUYUS/bdump/youyus_smon_6587.trc:
ORA-01595: error freeing extent (3) of rollback segment (4))
ORA-00600: internal error code, arguments: [4194], [53], [41], [], [], [], [], []
Thu Aug 26 18:58:50 2010
..............
Errors in file /s01/10gdb/admin/YOUYUS/bdump/youyus_j000_6630.trc:
ORA-00354: corrupt redo log block header
ORA-00353: log corruption near block 2 change 1617922 time 08/26/2010 18:35:39
ORA-00334: archived log: '/s01/10gdb/flash_recovery_area/YOUYUS/onlinelog/o1_mf_3_65psr4on_.log'
Thu Aug 26 19:00:31 2010
Errors in file /s01/10gdb/admin/YOUYUS/bdump/youyus_j000_6630.trc:
ORA-00600: internal error code, arguments: [4194], [53], [41], [], [], [], [], []
Thu Aug 26 19:00:34 2010
Errors in file /s01/10gdb/admin/YOUYUS/bdump/youyus_j000_6630.trc:
ORA-00354: corrupt redo log block header
ORA-00353: log corruption near block 2 change 1617922 time 08/26/2010 18:35:39
ORA-00334: archived log: '/s01/10gdb/flash_recovery_area/YOUYUS/onlinelog/o1_mf_3_65psr4on_.log'
ORA-00600: internal error code, arguments: [4194], [53], [41], [], [], [], [], []
Thu Aug 26 19:00:35 2010
Errors in file /s01/10gdb/admin/YOUYUS/bdump/youyus_j000_6630.trc:
ORA-00354: corrupt redo log block header
ORA-00353: log corruption near block 2 change 1617922 time 08/26/2010 18:35:39
ORA-00334: archived log: '/s01/10gdb/flash_recovery_area/YOUYUS/onlinelog/o1_mf_3_65psr4on_.log'

ORA-00600: internal error code, arguments: [4194], [53], [41], [], [], [], [], []

 

 

 

如果你因为ORA-600[4194]错误导致数据库无法打开,那么可以尝试设置以下事件:

 

 

SQL> alter system set event='10513 trace name context forever,level 2 : 10512 trace name context forever,level 1: 10511 trace name context forever,level 2: 10510 trace name context forever,level 1' scope=spfile;
System altered.

/* 10513事件用以阻止SMON在启动数据库后执行事务恢复(transaction recovery) */
/* 10512事件用以阻止SMON shrink rollback segment */
/* 10511事件用以阻止SMON check to cleanup undo dictionary */
/* 10500事件用以阻止SMON check to offline pending offline rollback segment */

SQL> alter system set undo_management=MANUAL scope=spfile;
System altered.

SQL> shutdown immediate;
ORA-03113: end-of-file on communication channel

SQL> startup mount;
ORACLE instance started.

Total System Global Area 2634022912 bytes
Fixed Size                  2086288 bytes
Variable Size            2382367344 bytes
Database Buffers          234881024 bytes
Redo Buffers               14688256 bytes
Database mounted.
SQL> alter database open;

Database altered.

SQL>  create undo tablespace undoc datafile size 300M;

SQL> alter system set undo_management=AUTO scope=spfile;
System altered.

SQL>  alter system set undo_tablespace=undoc scope=spfile;
System altered.

SQL> shutdown immediate;
Database closed.
Database dismounted.
ORACLE instance shut down.
SQL> startup mount;
ORACLE instance started.

Total System Global Area 2634022912 bytes
Fixed Size                  2086288 bytes
Variable Size            2382367344 bytes
Database Buffers          234881024 bytes
Redo Buffers               14688256 bytes
Database mounted.

SQL> alter database open;
Database altered.

/* 通过重建undo表空间可以避免一些4194错误,但不是全部 */

/* 这个库目前处于随时会crash的不可控状态,我们必须要导出数据并导入到新库中 * /

/* 这种情况下direct方式 可能可以规避一些意外错误 */

[maclean@rh2 dump]$ exp maclean/maclean file=full_maclean.dmp owner=maclean  direct=y statistics=none
Export: Release 10.2.0.4.0 - Production on Thu Aug 26 21:18:40 2010
Copyright (c) 1982, 2007, Oracle.  All rights reserved.
Connected to: Oracle Database 10g Enterprise Edition Release 10.2.0.4.0 - 64bit Production
With the Partitioning, OLAP, Data Mining and Real Application Testing options
Export done in ZHS16GBK character set and UTF8 NCHAR character set

About to export specified users ...
. exporting pre-schema procedural objects and actions
. exporting foreign function library names for user MACLEAN
. exporting PUBLIC type synonyms
. exporting private type synonyms
. exporting object type definitions for user MACLEAN
About to export MACLEAN's objects ...
. exporting database links
. exporting sequence numbers
. exporting cluster definitions
. about to export MACLEAN's tables via Direct Path ...
Table SYS_EXPORT_TABLE_01 will be exported in conventional path.
. . exporting table            SYS_EXPORT_TABLE_01        256 rows exported
Table SYS_EXPORT_TABLE_02 will be exported in conventional path.
. . exporting table            SYS_EXPORT_TABLE_02        257 rows exported
Table SYS_EXPORT_TABLE_03 will be exported in conventional path.
..............
exporting refresh groups and children
. exporting dimensions
. exporting post-schema procedural objects and actions
. exporting statistics
Export terminated successfully with warnings.

/* we are lucky! */

Recommended Hidden Parameters for 11gR1

Question #1:
==========
_optimizer_cost_based_transformation=false Currently set to false, should we keep or remove it for 11g upgrade?
It is a workarond for several bugs, including ORA-600 bug 6666870 fixed only in 11,2?

ANSWER
=======
_OPTIMIZER_COST_BASED_TRANSFORMATION controls whether or not the
optimizer tries different transformations against a query
using the cost with and without the transformations in order
to determine if a transformation is useful or not.
The parameter can be set to any of:
“exhaustive”, “iterative”, “linear”, “on”, “off”
giving some control over how much effort is given to costing
various transformations.

Cost based transformation can add a high overhead at parse time
but can yeild considerable benefits by way of a better plan
for the statement.

Known bugs
6666870 11.2 OERI:qctcte1 from cost based transformation
8541212 11.2 OERI [qctcte1] with function based index and OLD style join push predicate

Question #2: _undo_autotune=false Currently set to false, should we remove it for 11g upgrade?
Search key: _undo_autotune 11.1.0.7
Bug.8430038/7291739 ORA-1628 MAX # EXTENTS 32765 REACHED FOR ROLLBACK SEGMENT _SYSSMU105_123755639:
Fixed in 11.2
Patch available

If you leave the _undo_autotune=false in the parameter file in 11.1.0.7, then you will have to manually adjust
UNDO_RETENTION, and none of the historical information would be captured in undostats. It is better to remove this
parameter and allow AUM to administer the tuned retention for you.
However, in 11.1.0.7 there is a bug that can occur for which the workaround is to set it to false. This is Bug 7291739.

So my recommendation is to remove the parameter, allowing _undo_autotune to default to true, then install the fix for
Bug 7291739 in 11.1.0.7.

Question #3:
===========
_unnest_subquery=false – Currently NOT set, but recommended by PeopleSoft in note ID 749100.1
“Operating System, RDBMS & Additional Component Patches Required for Installation PeopleTools 8.49”

ANSWER
=======
_UNNEST_SUBQUERY
This parameter controls whether the optimizer attempts to unnest
correlated subqueries or not.

Known bugs
8245217 11.2 Dump [vopcpl] unnesting subquery

隐藏参数_high_priority_processes与oradism

运行在操作系统上的进程存在2种系统时序优先级模式:即 实时模式 Real Time(RT) mode, 与分时模式 Time Sharing(TS) mode.
绝大多数Oracle进程运行在TS模式下:

[oracle@rh1 ~]$ ps -efc|grep ora_|grep -v grep
oracle    8510     1 TS   23 Mar27 ?        00:00:02 ora_pmon_PROD
oracle    8512     1 TS   23 Mar27 ?        00:00:00 ora_psp0_PROD
oracle    8514     1 TS   23 Mar27 ?        00:00:00 ora_mman_PROD
oracle    8516     1 TS   23 Mar27 ?        00:00:02 ora_dbw0_PROD
oracle    8518     1 TS   23 Mar27 ?        00:00:04 ora_lgwr_PROD
oracle    8520     1 TS   23 Mar27 ?        00:00:04 ora_ckpt_PROD
oracle    8522     1 TS   23 Mar27 ?        00:00:08 ora_smon_PROD
oracle    8524     1 TS   23 Mar27 ?        00:00:00 ora_reco_PROD
oracle    8526     1 TS   23 Mar27 ?        00:00:34 ora_cjq0_PROD
oracle    8528     1 TS   23 Mar27 ?        00:00:06 ora_mmon_PROD
oracle    8530     1 TS   24 Mar27 ?        00:00:07 ora_mmnl_PROD
oracle    8538     1 TS   23 Mar27 ?        00:00:00 ora_arc0_PROD
oracle    8540     1 TS   23 Mar27 ?        00:00:00 ora_arc1_PROD
oracle    8548     1 TS   23 Mar27 ?        00:00:00 ora_qmnc_PROD
oracle    8555     1 TS   23 Mar27 ?        00:00:00 ora_q000_PROD
oracle    8559     1 TS   23 Mar27 ?        00:00:00 ora_q001_PROD
oracle   30500     1 TS   23 22:10 ?        00:00:00 ora_j000_PROD

如上所示所有进程均运行在TS模式下且priority均为23|24.
Oracle一般不推荐使用RT模式,因为虽然个别进程可以通过这种方式获得更多的CPU资源,但往往系统的瓶颈并非CPU,即尽管CPU使用率高了,但实际系统TPS并未得到提升。
在10gr2版本后RAC中的LMS进程成为唯一一个使用RT模式的Oracle进程,我们可以通过查询参数_high_priority_processes了解相关信息:

SQL> col name format a40
SQL> SELECT x.ksppinm NAME, y.ksppstvl VALUE
  2   FROM SYS.x$ksppi x, SYS.x$ksppcv y
  3   WHERE x.inst_id = USERENV ('Instance')
  4   AND y.inst_id = USERENV ('Instance')
  5   AND x.indx = y.indx
  6  AND x.ksppinm LIKE '%priority%';

NAME                                     VALUE
---------------------------------------- ----------
_high_priority_processes                 LMS*
_os_sched_high_priority                  1

_high_priority_processes通过进程功能名进行匹配,下面我们将提高LGWR及PMON进程的优先级:

SQL> alter system set "_high_priority_processes"='LMS*|LGWR|PMON' scope=spfile;

System altered.

SQL> startup force;
ORACLE instance started.

Total System Global Area  281018368 bytes
Fixed Size                  2083336 bytes
Variable Size             150996472 bytes
Database Buffers          121634816 bytes
Redo Buffers                6303744 bytes
Database mounted.
Database opened.
SQL> !ps -efc|grep ora_|grep -v grep
oracle   31441     1 RR   41 22:50 ?        00:00:00 ora_pmon_PROD
oracle   31445     1 TS   23 22:50 ?        00:00:00 ora_psp0_PROD
oracle   31447     1 TS   23 22:50 ?        00:00:00 ora_mman_PROD
oracle   31449     1 TS   23 22:50 ?        00:00:00 ora_dbw0_PROD
oracle   31451     1 RR   41 22:50 ?        00:00:00 ora_lgwr_PROD
oracle   31455     1 TS   23 22:50 ?        00:00:00 ora_ckpt_PROD
oracle   31457     1 TS   23 22:50 ?        00:00:00 ora_smon_PROD
oracle   31459     1 TS   22 22:50 ?        00:00:00 ora_reco_PROD
oracle   31461     1 TS   23 22:50 ?        00:00:01 ora_cjq0_PROD
oracle   31463     1 TS   23 22:50 ?        00:00:01 ora_mmon_PROD
oracle   31465     1 TS   24 22:50 ?        00:00:00 ora_mmnl_PROD
oracle   31471     1 TS   24 22:50 ?        00:00:00 ora_p000_PROD
oracle   31473     1 TS   24 22:50 ?        00:00:00 ora_p001_PROD
oracle   31475     1 TS   24 22:50 ?        00:00:00 ora_arc0_PROD
oracle   31477     1 TS   22 22:50 ?        00:00:00 ora_arc1_PROD
oracle   31481     1 TS   23 22:50 ?        00:00:00 ora_qmnc_PROD
oracle   31488     1 TS   23 22:50 ?        00:00:00 ora_q000_PROD
oracle   31490     1 TS   23 22:50 ?        00:00:00 ora_q001_PROD
oracle   31500     1 TS   23 22:50 ?        00:00:00 ora_j000_PROD

好了lgwr和pmon进程也进入实时模式了,同时priority值上升到了41.
注意:
Oracle默认仅允许LMS进程(11g中多了VKTM进程)使用RT模式是有它的原因的,所以如果不是Oracle support 推荐,您没有任何修改隐式参数的理由。
其次根据Oracle文档[ID 602419.1]的描述,oradism文件(该文件位于$ORACLE_HOME/bin目录下)不正确的权限将导致RT模式无法被正确使用,该文件默认属于root用户并具有s权限。如下测试:

[oracle@rh1 bin]$ ls -la oradism
-r-sr-s---  1 root oinstall 14931 Mar 11  2008 oradism
[oracle@rh1 bin]$ su - root
Password:
[root@rh1 ~]# chown oracle:oinstall /s01/oracle/product/10.2.0/db_1/bin/oradism
[root@rh1 ~]# exit
logout
[oracle@rh1 bin]$ ls -la oradism
-r-xr-x---  1 oracle oinstall 14931 Mar 11  2008 oradism
[oracle@rh1 bin]$ sqlplus / as sysdba

SQL*Plus: Release 10.2.0.4.0 - Production on Sun Mar 28 23:07:03 2010

Copyright (c) 1982, 2007, Oracle.  All Rights Reserved.


Connected to:
Oracle Database 10g Enterprise Edition Release 10.2.0.4.0 - 64bit Production
With the Partitioning, OLAP, Data Mining and Real Application Testing options

SQL>
SQL> shutdown immediate;
Database closed.
Database dismounted.
ORACLE instance shut down.
SQL> startup;
ORACLE instance started.

Total System Global Area  281018368 bytes
Fixed Size                  2083336 bytes
Variable Size             150996472 bytes
Database Buffers          121634816 bytes
Redo Buffers                6303744 bytes
Database mounted.
Database opened.
SQL> col name format a35;
SQL> col value format a10;
SQL> SELECT x.ksppinm NAME, y.ksppstvl VALUE
  2   FROM SYS.x$ksppi x, SYS.x$ksppcv y
  3   WHERE x.inst_id = USERENV ('Instance')
  4   AND y.inst_id = USERENV ('Instance')
  5   AND x.indx = y.indx
  6  AND x.ksppinm LIKE '%priority%';

NAME                                VALUE
----------------------------------- ----------
_high_priority_processes            LMS*|LGWR|PMON
_os_sched_high_priority             1
SQL> !ps -efc|grep ora_|grep -v grep
oracle   31994     1 TS   23 23:07 ?        00:00:00 ora_pmon_PROD
oracle   31998     1 TS   23 23:07 ?        00:00:00 ora_psp0_PROD
oracle   32000     1 TS   23 23:07 ?        00:00:00 ora_mman_PROD
oracle   32002     1 TS   23 23:07 ?        00:00:00 ora_dbw0_PROD
oracle   32004     1 TS   24 23:07 ?        00:00:00 ora_lgwr_PROD
oracle   32008     1 TS   22 23:07 ?        00:00:00 ora_ckpt_PROD
oracle   32010     1 TS   23 23:07 ?        00:00:00 ora_smon_PROD
oracle   32012     1 TS   22 23:07 ?        00:00:00 ora_reco_PROD
oracle   32014     1 TS   23 23:07 ?        00:00:01 ora_cjq0_PROD
oracle   32016     1 TS   23 23:07 ?        00:00:01 ora_mmon_PROD
oracle   32018     1 TS   24 23:07 ?        00:00:00 ora_mmnl_PROD
oracle   32026     1 TS   24 23:07 ?        00:00:00 ora_arc0_PROD
oracle   32028     1 TS   23 23:07 ?        00:00:00 ora_arc1_PROD
oracle   32032     1 TS   23 23:07 ?        00:00:00 ora_qmnc_PROD
oracle   32045     1 TS   23 23:07 ?        00:00:00 ora_q000_PROD
oracle   32065     1 TS   23 23:08 ?        00:00:00 ora_q001_PROD
oracle   32072     1 TS   23 23:08 ?        00:00:00 ora_j000_PROD

that’s great, 显然oradism不仅为Oracle实例提供了内存资源控制功能,还包括了进程优先级分配的权限。
我们应当再次声明hidden parameter不应“滥用”于production environment.

Know more about _in_memory_undo

set parameter _in_memory_undo = FALSE to disable IMU

Workaround: Disable IMU (set _in_memory_undo=FALSE)

PLEASE NOTE: This bug applies to single instance databases and not RAC as IMU is not enabled in RAC.

注意在RAC系统中IMU是不可用的,所以也就不必要去设置_in_memory_undo=FALSE

The workaround will prevent the problem, but will not fix it.

Note: _in_memory_undo is a dynamic parameter for 10g with values of TRUE or FALSE. It specifies whether there should be in memory undo for transactions. Setting this value to FALSE will disable this feature. This will cause excess redo generation.

_in_memory_undo is applicable when

compatibility >= 10.0
undo_management = AUTO
cluster_database = FALSE

Running IMU transactions may generate out-of-order redo records

Disabling in memory undo (_in_memory_undo=false)
can help to eliminate “In memory undo latch” contention
but there may still be “undo global data” latch contention
as that latch is used regardless of the setting of
_in_memory_undo. The fix for this bug can help reduce
contention on both latches.

沪ICP备14014813号-2

沪公网安备 31010802001379号