11gR2新特性:Heavy swapping observed on system in last 5 mins.

在11gR2中DBRM(database resource manager,11gR2中新的后台进程,见《Learning 11g New Background Processes》)会在Alert.log告警日志中反映OS操作系统最近5分钟是否有剧烈的swap活动了, 具体的日志如下:

 

WARNING: Heavy swapping observed on system in last 5 mins.
pct of memory swapped in [3.07%] pct of memory swapped out [4.44%].
Please make sure there is no memory pressure and the SGA and PGA
are configured correctly. Look at DBRM trace file for more details.

 

进一步诊断可以观察DBRM后台进程的trace:

 

[oracle@vrh2 trace]$ cat VPROD2_dbrm_5466.trc
Trace file /s01/orabase/diag/rdbms/vprod/VPROD2/trace/VPROD2_dbrm_5466.trc
Oracle Database 11g Enterprise Edition Release 11.2.0.3.0 - 64bit Production
With the Partitioning, Real Application Clusters, Automatic Storage Management, OLAP,
Data Mining and Real Application Testing options
ORACLE_HOME = /s01/orabase/product/11.2.0/dbhome_1
System name:    Linux
Node name:      vrh2.oracle.com
Release:        2.6.32-200.13.1.el5uek
Version:        #1 SMP Wed Jul 27 21:02:33 EDT 2011
Machine:        x86_64
Instance name: VPROD2
Redo thread mounted by this instance: 2
Oracle process number: 7
Unix process pid: 5466, image: oracle@vrh2.oracle.com (DBRM)

*** 2011-12-29 22:08:14.627
*** SESSION ID:(165.1) 2011-12-29 22:08:14.627
*** CLIENT ID:() 2011-12-29 22:08:14.627
*** SERVICE NAME:() 2011-12-29 22:08:14.627
*** MODULE NAME:() 2011-12-29 22:08:14.627
*** ACTION NAME:() 2011-12-29 22:08:14.627

kgsksysstop: blocking mode (2) timestamp: 1325214494612191
kgsksysstop: successful
kgsksysresume: successful

*** 2011-12-29 22:08:43.869
PQQ: Active Services changed
PQQ: Old service table
SvcIdx  SvcId Active ActDop
     5      5      1      0
     6      6      1      0
PQQ: New service table
SvcIdx  SvcId Active ActDop
     1      1      1      0
     2      2      1      0
     5      5      1      0
     6      6      1      0
2012-01-02 01:49:39.805820 : GSIPC:KSXPCB: msg 0x9bc353f0 status 34, type 12, dest 1, rcvr 0

*** 2012-01-02 01:49:54.509
PQQ: Skipping service checks
Trace file /s01/orabase/diag/rdbms/vprod/VPROD2/trace/VPROD2_dbrm_5466.trc
Oracle Database 11g Enterprise Edition Release 11.2.0.3.0 - 64bit Production
With the Partitioning, Real Application Clusters, Automatic Storage Management, OLAP,
Data Mining and Real Application Testing options
ORACLE_HOME = /s01/orabase/product/11.2.0/dbhome_1
System name:    Linux
Node name:      vrh2.oracle.com
Release:        2.6.32-200.13.1.el5uek
Version:        #1 SMP Wed Jul 27 21:02:33 EDT 2011
Machine:        x86_64
Instance name: VPROD2
Redo thread mounted by this instance: 2
Oracle process number: 7
Unix process pid: 5466, image: oracle@vrh2.oracle.com (DBRM)

*** 2012-01-03 03:05:54.518
*** SESSION ID:(165.1) 2012-01-03 03:05:54.518
*** CLIENT ID:() 2012-01-03 03:05:54.518
*** SERVICE NAME:() 2012-01-03 03:05:54.518
*** MODULE NAME:() 2012-01-03 03:05:54.518
*** ACTION NAME:() 2012-01-03 03:05:54.518

PQQ: Skipping service checks
kgsksysstop: blocking mode (2) timestamp: 1325577954530079
kgsksysstop: successful
kgsksysresume: successful

*** 2012-01-03 03:05:59.270
PQQ: Active Services changed
PQQ: Old service table
SvcIdx  SvcId Active ActDop
     5      5      1      0
     6      6      1      0
PQQ: New service table
SvcIdx  SvcId Active ActDop
     1      1      1      0
     2      2      1      0
     5      5      1      0
     6      6      1      0
PQQ: Checking service limits

*** 2012-01-07 02:06:51.856
PQQ: Skipping service checks
PQQ: Checking service limits

*** 2012-01-08 23:12:11.302
PQQ: Skipping service checks
Heavy swapping observed in last 5 mins:    [pct of total memory][bytes]

*** 2012-01-09 22:39:51.619
total swpin [ 3.07%][124709K], total swpout [ 4.44%][180120K]
vm stats captured every 30 secs for last 5 mins:
swpin:                 swpout:  
[ 0.27%][     11096K]  [ 0.25%][     10451K]
[ 0.27%][     11240K]  [ 0.29%][     12000K]
[ 0.29%][     12001K]  [ 0.02%][       853K]
[ 0.16%][      6849K]  [ 0.02%][       966K]
[ 0.53%][     21604K]  [ 0.09%][      4031K]
[ 0.10%][      4415K]  [ 0.03%][      1414K]
[ 0.43%][     17808K]  [ 0.37%][     15016K]
[ 0.64%][     25972K]  [ 1.61%][     65515K]
[ 0.26%][     10560K]  [ 0.88%][     36051K]
[ 0.07%][      3164K]  [ 0.83%][     33823K]

 

可以看到dbrm收集到了短期内的swapin和swapout数据,这样便于我们诊断由swap造成的性能或者hang问题。

 

解决OS 系统严重swap的一些思路:

1.  诊断是否存在内存泄露的进程,解决内存泄露
2.  调优SGA/PGA ,减少oracle对内存的占用
3.  利用  echo 3 > /proc/sys/vm/drop_caches 命令可以暂时释放一些cache的内存
4. 调整系统VM内存管理参数, 例如Linux上sysctl.conf中的以下几个参数

vm.min_free_kbytes   :Raising the value in /proc/sys/vm/min_free_kbytes will cause the system to start reclaiming memory at an earlier time than it would have before.

vm.vfs_cache_pressure :        At the default value of vfs_cache_pressure = 100 the kernel will attempt to reclaim dentries and inodes at a “fair” rate with respect to pagecache and swapcache reclaim. Decreasing vfs_cache_pressure causes the kernel to prefer to retain dentry and inode caches. Increasing vfs_cache_pressure beyond 100 causes the kernel to prefer to reclaim dentries and inodes.

vm.swappiness  : default 60 ;Apparently /proc/sys/vm/swappiness on Red Hat Linux allows the admin to tune how aggressively the kernel swaps out processes’ memory. Decreasing the  swappiness setting may result in improved Directory performance as the kernel
holds more of the server process in memory longer before swapping it out.

设置以下值,减少out of memory的可能性:

# Oracle-Validated setting for vm.min_free_kbytes is 51200 to avoid OOM killer
vm.min_free_kbytes = 51200
#vm.swappiness = 40
vm.vfs_cache_pressure = 200

Solaris 上swap -s的解释

Solaris 上swap -s 命令输出的各项内容解释如下:

swap -s
total: 53609376k bytes allocated + 16159792k reserved = 69769168k used, 17837288k available

 

bytes allocated : The total amount of swap space in 1024-byte blocks that is currently allocated as backing store (disk-backed swap space).
reserved: The total amount of swap space in 1024-byte blocks not currently allocated, but claimed by memory for possible future use.
used: The total amount of swap space in 1024-byte blocks that is either allocated or reserved.
available: The total amount of swap space in 1024-byte blocks that is currently available for future reservation and allocation.

 

一般我们可以通过以下公式计算swap 使用率:

 

Output of ‘swap -s’ is:
total: 2514952k bytes allocated + 202368k reserved = 2717320k used, 7021424k available

Swap Utilization (%) is:
(2717320/(2717320+7021424))*100
= 27.9%

但是实际上Total virtual swap = RAM backed swap + Disk backed swap

swap -l report disk backed swap usage. It does not report virtual swap usage.

Physical disk swap configured:
# /usr/sbin/swap -l

swapfile dev swaplo blocks free
/dev/zvol/dsk/uppool/swap 181,3 8 163839992 163839992

Total Disk backed swap: 163839992 x 512 = 78G

 

还是建议用vmstat -p 监控下换页的情况:

 

# vmstat 5
kthr memory page disk faults cpu r b w swap free re mf pi po fr de sr s0 s1 s2 s3 in sy cs us sy id

0 0 0 3296516 38201892 4321 49454 0 0 0 0 0 0 0 6 0 11521 164084 69372 11 31 59
0 0 0 3361076 38193196 3034 34037 0 0 0 0 0 0 0 47 0 9639 107575 37481 8 24 68
0 0 0 3501776 38286380 3325 36763 0 0 0 0 0 0 0 5 0 12679 113673 42466 8 25 67
0 0 0 3545612 38326200 4935 57916 0 0 0 0 0 0 0 63 0 13688 111744 35804 12 31 56 <<

Available virtual swap: 3545612 KB =~ 3G

并关注tmpfs文件系统的使用情况,在Solaris下/tmp目录可能会占用大量swap

The space remains allocated untill you either delete the files from /tmp directory or restart the server, as at the time of restart the swap space (/tmp) is cleaned.

Example.: If an export is performed into the /tmp directory, then the swap space will decrease with the size of the export dump file.
=====
solaris_user>swap -s
total: 2879320k bytes allocated + 277104k reserved = 3156424k used, 771104k available

solaris_user>dd if=/dev/zero of=/tmp/test.out count=100
100+0 records in
100+0 records out

solaris_user>swap -s
total: 2879416k bytes allocated + 277072k reserved = 3156488k used, 771040k available

 

High swap-space usage does not necessarily mean the system needs additional physical memory or that such usage is the reason for bad performance. High swapping in and out activities (observable with vmstat -p) can lead to performance problems: some processes have to wait for swapping activities to be finished before the processes run forward. Moreover, swapping is a single-threaded activity.

In some cases, you must also be aware of the available swap space. For example, the system runs hundreds or even thousands of Oracle session processes or Apache processes, and each process needs to reserve or allocate some swap space. In such cases, you must allocate an adequate swap device or add multiple swap devices.

Tmpfs

One difference between Solaris and other operating systems is /tmp, which is a nonpersistent, memory-based file system on Solaris (tmpfs). Tmpfs is designed for the situation in which a large number of short-lived files (like PHP sessions) need to be written and accessed on a fast file system. You can also create your own tmpfs file system and specify the size. See the man page for mount_tmpfs(1M).

Solaris also provides a ramdisk facility. You can create a ramdisk with ramdiskadm(1M) as a block device. The ramdisk uses physical memory only. By default, at most 25 percent of available physical memory can be allocated to ramdisks. The tmpfs file system uses virtual memory resources that include physical memory and swap space.

Large-sized files placed in tmpfs can affect the amount of memory space left over for program execution. Likewise, programs requiring large amounts of memory use up the space available to tmpfs. If you encounter this constraint (for example, running out of space on tmpfs), you can allocate more swap space by using the swap(1M) command. Avoid swapping in this case because swapping indicates shortage of physical memory and hurts performance even if swap space is sufficient.

tmpfs Filesystem
tmpfs file system also reports about virtual swap usage.tmpfs is a memory resident file system. It uses the page cache for caching file data. Files created in a tmpfs file system avoid physical disk read and write. The primary goal of designing tmpfs was to improve read/write performance of short lived files without invoking network and disk I/O. tmpfs does not use a dedicated memory such as a “RAM DISK”. Instead it uses virtual memory (VM) maintained by the kernel. This allows it to use VM and kernel resource allocation policies. Tmpfs files are written and read directly from the kernel memory. Pages allocated to tmpfs files are treated the same way as any other physical memory pages. Physical memory assigned to tmpfs files uses anonymous memory to store file data. The kernel does not differentiate tmpfs file data from the page cache. During memory pressure, tmpfs pages can be freed and written back to the physical swap device if the page daemon selects them as candidates for such. It is the user’s responsibility to keep a back up of tmpfs files by copying tmpfs files to disk based file system such as ufs. Otherwise, tmpfs files will be lost in case of a crash or reboot.

Tmpfs size changes dynamically depending upon how much virtual swap is available.

The "kbytes" column in "df -k /tmp" output is the amount of swap space available, rather than the total.

The tmpfs file system also has a minfree, so the total is slightly less than the amount of swap available. "kbytes" column of "df -k /tmp" output actually correspond to "swap -s" output of swap available. Normally, these two numbers are pretty close.

The difference is due to the tmpfs_minfree value, which is 2MB by default.
# df -kl -Z /tmp

Filesystem  kbytes    used     avail    capacity     Mounted on
swap        3449940   116     3449824     1%         /tmp

When a process releases memory then df -k /tmp would also show that its total file system size has increased.

AIX操作系统上安装Oracle数据库必不可少的几项检查工作

一直以来在UNIX/Linux like操作系统上安装Oracle数据库软件都是一门精细活,在实际安装软件前有不少操作系统参数或配置需要我们关心。我们以10g release 2为例,尽可能完整地列出所有有必要的预检查工作。

1.确认使用的AIX版本经过认证,AIX 5.2之前的版本包括5.1都没有通过Oracle 10g的认证,换而言之如果你要安装10g的话就要保证AIX的版本>=5.2,我们可以通过以下脚本进行验证:

[Read more…]

沪ICP备14014813号-2

沪公网安备 31010802001379号