Linux上的avahi-daemon Service服务

avahi-daemon是一种Linux操作系统上运行在客户机上实施查找基于网络的Zeroconf service的服务守护进程。 该服务可以为Zeroconf网络实现DNS服务发现及DNS组播规范。 用户程序通过Linux D-Bus信息传递接收发现到网络服务和资源的通知。 该守护进程配合缓存用户程序的答复,以帮助减少因答复而产生的网络流量。

 

 

 

详见以下描述:

 

Description

The avahi-daemon Linux service runs on client machines to perform network-based Zeroconf service discovery. Avahi is an implementation of the DNS Service Discovery and Multicast DNS specifications for Zeroconf Networking.  User applications receive notice of discovered network services and resources using the Linux D-Bus message passing. The daemon coordinates application efforts in caching replies, helping minimize network traffic.

Avahi provides a set of language bindings, including Python and Mono.  Because of its modularized architecture, Avahi is already integrated in major desktop components like GNOME’s Virtual File System or KDE’s input/output architecture.

Refer http://avahi.org/ for further specifications.

The avahi RPM package provides the /usr/sbin/avahi-daemon daemon and its configuration files.

Nature

This is a service to run the avahi-daemon(8) daemon.

Service Control

To manage the avahi-daemon service on demand, use the service(8) tool or run the /etc/init.d/avahi-daemon script directly:

# /sbin/service avahi-daemon help
Usage: /etc/init.d/avahi-daemon {start|stop|status|restart|condrestart}
# /etc/init.d/avahi-daemon help
Usage: /etc/init.d/avahi-daemon {start|stop|status|restart|condrestart}

The available commands are:

Command Description
start Start the avahi-daemon(8) daemon.
stop Stop the avahi-daemon(8) daemon.
status Report if the avahi-daemon(8) daemon is running.
restart Equivalent to a stop and then a start command sequence.
condrestart If the avahi-daemon(8) daemon is currently running, this is the same as a restart command. If the daemon is not running, no action is taken. Often used in RPM package installation to avoid starting a service not already running.

Configuration

To manage the avahi-daemon service at boot time, use chkconfig(8) tool:

# /sbin/chkconfig --list avahi-daemon
avahi-daemon 0:off 1:off 2:off 3:off 4:off 5:off 6:off
# /sbin/chkconfig avahi-daemon on
# /sbin/chkconfig --list avahi-daemon
avahi-daemon 0:off 1:off 2:on 3:on 4:on 5:on 6:off

Configuration file /etc/avahi/avahi-daemon.conf

# $Id: avahi-daemon.conf 1155 2006-02-22 22:54:56Z lennart $
#
# This file is part of avahi.
#
# avahi is free software; you can redistribute it and/or modify it
# under the terms of the GNU Lesser General Public License as
# published by the Free Software Foundation; either version 2 of the
# License, or (at your option) any later version.
#
# avahi is distributed in the hope that it will be useful, but WITHOUT
# ANY WARRANTY; without even the implied warranty of MERCHANTABILITY
# or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public
# License for more details.
#
# You should have received a copy of the GNU Lesser General Public
# License along with avahi; if not, write to the Free Software
# Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307
# USA.
# See avahi-daemon.conf(5) for more information on this configuration
# file![server]
#host-name=foo
#domain-name=local
browse-domains=0pointer.de, zeroconf.org
use-ipv4=yes
use-ipv6=yes
#check-response-ttl=no
#use-iff-running=no
#enable-dbus=yes
#disallow-other-stacks=no
#allow-point-to-point=no[wide-area]
enable-wide-area=yes

[publish]
#disable-publishing=no
#disable-user-service-publishing=no
#add-service-cookie=yes
#publish-addresses=yes
#publish-hinfo=yes
#publish-workstation=yes
#publish-domain=yes
#publish-dns-servers=192.168.50.1, 192.168.50.2
#publish-resolv-conf-dns-servers=yes

[reflector]
#enable-reflector=no
#reflect-ipv=no

[rlimits]
#rlimit-as=
rlimit-core=0
rlimit-data=4194304
rlimit-fsize=0
rlimit-nofile=30
rlimit-stack=4194304
rlimit-nproc=3

Oracle Enterprise Linux Version(s)

  • OEL 5

Notes

The AVAHI project is a free implementation of the former Apple Bonjour project and the subsequent Zeroconf project.

References

man 8 avahi-daemon
man 5 avahi-daemon.conf
http://avahi.org/
http://en.wikipedia.org/wiki/Avahi_%28software%29

Exadata Database Machine Host的操作系统OS版本

之前有同事问我Exadata用的是什么操作系统这个问题?

最早Oracle与HP合作的Exadata V1采用的是Oracle Enterprise Linux,而Oracle-Sun Exadata V2则目前还仅提供OEL,但是已经通过了Solaris -11 Express在 Exadata V2上的测试, 所以很快Exadata V2将会有Solaris的选择。

目前现有的Exadata X2-2 和 X2-8 绝大多数采用2个OEL 5的小版本:

较早出厂的使用OEL 5.3
# cat /etc/enterprise-release
Enterprise Linux Enterprise Linux Server release 5.3 (Carthage)

近期出场的使用OEL 5.5

# cat /etc/enterprise-release
Enterprise Linux Enterprise Linux Server release 5.5 (Carthage)

# uname -a
Linux vrh1.us.oracle.com 2.6.18-128.1.16.0.1.el5 #1 SMP Tue x86_64 x86_64 x86_64 GNU/Linux

 

The IB should be one of the compatible cards specified in Note 888828.1
If you build a backup server machine it is best tro build as close a clone of the Exadata Compute nodes as you can get.
I.e. install OEL 5 Update 5 and one of the IB cards specified in the note and you will have the correct ofed versions and kernel
This will guarantee interoperabilty and correct operation with the kernel and ofed drivers
From the doc
InfiniBand OFED Software
Exadata Storage Servers and database servers will interoperate with different InfiniBand OFED software versions, however, Oracle recommends that all versions be the same unless performing a rolling upgrade. Review Note 1262380.1 for database server software and firmware guidelines.

InfiniBand HCA
Exadata Storage Servers and database servers will interoperate with different InfiniBand host channel adapter (HCA) firmware versions, however, Oracle recommends that all versions be the same unless performing a rolling upgrade. Review Note 1262380.1 for database server software and firmware guidelines.

For a complete list of the Oracle QDR Infinband adaptors see here:

http://www.oracle.com/technetwork/documentation/oracle-net-sec-hw-190016.html#infinibandadp

For the compute nodes all firmware updates must be done via the bundle patches descibed in Doc 888828.1
So I would advise upgrading to the latest supported bundel patch.

For you backup server choose the same model card that came with the X2 compute nodes.
Install Oracle Eterprise Linux Release 5 Update 5
Upgrade the firmware to the same firmware an on the X2 or higher if not already the same,

Database Machine and Exadata Storage Server 11g Release 2 (11.2) Supported Versions [ID 888828.1]

Why ASMLIB and why not?

ASMLIB是一种基于Linux module,专门为Oracle Automatic Storage Management特性设计的内核支持库(kernel support library)。

长久以来我们对ASMLIB的认识并不全面,这里我们来具体了解一下使用ASMLIB的优缺点。

理论上我们可以从ASMLIB API中得到的以下益处:

  1. 总是使用direct,async IO
  2. 解决了永久性设备名的问题,即便在重启后设备名已经改变的情况下
  3. 解决了文件权限、拥有者的问题
  4. 减少了I/O期间从用户模式到内核模式的上下文切换,从而可能降低cpu使用率
  5. 减少了文件句柄的使用量
  6. ASMLIB API提供了传递如I/O优先级等元信息到存储设备的可能

虽然从理论上我们可以从ASMLIB中得到性能收益,但实践过程中这种优势是几乎可以忽略的,没有任何性能报告显示ASMLIB对比Linux上原生态的udev设备管理服务有任何性能上的优势。在Oracle官方论坛上有一篇<ASMLib and Linux block devices>讨论ASMLIB性能收益的帖子,你可以从中看到”asmlib wouldn’t necessarily give you much of an io performance benefit, it’s mainly for ease of management as it will find/discover the right devices for you, the io effect of asmlib is large the same as doing async io to raw devices.”的评论,实际上使用ASMLIB和直接使用裸设备(raw device)在性能上没有什么差别。

ASMLIB可能带来的缺点:

  1. 对于多路径设备(multipathing)需要在/etc/sysconfig/oracleasm-_dev_oracleasm配置文件中设置ORACLEASM_SCANORDER及ORACLEASM_SCANEXCLUDE,以便ASMLIB能找到正确的设备文件,具体可以参考Metalink Note<How To Setup ASM & ASMLIB On Native Linux Multipath Mapper disks? [ID 602952.1]>
  2. 因为ASM INSTANCE使用ASMLIB提供的asm disk,所以增加了额外的层面
  3. 每次Linux Kernel更新,都需要替换新的ASMLIB包
  4. 增加了因人为错误造成宕机downtime的可能
  5. 使用ASMLIB意味着要花费更多时间去创建和维护
  6. 因为ASMLIB的存在,可能引入更多的bug,这是我们最不想看到的
  7. 使用ASMLIB创建的disk,其disk header并不会和普通的asm disk header有什么不同,仅仅是在头部多出了ASMLIB的属性空间。

结论:
我个人的观点是尽可能不要使用ASMLIB,当然这不是DBA个人所能决定的事情。另一方面这取决于个人习惯,在rhel 4的早期发行版本中没有提供udev这样的设备管理服务,这导致在rhel 4中大量的ASM+RAC组合的系统使用ASMLIB , 经网友指出udev 作为kernel 2.6的新特性被引入,在rhel4的初始版本中就已经加入了udev绑定服务,但是在rhel4时代实际udev的使用并不广泛(In Linux 2.6, a new feature was introduced to simplify device management and hot plug capabilities. This feature is called udev and is a standard package in RHEL4 or Oracle
Enterprise Linux 4 (OEL4) as well as Novell’s SLES9 and SLES10.)。如果是在RHEL/OEL 5中那么你已经有充分的理由利用udev而放弃ASMLIB。

Reference:
ASMLIB Performance vs Udev
RAC+ASM 3 years in production Stories to share
How To Setup ASM & ASMLIB On Native Linux Multipath Mapper disks? [ID 602952.1]
ASMLib and Linux block devices

Understand Oracle Validated Configurations

Oracle Validated Configurations致力于为企业提供更简易、更快捷、成本更低的基于Linux和Oracle VM的解决方案。该计划向我们提供经过测试和验证的体系架构,其附带的文档揭示了相关硬件、软件、存储、网络原件的最佳配置实践,以帮助系统提升性能和可收缩性并降低成本。从行业角度来说Oracle Validated Configurations所验证的配置及提供的最佳实践文档受到Oracle合作伙伴的接受和认可,Oracle Validated Configurations提供了推荐使用的软硬件组合的部署方案细节,这些方案已被证明是十分有益的。

Oracle Validated Configurations提供了那些好处?

Oracle Validated Configurations是对系统底层组件在高负载下良好工作的有力保证,同时这些推荐配置也在实践中被证明是易于快速部署的。其有助于:

  1. 实现标准化的、具有可扩展性、高可用的且成本低廉的解决方案
  2. 加速并简化在Linux上部署Oracle软件
  3. 为最终用户降低了测试系统所要花费的昂贵成本
  4. 转嫁了用户的风险

那么Oracle Validated Configurations和最早推出的Oracle Product Certification有什么区别?

传统的Oracle Product Certification在认证某个操作系统平台后,可以确认Oracle的相关产品完全支持该系统平台。而Oracle Validated Configurations则通过测试验证更进一步提供了完整的组件组合信息,这些信息包括针对软硬件、存储的版本、设置、补丁的推荐值,而这些推荐值来源于Oracle及其合作伙伴的Linux测试实验室中高压测试的经验。

我们要如何使用Oracle Validated Configurations?

我们可以通过访问或订阅 <Browse Published Validated Configurations>来了解OVC中已经验证的软硬件组合配置。

此外在Oracle Enterprise Linux中提供了oracle-validated RPM软件包,通过使用OEL DVD介质上的该包我们可以更简单地部署安装Oracle产品所需要的软件包环境,特别是对于Oracle Database的安装来说有了以上特性后显得特别简单。

通过安装DVD介质使用oracle-validate软件包的步骤如下:
1.创建介质装载目录/media/disk: mkdir /media/disk
2.插入OEL DVD光盘
3.装载目录: mount /dev/cdrom /media/disk
4.touch /etc/yum.repos.d/public-yum-el5.repo,并加入以下内容

[oel5]
name = Enterprise Linux 5.5 DVD
baseurl=file:///media/disk/Server/
gpgcheck=0
enabled=1

注意以上name中的OEL版本(指5.5)可能和你手头DVD的版本不一样,这一般不会造成问题,但要保证当前操作系统与安装介质中的完全一致。

5.正式安装oracle-validated软件包环境,使用yum install oracle-validated 命令:

yum install oracle-validated
Loaded plugins: security
Setting up Install Process
Resolving Dependencies
--> Running transaction check
---> Package oracle-validated.x86_64 0:1.0.0-22.el5 set to be updated
--> Processing Dependency: /usr/lib/libaio.so for package: oracle-validated
--> Running transaction check
---> Package libaio-devel.i386 0:0.3.106-5 set to be updated
--> Finished Dependency Resolution

Dependencies Resolved

==========================================================================================
 Package                    Arch             Version                  Repository     Size
==========================================================================================
Installing:
 oracle-validated           x86_64           1.0.0-22.el5             ol5            16 k
Installing for dependencies:
 libaio-devel               i386             0.3.106-5                ol5            12 k

Transaction Summary
==========================================================================================
Install       2 Package(s)
Upgrade       0 Package(s)

Total download size: 27 k
Is this ok [y/N]: y
Downloading Packages:
------------------------------------------------------------------------------------------
Total                                                      12 MB/s |  27 kB     00:00
Running rpm_check_debug
Running Transaction Test
Finished Transaction Test
Transaction Test Succeeded
Running Transaction
  Installing     : libaio-devel                                                       1/2
  Installing     : oracle-validated                                                   2/2 

Installed:
  oracle-validated.x86_64 0:1.0.0-22.el5                                                  

Dependency Installed:
  libaio-devel.i386 0:0.3.106-5                                                           

Complete!

完成对oracle-validated的安装后操作系统软件包环境也就水道渠成了,此外该oracle-validated包还会帮助我们设置必要的Linux内核参数,具体来说它会修改/etc/sysctl.conf配置文件中的参数到Oracle推荐的值,以下为完成oracle-validated安装后的sysctl.conf,供参考:

# Kernel sysctl configuration file for Oracle Enterprise Linux
#
# For binary values, 0 is disabled, 1 is enabled.  See sysctl(8) and
# sysctl.conf(5) for more details.

# Controls IP packet forwarding
net.ipv4.ip_forward = 0

# Controls source route verification
net.ipv4.conf.default.rp_filter = 1

# Do not accept source routing
net.ipv4.conf.default.accept_source_route = 0

# Controls the System Request debugging functionality of the kernel

# Controls whether core dumps will append the PID to the core filename
# Useful for debugging multi-threaded applications
kernel.core_uses_pid = 1

# Controls the use of TCP syncookies
net.ipv4.tcp_syncookies = 1

# Controls the maximum size of a message, in bytes

# Controls the default maxmimum size of a mesage queue

# Controls the maximum shared segment size, in bytes

# Controls the maximum number of shared memory segments, in pages

# For 11g, Oracle-Validated setting for fs.file-max is 6815744
# For 10g, uncomment 'fs.file-max = 327679', and comment 'fs.file-max = 6553600' entry and re-run sysctl -p
# fs.file-max = 327679
fs.file-max = 6815744

# Oracle-Validated setting for kernel.msgmni is 2878
kernel.msgmni = 2878

# Oracle-Validated setting for kernel.msgmax is 8192
kernel.msgmax = 8192

# Oracle-Validated setting for kernel.msgmnb is 65536
kernel.msgmnb = 65536

# Oracle-Validated setting for kernel.sem is '250 32000 100 142'
kernel.sem = 250 32000 100 142

# Oracle-Validated setting for kernel.shmmni is 4096
kernel.shmmni = 4096

# Oracle-Validated setting for kernel.shmall is 1073741824
kernel.shmall = 1073741824

# Oracle-Validated setting for kernel.shmmax is 4398046511104 on x86_64 and 4294967295 on i386 architecture. Refer Note id 567506.1
kernel.shmmax = 4398046511104

# Oracle-Validated setting for kernel.sysrq is 1
kernel.sysrq = 1

# Oracle-Validated setting for net.core.rmem_default is 262144
net.core.rmem_default = 262144

# For 11g, Oracle-Validated setting for net.core.rmem_max is 4194304
# For 10g, uncomment 'net.core.rmem_max = 2097152', comment 'net.core.rmem_max = 4194304' entry and re-run sysctl -p
# net.core.rmem_max = 2097152
net.core.rmem_max = 4194304

# Oracle-Validated setting for net.core.wmem_default is 262144
net.core.wmem_default = 262144

# For 11g, Oracle-Validated setting for net.core.wmem_max is 1048576
# For 10g, uncomment 'net.core.wmem_max = 262144', comment 'net.core.wmem_max = 1048576' entry for this parameter and re-run sysctl -p
# net.core.wmem_max = 262144
net.core.wmem_max = 1048576

# Oracle-Validated setting for fs.aio-max-nr is 3145728
fs.aio-max-nr = 3145728

# For 11g, Oracle-Validated setting for net.ipv4.ip_local_port_range is 9000 65500
# For 10g, uncomment 'net.ipv4.ip_local_port_range = 1024 65000', comment 'net.ipv4.ip_local_port_range = 9000 65500' entry and re-run sysctl -p
# net.ipv4.ip_local_port_range = 1024 65000
net.ipv4.ip_local_port_range = 9000 65500

# Oracle-Validated setting for vm.min_free_kbytes is 51200 to avoid OOM killer
vm.min_free_kbytes = 51200

oracle-validated包还会帮助我们修改/etc/security/limits.conf配置以获得合理的shell limit:

[oracle@rh2 ~]$ ulimit  -a
core file size          (blocks, -c) 0
data seg size           (kbytes, -d) unlimited
scheduling priority             (-e) 0
file size               (blocks, -f) unlimited
pending signals                 (-i) 31668
max locked memory       (kbytes, -l) 50000000
max memory size         (kbytes, -m) unlimited
open files                      (-n) 131072
pipe size            (512 bytes, -p) 8
POSIX message queues     (bytes, -q) 819200
real-time priority              (-r) 0
stack size              (kbytes, -s) 8192
cpu time               (seconds, -t) unlimited
max user processes              (-u) 131072
virtual memory          (kbytes, -v) unlimited
file locks                      (-x) unlimited

[oracle@rh2 ~]$ cat /etc/security/limits.conf

# Oracle-Validated setting for nofile soft limit is 131072
oracle   soft   nofile    131072

# Oracle-Validated setting for nofile hard limit is 131072
oracle   hard   nofile    131072

# Oracle-Validated setting for nproc soft limit is 131072
oracle   soft   nproc    131072

# Oracle-Validated setting for nproc hard limit is 131072
oracle   hard   nproc    131072

# Oracle-Validated setting for core soft limit is unlimited
oracle   soft   core    unlimited

# Oracle-Validated setting for core hard limit is unlimited
oracle   hard   core    unlimited

# Oracle-Validated setting for memlock soft limit is 50000000
oracle   soft   memlock    50000000

# Oracle-Validated setting for memlock hard limit is 50000000
oracle   hard   memlock    50000000

SHMALL, SHMMAX and SGA sizing

Question:

I need to confirm my Linux kernel settings and also get pointers/explanation on how i need to properly setup my kernel for proper operation of the Oracle Server.
My aim for the SR is not so much to get actual answers on how to set values. Rather, I need help to clear up the concepts behind the numbers.

From the output of the commands below it can be seen that the server has 12 GB of memory and after the kernel is configured (see below output of ipcs -lms command), I have SHMMAX set at 8589933568.
After consulting various documents I have come to understand the following, please verify:

– The largest SGA size is that defined by PAGESIZE*kernel.shmall (in this case 16GB, which is a mistake apparently as the system only has 12GB of RAM)
– It is OK for shmmax to be smaller than the requested SGA. If additional size is needed, then the space will be allocated in multiple pages, as long as the size does not exceed PAGESIZE*kernel.shmall
– If more than one Oracle instances reside on the same server, then Linux Kernel settings will have to cater for the largest instance SGA, since
– … different instances will hold completely different memory segments, which will have to seperately adhere to kernel limitations, therefore the kernel limitations do not care for multiple instances, as those are different memory areas
– Memory for SGA is allocated completely by setting SGA_TARGET. In a different case, it will be allocated as needed

$ free
total used free shared buffers cached
Mem: 12299352 8217844 4081508 0 190816 6799828
-/+ buffers/cache: 1227200 11072152
Swap: 16775764 90912 16684852

ipcs -lms

—— Shared Memory Limits ——–
max number of segments = 4096
max seg size (kbytes) = 8388607
max total shared memory (kbytes) = 16777216
min seg size (bytes) = 1

—— Semaphore Limits ——–
max number of arrays = 128
max semaphores per array = 250
max semaphores system wide = 32000
max ops per semop call = 32000
semaphore max value = 32767

also ‘getconf PAGESIZE’ returns 4096

Answer:

– The largest SGA size is that defined by PAGESIZE*kernel.shmall (in this case 16GB, which is a mistake apparently as the system only has 12GB of RAM)

Comment :
Yes this needs to comply with the formula :
kernel.shmall = physical RAM size / pagesize as per NOTE:339510.1 .

– It is OK for shmmax to be smaller than the requested SGA. If additional size is needed, then the space will be allocated in multiple pages, as long as the size does not exceed PAGESIZE*kernel.shmall

Comment :
Yes it is ok to have SHMMAX<SGASIZE NOTE:567506.1 .
The allocation will be done in multiple shared segments either contigues
or non contiguous as per NOTE:15566.1

– If more than one Oracle instances reside on the same server, then Linux Kernel settings will have to cater for the largest instance SGA, since
different instances will hold completely different memory segments, which will have to seperately adhere to kernel limitations, therefore the kernel limitations do not care for multiple instances, as those are different memory areas.

Comment :
Yes thats valid for the SHMMAX , but for the SHMALL it is a systemwide
kernel variable affected by the physical memory and the pagesize .

– Memory for SGA is allocated completely by setting SGA_TARGET. In a different case, it will be allocated as needed.

comment :

Memory for the SGA is allocated completely by the SGA_MAX_SIZE .

I need to confirm my Linux kernel settings and also get pointers/explanation on how i need to properly setup my kernel for proper operation of the Oracle Server.
My aim for the SR is not so much to get actual answers on how to set values. Rather, I need help to clear up the concepts behind the numbers.

From the output of the commands below it can be seen that the server has 12 GB of memory and after the kernel is configured (see below output of ipcs -lms command), I have SHMMAX set at 8589933568.
After consulting various documents I have come to understand the following, please verify:

– The largest SGA size is that defined by PAGESIZE*kernel.shmall (in this case 16GB, which is a mistake apparently as the system only has 12GB of RAM)
– It is OK for shmmax to be smaller than the requested SGA. If additional size is needed, then the space will be allocated in multiple pages, as long as the size does not exceed PAGESIZE*kernel.shmall
– If more than one Oracle instances reside on the same server, then Linux Kernel settings will have to cater for the largest instance SGA, since
– … different instances will hold completely different memory segments, which will have to seperately adhere to kernel limitations, therefore the kernel limitations do not care for multiple instances, as those are different memory areas
– Memory for SGA is allocated completely by setting SGA_TARGET. In a different case, it will be allocated as needed

$ free
total used free shared buffers cached
Mem: 12299352 8217844 4081508 0 190816 6799828
-/+ buffers/cache: 1227200 11072152
Swap: 16775764 90912 16684852

ipcs -lms

—— Shared Memory Limits ——–
max number of segments = 4096
max seg size (kbytes) = 8388607
max total shared memory (kbytes) = 16777216
min seg size (bytes) = 1

—— Semaphore Limits ——–
max number of arrays = 128
max semaphores per array = 250
max semaphores system wide = 32000
max ops per semop call = 32000
semaphore max value = 32767

also ‘getconf PAGESIZE’ returns 4096

 

在Linux上分析硬件检测日志

数据库管理员在数据库的运维过程中或多或少要和操作系统乃至硬件打上交道,分析数据库故障时操作系统日志往往也是一个重要的线索来源。
以Linux操作系统为例,其主要的日志子系统(syslog subsystem)可大致分为三类:即1)用户连接日志 2)进程统计日志 3)系统和服务日志。
前2种在我们进行系统的安全审计及用户监控时可以派上用场,而因操作系统或硬件问题造成的数据库故障,我们往往需要关注系统和服务日志。在Linux上我们最常分析的是/var/log/messages日志文件,该日志文件包含了系统和服务的info信息(除mail,cron等服务外),这里我们要介绍的是/var/log/dmesg日志文件,该日志文件描述了系统开机时BIOS硬件加载成功与否的信息,以及网卡、光驱、软驱驱动和RAID、LVM、IPv6等的配置信息。此日志文件的信息记录存放在内核缓存中,主要用于硬件信息故障检测。用户既可以使用cat /var/log/dmesg命令来查看该日志信息,也直接可以使用dmesg命令来查看该日志信息。如:

[root@nas ~]# dmesg |egrep "sd|eth"
SCSI device sda: 625142448 512-byte hdwr sectors (320073 MB)
sda: Write Protect is off
sda: Mode Sense: 00 3a 00 00
SCSI device sda: drive cache: write back
SCSI device sda: 625142448 512-byte hdwr sectors (320073 MB)
sda: Write Protect is off
sda: Mode Sense: 00 3a 00 00
SCSI device sda: drive cache: write back
 sda: sda1 sda2 sda3 sda4
sd 0:0:0:0: Attached scsi disk sda
eth0: RTL8168d/8111d at 0xffffc20000032000, b8:ac:6f:dc:8b:43, XID 081000c0 IRQ 50
sd 0:0:0:0: Attached scsi generic sg0 type 0
SCSI device sdb: 976773168 512-byte hdwr sectors (500108 MB)
sdb: Write Protect is off
sdb: Mode Sense: 10 00 00 00
sdb: assuming drive cache: write through
SCSI device sdb: 976773168 512-byte hdwr sectors (500108 MB)
sdb: Write Protect is off
sdb: Mode Sense: 10 00 00 00
sdb: assuming drive cache: write through
 sdb: sdb1 sdb2
sd 2:0:0:0: Attached scsi disk sdb
sd 2:0:0:0: Attached scsi generic sg2 type 0
EXT3 FS on sda1, internal journal
EXT3 FS on sda2, internal journal
Adding 5116692k swap on /dev/sda3.  Priority:-1 extents:1 across:5116692k
r8169: eth0: link up
r8169: eth0: link up
eth0: no IPv6 routers present

/* 以上列出了系统识别的scsi硬盘及网卡的信息*/


[root@nas ~]# cat /var/log/messages |grep -i fail
Jan 17 03:04:03 nas udevd-event[2943]: wait_for_sysfs: waiting for 
'/sys/devices/pci0000:00/0000:00:1d.7/usb2/2-1/2-1:1.0/host3/target3:0:0/3:0:0:0/ioerr_cnt' failed
Jan 18 04:45:08 nas udevd-event[5138]: wait_for_sysfs: waiting for 
'/sys/devices/pci0000:00/0000:00:1d.7/usb2/2-1/2-1:1.0/host8/target8:0:0/8:0:0:0/ioerr_cnt' failed
Jan 18 04:45:08 nas kernel: sdb : READ CAPACITY failed.
Jan 18 04:45:08 nas kernel: sdb : READ CAPACITY failed.

/* 以上列出了硬件检测失败记录 */

[root@nas ~]# dmesg |grep -i err
ACPI: IRQ0 used by override.
ACPI: IRQ2 used by override.
ACPI: IRQ9 used by override.
Using local APIC timer interrupts.
ACPI: Using IOAPIC for interrupt routing
ACPI: PCI Interrupt Routing Table [\_SB_.PCI0._PRT]
ACPI: PCI Interrupt Routing Table [\_SB_.PCI0.P0P1._PRT]
ACPI: PCI Interrupt Routing Table [\_SB_.PCI0.P0P4._PRT]
ACPI: PCI Interrupt Routing Table [\_SB_.PCI0.P0P6._PRT]
ACPI: PCI Interrupt Link [LNKA] (IRQs 3 4 6 7 10 *11 12 14 15)
ACPI: PCI Interrupt Link [LNKB] (IRQs *5)
ACPI: PCI Interrupt Link [LNKC] (IRQs 3 4 6 7 *10 11 12 14 15)
ACPI: PCI Interrupt Link [LNKD] (IRQs 3 4 6 7 10 11 12 14 *15)
ACPI: PCI Interrupt Link [LNKE] (IRQs 3 4 6 7 10 11 12 14 15) *0, disabled.
ACPI: PCI Interrupt Link [LNKF] (IRQs 3 4 6 7 10 11 12 *14 15)
ACPI: PCI Interrupt Link [LNKG] (IRQs *3 4 6 7 10 11 12 14 15)
ACPI: PCI Interrupt Link [LNKH] (IRQs 3 4 6 *7 10 11 12 14 15)
ACPI: PCI Interrupt 0000:00:1c.0[A] -> GSI 17 (level, low) -> IRQ 169
ACPI: PCI Interrupt 0000:00:1c.2[C] -> GSI 18 (level, low) -> IRQ 177
ide: Assuming 33MHz system bus speed for PIO modes; override with idebus=xx
ACPI: PCI Interrupt 0000:00:1a.7[C] -> GSI 18 (level, low) -> IRQ 177
ACPI: PCI Interrupt 0000:00:1d.7[A] -> GSI 23 (level, low) -> IRQ 209
ACPI: PCI Interrupt 0000:00:1a.0[A] -> GSI 16 (level, low) -> IRQ 217
ACPI: PCI Interrupt 0000:00:1a.1[B] -> GSI 21 (level, low) -> IRQ 225
ACPI: PCI Interrupt 0000:00:1d.0[A] -> GSI 23 (level, low) -> IRQ 209
ACPI: PCI Interrupt 0000:00:1d.1[B] -> GSI 19 (level, low) -> IRQ 233
ACPI: PCI Interrupt 0000:00:1d.2[C] -> GSI 18 (level, low) -> IRQ 177
ACPI: PCI Interrupt 0000:00:1f.2[B] -> GSI 19 (level, low) -> IRQ 233
ACPI: PCI Interrupt 0000:00:1f.3[C] -> GSI 18 (level, low) -> IRQ 177
ACPI: PCI Interrupt 0000:02:00.0[A] -> GSI 18 (level, low) -> IRQ 177
ACPI: PCI Interrupt 0000:00:1b.0[A] -> GSI 22 (level, low) -> IRQ 58

/* 以上列出了硬件检测错误记录 */

/var/log/dmesg硬件检测日志的格式较为简单,一般为”device name:message text”的形式。该日志中常见的设备名称有:SCSI,PCI,Memory,loop,Kernel,EXT3,DMA,CPU,Console,BIOS,ata2,ata1,ACPI,floppy,Time等。其中ACPI(Advanced Configuration and Power Interface)即高级电源管理服务,可以看到以上日志中该服务的PCI中断出现了某些问题,而sdb移动磁盘则出现了”READ CAPACITY failed.”(结合之前的日志可能是因为USB外接硬盘未准备好)的失败,若该问题持续可能导致该移动硬盘无法挂载(mount)。

Oracle database 11g r2最新安装体验

安装软体分成2个zip包,需要全部解压后方能安装,解压后生成目录database,

linux.x64_11gR2_database_1of2.zip

linux.x64_11gR2_database_2of2.zip

最明显的当然是安装界面风格,整体偏于纯白了,跟r1的蓝白风格有所不同;

11g1

metalink协助在r2中显得更加重要,安全补丁更新已与metalink账号绑定起来;

11g2

我们选择仅安装单实例的软件,不创建数据库。

多出了产品语言的选择界面,但着不同与数据库字符集的选择,主要决定了帮助信息的语言集合。

一般来说不管是开发测试都因该使用enterprise edition,为了统一环境避免麻烦。

选择安装目录,默认是安装在变量ORACLE_BASE目录下:

11g4

安装预检查的内容有不少改动:

11g5

内存安装要求由10g的512M上升到1g,swap需求与当前主机的物理内存大小一致,tmp目录大小要求为1g,

Shell 中hard limit:max open files的要求上升到65536,这要求我们修改/etc/security/limits.conf中oracle的设置。

内核参数fs.file-max的要求上升到6815744,端口限制参数net.ipv4.ip_local_port_range由2000 65000变成9000 65500,net.core.rmem_default参数上升至 262144等。且在10g基础上多出了2个rpm包的安装要求,分别为elfutil-libelf-devel-0.97,该包的具体用途如下:

The elfutils-libelf-devel package contains the libraries to create
applications for handling compiled objects. libelf allows you to
access the internals of the ELF object file format, so you can see the
different sections of an ELF file.

下载地址:elf包

另一个是 unixodbc-devel-2.2.11,描述为:

The unixODBC package can be used to access databases through ODBC
drivers. If you want to develop programs that will access data through
ODBC, you need to install this package.

下载地址为:unixodbc

另安装程序目前会自动生成修改参数的脚本了,点击fix & check again,它会提示你在/tmp/CVU_11.2.0.1.0_oracle目录下的runfixup.sh文件可以帮助修改相关参数,这使安装步骤简便许多。

修改后的sysctl.conf文件内容如下:

net.ipv4.ip_forward = 0

# Controls source route verification
net.ipv4.conf.default.rp_filter = 1

# Do not accept source routing
net.ipv4.conf.default.accept_source_route = 0

# Controls the System Request debugging functionality of the kernel
kernel.sysrq = 0

# Controls whether core dumps will append the PID to the core filename.
# Useful for debugging multi-threaded applications.
kernel.core_uses_pid = 1

kernel.shmall = 2097152
kernel.shmmax = 4589934592
kernel.shmmni = 4096
kernel.sem = 250 32000 100 128
fs.file-max = 6815744
net.ipv4.ip_local_port_range = 9000 65500
net.core.rmem_default = 262144
net.core.rmem_max = 4194304
net.core.wmem_default = 262144
net.core.wmem_max = 1048576

fs.aio-max-nr = 1048576

上述的2个rpm即使不安装,Oracle软件也可以安装成功,虽然我们不提倡使用ignore选项。

在当前的安装程序中可以保存response file了,这为今后的静默安装提供了方便。

11g6

最后点击 finish,去喝一杯咖啡吧,11g 的安装时间要比10g 长很多,因为相关组件更丰富了,这在之后将介绍。

安装进度条界面:

11g7

安装完成,使用root用户执行root.sh,关闭安装界面。

对比10g与11g的目录,可以发现11g目录下多出了deinstall,dc_ocm,apex,sqldeveloper等子目录。

apex 为Oracle application express 目前已经整合到11g的server端中。

deinstall目录下的deinstall脚本将帮助删除当前Oracle软件并清除oraInventory中的信息。

sqldeveloper为图形界面的sqlplus开发管理工具,大约占用80M空间,一般不会使用到。

转载请注明源地址: https://www.askmac.cn

沪ICP备14014813号-2

沪公网安备 31010802001379号