一大早来到公司,打开邮箱,发现收到了一堆的报警邮件,一个Standby数据库Down掉了。
登陆检查主库,警告日志记录了错误信息:
*** 2006-10-30 07:32:10.614
kcrrfail: dest:2 err:12560 force:0
ORA-12560: TNS:protocol adapter error
*** 2006-10-30 07:34:10.615
Error 12541 connecting to destination LOG_ARCHIVE_DEST_2 standby host 'bmarksb'
Error 12541 attaching to destination LOG_ARCHIVE_DEST_2 standby host 'bmarksb'
Heartbeat failed to connect to standby 'bmarksb'. Error is 12541.
*** 2006-10-30 07:34:10.615
kcrrfail: dest:2 err:12541 force:0
ORA-12541: TNS:no listener
*** 2006-10-30 07:36:10.615
Error 12541 connecting to destination LOG_ARCHIVE_DEST_2 standby host 'bmarksb'
Error 12541 attaching to destination LOG_ARCHIVE_DEST_2 standby host 'bmarksb'
Heartbeat failed to connect to standby 'bmarksb'. Error is 12541.
马上登陆从库主机,手工启动备用数据库:
[oracle@wapcom2 bdump]$ sqlplus "/ as sysdba"SQL*Plus: Release 9.2.0.6.0 - Production on Mon Oct 30 08:17:24 2006
Copyright (c) 1982, 2002, Oracle Corporation. All rights reserved.
Connected to an idle instance.
SQL> startup nomount;
ORACLE instance started.Total System Global Area 470881780 bytes
Fixed Size 452084 bytes
Variable Size 167772160 bytes
Database Buffers 301989888 bytes
Redo Buffers 667648 bytes
SQL> alter database mount standby database;Database altered.
SQL> alter database recover managed standby database disconnect from session;
Database altered.
SQL> exit
Disconnected from Oracle9i Enterprise Edition Release 9.2.0.6.0 - Production
With the Partitioning, OLAP and Oracle Data Mining options
JServer Release 9.2.0.6.0 - Production
[oracle@wapcom2 bdump]$ lsnrctl start
观察从库的日志信息,发现归档可以自动应用:
[oracle@wapcom2 bdump]$ tail -f alert_bmark.log
Standby Database mounted.
Completed: alter database mount standby database
Mon Oct 30 08:19:23 2006
alter database recover managed standby database disconnect from session
Attempt to start background Managed Standby Recovery process
MRP0 started with pid=12
MRP0: Background Managed Standby Recovery process started
Media Recovery Waiting for thread 1 seq# 5151
Mon Oct 30 08:19:29 2006
Completed: alter database recover managed standby database di
Mon Oct 30 08:22:58 2006
Media Recovery Log /opt/oracle/oradata/bmark/stdarch/1_5151.arc
Media Recovery Log /opt/oracle/oradata/bmark/stdarch/1_5152.arc
Media Recovery Log /opt/oracle/oradata/bmark/stdarch/1_5153.arc
Media Recovery Log /opt/oracle/oradata/bmark/stdarch/1_5154.arc
Media Recovery Log /opt/oracle/oradata/bmark/stdarch/1_5155.arc
Media Recovery Waiting for thread 1 seq# 5156
再检查原因,发现原来是主机出现问题,在夜间不断重起:
-bash-2.05b$ last |grep reboot
reboot system boot 2.4.21-15.ELsmp Mon Oct 30 08:10 (02:14)
reboot system boot 2.4.21-15.ELsmp Mon Oct 30 07:51 (02:32)
reboot system boot 2.4.21-15.ELsmp Mon Oct 30 07:38 (02:45)
reboot system boot 2.4.21-15.ELsmp Mon Oct 30 07:35 (02:48)
reboot system boot 2.4.21-15.ELsmp Mon Oct 30 07:21 (03:02)
reboot system boot 2.4.21-15.ELsmp Mon Oct 30 07:18 (03:05)
reboot system boot 2.4.21-15.ELsmp Mon Oct 30 06:39 (03:44)
reboot system boot 2.4.21-15.ELsmp Mon Oct 30 06:37 (03:46)
reboot system boot 2.4.21-15.ELsmp Mon Oct 30 06:32 (03:51)
reboot system boot 2.4.21-15.ELsmp Mon Oct 30 06:03 (04:21)
reboot system boot 2.4.21-15.ELsmp Mon Oct 30 01:48 (08:36)
reboot system boot 2.4.21-15.ELsmp Mon Oct 30 01:23 (09:01)
reboot system boot 2.4.21-15.ELsmp Mon Oct 30 00:39 (09:44)
初步看来是硬件出现了故障,最近的硬件故障极为频繁,年底也到了事故多发期。
提醒大家也多多注意。
参考文档:
http://www.eygle.com/ha/dataguard-step-by-step.htm
-The End-