Oracle Explorer: RAC database hang issue

One of two-nodes Oracle 10.2.04 RAC database hanged and restarted around 1:20 AM this morning. According to the trace files, alert log files and other log files, here are some error messages reported in those files:

Received ORADEBUG command 'dump errorstack 1' from process Unix process pid: 16209, image: *** 2011-02-02 10:21:13.667

In alert_node1.log file:

Tue Feb 1 16:52:47 2011
Tue Feb 1 23:15:27 2011

Trace dumping is performing id=[cdmp_20110201162127]

Wed Feb 2 01:10:51 2011

Errors in file /oracle/admin/imsr/bdump/imsr1_pmon_10547.trc:

ORA-00482: LMD* process terminated with error

Wed Feb 2 01:21:46 2011

PMON: terminating instance due to error 482

Wed Feb 2 01:21:46 2011

System state dump is made for local instance

System State dumped to trace file /oracle/admin/imsr/bdump/imsr1_diag_10549.trc
Wed Feb 2 01:21:51 2011

Instance terminated by PMON, pid = 10547

Wed Feb 2 01:21:52 2011

Instance terminated by USER, pid = 16125

Wed Feb 2 01:21:54 2011

Starting ORACLE instance (normal)
Wed Feb 2 09:52:33 2011

Thread 1 advanced to log sequence 12344 (LGWR switch)

Current log# 2 seq# 12344 mem# 0: /oradata/imsr/redo02.log

Thread 1 cannot allocate new log, sequence 12345

Checkpoint not complete

Current log# 2 seq# 12344 mem# 0: /oradata/imsr/redo02.log

Wed Feb 2 09:52:42 2011

Thread 1 advanced to log sequence 12345 (LGWR switch)

Current log# 1 seq# 12345 mem# 0: /oradata/imsr/redo01.log

Wed Feb 2 10:08:26 2011

IPC Send timeout detected. Receiver ospid 16215

Wed Feb 2 10:21:13 2011

Errors in file /oracle/admin/imsr/bdump/imsr1_lmd0_16215.trc:

Wed Feb 2 10:27:38 2011

Trace dumping is performing id=[cdmp_20110202094318]

In alert_node2.log file:

Waiting for instances to leave:
IPC Send timeout detected.Sender: ospid 11471

Receiver: inst 1 binc 1824903189 ospid 1621
Wed Feb 2 10:16:22 2011

MMNL absent for 1807 secs; Foregrounds taking over

Wed Feb 2 10:16:33 2011

Waiting for instances to leave:

Here are some Metalink notes and articles I found related to the above error messages:

Based on the above links, it's likely that followings are the causes of this database hang problem:

1 MAXBYTES is smaller than BYTES

set lines 300
col file_name format a50
select file_name, tablespace_name, bytes/1024/1024, maxbytes/1024/1024 from dba_data_files;

2 Hit Oracle bugs (very likely)

3 Automatic SGA setting caused crash

To be continued.....................

Labels: db hang, RAC

Oracle Explorer

Wednesday, February 2, 2011

RAC database hang issue

0 Comments:

Post a Comment

About Me

Previous Posts