Scientific Linux Forum.org



  Reply to this topicStart new topicStart Poll

> Unwanted system reboot
gentleman
 Posted: May 30 2013, 05:37 PM
Quote Post


SLF Newbie


Group: Members
Posts: 2
Member No.: 2528
Joined: 30-May 13









I am having a problem with a SL 5.5 installation on a desktop machine. The CPU is an AMD Phenom II X4 945. The OS is SL 5.5 installed on a RAID 1 volume. The kernel is

Linux node01.cluster 2.6.18-348.4.1.el5 #1 SMP Tue Apr 16 15:42:58 EDT 2013 x86_64 x86_64 x86_64 GNU/Linux

The system happens to reboot on random days but apparently at the same time. From /var/log/messages

May 12 04:02:02 node01 syslogd 1.4.1: restart.

The day is random but the time 04:02 is when the system daily cron jobs are run:

/etc/cron.daily/0anacron
/etc/cron.daily/0logwatch
/etc/cron.daily/cups
/etc/cron.daily/logrotate
/etc/cron.daily/makewhatis.cron
/etc/cron.daily/mlocate.cron
/etc/cron.daily/prelink
/etc/cron.daily/rpm
/etc/cron.daily/tetex.cron
/etc/cron.daily/tmpwatch
/etc/cron.daily/yum.cron

These are the deafult system cron jobs that come with SL 5.5. After checking them, I see that the only candidate for sending a reboot signal is prelink. However, I am not able to identify the reason for sending the reboot signal (and I am not sure that prelink is the script that actually sends the reboot signal). I do not see any other message immediately before the reboot time. Do you have any idea/suggestion?
PM
^
redman
 Posted: May 30 2013, 09:09 PM
Quote Post


Retired SLF Administrator
********

Group: Admins
Posts: 1276
Member No.: 2
Joined: 8-April 11









QUOTE (gentleman @ May 30 2013, 07:37 PM)
I do not see any other message immediately before the reboot time.

That is a pity... My first thought was faulty hardware that might cause it (for example a faulty harddisk that gets confused and causes the reboot when some read/write action is performed?

--------------------
"Sometimes the best helping hand you can give is a good, firm push."
PM
^
tux99
 Posted: May 31 2013, 04:50 AM
Quote Post


SLF Moderator
********

Group: Moderators
Posts: 1277
Member No.: 224
Joined: 28-May 11









I don't think it's any of the scripts. What's more likely is a hardware fault that gets triggered by the high system load when those scripts run. Especially 'mlocate' can put quite high load on the hard disks.

I'd suggest run some hardware diagnostics on your PC, maybe start with memtest86 and let it run for 12-24 hours (not just 1 pass) and then try something that puts a high load on the cpu like building a custom kernel.

--------------------
My personal SL6 repository, specialized in audio/video software: http://pkgrepo.linuxtech.net/el6/
(can be used together with EPEL and ELRepo repositories) - repository mirror: http://linuxsoft.cern.ch/linuxtech/el6/
PM
^
gentleman
 Posted: May 31 2013, 08:59 PM
Quote Post


SLF Newbie


Group: Members
Posts: 2
Member No.: 2528
Joined: 30-May 13









Is there a way to check for hard drives I/O failures? memtest86 checks for failures of the RAM under heavy load. Also, if a hard disk fails under heavy load, I would notice it in the raid volume that degrades.
PM
^
0 User(s) are reading this topic (0 Guests and 0 Anonymous Users)
0 Members:

Topic Options Reply to this topicStart new topicStart Poll