
| This forum is proudly powered by Scientific Linux 6 | SL website Download SL Help Search Members |
| Welcome Guest ( Log In | Register ) | Resend Validation Email |
![]() ![]() ![]() |
| Hellboy |
Posted: Jan 2 2012, 04:37 PM
|
|
|
SLF Rookie ![]() Group: Members Posts: 23 Member No.: 329 Joined: 22-June 11 |
On my homeserver/nas based on a HP microserver i have SL6.1 with latest security updates installed. I have nfs, samba, sabnzbd, openvpn up and running. But i am getting kernel errors lately.
Kernel panic not synic stack-protector kernel stack is corrupted. I verified that there are no hardware problems etc. Anyone an idea? |
|
| Jcink |
Posted: Jan 2 2012, 05:03 PM
|
|||
|
SLF IRC Team ![]() ![]() ![]() ![]() Group: Members Posts: 212 Member No.: 15 Joined: 10-April 11 |
This is a tough one. I searched around for this problem and could only find this happening to people who had bad wireless drivers, and at some point in 2009 there was a kernel bug, but it was fixed.
What kernel are you running right now? Post it here by showing the output of:
How long does it take for this issue to happen as well? A few times daily, or at random? If you were fine before and you still have all of the old kernels installed, I'd go back to one of them and then see if this still happens. |
|||
| Hellboy |
Posted: Jan 2 2012, 08:36 PM
|
|
|
SLF Rookie ![]() Group: Members Posts: 23 Member No.: 329 Joined: 22-June 11 |
I think i found the problem. I added some disks and created a RAID1 and added some parameters to /etc/sysctl.conf
dev.raid.speed_limit_min = 50000 dev.raid.speed_limit_max = 200000 I think i overdone it a little bit. When the server has to do a lot of I/O it get kernel crashes. |
|
| Hellboy |
Posted: Jan 4 2012, 01:17 PM
|
|
|
SLF Rookie ![]() Group: Members Posts: 23 Member No.: 329 Joined: 22-June 11 |
It happened again. I uploaded 2 nzb files, and sabnzbd gets active, the system then crashed again. I never had this problem before with sabnzbd.
What could be the problem. |
|
| Jcink |
Posted: Jan 4 2012, 02:29 PM
|
|
|
SLF IRC Team ![]() ![]() ![]() ![]() Group: Members Posts: 212 Member No.: 15 Joined: 10-April 11 |
I don't think it's related to sabnzbd, but an i/o issue somewhere as you pointed out.
What kind of disks are they? Not that I mean to sound like I'm doubting you or anything, but also - how did you verify that there was nothing wrong with the hard disks? I'm guessing you checked SMART data but unless you ran the manufacturer's tools as well, you can't be sure. Also, is the BIOS up to date? |
|
| Hellboy |
Posted: Jan 4 2012, 02:49 PM
|
|
|
SLF Rookie ![]() Group: Members Posts: 23 Member No.: 329 Joined: 22-June 11 |
I have 4 disks in the server. 2 * 300GB RAID1 and 2 * 1TB, all software raid, because the hardware raid is a fake one.
on the 2 * 300 GB i have a 512MB partition for /boot and the rest of the disk is lvm and the 2 * 1TB RAID1 is all for LVM. I have on big volume group. I didn't have the smartmontools installed, i forgot. I will install them and have a look. |
|
| Hellboy |
Posted: Jan 4 2012, 08:59 PM
|
|
|
SLF Rookie ![]() Group: Members Posts: 23 Member No.: 329 Joined: 22-June 11 |
smartmontools are installed, but there are no errors.
I also did a test to see if i have faulty memory. I cheched the amount of memory i had 4gb, so i did a dd dd if=/dev/urandom bs=4021868 of=/data/memtest count=1050 md5sum /data/memtest; md5sum /data/memtest; md5sum /data/memtest All the checksums are equal, so it can't be memory error. I uploaded 4 nzb's, ran iometer, but the server didn't crash. I will see if i can upgrade the BIOS. |
|
| Hellboy |
Posted: Jan 5 2012, 06:06 PM
|
|
|
SLF Rookie ![]() Group: Members Posts: 23 Member No.: 329 Joined: 22-June 11 |
I also removed the vm.swappiness = 0 from /etc/sysctl.conf en reloaded with sysctl -p /etc/sysctl.conf. Did some tests and rebooted the server.
I opened up Openvpn + uploaded 4 nzb's, and the server is all fine. Still a strange problem. I will update the BIOS and other firmware this weekend. |
|
| helikaon |
Posted: Jan 6 2012, 11:06 AM
|
|
![]() SLF Moderator ![]() ![]() ![]() ![]() ![]() ![]() Group: Moderators Posts: 516 Member No.: 4 Joined: 8-April 11 |
Hi,
this is definitely hard to track down problem, while we don't know much about your system. 1. as i gather, your problem started after you added new 2x 1TB disks to SW raid1, right? so you have some /dev/mdx device and on top of it LVM 2. you have 64b os? 3. how do you upload the files on your server? i'm not familiar with the "sabnzbd" so you run openvpn and then you connect to sanbzd via some sabnzb client? 4. tail -v /var/log/messages all the time, we could catch something usefull there, which is lost when server is panicked and rebooted 5. try different kernel? cheers, -------------------- |
|
| Hellboy |
Posted: Jan 6 2012, 01:24 PM
|
|
|
SLF Rookie ![]() Group: Members Posts: 23 Member No.: 329 Joined: 22-June 11 |
The hardware is a HP Microserver N36L.
OS = SL 6.1 Kernel = Latest that comes with SL6.1 Sabnzbd is a ptyhon program and has a webinterface (engine is cherrypy) from which you can upload nzb files. I tried 2 older kernel version, but same result. I looked at all the logs, nothing usefull. Did memory tests (memtest86) The server without my tuningparameters is running stable. And i have done a load of loadtests (iometer, sabnzbd, iperf) So i will leave it as it is right now. |
|
| helikaon |
Posted: Jan 6 2012, 01:36 PM
|
|
![]() SLF Moderator ![]() ![]() ![]() ![]() ![]() ![]() Group: Moderators Posts: 516 Member No.: 4 Joined: 8-April 11 |
Hi,
if the issue is hampering usability of the server, you could still try the non-SL kernels, in our forum kernel section the 'torracat' posted link to precompiled rhel/centos/sl nonstandard (vanilla kernels rpm packaged), so it should be safe to try them out... Otherwise this is more likely to configure kernel dump and ask on SL devel lists for help ... cheers, -------------------- |
|
| Jcink |
Posted: Jan 7 2012, 09:05 AM
|
|
|
SLF IRC Team ![]() ![]() ![]() ![]() Group: Members Posts: 212 Member No.: 15 Joined: 10-April 11 |
It's a tough one to track down period, unfortunately.
It seems as though you've done a lot of the things I would have tried already if I was stuck in your situation. Checking the memory out multiple times, looking over SMART data, going back to old kernels, verifying other hardware is in check... the only possible thing left that I can think of to do is the BIOS update. Did you try that yet? If it does happen again with your tuning settings removed, I'd say give it a shot and update it. If not this will definitely have to be taken to the SL mailing list unless we can get a crash dump analyzer in here... |
|
| Hellboy |
Posted: Jan 7 2012, 10:39 AM
|
|
|
SLF Rookie ![]() Group: Members Posts: 23 Member No.: 329 Joined: 22-June 11 |
I will try to update the BIOS asap, i will post my finding then. I see there where more issues, also some folks with ubuntu had kernel-panics. I see that there were also some problems with the firmware for the buitin network card.
The server is now up and running for 2 days under heavy load, and it is stable. |
|
| Hellboy |
Posted: Jan 8 2012, 03:10 PM
|
|
|
SLF Rookie ![]() Group: Members Posts: 23 Member No.: 329 Joined: 22-June 11 |
Today i started up my lab spacewalk (open source edition of redhat satellite server). It also has a dhcp onboard. Immediately my nas got a kernel panic.
My nas is also running openvpn in bridged mode, which means my nic is in promiscious mode. Could that be a problem? |
|
| Hellboy |
Posted: Jan 14 2012, 08:51 AM
|
|
|
SLF Rookie ![]() Group: Members Posts: 23 Member No.: 329 Joined: 22-June 11 |
I updated the BIOS and firmware of the onboard network card. Everything is up and running for a week now without problems. I even added services, i am hosting a bunch of websites on the server.
|
|
| helikaon |
Posted: Jan 14 2012, 06:08 PM
|
|||
![]() SLF Moderator ![]() ![]() ![]() ![]() ![]() ![]() Group: Moderators Posts: 516 Member No.: 4 Joined: 8-April 11 |
Yay, gratz on that So firmware in the end ... it is something to do if things doesnt work and not to do if it works cheers, -------------------- |
|||
![]() |
![]() ![]() ![]() |