My web server was down for most of yesterday, for reasons out of my control.
I shelled in only to see one error message upon restarting the Apache web service: Segmentation fault. I checked the last few lines of Apache’s error_log only to see many more errors related to the same problem. It’s one of most dreaded messages one can receive on a Linux box, aside from the dreaded ‘kernel panic’.
A segmentation fault usually means one of two things in the Linux world: a software fault or a hardware fault. So, I set out to determine what the underlying cause was.
# apt-get update
Segmentation Fault
As a system package installation tool, apt-get usually runs without any errors (unless imposed by network issues). Perhaps, a corrupt filesystem was the culprit? I created a file in the root of the server, forcefsck, to force a file system scan on the next reboot. However, it turns out that it wasn’t such a good idea issuing the reboot command remotely, as the system was waiting for my input once I got home. It turns out there were multiple, soft disk errors waiting to be corrected. After correcting the file system errors, I ran MemTest86+ and also Seagate’s SeaTools to test the RAM and hard drive, respectively. Both turned out to be fine, with no errors. After some more software and hardware testing, the culprit was determined to be the motherboard. Argh! Over 10 GB of data needed to be backed up.
Four hours later (started at 7 PM, finished at 11 PM), the server was up and running on entirely new hardware (well, except for the hard drive). I decided to use Ubuntu Server 6.06.1 LTS for the server operating system this time around. After a couple of minor obstacles (eg. compiling and installing eAccelerator 0.9.5-rc1), everything is running smoothly again