Going on Instinct: A Mysterious Kernel Panic
There’s nothing quite like a Monday that begins with a Kernel panic on one of your most heavily used machines. I applied a set of updates to my CentOS 4.1 install on Friday (the pack of updates that apply to a fresh installation) only to come in and find that useradd was eating up 99% of one of the processors. Useradd wouldn’t accept a kill signal either so I scheduled a reboot. That was the last straw for the machine and it came up with a kernel panic in drivers/input/serio/i8042.c:988.
Doing my homework I found that the runaway useradd process is a known bug on both the Redhat and CentOS bugtrack. There was no mention of a kernel panic problem on reboot however. So I backed up my data and set out to do a clean installation. I walked through each of the 104 updates to the system to narrow down the troublemaker. Only when I had installed the upgraded device-mapper (1.01.04-1.0.RHEL4) did I experience the problem. As part of dependency resolution the machine had also oddly installed libselinux.i386, device-mapper.i386 and dmraid.i386.
Noticing libselinux in the odd bunch I wanted to make sure my kernel wasn’t enabling selinux and tripping up on my 3ware RAID controller somehow. The 3ware RAID stood out in my mind because it was the only really unique piece of hardware on the machine that I would have had to install kernel support for in an earlier version. To make sure selinux was off I passed the kernel selinux=off via grub and lo and behold the machine booted. I’m still trying to determine precisely why the selinux policy would have conflicted with this setup and whether or not the new enhanced support for audit and its selinux interactions, also among the updated packages, might have something to do with it.