Linux: Preserving Oops Data Through Resets

Posted by dcparris on Apr 10, 2006 8:39 PM EDT
KernelTrap; By Jeremy
Mail this story
Print this story

James Courtier queried the Linux Kernel mailing list on the feasibility of restoring the kernel ring buffer after a reset. He proposed simply writing the ring buffer data redundantly to memory in the hope that not all RAM is erased at boot time, allowing the buffer to be reconstructed. The kernel ring buffer is typically viewed with the dmesg command. Referring to the method of collecting data from an oops through a serial connection, James explained, "the main advantage of something like this would be for newer motherboards that are around now that don't have a serial port." An existing solution to this problem is usingkexec to boot a special lightweight kernel after a crash to collect a kernel crash dump.

The general consensus to James' query was that data written to RAM before a reset will not be available after, though exactly how much of the RAM is overwritten was debated. Kexec author Eric Biederman explained, "clearing the memory can be done at full memory bandwidth which can happen in seconds. On systems with ECC you need initialize all of the check bits so some kind of write to memory needs to happen." He then went on to note, "in practice a reset does not clear the memory and only a few bits tend to get flipped." Andi Kleen offered an alternative solution, "define a generic interface that allows drivers to register memory storage handlers. Add a entry into the oops die and panic notifiers that saves the kernel log into these backends." As an example, Andi suggested that video drivers could make available a small portion of video card RAM which could be used to preserve crash data across reboots.

Full Story

» Read more about: Story Type: News Story; Groups: Kernel, Linux

« Return to the newswire homepage

This topic does not have any threads posted yet!

You cannot post until you login.