LinuxConf.Au: Kexec: Soft-Reboot and Crash-Dump Analysis for Linux and Xen

Posted in Conferences, Operating Systems on March 29, 2007


LinuxConf.Au: Kexec: Soft-Reboot and Crash-Dump Analysis for Linux and Xen

Kexec is a feature that was introduced in 2.6.13. It is primarily a feature to allow soft-reboots on machines where booting through the BIOS is either slow or unreliable. This base functionality has been extended to support crash dump analysis - that is a way of obtaining a core file of a crashed kernel. There is also a boot-loader, kboot, which makes relies on kexec.

Kexec in its original form, to allow soft-reboots is, arguably only useful to a fairly restricted set of hardware, most likely used by developers or in embedded environments. However, the evolution of its crash-dump analysis functionality, and further work on using it as the basis for a boot-loader show that it is a technology that has a much wider range of applications that first meets the eye.

Crash-Dump analysis is a fairly hot topic for many people that I meet during the course of my work and part of the intention of this presentation is to bring it to a wider audience. While many of the people currently interested in it want to analyze crashes on very large and hopefully very stable systems. I believe that it is also useful for analysis of crashes on smaller systems. At the OSDL Japan Linux Symposium held in Tokyo in June 2005 Andrew Morton commented that he thought it would also be a good tool for users to provide core files to kernel developers, potentially on problems in very green kernels. I think that this highlights the broadness of the audience for kexec.

The first part of this presentation will give an overview of how kexec works to perform soft-reboots. It will take a look at what crash-dump analysis is, why it is important, and how kexec can be used for this purpose. Lastly it will take a brief look at kboot.

The second part of the presentation will discuss the work that has been taking place to port kexec to the Xen hypervisor and Domain 0, focusing on why this is a good solution for crash-dump analysis of those components. It will also provide an overview of the relationship between hardware, the Xen hypervisor and Xen domains for those not familiar with these concepts.

Watch Video Watch Video on External Site

Tags: Debugging, Conferences, OS, Linux, Lectures, LinuxConf.AU