|
|
Subscribe / Log in / New account

Ksplice provides updates without reboots

Benefits for LWN subscribers

The primary benefit from subscribing to LWN is helping to keep us publishing, but, beyond that, subscribers get immediate access to all site content and access to a number of extra site features. Please sign up today!

By Jake Edge
July 10, 2009

While Linux systems generally have a good reputation for uptime, there are sometimes unavoidable reasons that a reboot is required. Typically, that is because of a kernel update, especially one that fixes a security hole. Companies that have long-running processes, or those who require uninterrupted availability, are not particularly fond of this requirement. A new company, Ksplice, Inc. has come up with a way to avoid these reboots by hot-patching a running kernel.

The technique used by the company, unsurprisingly called Ksplice, is free software, which we looked at last November on the Kernel page. (An earlier look, from April 2008, may also be instructive). The basic idea is that by doing a comparison of the original and patched kernels, one can build a kernel module that will patch the new code into the running kernel.

For simple code changes, the process is fairly straightforward. Each kernel is built with a special set of flags to simplify determining which functions have changed as a result of the patch. Those changes are packaged up into the module, and then applied when the module is loaded. Then there is the small matter of ensuring that the kernel is not currently executing any of the functions to be replaced. In order to do that, the kernel is halted while each thread is examined, if none are running the affected code—or have a return address into the code on their stack—the patch is made and the kernel can go on its way. Otherwise, Ksplice delays for a short time and tries again, eventually giving up if it cannot satisfy that condition.

There are several kinds of changes that are much more difficult to handle, particularly data structure changes. For those, someone needs to analyze the changes and write code to handle munging the data structures appropriately. Ksplice has an infrastructure that allows this data structure manipulation to be done while the kernel is halted, but the code itself is, or can be, non-trivial. To a great extent, it is the knowledge of how to do this with Ksplice that the company is offering as a service.

As a test of the technology, the Ksplice developers looked at all of the security problems listed for the kernel over a three-year period (May 2005 to May 2008). Of the 64 Common Vulnerabilities and Exposures (CVE) entries for the kernel that had an impact worse than denial of service, Ksplice was able to patch 56 without any additional code being written. The other eight could be handled with a small amount of code—an average of 17 lines per patch.

As a further demonstration of the Ksplice technique, Ksplice, Inc. is currently offering a free-beer service for Ubuntu 9.04 (Jaunty Jackalope) users. Ksplice Uptrack will allow those users to update their kernels without rebooting them. The Ksplice folks will be tracking the Ubuntu kernel git tree, turning those changes (security and bug fixes) into modules that can be retrieved and applied with the Uptrack client. As described in the FAQ, Uptrack will support the latest release of Ubuntu: 9.04 for now, switching over to 9.10 (Karmic Koala) when that is released.

As noted, Ksplice itself is free software, available under the GPLv2, and the Uptrack client is as well. That leads to a service-oriented free software business model for Ksplice, Inc. While their exact plans are not yet clear, providing similar updates for enterprise kernels (RHEL and SUSE), but charging for those, would seem an obvious next step. Other areas for expansion include other operating systems as well as user-space applications. In an interview, Waseem Daher, co-founder and COO of Ksplice, described the company's goal:

The long term vision is that, at the end of the day, all updates will be hot updates — updates that don't require a reboot or an application restart. This is actually a big problem because if you look at technology used in data centers, no-one has a good solution for software updates, from as low level as your router or SAN, up to your virtualization solution, the operating system, the database, and the critical applications. Right now, all these updates require you either to reboot the system or restart the service.

This is a big pain point for sysadmins because, on the one hand you have to apply the updates so that you can fix important security problems, but on the other if you don't then you're vulnerable. When you do apply them, though, there's downtime and that's lost productivity. There's a real cost associated with the downtime. We want to take the technology that we've developed and use it to make life easier in the data center. That's the broad vision for where we're going with the company, and we're starting with Linux.

That's a rather ambitious vision, but one that seems in keeping with where things are headed. No matter how fast booting gets, it is still a major annoyance, for servers or desktops. Even restarting applications, particularly things like database servers or desktop environments, leads to lost time and productivity. Whether Ksplice, Inc. can expand their offerings to reach that goal is an open question.

One of the problems that Ksplice will face is competition. In the Linux world, that could come from distributors deciding to start making Ksplice modules themselves, and either charging their customers for them, or adding that capability to their subscription-based support offerings. In the proprietary, closed source world, Ksplice will have to work with the vendors of operating systems and applications so that it can access the source code. Those vendors are most certainly going to want a piece of the pie for that access.

There may also be technical hurdles. One botched kernel update, which led to introducing a serious flaw—security or otherwise—could ruin the company's reputation. That, in turn, might make it much harder to convince new customers. Hot-patching is a subtle, difficult problem to solve completely.

On the other hand, Ksplice has an excellent pedigree; started by four MIT students based on co-founder Jeff Arnold's master's thesis. Ksplice also won MIT's $100,000 entrepreneurship competition—against some stiff competition, one would guess. Arnold's reasons for looking at the problem will resonate with system administrators everywhere: he delayed patching an MIT server to avoid downtime on a busy system, so an attacker took advantage of that window.

It will be interesting to watch both Ksplice and the general idea of hot-patching over the coming years. When Ksplice was first introduced, a Microsoft patent on the technique was noted on linux-kernel, along with protestations of rather old (PDP-11) prior art. How that plays into the fortunes of Ksplice and others who come along will be interesting—potentially sickening—as well.


(Log in to post comments)

Ksplice provides updates without reboots

Posted Jul 10, 2009 16:41 UTC (Fri) by jbarnold (guest, #51843) [Link]

As I mentioned on the LKML last year [1], the Microsoft "patent" that was pointed out is actually a rejected patent application, not a patent. It received a "final rejection" from the U.S. patent office in 2006.

You can browse the relevant documents at http://portal.uspto.gov/external/portal/pair (The application number is 10/307,902).

[1] http://lkml.org/lkml/2008/4/29/708

Ksplice provides updates without reboots

Posted Jul 13, 2009 1:34 UTC (Mon) by jhhaller (guest, #56103) [Link]

[I posted a similar comment on Steven Vaughan-Nichols' Computerworld ksplice article]

While the patent office has rejected Microsoft's patent, Microsoft is appealing it to court, so it's not over yet. Also, the reason for the rejection was a prior HP patent application. Presumably, HP could object to ksplice, although they are friendlier to Open Source, and their patent was related to dynamically patching code related to missing hardware instructions, not patching code. But, I think there are many other patching frameworks which are likely to be prior art. IEEE Software, March, 1993 had a survey of patching frameworks dating back to the 1970s.

In my opinion, the main innovation of ksplice isn't the dynamic patching, but the automated discovery of what and where to patch, as well as allowing changes to the number of function arguments. However, until shared libraries and long-lived processes are addressed, I think the business case for a service will be limited.

Ksplice provides updates without reboots

Posted Jul 10, 2009 17:14 UTC (Fri) by mjthayer (guest, #39183) [Link]

If their technique does catch on, I'm sure that the kernel (that is, its developers...) will also find ways to adapt to make runtime patching safer and more easy, similar to what paravirtualised kernels do for virtualisers.

Ksplice provides updates without reboots

Posted Jul 10, 2009 17:27 UTC (Fri) by jspaleta (subscriber, #50639) [Link]

Interesting....
If this works the Uptrack service seems like something that RHN and Landscape customers will be asking to see integrated as a service offering in those management services. Not that Landscape has any customers currently to worry about.

Want to make any bets on whether Ksplice will generate Ubuntu service revenue with Uptrack faster than Canonical can with Landscape?

-jef

Ksplice provides updates without reboots

Posted Jul 10, 2009 18:42 UTC (Fri) by spender (guest, #23067) [Link]

This would be nice if exploitable vulnerabilities were actually labeled as such. Back in the real world however...

If they do actually include all the updates the vendor would normally provide, it follows that they're also playing along with whatever embargoes the vendors have in place. So the shortened vulnerability window in those cases only applies to those who wait to install updates because they don't want to reboot their machine so often. Actually, I wonder how the embargo issue will play out, since much of the reason why distros combine many fixes into one update is purely because of the reboot requirement. If every distro moved to this technology, would embargoes be done away with?

Even though this seems like it would increase the risk of silently fixed vulnerabilities, in general it will improve security for those with the 'patch what they tell me to, and I'm safe' mentality. There are currently far too many people running incredibly outdated kernels simply because rebooting for the handful of vulnerabilities cropping up each week is far too disruptive.

-Brad

Ksplice provides updates without reboots

Posted Jul 10, 2009 19:19 UTC (Fri) by spender (guest, #23067) [Link]

On the other hand, imagining the scenario where all vendors are using this style of updates, it might have the effect of removing the incentive for developers to silently fix vulnerabilities in the kernel. Then if they could just hire some real security experts to properly identify and classify vulnerabilities, Linux would have a real improvement in kernel security.

-Brad

Ksplice provides updates without reboots

Posted Jul 11, 2009 10:13 UTC (Sat) by nix (subscriber, #2304) [Link]

Nice idea, but if they did hire real security experts, I suspect that said
experts would much rather spend time on interesting things such as better
security frameworks rather than the incredibly dull gruntwork of poring
through an ocean looking for sunken turds. And the ocean is always growing
far faster than any plausible population of security experts hireable by
one organization can possibly audit them.

It would be very dispiriting for the poor sods so hired: and the net
effect? Sure, security would go up --- but from the point of view of the
alien beings who work the money levers, they'd be paying money to get back
reports of bad security, upgrade hassle for their customers, and bad PR
whenever MS decides to do one of their fallacious 'count the CVE'
Windows-has-better-security PR pushes, but the number of vulnerabilities
probably wouldn't fall all that far, because new code is still arriving
far faster than it could be audited.

Worse yet: a huge amount of security-dangerous stuff isn't in the kernel
at all, but in higher parts of the stack which talk to the network. I'm
certain you couldn't hire enough security experts to audit Firefox and
everything underneath it, and as long as that remains problematic
attackers will still be able to run arbitrary code with the privileges of
a user. (And, TBH, that's all they really care about. They can keysniff
your browser and send their spam without grabbing root...)

But perhaps I'm being too cynical. At least the common core of the kernel
that everyone runs (mm, fs) could probably be kept somewhat more hole-free
than other parts, as it doesn't change all that fast. But I look at other
operating systems, run by people who *do* hire security experts, and I
look at their security records, largely as lamentable as ours, largely in
userspace, and I wonder if it would really help.

Not cynical enough

Posted Jul 14, 2009 20:19 UTC (Tue) by man_ls (guest, #15091) [Link]

No way! In the real world, spender will get hired as the kernel security expert; he will bravely go over every kernel fix and find the vulnerability lying within, reveal it with great fanfare and excruciating detail, and use the assigned CVE to properly label the upgrade. Inane DoS attacks will be largely a thing of the past. After all this work every kernel version will carry with it some 2870 "OH NOES PLEASE UPGRADE" warning labels, and with it a heightened sense of warm protection for every user. Not to speak about stable releases -- these will come with a few dozen "OH NOES PLEASE PLEASE UPGRADE" warnings.

In a few iterations security will improve so much that Linux will be suitable for end users and the Year of the Linux Desktop will finally arrive.

Ksplice provides updates without reboots

Posted Jul 11, 2009 2:05 UTC (Sat) by bbaetz (subscriber, #42501) [Link]

> Companies that have long-running processes, or those who require
> uninterrupted availability, are not particularly fond of this requirement.

Companies who require uninterrupted availability should have more than one machine (and measure SLAs by service, not by individual systems)

Companies that have long-running processes should be able to checkpoint them.

Of course, that's not what happens in the real world, and it would certainly make updates quicker for companies with hundreds of machines to do the updates, but I'd hate to see this being used as a reason to not have proper redundancy...

Ksplice provides updates without reboots

Posted Jul 11, 2009 13:26 UTC (Sat) by gdt (subscriber, #6284) [Link]

Companies who require uninterrupted availability should have more than one machine

That's not likely for some applications. For example, telco customers only want to buy one set top box, not one and a hot spare.

Ksplice provides updates without reboots

Posted Jul 11, 2009 15:35 UTC (Sat) by foom (subscriber, #14868) [Link]

That's not likely for some applications. For example, telco customers only want to buy one set top box, not one and a hot spare.
And telcos don't mind frequently rebooting your set-top-box at 3am. No uninterrupted availability guarantee there... :)

Ksplice provides updates without reboots

Posted Jul 12, 2009 8:50 UTC (Sun) by gdt (subscriber, #6284) [Link]

With respect Foom, some of us are serious about running telephony over fiber to the home. Any scheduled end-to-end outage lasting longer than five minutes is unacceptable due to the obvious safety of life issues. Since we want to field a capable STB, that means we need ways to apply security updates without an initial program load.

We would really like to keep the STB clock rate as low as possible, even if this extends reboot times, in order to get more than four hours full operation and eight hours telephony-only operation from battery. Again, there are obvious SoL reasons for as long a battery run-time as possible.

My concern with ksplice is that it applies only to the kernel. Whereas items like libc are as problematic as the kernel in needing a software restart after updates. A more general approach would be welcome.

Ksplice provides updates without reboots

Posted Jul 12, 2009 9:26 UTC (Sun) by tzafrir (subscriber, #11501) [Link]

With libraries the issue is known and mostly (if not fully) handled already.

Library updates need no such special handling: unless done improperly (by a direct write rather than rename), the old copy would still be present with running processes whereas the on-copy version (and the one used by new processes) will be updates.

This leaves you with long-running processes to restart at your own free time. But that does not require a complete reboot.

Specifically the update procedure of libc on Debian conditionally restarts several daemons, and will always restart the init process. And you can always use "telinit u" to do the latter manually.

Ksplice provides updates without reboots

Posted Jul 13, 2009 9:05 UTC (Mon) by dgm (subscriber, #49227) [Link]

There should be a better way to fix long-running processes than restarting them, more in the line of what Ksplice does. As long as the interface has not changed (function parameters and struct layout if we speak of plain C) it should be doable.

Ksplice provides updates without reboots

Posted Jul 14, 2009 6:55 UTC (Tue) by NAR (subscriber, #1313) [Link]

While their exact plans are not yet clear, providing similar updates for enterprise kernels (RHEL and SUSE), but charging for those, would seem an obvious next step.

But would RedHat or Novell still support these hotpatched kernels? I think they should try to get money from RedHat and Novell directly...

Anyway, I think this whole ksplice thing is just a hack. A really clever hack that has the same effect on heterosexual male hackers that a half-nude Pamela Anderson picture - but still just a hack. I guess anyone really interested in high availability has to use some kind of hardware duplication in order to avoid harware failures. And if there's a spare wheel hardware available then there are cleaner ways to restart the service without noticable outage.

This ksplice thing is just a hack to make a kind of technology (monolithic application written in C) to be used in ways that it wasn't designed to be used. It can be used when there are no resources for a proper solution. However, there are technologies that are designed from the starting point to enable upgrades, etc. without downtime, for example Erlang with its OTP standard library.

Ksplice provides updates without reboots

Posted Jul 14, 2009 9:38 UTC (Tue) by jcm (subscriber, #18262) [Link]

Meanwhile, in the real world, most people don't have the kind of money required to do 100% hardware duplication and even then that might not be easily possible for certain applications. Ksplice is only a hack insomuch as anything is a hack until it's proven itself - Linux was a hack once.

Ksplice provides updates without reboots

Posted Jul 14, 2009 18:47 UTC (Tue) by dlang (guest, #313) [Link]

if businesses need continuous availability they are going to have redundant hardware.

it may be 100% duplicate (1-1 replication), or it may be a n+x style cluster of machines, but in either case they will have the redundant hardware so that they can survive hardware failing.

there are some companies that try to save a little money and make the backup boxes less powerful than the primary boxes, but they almost never do this after they have actually _used_ their backup boxes (people really don't care that you had a system failure and are running on a backup, they just want their stuff to work, and work fast)

Ksplice provides updates without reboots

Posted Jul 14, 2009 19:33 UTC (Tue) by NAR (subscriber, #1313) [Link]

Telephony exchange systems based on Erlang are used on all 5 continents (as far as I know) - that's real enough to me. On the other hand, if the target audience of KSplice Inc. can't afford spare hardware, will they spend money on the services of KSplice Inc?

Also there's the "false sense of security" expression thrown around when a not perfect solution is given to a problem - ksplice could give "false sense of reliability".


Copyright © 2009, Eklektix, Inc.
This article may be redistributed under the terms of the Creative Commons CC BY-SA 4.0 license
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds