Biz & IT —

Unsafe at any clock speed: Linux kernel security needs a rethink

Ars reports from the Linux Security Summit—and finds much work that needs to be done.

The Linux kernel today faces an unprecedented safety crisis. Much like when Ralph Nader famously told the American public that their cars were "unsafe at any speed" back in 1965, numerous security developers told the 2016 Linux Security Summit in Toronto that the operating system needs a total rethink to keep it fit for purpose.

No longer the niche concern of years past, Linux today underpins the server farms that run the cloud, more than a billion Android phones, and not to mention the coming tsunami of grossly insecure devices that will be hitched to the Internet of Things. Today's world runs on Linux, and the security of its kernel is a single point of failure that will affect the safety and well-being of almost every human being on the planet in one way or another.

"Cars were designed to run but not to fail," Kees Cook, head of the Linux Kernel Self Protection Project, and a Google employee working on the future of IoT security, said at the summit. "Very comfortable while you're going down the road, but as soon as you crashed, everybody died."

A crash test between a 1959 Chevrolet Bel Air and 2009 Chevrolet Malibu. The Linux kernel is more like the car on the right.

"That's not acceptable anymore," he added, "and in a similar fashion the Linux kernel needs to deal with attacks in a manner where it actually is expecting them and actually handles gracefully in some fashion the fact that it's being attacked."

Jeffrey Vander Stoep, a software engineer on the Android security team at Google, echoed Cook’s message: "This kind of hearkens back to last year's keynote speech when [Konstantin “Kai” Ryabitsev] compared computer safety with the car industry years ago. We need more and we need better safety features, and with it in mind this may cause inconvenience for developers, we still need them."

For his part, Kai, a senior systems administrator at the Linux Foundation, who was unable to attend this year’s summit, is pleased that this car safety analogy is finding traction.

“We approach security today as though we are still living in the world of the 1990s and 2000s, computers in a data centre managed by knowledgeable people,” he told Ars. But, he pointed out, most computers today—laptops, smartphones, IoT devices—are not managed and secured by IT professionals.

“For the cases where computers are not well protected in the hands of end-users who are not IT professionals, and who do not have any recourse to IT professional help, we need to design systems that proactively protect them,” he said. “We have to change the way we approach this dramatically, the same way the vehicle manufacturers in the 1970s did.”

This is, however, easier said than done.

Killing bug classes, not political dissidents

The clear consensus at the Linux Security Summit was that squashing bugs is a losing strategy. Many deployed devices running Linux will never receive security updates, and patching a security hole in the upstream kernel does nothing to ensure the safety of an IoT device that could be in use for a decade and may forever be ignored by the manufacturer.

Even devices that do receive patches may see long gaps between public bug discovery and a patch being applied. Cook gave the example of an Internet-connected door lock that an end-user might well use for 15 years or more. Such devices are likely to receive sporadic security patches, if at all.

Worse, the average lifetime of a critical security bug in the Linux kernel, from introduction during a code commit to public discovery and having a patch issued, averages three years or more. According to Cook’s analysis, critical and high-severity security bugs in the upstream kernel have lifespans from 3.3 to 6.4 years between commit and discovery.

Red = critical severity bugs; orange = high; blue = medium; and black = low. The X axis is total number of security bugs; the Y axis shows the kernel version. So, the height of the bar shows how long that bug was present.
Red = critical severity bugs; orange = high; blue = medium; and black = low. The X axis is total number of security bugs; the Y axis shows the kernel version. So, the height of the bar shows how long that bug was present.
Kees Cook

"The question I get a lot is 'well isn't this just theoretical?'" he said. "No-one's actually finding these bugs to begin with, so there's no window of opportunity. And that's demonstrably false.”

Nation-state attackers are watching every commit, looking for an opening, he said, and "people are finding these bugs sometimes immediately when they're introduced."

He went on: "This seems to be a big thing that people for some reason just can't accept mentally. You know, like 'well I have no open bugs in my bug tracker, everything's fine.'"

How, then, can the kernel proactively defend itself against bugs that have not yet been reported—or even implemented?

The answer, said Cook, could be a matter of life and death for some people: "If you're a dissident, an activist somewhere in the world, and you're getting spied on, your life is literally at risk because of these bugs. As we move forward, we need devices that can protect themselves."

A closer look at the lifespan of critical- and high-severity security bugs in the upstream Linux kernel. X axis is the number of security bugs; Y axis shows the kernel versions in which each security bug was present.
A closer look at the lifespan of critical- and high-severity security bugs in the upstream Linux kernel. X axis is the number of security bugs; Y axis shows the kernel versions in which each security bug was present.
Kees Cook

Protecting a world in which critical infrastructure runs Linux—not to mention protecting journalists and political dissidents—begins with protecting the kernel. The way to do that is to focus on squashing entire classes of bugs, so that a single undiscovered bug would not be exploitable, even on a future device running an ancient kernel.

Further, since successful attacks today often require chaining multiple exploits together, finding ways to break the exploit chain is a critical goal.

Channel Ars Technica