Proper sleep —

20-year-old Linux workaround is still slowing down AMD systems

A little fix for CPUs that didn't properly sleep had decades-long consequences.

A second-generation Epyc server chip from AMD, one that may have been running 2002-era Linux code slowing it down.
Enlarge / A second-generation Epyc server chip from AMD, one that may have been running 2002-era Linux code slowing it down.
Getty Images

AMD has come a long way since 2002, but the Linux kernel still treats modern Threadrippers like Athlon-era systems—at least in one potentially lag-inducing respect.

AMD engineer Prateek Nayak recently submitted a patch to Linux's processor idle drivers that would "skip dummy wait for processors based on the Zen microarchitecture." When ACPI support was added to the Linux kernel in 2002—written by Andy Grover, committed by Linus Torvalds—it included a "dummy wait op." The system essentially read data with no purpose other than delaying the next instruction until the CPU could fully stop with the STPCLK# command. This allowed for some power-saving and compatibility during the early days of ACPI implementation when some chipsets wouldn't move to an idle state when one would expect it.

But today's Zen-based AMD chips don't need this workaround, and, as Nayak writes, it's hurting them, at least in specific workloads on Linux. Testing with instruction-based sampling (IBS) workloads shows that "a significant amount of time is spent in the dummy op, which incorrectly gets accounted as C-State residency." The CPU, seeing all this low-effort dummy work, can push into deeper, slower C-State, which then makes the CPU take longer to "wake up," especially on jobs that require lots of switching between busy and idle states.

Nayak ran tests in tbench on a dual-socket Zen3 system against the baseline Linux kernel, a kernel with the C2 state entirely disabled, and a kernel with the dummy wait operation patched out. His patched version saw a 1,390 percent increase in minimum MB/s throughput and a 51 percent increase in mean MB/s over the baseline kernel, often just a little behind having C2 disabled entirely.

Intel systems have avoided AMD's legacy curse, as they use an MWAIT-based system for at least a decade, per the Phoronix blog. That led to an urgent patch submitted by Dave Hansen of Intel. His solution was to limit "dummy wait" to Intel systems, where it would not affect "remotely modern Intel systems," and add comments to the kernel's idle drivers that spell out what's happening—and encourage those reading to "consider moving your system to a more modern idle mechanism."

If an urgent patch removing or limiting "dummy wait" is submitted this week, it could likely make the Linux 6.0 kernel, which Torvalds expects to ship next week.

Channel Ars Technica