The Linux graphics stack from X to Wayland

In the early 1980s, MIT computer scientist Bob Scheifler set about laying down the principles for a new windowing system. He had decided to call it X, because it was an improvement on the W graphical system, which naturally resided on the V operating system. Little did Bob know at the time, but the X Window System that he and fellow researches would eventually create would go on to cause a revolution. It became the standard graphical interface of virtually all UNIX based operating systems, because it provided features and concepts far superior to its competition. It took only a few short years for the UNIX community to embrace the X windowing system en masse.

In this article, we'll take a look at the development of the Linux graphics stack, from the initial X client/server system to the modern Wayland effort.

What made X so special, of course, is legendary. X was the first graphical interface to embrace a networked, distributed solution. An X Server running on one of the time sharing machines was capable of generating the display for windows that belong to any number of local clients. X defined a network display protocol so that windows from one machine could be displayed on another, remote machine. In fact, X was always intended to be used in this network fashion, and the protocol was completely hardware-independent. X clients running on one type of UNIX could send their displays over the wire to a completely different UNIX hardware platform.

X also abstracted the look-and-feel away from the server itself. So the X protocol defined pointing devices and window primitives, but left the appearance of the interface up to the widget toolkits, window managers, and desktop environments.

As X development proceeded, led by Bob Scheifler and under the stewardship of MIT, more vendors became interested. Industry leaders at the time, like DEC, obtained a free license to the source code to make further improvements. Then a curious thing happened. A group of vendors asked MIT if some sort of arrangement could be made to preserve the integrity of the source. They wanted to keep X universally useful to all interested parties. MIT agreed, and soon the MIT X Consortium was formed and the full source tree was released, including the DEC improvements. This release of the X source really was an extraordinary event. The vendor community realized that X had become a valuable commodity, and it was in the best interests of all to protect it from any one company gaining control. Perhaps the opening of the X source code is the single most important event to come out of the X story, and the MIT X Consortium maintains the rights to the X source today.

One of the senior developers recruited by the Consortium was Keith Packard, who was commissioned to re-implement the core X server in 1988. As we'll see, Packard figured prominently in the development of the Linux graphics stack.

Although X has ruled the UNIX and Linux graphics stacks, the feature-laden and ubiquitous software eventually became a victim of its own success. As Linux took flight throughout the '90s, X began to find use in configurations where a standalone X server and client both resided on one desktop computer; it came bundled this way with pretty much all of the Linux distributions. The network transparency of X is of no use on a single desktop installation, and this once vaunted feature was adding overhead to the video drawing.

As PC sales ballooned during this period, the sophistication of dedicated graphics hardware began to creep ahead of the capabilities of X. The development of new and improved hardware in graphics cards was and continues to be very aggressive.

The arrival of Translation Table Maps

Around 2004, some Linux developers had become increasingly frustrated with the slow pace of X development. They had at their disposal OpenGL, an image rendering API that was originally developed to produce 2D and 3D graphics (derived from work by the now-defunct Silicon Graphics) in 1992. But after years of attempting to get X to talk 3D to the graphics device, not a single OpenGL call could be made through the X layer.

Then, in 2007, a bright light. Thomas Hellstrom, Eric Anholt, and Dave Airlie had developed a memory management module they called translation table maps (TTM). TTM was designed to move the memory buffers destined for graphics devices back and forth between graphic device memory and system memory. It was notable because of the wild applause of the Linux community. It provided hope that somebody—anybody—was working on the problem of providing an API to properly manage graphical applications' 3D needs. The strategy was to make the memory buffer a first class object, and allow applications to allocate and manipulate memory buffers of the graphical content. TTM would manage the buffers for all applications on the host, and provide synchronization between the GPU and the CPU. This would be accomplished with the use of a "fence." The fence was just a signal that the GPU was finished operating on a buffer, so that control of it could be sent back to the owning application.

To be fair, TTM was an ambitious attempt to standardize how applications access the GPU; it was an overall memory manager for all video drivers in the Linux space. In short, TTM was trying to provide all of the operations that any graphics program might possibly need. The unfortunate side effect was a very large amount of code—the TTM API was huge, whereas each individual open source driver only needs a small subset of API calls. A large API means confusion for developers who have to make choices. The loudest complaint was that the TTM had some performance issues, perhaps related to the fencing mechanism, and inefficient copying of the buffer objects. TTM could be many things to many applications, but it couldn't afford to be slow.

Reenter Keith Packard. In 2008, he announced that work was proceeding on an alternative to TTM. By now Keith was working for Intel, and together with the help of Eric Anholt he used the lessons learned from developing TTM and rewrote it. The new API was to be called GEM (Graphics Execution Manager). Most developers reading this piece can probably guess what happened next, because experienced developers know that the only thing better than getting a chance to solve a big problem by writing a significant chunk of code is doing it twice.

GEM had many improvements over TTM, one of the more significant of which was the fact that the API was much tighter, and the troublesome fence concept was removed. Keith and Eric put the onus on the applications to lock memory buffers outside of the API. That freed up GEM to concentrate on managing the memory under control of the GPU, and to control the video device execution context. The goal was to shift the focus to managing ioctl() calls within the kernel instead of managing memory by moving buffers around. The net effect was that GEM became more of a streaming API than a memory manager.

GEM allowed applications to share memory buffers so that the entire contents of the GPU memory space did not have be be reloaded. This is from the original release notes:

"Gem provides simple mechanisms to manage graphics data and control execution flow within the linux [sic] operating system. Using many existing kernel subsystems, it does this with a modest amount of code."

The introduction of GEM in May of 2008 was a promising step forward for the Linux graphics stack. GEM did not try to be all things to all applications. For example, it left the execution of the GPU commands to be generated by the device specific driver. Because Keith and Eric were working at Intel, it was only natural for them to write GEM specific to the open-source intel driver. The hope was that GEM could be improved to the point where it could support other drivers as well, thus effectively covering the three biggest manufacturers of GPUs.

However, non-intel device driver adoption of GEM was slow. There is some evidence to suggest that the AMD driver adopted a "GEMified TTM manager", signifying a reluctance to move the code directly into the GEM space. GEM was in danger of becoming a one-horse race.

Both TTM and GEM try to solve the 3D acceleration problem in the Linux graphics stack by integrating directly with X to get to the device to perform GPU operations. Both attempt to bring order to the crowd of libraries like OpenGL (which depends on X), Qt (depends on X) and GTK+ (also X). The problem was that X stands between all of these libraries and the kernel, and the kernel is the way to the device driver, and the ultimately to the GPU.

X is the oldest lady at the dance, and she insists on dancing with everyone. X has millions of lines of source, but most of it was written long ago, when there were no GPUs, and no specialized transistors to do programmable shading or rotation and translation of vertexes. The hardware had no notion of oversampling and interpolation to reduce aliasing, nor was it capable of producing extremely precise color spaces. The time has come for the old lady to take a chair.

Biz & IT —

The Linux graphics stack from X to Wayland

Ars looks at the evolution of the Linux graphics stack, from the origins of …

The arrival of Translation Table Maps

Channel Ars Technica

The arrival of Translation Table Maps

reader comments

Channel Ars Technica