Xen: finishing the job
This article brought to you by LWN subscribers Subscribers to LWN.net made this article — and everything that surrounds it — possible. If you appreciate our content, please buy a subscription and make the next set of articles possible. |
Once upon a time, Xen was the hot virtualization story. The Xen developers had a working solution for Linux - using free software - well ahead of anybody else, and Xen looked like the future of virtualization on Linux. Much venture capital chased after that story, and distributors raced to be the first to offer Xen-based virtualization. But, along the way, Xen seemed to get lost. The XenSource developers often showed little interest in getting their code into the mainline, and attempts by others to get that job done ran into no end of obstacles. So Xen stayed out of the mainline for years; the first public Xen release happened in 2003, but the core Xen code was only merged for 2.6.23 in October, 2007.
In the mean time, KVM showed up and grabbed much of the attention. Its path into the mainline was almost blindingly fast, and many kernel developers were less than shy about expressing their preference for the KVM approach. More recently, Red Hat has made things more formal with its announcement of a "virtualization agenda" based on KVM. Meanwhile, lguest showed up as an easy introduction for those who want to play with virtualization code.
The Xen story is a classic example of the reasons behind the "upstream first" policy, which states that code should be merged into the mainline before being shipped to customers. Distributors rushed to ship Xen, then found themselves supporting out-of-tree code which, often, was not well supported by its creators. In particular, published releases of Xen often only supported relatively old kernels, creating lots of work for distributors wanting to ship something more current. Now at least some of those distributors are moving on to other solutions, and high-level kernel developers are questioning whether, at this point, it's worth merging the remaining Xen code at all.
All told, Xen looks to be on its last legs. Or, perhaps, the rumors of Xen's demise have been slightly exaggerated.
The code in the mainline implements the Xen "DomU" concept - an unprivileged domain with no access to the hardware. A full Xen implementation requires more than that, though; there is the user-space hypervisor (which is GPL-licensed) and the kernel-based "Dom0" code. Dom0 is the first domain started by the hypervisor; it is typically run with more privileges than any other Xen guest. The purpose of Dom0 is to carefully hand out privileges to other Xen domains, providing access to hardware, network interfaces, etc. as set by administrative policy. Actual implementations of Xen must include the Dom0 code - currently a large body of out-of-tree kernel code.
Jeremy Fitzhardinge would like to change that situation. So he has posted a core Xen Dom0 patch set with the goal of getting it merged into the 2.6.30 release. Among the review comments was this question from Andrew Morton:
In three years time, will we regret having merged this?
The questions asked by Andrew were, essentially, (1) what code (beyond
the current posting) is required to finish the job, and (2) is there
really any reason to do that? The answer
to the first question was "another 2-3 similarly sized series to get
everything so that you can boot dom0 out of the box
". Then there are
various other bits which may not ever make it into the mainline. But, says
Jeremy, getting the core into the mainline would shrink the out-of-tree
patches carried by distributors and generally make life easier for
everybody. For the second question, Jeremy responds:
Beyond that, Jeremy is arguing that Xen still has a reason to exist. Its design differs significantly from that of KVM in a number of ways; see this message for an excellent description of those differences. As a result, Xen is useful in different situations.
Some of the advantages claimed by Jeremy include:
- Xen's approach to page tables eliminates the need for shadow page
tables or page table nesting in the guests; that, in turn, allows for
significantly better performance for many workloads.
- The Xen hypervisor is lightweight, and can be run standalone; the KVM
hypervisor is, instead, the Linux kernel. It seems that some vendors
(HP and Dell are named) are shipping a Xen hypervisor in the firmware
of many of their systems; that's the code behind the "instant on"
feature, among other things.
- Xen's paravirtualization support allows it to work with hardware which
does not support full virtualization. KVM, instead, needs hardware
support.
- The separation between the hypervisor, Dom0, and DomU makes security validation easier. The separation between domains also allows for wild configurations with each device being driven by a separate domain; one might think of this kind of thing as a sort of heavyweight microkernel architecture.
KVM's advantages, instead, take the form of relative simplicity, ease of use, full access to contemporary kernel features, etc. By Jeremy's reasoning, there is a place for both systems in Linux.
The relative silence at the end of the discussion suggests that Jeremy has
made his case fairly well. Mistakes may have been made in Xen's history,
but it is a project which remains alive, and which has clear reasons to
exist. Your editor predicts that the Dom0 code will find little opposition
at the opening of the 2.6.30 merge window.
Index entries for this article | |
---|---|
Kernel | Virtualization/Xen |
Kernel | Xen |
(Log in to post comments)
Xen: finishing the job
Posted Mar 4, 2009 15:47 UTC (Wed) by sayler (guest, #3164) [Link]
Perhaps I'm not the target audience for Xen -- having used it for a number of research projects -- but it is a royal pain to have to deal with back- or forward-ported Xen Dom0's.
Xen: finishing the job
Posted Mar 4, 2009 16:41 UTC (Wed) by martinfick (subscriber, #4455) [Link]
Xen: finishing the job
Posted Mar 4, 2009 20:05 UTC (Wed) by jmorris42 (guest, #2203) [Link]
> I would run Xen if it were in the mainline..
KVM is basically QEMU with a kernel module to speed it up. KQEMU is a kernel module to speed up QEMU that doesn't depend on hardware virtualization. So is Xen on old hardware enough faster than QEMU+KQEMU to justify keeping around yet another virtualization platform? That is the billion dollar question Xen is hoping they can answer yes to. Because if they can't Citrix is going to feel really dumb after throwing big sacks 'o cash to own Xen.
Xen: finishing the job
Posted Mar 4, 2009 20:10 UTC (Wed) by martinfick (subscriber, #4455) [Link]
Xen: finishing the job
Posted Mar 4, 2009 21:33 UTC (Wed) by drag (guest, #31333) [Link]
Using Qemu there are several different ways to setup networking... you can use the default 'userspace tcp stack' which provides easy tcp networking (not the entire tcp/ip stuff though..). Or you can setup a virtual ethernet switch and connect your virtual ethernet ports to that and then use iptables to create a NAT firewall that then allows that virtual ethernet network a gateway to the outside network. Or you can combine the virtual ethernet ports with the physical external port and use a bridge to connect them.
Of course as you can imagine the default is rather limited. On my Fedora laptop virt-manager sets up a virtual ethernet switch and then connects that to the external world using a NAT firewall. That works with Network-Manager and dnsmasq so my virtual machines have access to the network irregardless of how my laptop is connected and can adapt to changing network topographies.
By default Qemu (and modified versions) use a emulated 100Mbit ethernet connection. The fastest emulated ethernet card you can use would be a Intel 1000Mbit ethernet switch.
However if you want very good performance you need to use PV network drivers which then provide good performance. I had a 300% improvement in performance, more reliable performance, and reduced cpu load from using those over the emulated nic devices.
But I guess that PV drivers are only available to people using KVM and not Kqemu/Qemu?
-----------------------------
Now I don't know exactly what Xen uses for networking stuff. But I know that it's performance is similar when using full virtualization. I don't know about it using it's paravirtualization mode.
Xen: finishing the job
Posted Mar 4, 2009 22:17 UTC (Wed) by aliguori (subscriber, #30636) [Link]
<p><i>But I guess that PV drivers are only available to people using KVM and not Kqemu/Qemu?</i></p>
<p>PV drivers (ala VirtIO) are now available in upstream QEMU</p>
Xen: finishing the job
Posted Mar 4, 2009 22:17 UTC (Wed) by aliguori (subscriber, #30636) [Link]
Xen: finishing the job
Posted Mar 5, 2009 2:55 UTC (Thu) by bluefoxicy (guest, #25366) [Link]
> through a emulated network interface.
The problem is, with KQemu/VMware/Qemu/KVM, you're running through an emulated network interface; whereas with Xen, you are not.
With KVM or Qemu, non-paravirtualized, the network hardware is emulated. The hard disk is emulated too. You make some system calls to write raw Ethernet frames or spin up TCP/IP connections; the kernel plays with some MMIO registers or do PIO IN/OUT instructions, and there's a piece of code (a reverse driver, pretty much) that tracks the state of the "hardware" and determines what exactly you're trying to do. Then it relays your intent to the host OS, which then encodes all this to... games with MMIO or PIO, through a hardware driver, into real hardware.
With Xen paravirtualization, the hardware isn't emulated. You make some system calls to emit a raw Ethernet frame or open a TCP/IP connection. The kernel calls a Xen function and says, "On this device, emit this to the network." Xen passes this to a hook in the Dom0 OS, which then looks at the virtual device in a map to find the physical device and does all the hardware magic of MMIO/PIO games to actually send it out to the network.
In other words, the kernel and the hypervisor do a hell of a lot less work when you're paravirtualized. Hardware drivers for virtual devices are essentially "Tell the hypervisor I need to write this data to this device," instead of "Do a crazy, complicated rain dance to get this device to perform this function." Even better, the hypervisor doesn't have to interpret this crazy, complicated rain dance; it's handed exactly what you want in simple, easy to read instructions which don't have to be decoded and passed to the kernel and then re-encoded for a different hardware device etc.
This means it's faster.
No it does not.
Posted Mar 6, 2009 7:22 UTC (Fri) by khim (subscriber, #9252) [Link]
Xen may be faster today but this is not intrinsic advantage.
The story with KVM:
1. Userspace asks kernel to the send the packet.
2. Context switch to kernel.
3. Kernel asks "hardware" to send the packet.
4. "Reverse driver" asks the outr kernel to send the packet.
5. Context switch to outer kernel.
7. Outer kernel talks to real hardware.
The story with Xen:
1. Userspace asks kernel to the send the packet.
2. Context switch to kernel.
3. Kernel asks Xen to send the packet.
4. Conext switch to Xen.
5. Xen asks the outer kernel to send the packet.
6. Context switch to Dom0 kernel.
7. Dom0 kernel talks to real hardware.
Context switches are expensive (equal to hundred simple operations or so) and Xen uses one additional context over KVM. This can easily compensate for simpler interface without "reverse driver". That's why there are push to create drivers directly for Xen - this way it'll be faster then KVM... if KVM will not use paravirtualization. I fail to see why it can not use paravirtualization for all devices except CPU (where it has hardware support and so is fast enough already).
In the end Xen can become fast specialized OS, but so can KVM - and which way is faster? Drepper's words are still relevant: neither Xen nor VMWare have any real advantages which cannot be surmounted by giving KVM more time to catch up, i.e., grant it the same time to develop the features.
And if so then why should we include interim solution? It depends on timeframe: everyone agree btrfs will do everything ext4 does, yet ext4 was included anyway. Because btrfs will not be ready for a few more years. If KVM will need few more years to catch up then may be Dom0 support is worth having in the kernel, but if it's only matter of months - the story will be different...
Xen: finishing the job
Posted Mar 5, 2009 11:57 UTC (Thu) by danpb (subscriber, #4831) [Link]
Xen: finishing the job
Posted Mar 7, 2009 7:29 UTC (Sat) by mab (guest, #314) [Link]
Xen: finishing the job
Posted Mar 8, 2009 8:08 UTC (Sun) by rahulsundaram (subscriber, #21946) [Link]
Xen: finishing the job
Posted Mar 4, 2009 16:50 UTC (Wed) by drag (guest, #31333) [Link]
However I've moved on to using KVM for most everything. Having the ability to simply _have_ a hypervisor by default with no effort, no patching, no rebooting, no 'lifting' my system kernel out of Ring 0, etc etc is a wonderful thing.
And the other thing is that no special or weird configurations are needed. While Fedora with virt-manager provides a nice gui and other tools... for many of my tasks simply being able to launch qemu with screen and serial output to my terminal is quite convenient.
That being said, if people are using Xen and finding it useful and there are cases were it would be superior then it would be nice to get support into the kernel.
Xen: finishing the job
Posted Mar 5, 2009 10:06 UTC (Thu) by dw (guest, #12017) [Link]
Hey I heard about this really neat new free OS by communists called DEBIAN LINUX which has all this stuff built in. Sure beats that Slackware nonsense you appear to be running. :)
apt-get install xen-linux-system && grub-install /dev/sda && reboot
Xen: finishing the job
Posted Mar 5, 2009 17:29 UTC (Thu) by drag (guest, #31333) [Link]
It's still not that easy.
With KVM... "modprobe kvm-intel" (or -amd or whatever) That will work on any recent Linux distribution. The difference being is that KVM is already there. Having to install a modified qemu is all I need to do and is _still_ quite a bit simplier and less problem prone then what you pasted there.
With my laptop, for example, which I make heavy use of virtualization for small development and documentation projects I run Fedora 10 for various reasons (my prefered distribution is Debian, btw). I have a Intel GMA X3100 video card and wifi. For various other reasons I like to have DRI2 enabled. This requires having a rather new kernel, a very new kernel (along with newer X stuff)
Also I like having good power management stuff. Being able to suspend my laptop and such is very handy as I move around quite a bit.
All of this sort of stuff makes life for a Xen user much much more difficult.
------------------------------
Also all the benefits of running Xen seem to stem from it's paravirtualization features. For what I do I need full virtualization... Having to muck around with the kernel of the guest systems in addition to the kernel of the host system is just not worth the trouble and is frequently not really even practical.
It is still not the same
Posted Mar 7, 2009 1:19 UTC (Sat) by gwolf (subscriber, #14632) [Link]
And BTW, ACPI is not only a good feature for laptops. I want my servers also to suck less energy when demand drops at night.
Xen: finishing the job
Posted Mar 4, 2009 16:59 UTC (Wed) by mday_ii (guest, #25315) [Link]
When evaluating Xen for security, you must audit the dom0 kernel + userspace right along with the Xen kernel. dom0 is a fully privileged guest - it has access to memory for all other guests. You end up with more kloc to evaulate with Xen than with KVM.
KVM also supports PCI device pass-through (the feature which allows each device to be driven by a separate domain. KVM runs paravirtual Linux kernels using the standard paravirt-ops interface.
Xen's approach to managing guest page tables (paravirtualization and batching of page table updates) will lose its benefit quickly as EPT (nested page table) support moves guest page table management into the processor. Nehalem and Barcelona both support this feature, which in some tests eliminates more than 90% of traps into the hypervisor.
Xen: finishing the job
Posted Mar 5, 2009 10:12 UTC (Thu) by dw (guest, #12017) [Link]
Surely when 'measuring' the security of KVM, one should also take into account the security of Qemu (see for example this paper which is pretty damning).
Xen: finishing the job
Posted Mar 5, 2009 13:56 UTC (Thu) by mday_ii (guest, #25315) [Link]
Xen: finishing the job
Posted Mar 8, 2009 4:33 UTC (Sun) by landley (guest, #6789) [Link]
> QEMU 0.8.2 was the latest version available as of this
> writing, which was used in its default configuration.
That was released July 22, 2006. That's about when the 2.6.17 kernel was released. So you're saying "look at all these bugs an old version of the project had". Keeping in mind that the project only _launched_ in 2003, it shouldn't come as a surprise that back when it was only 3 years old it didn't even have working x86-64 support yet (and even x86 had a very restricted and buggy set of hardware it could emulate), so its development community hadn't started paying attention to security auditing device emulations just yet. They were too busy trying to add enough features to make it usable.
I also note that the first place I saw that paper is when it was linked from the qemu development mailing list shortly after it came out, and that's when the developers went "oh, people are trying to use it for honeypots? Ok, we'd better add bounds checking and such then".
The qemu development community has roughly quadrupled in size since then, guesstimating by list traffic and source control commits...
The current qemu is 0.10.0, released March 4th. Among other new features, it integrates kvm support in the base qemu. Just FYI.
Rob
Xen: finishing the job
Posted Mar 12, 2009 23:57 UTC (Thu) by efexis (guest, #26355) [Link]
"Xen is not a full hypervisor until it loads the first domain - dom0"Yes but the domU's can't talk directly to the dom0 without going through the hypervisor code though can they? This means that dom0 doesn't have to be provably secure if the hypervisor is. The hypervisor is acting like a firewall between networks, and the smaller and simpler this bit of code is, the easier it is to reach higher levels of certainty that the system is secure.
Xen: finishing the job
Posted Mar 4, 2009 17:38 UTC (Wed) by kev009 (guest, #43906) [Link]
Xen is proven, stable, and it works on a lot of hardware that KVM doesn't. It is used by companies like Linode.com and Slicehost with great success. Having the latest Dom0 kernel would be a boon to usage and latest kernel features/hw support.
KVM is great, especially for desktops, but ATM Xen is widely used on servers and shows no sign of reduction. Please merge!
Xen: finishing the job
Posted Mar 4, 2009 18:03 UTC (Wed) by amit (subscriber, #1274) [Link]
http://librehosting.com/tech/
http://www.ukcloudhosting.co.uk/content/uk-cloud-hosting-...
Red Hat has revealed its virtualisation strategy based on KVM:
http://www.redhat.com/virtualization-strategy/?intcmp=701...
That goes a long way into saying the kind of efforts that will be put into kvm to make it as stable as possible for deploying into enterprises.
Fedora and Ubuntu already support KVM as the default hypervisor now.
That said, however, it's best to merge the Xen code upstream since having it out of tree while there are users out there doesn't seem like a good idea. One of the reasons distros quickly had to move to kvm as the default is because of the pain in maintaining out-of-tree patches and also maintaining different kernel versions just for Xen dom0 support.
Xen: finishing the job
Posted Mar 4, 2009 18:06 UTC (Wed) by amit (subscriber, #1274) [Link]
Xen: finishing the job
Posted Mar 4, 2009 18:33 UTC (Wed) by bgilbert (subscriber, #4738) [Link]
Proven and stable? You must be using a different Xen than I have. In my experience, getting Xen to work reliably on a given system is an incredible amount of work, when it's possible at all. And it mostly involves blind tinkering, since there's often no indication of how or why things are breaking.
Xen: finishing the job
Posted Mar 4, 2009 20:48 UTC (Wed) by kev009 (guest, #43906) [Link]
I am using this a production rack across a few 1st gen Opterons and it is fantastic. As an aside, VMWare wont do 64bit guests on these CPUs but Xen will.
Xen: finishing the job
Posted Mar 5, 2009 8:27 UTC (Thu) by leighbb (subscriber, #1205) [Link]
In the early days I was using unofficial packages, but since Etch everything worked "out of the box". I have just upgraded to Lenny and so far (touch wood) it remains as robust as ever.
I have a nice 8GB dual-CPU Opteron server running 64-bit dom0 and domU's. This box does not support hardware virtualisation, so from my point of view Xen is the only option to get near-native speed and 64-bit guests.
Xen: finishing the job
Posted Mar 4, 2009 19:39 UTC (Wed) by jmm (subscriber, #34596) [Link]
Xen: finishing the job
Posted Mar 5, 2009 13:56 UTC (Thu) by amit (subscriber, #1274) [Link]
KVM has paravirt drivers for Windows (for network; block is coming soon) and Linux (network, block, mmu, clock,...)
Xen: finishing the job
Posted Mar 4, 2009 20:06 UTC (Wed) by ianwoodstock (guest, #56970) [Link]
it's interesting to see the names on there - 2 from Citrix (Ian and Jeremy) and then 4 from Redhat
Xen: finishing the job
Posted Mar 4, 2009 20:50 UTC (Wed) by kev009 (guest, #43906) [Link]
Xen: finishing the job
Posted Mar 5, 2009 8:26 UTC (Thu) by kolyshkin (guest, #34342) [Link]
Xen: finishing the job
Posted Mar 5, 2009 3:53 UTC (Thu) by pabs (subscriber, #43278) [Link]
Xen: finishing the job
Posted Mar 5, 2009 13:46 UTC (Thu) by amit (subscriber, #1274) [Link]
Xen: finishing the job
Posted Mar 5, 2009 7:40 UTC (Thu) by bangert (subscriber, #28342) [Link]
the openvz project as well...
Xen: finishing the job
Posted Mar 5, 2009 8:36 UTC (Thu) by kolyshkin (guest, #34342) [Link]
Oh all right, I guess you just said «LXC» or «Linux Containers». So let me explain.
LXC is not something opposing to OpenVZ (as in KVM vs. Xen). OpenVZ team is one of the top contributors to LXC (I am not sure who's the number one here, it is either IBM or OpenVZ, and it doesn't really matter for me). What we do is we take a feature from OpenVZ patchset, rewrite it for mainstream and submit, when after a few rounds of reviewing and improving it gets merged. Those big container building blocks PID namespace, and network namespace, and memory controller are all here due to hard work of OpenVZ guys (with some help of IBM guys).
Also, currently there's still a lot of work to do for LXC to become more-or-less usable.
So I am sorry but I don't see any correlation here (aside from the «merge early» mantra). If you have a different perspective on this please share it with us.
Xen: finishing the job
Posted Mar 7, 2009 11:18 UTC (Sat) by rwmj (subscriber, #5474) [Link]
The Xen hypervisor is lightweight, and can be run standalone; the KVM hypervisor is, instead, the Linux kernel.
I'm a bit surprised Jonathan allowed Jeremy's comment above to go unremarked. Sure Xen hypervisor can be lightweight, but that's only because it doesn't have any device drivers! Once you add all the device support for Linux, guess what, you've got the Linux kernel. Xen is only really useful once you combine the HV and the Dom0 (ie. a full Linux kernel). There's not even any security advantage since the Dom0 gets full access to the hardware.
Xen: finishing the job (security view)
Posted Mar 10, 2009 18:54 UTC (Tue) by huneycutt (guest, #13037) [Link]
Professional perspective most of my customers are in the DoD or intelligence communities. Id like to know which approach is going to get the authoritative backing for pursuing government-strength security certification and accreditation. I can see arguments for each
multiple KVM partitions running on top of a EAL 4+ version of Linux (RH or HP configuration), or the single Xen hypervisor (reminiscent of a medium assurance version of the Separation Kernel Protection Profile) controlling a single semi-trusted partition (dom0) and a bunch of untrusted partitions (domU).
Historically, the evaluators much prefer simplicity in the products being evaluated. The more trivial, the more likely to be certified. In this view, testing Xen for SKPP-type controls and granting it a level of trustworthiness in controlling both dom0 and domU domains would seem more likely. Combining an SKPP-type evaluation (for KVM itself) on top of an LSPP/RBAC/etc EAL 4+ evaluation is probably asking too much of anyone
even if the protection profiles used for the RH/HP certifications were still valid.
Unfortunately, I dont see anyone jumping up and down right now to pay for sponsoring any more NIAP testing, given the state of the common criteria, the pace of virtualization evolution, and the extremely dynamic nature of the certification and accreditation processes themselves. I know that NSA is pressing ahead with the HAP program, but Id prefer to see the defacto standard solutions come from a purely open source effort, if only to make world-wide secure information sharing achievable. Id love to hear from anyone with additional information on this topic.
Also unfortunately, when intellectual properties such as Xen and KVM are acquired by major players in the commercial side of the business (which Citrix and RH undoubtedly qualify as), it is human nature to begin to question the motives of their actions (such as belligerently stonewalling kernel incorporation or providing roadmaps which lean strongly toward a recently acquired product without providing documentation of the technical justification for the new stance). Trust is very nearly impossible to regain once it has been compromised. Id strongly recommend to both Citrix and Red Hat that they continue to work together for the good of the overall user community, advance both of their products as openly as possible, and let the users make the decisions about which approach is best based on their experiences. Otherwise, the open source community is taking a big step toward behaving just like the other folks out there.
Xen: finishing the job (security view)
Posted Jun 3, 2009 15:39 UTC (Wed) by ceplm (subscriber, #41334) [Link]