Running applications on the Xen Hypervisor
There are a number of motivations for running applications directly on top of the Xen hypervisor without resorting to a full general-purpose OS. For example, one might want to maximally isolate applications with minimal overhead. Leaving the OS out of the picture decreases overhead, since for example the inter-application protection offered normally by virtual memory is already handled by the Xen hypervisor. However, at the same time problems arise: applications expect and use many services normally provided by the OS, for example files, sockets, event notification and so forth. We were able to set up a production quality environment for running applications as Xen DomU's in a few weeks by reusing hundreds of thousands of lines of unmodified driver and infrastructure code from NetBSD. While the amount of driver code may sound like a lot for running single applications, keep in mind that it involves for example file systems, the TCP/IP stack, stdio, system calls and so forth -- the innocent-looking open() alone accepts over 20 flags which must be properly handled. The remainder of this post looks at the effort in more detail.
I have been on a path to maximize the reuse potential of the NetBSD kernel with a technology called rump kernels. Things started out with running unmodified drivers in userspace on NetBSD, but the ability to host rump kernels has since spread to the userspace of other operating systems, web browsers (when compiled to javascript), and even the Linux kernel. Running rump kernels directly on the Xen hypervisor has been suggested by a number of people over the years. It provides a different sort of challenge, since as opposed to the environments mentioned previously, the Xen hypervisor is a "bare metal" type of environment: the guest is in charge of everything starting from bootstrap and page table management. The conveniently reusable Xen Mini-OS was employed for interfacing with the bare metal environment, and the necessary rump kernel hypercalls were built upon that. The environment for running unmodified NetBSD kernel drivers (e.g. TCP/IP) and system call handlers (e.g. socket()) directly on top of the Xen hypervisor was available after implementing the necessary rump kernel hypercalls (note: link points to the initial revision).
Shortly after I had published the above, I was contacted by Justin Cormack with whom I had collaborated earlier on his ljsyscall project, which provides system call interfaces to Lua programs. He wanted to run the LuaJIT interpreter and his ljsyscall implementation directly on top of Xen. However, in addition to system calls which were already handled by the rump kernel, the LuaJIT interpreter uses interfaces from libc, so we added the NetBSD libc to the mix. While currently the libc sources are hosted on github, we plan to integrate the changes into the upstream NetBSD sources as soon as things settle (see the repo for instructions on how to produce a diff and verify that the changes really are tiny). The same repository hosts the math library libm, but it is there just for convenience reasons so that these early builds deliver everything from a single checkout. It has been verified that you can alternatively use libm from a standard NetBSD binary distribution, as supposedly you could any other user-level library. The resulting architecture is depicted below.
The API and ABI we provide are the same as of a regular NetBSD installation. Apart from some limitations, such as the absense of fork() -- would it duplicate the DomU? -- objects compiled for a regular NetBSD installation can be linked into the DomU image and booted to run directly as standalone applications on top of Xen. As proofs of concept, I created a demo where a Xen DomU configures TCP/IP networking, mounts a file system image, and runs a httpd daemon to serve the contents, and Justin's demo runs the LuaJIT interpreter and executes the self-test suite for ljsyscall. Though there is solid support for running applications, not all of the work is done. Especially the build framework needs to be more flexible, and everyone who has a use case for this technology is welcome to test out their application and contribute ideas and code for improving the framework.
In conclusion, we have shown that it is straightforward to reuse both kernel and library code from an existing real-world operating system in creating an application environment which can run on top of a bare-metal type cloud platform. By being able to use in the order of 99.9% of the code -- that's 1,000 lines written per 1,000,000 used unmodified -- from an existing, real-world proven source, the task was quick to pull off, the result is robust, and the offered application interfaces are complete. Some might call our work a "yet another $fookernel", but we call it a working result and challenge everyone to evaluate it for themselves.
[5 comments]
Posted by Jim Wise on September 17, 2013 at 06:25 PM UTC #
Posted by Antti Kantee on September 17, 2013 at 08:25 PM UTC #
Posted by josh on September 18, 2013 at 12:03 AM UTC #
Posted by Krishna on September 18, 2013 at 12:46 PM UTC #
Posted by Antti Kantee on September 18, 2013 at 01:56 PM UTC #