[buildd-tools-devel] Bug#604268: Bug#604268: Bug#604268: Bug#604268: Bug#604268: QEMU linux-user support
Loïc Minier
lool at dooz.org
Mon Jan 24 00:20:34 UTC 2011
On Sun, Jan 23, 2011, Roger Leigh wrote:
> This should be sufficiently portable for Linux usage. I am a little
> concerned that it might be fragile though. How does the binfmt-misc
> code know which file to pick? Can't we use the same mechanism and
> avoid this?
So binfmt-support has an update-binfmts which registers
/usr/share/binfmt-support/* stuff into /var/lib/binfmts/*; these end up
enabled in the running kernel (see /proc/sys/fs/binfmt_misc) via
update-binfmts --enable which is run by /etc/init.d/binfmt-support.
While we could try parsing binfmt-support's format or binfmt_misc's
format, I think the best thing here would be to check with the
binfmt-support maintainer. I wonder whether binfmt-support is
Debian/Ubuntu specific. If it is, then poking the kernel format or
trying to run a binary might be best. If it's also used on other
distros, perhaps we can get some command which tells us what the
interpreter is for a specific binary. If that makes sense to you, I
can poke Colin about it, and perhaps open a bug report for the
binfmt-support changes.
(ptrace() might also allow finding out which interpreter is run, but
that seems fragile too.)
> > > If you're unhappy with any of the names used, that's also trivial to
> > > change if you like. (I'm fairly rubbish at naming things!)
> >
> > That's probably as good as what I would think of; qemu syscall
> > emulation is usually named "CPU transparency" because it's basically a
> > mapping of a flow of CPU instruction to another one, with syscall
> > translation. This is different from qemu machine emulation which
> > emulates hardware; this is sometimes called simulation. Upstream, the
> > syscall emulation stuff is called "qemu-user emulation".
>
> Would it be better to use "emulation=qemu-user" rather than just
> "qemu"? It would allow addition of "qemu-system" at a later point,
> and also makes the distinction between the two.
I'm happy with either; I don't know of any other implementation of this
feature than qemu, and I can't think of any other name that we would
use than "emulation". So emulation=qemu is fine! You could also name
it emulation=binfmt-misc if we manage to get the information without
special casing qemu ;-)
qemu-system-* > ah, I hope you don't mind if I share some thoughts
here:
* qemu-system-* can be managed much like a remote system over TCP/IP,
with some commands to setup and teardown; I've seen software like
Hudson deal with by having two ways of controlling slaves: a) install
a piece of software on the slave which connects to the master and
gets orders, intrusive but clean b) have the master connect to the
slave over SSH and send command / receive responses over that link.
I think having simple pre-/post-commands would be fine to deal with
such use.
* There are some specificities that you could exploit to tune:
- qemu-system-* often allows you to interact with consoles of the vm;
I think that's what qemubuilder and "rootstock" use to interact
with the guest. That's nifty, but very specific to this use case.
- if your guest has virtio support, you have more efficient network
and block device, can share memory, but you can also share
filesystems! This might be very efficient (more than NFS or scp),
but again, very specific.
* The complexity is really in getting a good qemu-system-foo
environment working and up-to-date, including kernel, rootfs, cmdline
opts, serial line setup etc. This is a problem which can be solved
separately, much like creating a chroot is a separate problem from
using it. In fact, even for qemu-system-arm, each machine might have
a different boot mechanism, this board only supports booting from a
SD image, this other only from flash, that one does not have
networking etc.
I found libvirt to be a really nice abstraction to control vms of
various types; it worked fine for kvm, virtualbox and qemu based vms
for me, and has a quite complete stack with some language bindings,
higher level software, UIs etc. libvirt allows defining additional
types of vms, and I'm convinced we could define new types of vms to
start qemu-system-arm with this arm kernel and these command-line flags
etc.
> > Ah it's actually qemu-kvm-extras-static in Ubuntu, not qemu-kvm-extras;
> > I can see I'm at the origin of this bug
> > > +# Depends: file, qemu-user-static | qemu-kvm-extras
> > Should be | qemu-kvm-extras-static
> If the package is not in Debian, maybe this one should be patched
> in when put in Ubuntu?
It's not too good though because it means manual merge every time
schroot is uploaded, which adds work and delays. I could propose a
dpkg-vendor based test to generate a ${qemu-user-suggests} or similar
to use in control, but I think you'll find that uglier in your package
than the above control snippet ;-)
(NB: ideally, Ubuntu would use the same package layout as Debian; it's
quite complex here and involves many packages and a long story; I'm
happy to share the details)
> If the path wasn't hardcoded, we could
> have put it into /usr/local or even some non-standard name or location
> if we could adjust the interpreter. It's a shame it's looking inside
> the chroot, rather than the main root, but that's probably the only
> sane thing to do now a system can have multiple namespaces.
Yeah, it's a complex security problem: it's a kernel service, so
looking up things in the PATH already sounds scary, the interpreter
also ends up in userspace memory, so I would be scared if it was
allowed to load stuff outside of the chroot (albeit it seems useful).
> I think that clearly documenting this limitation is the most
> pragmatic approach here. I think the chance of installing and
> using the same qemu-$arch-static binary inside the chroot is small,
> but using a diversion will be a good improvement.
Makes complete sense.
--
Loïc Minier
More information about the Buildd-tools-devel
mailing list