[kernel] r11745 - in dists/trunk/linux-2.6/debian/patches: features series

Maximilian Attems maks at alioth.debian.org
Wed Jul 2 21:33:27 UTC 2008


Author: maks
Date: Wed Jul  2 21:33:25 2008
New Revision: 11745

Log:
xen: merge 2.6.27 patch material

the x86/xen branch adds:
* Save/restore/migration
* Further pvfb enhancements

not yet enabled xen patch subject to tests.

Added:
   dists/trunk/linux-2.6/debian/patches/features/xen-x86.patch
Modified:
   dists/trunk/linux-2.6/debian/patches/series/1~experimental.1-extra

Added: dists/trunk/linux-2.6/debian/patches/features/xen-x86.patch
==============================================================================
--- (empty file)
+++ dists/trunk/linux-2.6/debian/patches/features/xen-x86.patch	Wed Jul  2 21:33:25 2008
@@ -0,0 +1,3842 @@
+commit 400d34944c4ad82a817c06e570bc93b1114aa596
+Author: Jeremy Fitzhardinge <jeremy at goop.org>
+Date:   Mon Jun 16 04:30:03 2008 -0700
+
+    xen: add mechanism to extend existing multicalls
+    
+    Some Xen hypercalls accept an array of operations to work on.  In
+    general this is because its more efficient for the hypercall to the
+    work all at once rather than as separate hypercalls (even batched as a
+    multicall).
+    
+    This patch adds a mechanism (xen_mc_extend_args()) to allocate more
+    argument space to the last-issued multicall, in order to extend its
+    argument list.
+    
+    The user of this mechanism is xen/mmu.c, which uses it to extend the
+    args array of mmu_update.  This is particularly valuable when doing
+    the update for a large mprotect, which goes via
+    ptep_modify_prot_commit(), but it also manages to batch updates to
+    pgd/pmds as well.
+    
+    Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge at citrix.com>
+    Acked-by: Linus Torvalds <torvalds at linux-foundation.org>
+    Acked-by: Hugh Dickins <hugh at veritas.com>
+    Signed-off-by: Ingo Molnar <mingo at elte.hu>
+
+commit e57778a1e30470c9f5b79e370511b9af29b59c48
+Author: Jeremy Fitzhardinge <jeremy at goop.org>
+Date:   Mon Jun 16 04:30:02 2008 -0700
+
+    xen: implement ptep_modify_prot_start/commit
+    
+    Xen has a pte update function which will update a pte while preserving
+    its accessed and dirty bits.  This means that ptep_modify_prot_start() can be
+    implemented as a simple read of the pte value.  The hardware may
+    update the pte in the meantime, but ptep_modify_prot_commit() updates it while
+    preserving any changes that may have happened in the meantime.
+    
+    The updates in ptep_modify_prot_commit() are batched if we're currently in lazy
+    mmu mode.
+    
+    The mmu_update hypercall can take a batch of updates to perform, but
+    this code doesn't make particular use of that feature, in favour of
+    using generic multicall batching to get them all into the hypervisor.
+    
+    The net effect of this is that each mprotect pte update turns from two
+    expensive trap-and-emulate faults into they hypervisor into a single
+    hypercall whose cost is amortized in a batched multicall.
+    
+    Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge at citrix.com>
+    Acked-by: Linus Torvalds <torvalds at linux-foundation.org>
+    Acked-by: Hugh Dickins <hugh at veritas.com>
+    Signed-off-by: Ingo Molnar <mingo at elte.hu>
+
+commit 08b882c627aeeeb3cfd3c4354f0d360d7949549d
+Author: Jeremy Fitzhardinge <jeremy at goop.org>
+Date:   Mon Jun 16 04:30:01 2008 -0700
+
+    paravirt: add hooks for ptep_modify_prot_start/commit
+    
+    This patch adds paravirt-ops hooks in pv_mmu_ops for ptep_modify_prot_start and
+    ptep_modify_prot_commit.  This allows the hypervisor-specific backends to
+    implement these in some more efficient way.
+    
+    Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge at citrix.com>
+    Acked-by: Linus Torvalds <torvalds at linux-foundation.org>
+    Acked-by: Hugh Dickins <hugh at veritas.com>
+    Signed-off-by: Ingo Molnar <mingo at elte.hu>
+
+commit 1ea0704e0da65b2b46f9142ff1391163aac24060
+Author: Jeremy Fitzhardinge <jeremy at goop.org>
+Date:   Mon Jun 16 04:30:00 2008 -0700
+
+    mm: add a ptep_modify_prot transaction abstraction
+    
+    This patch adds an API for doing read-modify-write updates to a pte's
+    protection bits which may race against hardware updates to the pte.
+    After reading the pte, the hardware may asynchonously set the accessed
+    or dirty bits on a pte, which would be lost when writing back the
+    modified pte value.
+    
+    The existing technique to handle this race is to use
+    ptep_get_and_clear() atomically fetch the old pte value and clear it
+    in memory.  This has the effect of marking the pte as non-present,
+    which will prevent the hardware from updating its state.  When the new
+    value is written back, the pte will be present again, and the hardware
+    can resume updating the access/dirty flags.
+    
+    When running in a virtualized environment, pagetable updates are
+    relatively expensive, since they generally involve some trap into the
+    hypervisor.  To mitigate the cost of these updates, we tend to batch
+    them.
+    
+    However, because of the atomic nature of ptep_get_and_clear(), it is
+    inherently non-batchable.  This new interface allows batching by
+    giving the underlying implementation enough information to open a
+    transaction between the read and write phases:
+    
+    ptep_modify_prot_start() returns the current pte value, and puts the
+      pte entry into a state where either the hardware will not update the
+      pte, or if it does, the updates will be preserved on commit.
+    
+    ptep_modify_prot_commit() writes back the updated pte, makes sure that
+      any hardware updates made since ptep_modify_prot_start() are
+      preserved.
+    
+    ptep_modify_prot_start() and _commit() must be exactly paired, and
+    used while holding the appropriate pte lock.  They do not protect
+    against other software updates of the pte in any way.
+    
+    The current implementations of ptep_modify_prot_start and _commit are
+    functionally unchanged from before: _start() uses ptep_get_and_clear()
+    fetch the pte and zero the entry, preventing any hardware updates.
+    _commit() simply writes the new pte value back knowing that the
+    hardware has not updated the pte in the meantime.
+    
+    The only current user of this interface is mprotect
+    
+    Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge at citrix.com>
+    Acked-by: Linus Torvalds <torvalds at linux-foundation.org>
+    Acked-by: Hugh Dickins <hugh at veritas.com>
+    Signed-off-by: Ingo Molnar <mingo at elte.hu>
+
+commit d02859ecb321c8c0f74cb9bbe3f51a59e58822b0
+Merge: a987b16... 543cf4c...
+Author: Ingo Molnar <mingo at elte.hu>
+Date:   Wed Jun 25 12:16:51 2008 +0200
+
+    Merge commit 'v2.6.26-rc8' into x86/xen
+    
+    Conflicts:
+    
+    	arch/x86/xen/enlighten.c
+    	arch/x86/xen/mmu.c
+    
+    Signed-off-by: Ingo Molnar <mingo at elte.hu>
+
+commit a987b16cc6123af2c9414032701bab5f73c54c89
+Author: Jeremy Fitzhardinge <jeremy at goop.org>
+Date:   Mon Jun 16 15:01:56 2008 -0700
+
+    xen: don't drop NX bit
+    
+    Because NX is now enforced properly, we must put the hypercall page
+    into the .text segment so that it is executable.
+    
+    Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge at citrix.com>
+    Cc: Stable Kernel <stable at kernel.org>
+    Cc: the arch/x86 maintainers <x86 at kernel.org>
+    Signed-off-by: Ingo Molnar <mingo at elte.hu>
+
+commit eb179e443deb0a5c81a62b4c157124a4b7ff1813
+Author: Jeremy Fitzhardinge <jeremy at goop.org>
+Date:   Mon Jun 16 15:01:53 2008 -0700
+
+    xen: mask unwanted pte bits in __supported_pte_mask
+    
+    [ Stable: this isn't a bugfix in itself, but it's a pre-requiste
+      for "xen: don't drop NX bit" ]
+    
+    Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge at citrix.com>
+    Cc: Stable Kernel <stable at kernel.org>
+    Cc: the arch/x86 maintainers <x86 at kernel.org>
+    Signed-off-by: Ingo Molnar <mingo at elte.hu>
+
+commit 6673cf63e5d973db5145d1f48b354efcb9fe2a13
+Author: Isaku Yamahata <yamahata at valinux.co.jp>
+Date:   Mon Jun 16 14:58:13 2008 -0700
+
+    xen: Use wmb instead of rmb in xen_evtchn_do_upcall().
+    
+    This patch is ported one from 534:77db69c38249 of linux-2.6.18-xen.hg.
+    Use wmb instead of rmb to enforce ordering between
+    evtchn_upcall_pending and evtchn_pending_sel stores
+    in xen_evtchn_do_upcall().
+    
+    Cc: Samuel Thibault <samuel.thibault at eu.citrix.com>
+    Signed-off-by: Isaku Yamahata <yamahata at valinux.co.jp>
+    Cc: Nick Piggin <nickpiggin at yahoo.com.au>
+    Cc: the arch/x86 maintainers <x86 at kernel.org>
+    Signed-off-by: Ingo Molnar <mingo at elte.hu>
+
+commit 688d22e23ab1caacb2c36c615854294b58f2ea47
+Merge: 7e0edc1... 0665190...
+Author: Ingo Molnar <mingo at elte.hu>
+Date:   Mon Jun 16 11:21:27 2008 +0200
+
+    Merge branch 'linus' into x86/xen
+
+commit 7e0edc1bc343231029084761ebf59e522902eb49
+Author: Jeremy Fitzhardinge <jeremy at goop.org>
+Date:   Sat May 31 01:33:04 2008 +0100
+
+    xen: add new Xen elfnote types and use them appropriately
+    
+    Define recently added XEN_ELFNOTEs, and use them appropriately.
+    Most significantly, this enables domain checkpointing (xm save -c).
+    
+    Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge at citrix.com>
+    Signed-off-by: Ingo Molnar <mingo at elte.hu>
+
+commit d07af1f0e3a3e378074fc36322dd7b0e72d9a3e2
+Author: Jeremy Fitzhardinge <jeremy at goop.org>
+Date:   Sat May 31 01:33:03 2008 +0100
+
+    xen: resume timers on all vcpus
+    
+    On resume, the vcpu timer modes will not be restored.  The timer
+    infrastructure doesn't do this for us, since it assumes the cpus
+    are offline.  We can just poke the other vcpus into the right mode
+    directly though.
+    
+    Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge at citrix.com>
+    Signed-off-by: Ingo Molnar <mingo at elte.hu>
+
+commit 9c7a794209f8a91f47697c3be20597eb60531e6d
+Author: Jeremy Fitzhardinge <jeremy at goop.org>
+Date:   Sat May 31 01:33:02 2008 +0100
+
+    xen: restore vcpu_info mapping
+    
+    If we're using vcpu_info mapping, then make sure its restored on all
+    processors before relasing them from stop_machine.
+    
+    The only complication is that if this fails, we can't continue because
+    we've already made assumptions that the mapping is available (baked in
+    calls to the _direct versions of the functions, for example).
+    
+    Fortunately this can only happen with a 32-bit hypervisor, which may
+    possibly run out of mapping space.  On a 64-bit hypervisor, this is a
+    non-issue.
+    
+    Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge at citrix.com>
+    Signed-off-by: Ingo Molnar <mingo at elte.hu>
+
+commit e2426cf85f8db5891fb5831323d2d0c176c4dadc
+Author: Jeremy Fitzhardinge <jeremy at goop.org>
+Date:   Sat May 31 01:24:27 2008 +0100
+
+    xen: avoid hypercalls when updating unpinned pud/pmd
+    
+    When operating on an unpinned pagetable (ie, one under construction or
+    destruction), it isn't necessary to use a hypercall to update a
+    pud/pmd entry.  Jan Beulich observed that a similar optimisation
+    avoided many thousands of hypercalls while doing a kernel build.
+    
+    One tricky part is that early in the kernel boot there's no page
+    structure, so we can't check to see if the page is pinned.  In that
+    case, we just always use the hypercall.
+    
+    Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge at citrix.com>
+    Cc: Jan Beulich <jbeulich at novell.com>
+    Signed-off-by: Ingo Molnar <mingo at elte.hu>
+
+commit 15ce60056b24a65b65e28de973a9fd8ac0750a2f
+Author: Ingo Molnar <mingo at elte.hu>
+Date:   Mon Jun 2 13:20:11 2008 +0200
+
+    xen: export get_phys_to_machine
+    
+    -tip testing found the following xen-console symbols trouble:
+    
+      ERROR: "get_phys_to_machine" [drivers/video/xen-fbfront.ko] undefined!
+      ERROR: "get_phys_to_machine" [drivers/net/xen-netfront.ko] undefined!
+      ERROR: "get_phys_to_machine" [drivers/input/xen-kbdfront.ko] undefined!
+    
+    with:
+    
+      http://redhat.com/~mingo/misc/config-Mon_Jun__2_12_25_13_CEST_2008.bad
+
+commit c78277288e3d561d55fb48bc0fe8d6e2cf4d0880
+Author: Jeremy Fitzhardinge <jeremy at goop.org>
+Date:   Thu May 29 09:02:19 2008 +0100
+
+    CONFIG_PM_SLEEP fix: xen: fix compilation when CONFIG_PM_SLEEP is disabled
+    
+    Xen save/restore depends on CONFIG_PM_SLEEP being set for device_power_up/down.
+    
+    Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge at citrix.com>
+    Acked-by: Randy Dunlap <randy.dunlap at oracle.com>
+    Signed-off-by: Ingo Molnar <mingo at elte.hu>
+
+commit 0261ac5f2f43a1906cfacfb19d62ed643d162cbe
+Author: Ingo Molnar <mingo at elte.hu>
+Date:   Thu May 29 09:31:50 2008 +0200
+
+    xen: fix "xen: implement save/restore"
+    
+    -tip testing found the following build breakage:
+    
+      drivers/built-in.o: In function `xen_suspend':
+      manage.c:(.text+0x4390f): undefined reference to `xen_console_resume'
+    
+    with this config:
+    
+      http://redhat.com/~mingo/misc/config-Thu_May_29_09_23_16_CEST_2008.bad
+    
+    i have bisected it down to:
+    
+    |  commit 0e91398f2a5d4eb6b07df8115917d0d1cf3e9b58
+    |  Author: Jeremy Fitzhardinge <jeremy at goop.org>
+    |  Date:   Mon May 26 23:31:27 2008 +0100
+    |
+    |      xen: implement save/restore
+    
+    the problem is that drivers/xen/manage.c is built unconditionally if
+    CONFIG_XEN is enabled and makes use of xen_suspend(), but
+    drivers/char/hvc_xen.c, where the xen_suspend() method is implemented,
+    is only build if CONFIG_HVC_XEN=y as well.
+    
+    i have solved this by providing a NOP implementation for xen_suspend()
+    in the !CONFIG_HVC_XEN case.
+    
+    Signed-off-by: Ingo Molnar <mingo at elte.hu>
+
+commit b20aeccd6ad42ccb6be1b3d1d32618ddd2b31bf0
+Author: Ingo Molnar <mingo at elte.hu>
+Date:   Wed May 28 14:24:38 2008 +0200
+
+    xen: fix early bootup crash on native hardware
+    
+    -tip tree auto-testing found the following early bootup hang:
+    
+    -------------->
+    get_memcfg_from_srat: assigning address to rsdp
+    RSD PTR  v0 [Nvidia]
+    BUG: Int 14: CR2 ffd00040
+         EDI 8092fbfe  ESI ffd00040  EBP 80b0aee8  ESP 80b0aed0
+         EBX 000f76f0  EDX 0000000e  ECX 00000003  EAX ffd00040
+         err 00000000  EIP 802c055a   CS 00000060  flg 00010006
+    Stack: ffd00040 80bc78d0 80b0af6c 80b1dbfe 8093d8ba 00000008 80b42810 80b4ddb4
+           80b42842 00000000 80b0af1c 801079c8 808e724e 00000000 80b42871 802c0531
+           00000100 00000000 0003fff0 80b0af40 80129999 00040100 00040100 00000000
+    Pid: 0, comm: swapper Not tainted 2.6.26-rc4-sched-devel.git #570
+     [<802c055a>] ? strncmp+0x11/0x25
+     [<80b1dbfe>] ? get_memcfg_from_srat+0xb4/0x568
+     [<801079c8>] ? mcount_call+0x5/0x9
+     [<802c0531>] ? strcmp+0xa/0x22
+     [<80129999>] ? printk+0x38/0x3a
+     [<80129999>] ? printk+0x38/0x3a
+     [<8011b122>] ? memory_present+0x66/0x6f
+     [<80b216b4>] ? setup_memory+0x13/0x40c
+     [<80b16b47>] ? propagate_e820_map+0x80/0x97
+     [<80b1622a>] ? setup_arch+0x248/0x477
+     [<80129999>] ? printk+0x38/0x3a
+     [<80b11759>] ? start_kernel+0x6e/0x2eb
+     [<80b110fc>] ? i386_start_kernel+0xeb/0xf2
+     =======================
+    <------
+    
+    with this config:
+    
+       http://redhat.com/~mingo/misc/config-Wed_May_28_01_33_33_CEST_2008.bad
+    
+    The thing is, the crash makes little sense at first sight. We crash on a
+    benign-looking printk. The code around it got changed in -tip but
+    checking those topic branches individually did not reproduce the bug.
+    
+    Bisection led to this commit:
+    
+    |   d5edbc1f75420935b1ec7e65df10c8f81cea82de is first bad commit
+    |   commit d5edbc1f75420935b1ec7e65df10c8f81cea82de
+    |   Author: Jeremy Fitzhardinge <jeremy at goop.org>
+    |   Date:   Mon May 26 23:31:22 2008 +0100
+    |
+    |   xen: add p2m mfn_list_list
+    
+    Which is somewhat surprising, as on native hardware Xen client side
+    should have little to no side-effects.
+    
+    After some head scratching, it turns out the following happened:
+    randconfig enabled the following Xen options:
+    
+      CONFIG_XEN=y
+      CONFIG_XEN_MAX_DOMAIN_MEMORY=8
+      # CONFIG_XEN_BLKDEV_FRONTEND is not set
+      # CONFIG_XEN_NETDEV_FRONTEND is not set
+      CONFIG_HVC_XEN=y
+      # CONFIG_XEN_BALLOON is not set
+    
+    which activated this piece of code in arch/x86/xen/mmu.c:
+    
+    > @@ -69,6 +69,13 @@
+    >  	__attribute__((section(".data.page_aligned"))) =
+    >  		{ [ 0 ... TOP_ENTRIES - 1] = &p2m_missing[0] };
+    >
+    > +/* Arrays of p2m arrays expressed in mfns used for save/restore */
+    > +static unsigned long p2m_top_mfn[TOP_ENTRIES]
+    > +	__attribute__((section(".bss.page_aligned")));
+    > +
+    > +static unsigned long p2m_top_mfn_list[TOP_ENTRIES / P2M_ENTRIES_PER_PAGE]
+    > +	__attribute__((section(".bss.page_aligned")));
+    
+    The problem is, you must only put variables into .bss.page_aligned that
+    have a _size_ that is _exactly_ page aligned. In this case the size of
+    p2m_top_mfn_list is not page aligned:
+    
+     80b8d000 b p2m_top_mfn
+     80b8f000 b p2m_top_mfn_list
+     80b8f008 b softirq_stack
+     80b97008 b hardirq_stack
+     80b9f008 b bm_pte
+    
+    So all subsequent variables get unaligned which, depending on luck,
+    breaks the kernel in various funny ways. In this case what killed the
+    kernel first was the misaligned bootmap pte page, resulting in that
+    creative crash above.
+    
+    Anyway, this was a fun bug to track down :-)
+    
+    I think the moral is that .bss.page_aligned is a dangerous construct in
+    its current form, and the symptoms of breakage are very non-trivial, so
+    i think we need build-time checks to make sure all symbols in
+    .bss.page_aligned are truly page aligned.
+    
+    The Xen fix below gets the kernel booting again.
+    
+    Signed-off-by: Ingo Molnar <mingo at elte.hu>
+
+commit 359cdd3f866b6219a6729e313faf2221397f3278
+Author: Jeremy Fitzhardinge <jeremy at goop.org>
+Date:   Mon May 26 23:31:28 2008 +0100
+
+    xen: maintain clock offset over save/restore
+    
+    Hook into the device model to make sure that timekeeping's resume handler
+    is called.  This deals with our clocksource's non-monotonicity over the
+    save/restore.  Explicitly call clock_has_changed() to make sure that
+    all the timers get retriggered properly.
+    
+    Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge at citrix.com>
+    Signed-off-by: Thomas Gleixner <tglx at linutronix.de>
+
+commit 0e91398f2a5d4eb6b07df8115917d0d1cf3e9b58
+Author: Jeremy Fitzhardinge <jeremy at goop.org>
+Date:   Mon May 26 23:31:27 2008 +0100
+
+    xen: implement save/restore
+    
+    This patch implements Xen save/restore and migration.
+    
+    Saving is triggered via xenbus, which is polled in
+    drivers/xen/manage.c.  When a suspend request comes in, the kernel
+    prepares itself for saving by:
+    
+    1 - Freeze all processes.  This is primarily to prevent any
+        partially-completed pagetable updates from confusing the suspend
+        process.  If CONFIG_PREEMPT isn't defined, then this isn't necessary.
+    
+    2 - Suspend xenbus and other devices
+    
+    3 - Stop_machine, to make sure all the other vcpus are quiescent.  The
+        Xen tools require the domain to run its save off vcpu0.
+    
+    4 - Within the stop_machine state, it pins any unpinned pgds (under
+        construction or destruction), performs canonicalizes various other
+        pieces of state (mostly converting mfns to pfns), and finally
+    
+    5 - Suspend the domain
+    
+    Restore reverses the steps used to save the domain, ending when all
+    the frozen processes are thawed.
+    
+    Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge at citrix.com>
+    Signed-off-by: Thomas Gleixner <tglx at linutronix.de>
+
+commit 7d88d32a4670af583c896e5ecd3929b78538ca62
+Author: Jeremy Fitzhardinge <jeremy at goop.org>
+Date:   Mon May 26 23:31:26 2008 +0100
+
+    xenbus: rebind irq on restore
+    
+    When restoring, rebind the existing xenbus irq to the new xenbus event
+    channel.  (It turns out in practice that this is always the same, and
+    is never updated on restore.  That's a bug, but Xeno-linux has been
+    like this for a long time, so it can't really be fixed.)
+    
+    Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge at citrix.com>
+    Signed-off-by: Thomas Gleixner <tglx at linutronix.de>
+
+commit 6b9b732d0e396a3f1a95977162a8624aafce38a1
+Author: Jeremy Fitzhardinge <jeremy at goop.org>
+Date:   Mon May 26 23:31:25 2008 +0100
+
+    xen-console: add save/restore
+    
+    Add code to:
+    
+     1. Deal with the console page being canonicalized.  During save, the
+        console's mfn in the start_info structure is canonicalized to a pfn.
+        In order to deal with that, we always use a copy of the pfn and
+        indirect off that all the time.  However, we fall back to using the
+        mfn if the pfn hasn't been initialized yet.
+    
+     2. Restore the console event channel, and rebind it to the existing irq.
+    
+    Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge at citrix.com>
+    Signed-off-by: Thomas Gleixner <tglx at linutronix.de>
+
+commit 0f2287ad7c61f10b2a22a06e2a66cdbbbfc44ad0
+Author: Jeremy Fitzhardinge <jeremy at goop.org>
+Date:   Mon May 26 23:31:24 2008 +0100
+
+    xen: fix unbind_from_irq()
+    
+    Rearrange the tests in unbind_from_irq() so that we can still unbind
+    an irq even if the underlying event channel is bad.  This allows a
+    device driver to shuffle its irqs on save/restore before the
+    underlying event channels have been fixed up.
+    
+    Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge at citrix.com>
+    Signed-off-by: Thomas Gleixner <tglx at linutronix.de>
+
+commit eb1e305f4ef201e549ffd475b7dcbcd4ec36d7dc
+Author: Jeremy Fitzhardinge <jeremy at goop.org>
+Date:   Mon May 26 23:31:23 2008 +0100
+
+    xen: add rebind_evtchn_irq
+    
+    Add rebind_evtchn_irq(), which will rebind an device driver's existing
+    irq to a new event channel on restore.  Since the new event channel
+    will be masked and bound to vcpu0, we update the state accordingly and
+    unmask the irq once everything is set up.
+    
+    Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge at citrix.com>
+    Signed-off-by: Thomas Gleixner <tglx at linutronix.de>
+
+commit d5edbc1f75420935b1ec7e65df10c8f81cea82de
+Author: Jeremy Fitzhardinge <jeremy at goop.org>
+Date:   Mon May 26 23:31:22 2008 +0100
+
+    xen: add p2m mfn_list_list
+    
+    When saving a domain, the Xen tools need to remap all our mfns to
+    portable pfns.  In order to remap our p2m table, it needs to know
+    where all its pages are, so maintain the references to the p2m table
+    for it to use.
+    
+    Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge at citrix.com>
+    Signed-off-by: Thomas Gleixner <tglx at linutronix.de>
+
+commit a0d695c821544947342a2d372ec4108bc813b979
+Author: Jeremy Fitzhardinge <jeremy at goop.org>
+Date:   Mon May 26 23:31:21 2008 +0100
+
+    xen: make dummy_shared_info non-static
+    
+    Rename dummy_shared_info to xen_dummy_shared_info and make it
+    non-static, in anticipation of users outside of enlighten.c
+    
+    Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge at citrix.com>
+    Signed-off-by: Thomas Gleixner <tglx at linutronix.de>
+
+commit cf0923ea295ba08ae656ef04164a43cb6553ba99
+Author: Jeremy Fitzhardinge <jeremy at goop.org>
+Date:   Mon May 26 23:31:20 2008 +0100
+
+    xen: efficiently support a holey p2m table
+    
+    When using sparsemem and memory hotplug, the kernel's pseudo-physical
+    address space can be discontigious.  Previously this was dealt with by
+    having the upper parts of the radix tree stubbed off.  Unfortunately,
+    this is incompatible with save/restore, which requires a complete p2m
+    table.
+    
+    The solution is to have a special distinguished all-invalid p2m leaf
+    page, which we can point all the hole areas at.  This allows the tools
+    to see a complete p2m table, but it only costs a page for all memory
+    holes.
+    
+    It also simplifies the code since it removes a few special cases.
+    
+    Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge at citrix.com>
+    Signed-off-by: Thomas Gleixner <tglx at linutronix.de>
+
+commit 8006ec3e911f93d702e1d4a4e387e244ab434924
+Author: Jeremy Fitzhardinge <jeremy at goop.org>
+Date:   Mon May 26 23:31:19 2008 +0100
+
+    xen: add configurable max domain size
+    
+    Add a config option to set the max size of a Xen domain.  This is used
+    to scale the size of the physical-to-machine array; it ends up using
+    around 1 page/GByte, so there's no reason to be very restrictive.
+    
+    For a 32-bit guest, the default value of 8GB is probably sufficient;
+    there's not much point in giving a 32-bit machine much more memory
+    than that.
+    
+    Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge at citrix.com>
+    Signed-off-by: Thomas Gleixner <tglx at linutronix.de>
+
+commit d451bb7aa852627bdf7be7937dc3d9d9f261b235
+Author: Jeremy Fitzhardinge <jeremy at goop.org>
+Date:   Mon May 26 23:31:18 2008 +0100
+
+    xen: make phys_to_machine structure dynamic
+    
+    We now support the use of memory hotplug, so the physical to machine
+    page mapping structure must be dynamic.  This is implemented as a
+    two-level radix tree structure, which allows us to efficiently
+    incrementally allocate memory for the p2m table as new pages are
+    added.
+    
+    Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge at citrix.com>
+    Signed-off-by: Thomas Gleixner <tglx at linutronix.de>
+
+commit 955d6f1778da5a9795f2dfb07f760006f194609a
+Author: Adrian Bunk <bunk at kernel.org>
+Date:   Mon May 26 23:31:17 2008 +0100
+
+    xen: drivers/xen/balloon.c: make a function static
+    
+    Make the needlessly global balloon_set_new_target() static.
+    
+    Signed-off-by: Adrian Bunk <bunk at kernel.org>
+    Acked-by: Chris Wright <chrisw at sous-sol.org>
+    Signed-off-by: Andrew Morton <akpm at linux-foundation.org>
+    Signed-off-by: Thomas Gleixner <tglx at linutronix.de>
+
+commit 38bb5ab4179572f4d24d3ca7188172a31ca51a69
+Author: Jeremy Fitzhardinge <jeremy at goop.org>
+Date:   Mon May 26 23:31:16 2008 +0100
+
+    xen: count resched interrupts properly
+    
+    Make sure resched interrupts appear in /proc/interrupts in the proper
+    place.
+    
+    Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge at citrix.com>
+    Signed-off-by: Thomas Gleixner <tglx at linutronix.de>
+
+commit bfdab126cfa6fe3c2ddb8b6007a38202b510b6c1
+Author: Isaku Yamahata <yamahata at valinux.co.jp>
+Date:   Mon May 26 23:31:15 2008 +0100
+
+    xen: add missing definitions in include/xen/interface/memory.h which ia64/xen needs
+    
+    Add xen handles realted definitions for xen memory which ia64/xen needs.
+    Pointer argumsnts for ia64/xen hypercall are passed in pseudo physical
+    address (guest physical address) so that it is required to convert
+    guest kernel virtual address into pseudo physical address.
+    The xen guest handle represents such arguments.
+    
+    Signed-off-by: Isaku Yamahata <yamahata at valinux.co.jp>
+    Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge at citrix.com>
+    Signed-off-by: Thomas Gleixner <tglx at linutronix.de>
+
+commit a90971ebddc81330f59203dee9803512aa4e2ef6
+Author: Isaku Yamahata <yamahata at valinux.co.jp>
+Date:   Mon May 26 23:31:14 2008 +0100
+
+    xen: compilation fix to balloon driver for ia64 support
+    
+    fix compilation error of ballon driver on ia64.
+    extent_start member is pointer argument. On x86 pointer argument for
+    xen hypercall is passed as virtual address.
+    On the other hand, ia64 and ppc, pointer argument is passed in pseudo
+    physical address. (guest physicall address.)
+    So they must be passed as handle and convert right before issuing hypercall.
+    
+      CC      drivers/xen/balloon.o
+    linux-2.6-x86/drivers/xen/balloon.c: In function 'increase_reservation':
+    linux-2.6-x86/drivers/xen/balloon.c:228: error: incompatible types in assignment
+    linux-2.6-x86/drivers/xen/balloon.c: In function 'decrease_reservation':
+    linux-2.6-x86/drivers/xen/balloon.c:324: error: incompatible types in assignment
+    linux-2.6-x86/drivers/xen/balloon.c: In function 'dealloc_pte_fn':
+    linux-2.6-x86/drivers/xen/balloon.c:486: error: incompatible types in assignment
+    linux-2.6-x86/drivers/xen/balloon.c: In function 'alloc_empty_pages_and_pagevec':
+    linux-2.6-x86/drivers/xen/balloon.c:522: error: incompatible types in assignment
+    make[2]: *** [drivers/xen/balloon.o] Error 1
+    
+    Signed-off-by: Isaku Yamahata <yamahata at valinux.co.jp>
+    Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge at citrix.com>
+    Signed-off-by: Thomas Gleixner <tglx at linutronix.de>
+
+commit ec9b2065d4d3b797604c09a569083dd9ff951b1b
+Author: Isaku Yamahata <yamahata at valinux.co.jp>
+Date:   Mon May 26 23:31:13 2008 +0100
+
+    xen: Move manage.c to drivers/xen for ia64/xen support
+    
+    move arch/x86/xen/manage.c under drivers/xen/to share codes
+    with x86 and ia64.
+    ia64/xen also uses manage.c
+    
+    Signed-off-by: Isaku Yamahata <yamahata at valinux.co.jp>
+    Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge at citrix.com>
+    Signed-off-by: Thomas Gleixner <tglx at linutronix.de>
+
+commit 83abc70a4c6e306f4c1672e25884322f797e4fcb
+Author: Jeremy Fitzhardinge <jeremy at goop.org>
+Date:   Mon May 26 23:31:12 2008 +0100
+
+    xen: make earlyprintk=xen work again
+    
+    For some perverse reason, if you call add_preferred_console() it prevents
+    setup_early_printk() from successfully enabling the boot console -
+    unless you make it a preferred console too...
+    
+    Also, make xenboot console output distinct from normal console output,
+    since it gets repeated when the console handover happens, and the
+    duplicated output is confusing without disambiguation.
+    
+    Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge at citrix.com>
+    Signed-off-by: Thomas Gleixner <tglx at linutronix.de>
+    Cc: Markus Armbruster <armbru at redhat.com>
+    Cc: Gerd Hoffmann <kraxel at redhat.com>
+
+commit e4dcff1f6e7582f76c2c9990b1d9111bbc8e26ef
+Author: Markus Armbruster <armbru at redhat.com>
+Date:   Mon May 26 23:31:11 2008 +0100
+
+    xen pvfb: Dynamic mode support (screen resizing)
+    
+    The pvfb backend indicates dynamic mode support by creating node
+    feature_resize with a non-zero value in its xenstore directory.
+    xen-fbfront sends a resize notification event on mode change.  Fully
+    backwards compatible both ways.
+    
+    Framebuffer size and initial resolution can be controlled through
+    kernel parameter xen_fbfront.video.  The backend enforces a separate
+    size limit, which it advertises in node videoram in its xenstore
+    directory.
+    
+    xen-kbdfront gets the maximum screen resolution from nodes width and
+    height in the backend's xenstore directory instead of hardcoding it.
+    
+    Additional goodie: support for larger framebuffers (512M on a 64-bit
+    system with 4K pages).
+    
+    Changing the number of bits per pixels dynamically is not supported,
+    yet.
+    
+    Ported from
+    http://xenbits.xensource.com/linux-2.6.18-xen.hg?rev/92f7b3144f41
+    http://xenbits.xensource.com/linux-2.6.18-xen.hg?rev/bfc040135633
+    
+    Signed-off-by: Pat Campbell <plc at novell.com>
+    Signed-off-by: Markus Armbruster <armbru at redhat.com>
+    Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge at citrix.com>
+    Signed-off-by: Thomas Gleixner <tglx at linutronix.de>
+
+commit f4ad1ebd7a0fae2782ef9f76c0b94b536742c3e8
+Author: Markus Armbruster <armbru at redhat.com>
+Date:   Mon May 26 23:31:10 2008 +0100
+
+    xen pvfb: Zero unused bytes in events sent to backend
+    
+    This isn't a security flaw (the backend can see all our memory
+    anyway).  But it's the right thing to do all the same.
+    
+    Signed-off-by: Markus Armbruster <armbru at redhat.com>
+    Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge at citrix.com>
+    Signed-off-by: Thomas Gleixner <tglx at linutronix.de>
+
+commit 1e892c959da42278e60b21f5ecfd6fba0efff313
+Author: Markus Armbruster <armbru at redhat.com>
+Date:   Mon May 26 23:31:09 2008 +0100
+
+    xen pvfb: Module aliases to support module autoloading
+    
+    These are mostly for completeness and consistency with the other
+    frontends, as PVFB is typically compiled in rather than a module.
+    
+    Derived from
+    http://xenbits.xensource.com/linux-2.6.18-xen.hg?rev/5e294e29a43e
+    
+    While there, add module descriptions.
+    
+    Signed-off-by: Markus Armbruster <armbru at redhat.com>
+    Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge at citrix.com>
+    Signed-off-by: Thomas Gleixner <tglx at linutronix.de>
+
+commit 6ba0e7b36c7cc1745b3cbeda244d14edae3ad058
+Author: Markus Armbruster <armbru at redhat.com>
+Date:   Mon May 26 23:31:08 2008 +0100
+
+    xen pvfb: Pointer z-axis (mouse wheel) support
+    
+    Add z-axis motion to pointer events.  Backward compatible, because
+    there's space for the z-axis in union xenkbd_in_event, and old
+    backends zero it.
+    
+    Derived from
+    http://xenbits.xensource.com/linux-2.6.18-xen.hg?rev/57dfe0098000
+    http://xenbits.xensource.com/linux-2.6.18-xen.hg?rev/1edfea26a2a9
+    http://xenbits.xensource.com/linux-2.6.18-xen.hg?rev/c3ff0b26f664
+    
+    Signed-off-by: Pat Campbell <plc at novell.com>
+    Signed-off-by: Markus Armbruster <armbru at redhat.com>
+    Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge at citrix.com>
+    Signed-off-by: Thomas Gleixner <tglx at linutronix.de>
+
+commit 9e124fe16ff24746d6de5a2ad685266d7bce0e08
+Author: Markus Armbruster <armbru at redhat.com>
+Date:   Mon May 26 23:31:07 2008 +0100
+
+    xen: Enable console tty by default in domU if it's not a dummy
+    
+    Without console= arguments on the kernel command line, the first
+    console to register becomes enabled and the preferred console (the one
+    behind /dev/console).  This is normally tty (assuming
+    CONFIG_VT_CONSOLE is enabled, which it commonly is).
+    
+    This is okay as long tty is a useful console.  But unless we have the
+    PV framebuffer, and it is enabled for this domain, tty0 in domU is
+    merely a dummy.  In that case, we want the preferred console to be the
+    Xen console hvc0, and we want it without having to fiddle with the
+    kernel command line.  Commit b8c2d3dfbc117dff26058fbac316b8acfc2cb5f7
+    did that for us.
+    
+    Since we now have the PV framebuffer, we want to enable and prefer tty
+    again, but only when PVFB is enabled.  But even then we still want to
+    enable the Xen console as well.
+    
+    Problem: when tty registers, we can't yet know whether the PVFB is
+    enabled.  By the time we can know (xenstore is up), the console setup
+    game is over.
+    
+    Solution: enable console tty by default, but keep hvc as the preferred
+    console.  Change the preferred console to tty when PVFB probes
+    successfully, unless we've been given console kernel parameters.
+    
+    Signed-off-by: Markus Armbruster <armbru at redhat.com>
+    Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge at citrix.com>
+    Signed-off-by: Thomas Gleixner <tglx at linutronix.de>
+
+commit a15af1c9ea2750a9ff01e51615c45950bad8221b
+Author: Jeremy Fitzhardinge <jeremy at goop.org>
+Date:   Mon May 26 23:31:06 2008 +0100
+
+    x86/paravirt: add pte_flags to just get pte flags
+    
+    Add pte_flags() to extract the flags from a pte.  This is a special
+    case of pte_val() which is only guaranteed to return the pte's flags
+    correctly; the page number may be corrupted or missing.
+    
+    The intent is to allow paravirt implementations to return pte flags
+    without having to do any translation of the page number (most notably,
+    Xen).
+    
+    Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge at citrix.com>
+    Signed-off-by: Thomas Gleixner <tglx at linutronix.de>
+
+commit 239d1fc04ed0b58d638096b12a7f6d50269d30c9
+Author: Jeremy Fitzhardinge <jeremy at goop.org>
+Date:   Mon May 26 23:31:05 2008 +0100
+
+    xen: don't worry about preempt during xen_irq_enable()
+    
+    When enabling interrupts, we don't need to worry about preemption,
+    because we either enter with interrupts disabled - so no preemption -
+    or the caller is confused and is re-enabling interrupts on some
+    indeterminate processor.
+    
+    Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge at citrix.com>
+    Signed-off-by: Thomas Gleixner <tglx at linutronix.de>
+
+commit 2956a3511c8c5dccb1d4739ead17c7c3c23a24b7
+Author: Jeremy Fitzhardinge <jeremy at goop.org>
+Date:   Mon May 26 23:31:04 2008 +0100
+
+    xen: allow some cr4 updates
+    
+    The guest can legitimately change things like cr4.OSFXSR and
+    OSXMMEXCPT, so let it.
+    
+    Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge at citrix.com>
+    Signed-off-by: Thomas Gleixner <tglx at linutronix.de>
+
+commit 349c709f42453707f74bece0d9d35ee5b3842893
+Author: Jeremy Fitzhardinge <jeremy at goop.org>
+Date:   Mon May 26 23:31:02 2008 +0100
+
+    xen: use new sched_op
+    
+    Use the new sched_op hypercall, mainly because xenner doesn't support
+    the old one.
+    
+    Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge at citrix.com>
+    Signed-off-by: Thomas Gleixner <tglx at linutronix.de>
+
+commit 7b1333aa4cb546ddeb9c05098a53d9a777623a05
+Author: Jeremy Fitzhardinge <jeremy at goop.org>
+Date:   Mon May 26 23:31:01 2008 +0100
+
+    xen: use hypercall rather than clts
+    
+    Xen will trap and emulate clts, but its better to use a hypercall.
+    Also, xenner doesn't handle clts.
+    
+    Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge at citrix.com>
+    Signed-off-by: Thomas Gleixner <tglx at linutronix.de>
+
+commit 0922abdc3982ae54cbe1b24ac5aa91a260eca1bb
+Author: Jeremy Fitzhardinge <jeremy at goop.org>
+Date:   Mon May 26 23:31:00 2008 +0100
+
+    xen: make early console also write to debug console
+    
+    When using "earlyprintk=xen", also write the console output to the raw
+    debug console.  This will appear on dom0's console if the hypervisor
+    has been compiled to allow it.
+    
+    Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge at citrix.com>
+    Signed-off-by: Thomas Gleixner <tglx at linutronix.de>
+
+commit 0acf10d8fbd52926217d3933d196b33fe2468f18
+Author: Jeremy Fitzhardinge <jeremy at goop.org>
+Date:   Mon May 26 23:30:59 2008 +0100
+
+    xen: add raw console write functions for debug
+    
+    Add a couple of functions which can write directly to the Xen console
+    for debugging.  This output ends up on the host's dom0 console
+    (assuming it allows the domain to write there).
+    
+    Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge at citrix.com>
+    Signed-off-by: Thomas Gleixner <tglx at linutronix.de>
+
+commit 3843fc2575e3389f4f0ad0420a720240a5746a5d
+Author: Jeremy Fitzhardinge <jeremy at goop.org>
+Date:   Fri May 9 12:05:57 2008 +0100
+
+    xen: remove support for non-PAE 32-bit
+    
+    Non-PAE operation has been deprecated in Xen for a while, and is
+    rarely tested or used.  xen-unstable has now officially dropped
+    non-PAE support.  Since Xen/pvops' non-PAE support has also been
+    broken for a while, we may as well completely drop it altogether.
+    
+    Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge at citrix.com>
+    Signed-off-by: Ingo Molnar <mingo at elte.hu>
+    Signed-off-by: Thomas Gleixner <tglx at linutronix.de>
+diff --git a/arch/x86/kernel/paravirt.c b/arch/x86/kernel/paravirt.c
+index 74f0c5e..f1ab0f7 100644
+--- a/arch/x86/kernel/paravirt.c
++++ b/arch/x86/kernel/paravirt.c
+@@ -380,6 +380,9 @@ struct pv_mmu_ops pv_mmu_ops = {
+ 	.pte_update = paravirt_nop,
+ 	.pte_update_defer = paravirt_nop,
+ 
++	.ptep_modify_prot_start = __ptep_modify_prot_start,
++	.ptep_modify_prot_commit = __ptep_modify_prot_commit,
++
+ #ifdef CONFIG_HIGHPTE
+ 	.kmap_atomic_pte = kmap_atomic,
+ #endif
+@@ -403,6 +406,7 @@ struct pv_mmu_ops pv_mmu_ops = {
+ #endif /* PAGETABLE_LEVELS >= 3 */
+ 
+ 	.pte_val = native_pte_val,
++	.pte_flags = native_pte_val,
+ 	.pgd_val = native_pgd_val,
+ 
+ 	.make_pte = native_make_pte,
+diff --git a/arch/x86/xen/Kconfig b/arch/x86/xen/Kconfig
+index 6c388e5..c2cc995 100644
+--- a/arch/x86/xen/Kconfig
++++ b/arch/x86/xen/Kconfig
+@@ -20,3 +20,13 @@ config XEN
+ 	select SYS_HYPERVISOR
+ 	help
+ 	  This is the /proc/xen interface used by Xen's libxc.
++
++config XEN_MAX_DOMAIN_MEMORY
++       int "Maximum allowed size of a domain in gigabytes"
++       default 8
++       depends on XEN
++       help
++         The pseudo-physical to machine address array is sized
++         according to the maximum possible memory size of a Xen
++         domain.  This array uses 1 page per gigabyte, so there's no
++         need to be too stingy here.
+\ No newline at end of file
+diff --git a/arch/x86/xen/Makefile b/arch/x86/xen/Makefile
+index 3d8df98..2ba2d16 100644
+--- a/arch/x86/xen/Makefile
++++ b/arch/x86/xen/Makefile
+@@ -1,4 +1,4 @@
+ obj-y		:= enlighten.o setup.o multicalls.o mmu.o \
+-			time.o manage.o xen-asm.o grant-table.o
++			time.o xen-asm.o grant-table.o suspend.o
+ 
+ obj-$(CONFIG_SMP)	+= smp.o
+diff --git a/arch/x86/xen/enlighten.c b/arch/x86/xen/enlighten.c
+index f09c1c6..bd74229 100644
+--- a/arch/x86/xen/enlighten.c
++++ b/arch/x86/xen/enlighten.c
+@@ -75,13 +75,13 @@ DEFINE_PER_CPU(unsigned long, xen_current_cr3);	 /* actual vcpu cr3 */
+ struct start_info *xen_start_info;
+ EXPORT_SYMBOL_GPL(xen_start_info);
+ 
+-static /* __initdata */ struct shared_info dummy_shared_info;
++struct shared_info xen_dummy_shared_info;
+ 
+ /*
+  * Point at some empty memory to start with. We map the real shared_info
+  * page as soon as fixmap is up and running.
+  */
+-struct shared_info *HYPERVISOR_shared_info = (void *)&dummy_shared_info;
++struct shared_info *HYPERVISOR_shared_info = (void *)&xen_dummy_shared_info;
+ 
+ /*
+  * Flag to determine whether vcpu info placement is available on all
+@@ -98,13 +98,13 @@ struct shared_info *HYPERVISOR_shared_info = (void *)&dummy_shared_info;
+  */
+ static int have_vcpu_info_placement = 1;
+ 
+-static void __init xen_vcpu_setup(int cpu)
++static void xen_vcpu_setup(int cpu)
+ {
+ 	struct vcpu_register_vcpu_info info;
+ 	int err;
+ 	struct vcpu_info *vcpup;
+ 
+-	BUG_ON(HYPERVISOR_shared_info == &dummy_shared_info);
++	BUG_ON(HYPERVISOR_shared_info == &xen_dummy_shared_info);
+ 	per_cpu(xen_vcpu, cpu) = &HYPERVISOR_shared_info->vcpu_info[cpu];
+ 
+ 	if (!have_vcpu_info_placement)
+@@ -136,11 +136,41 @@ static void __init xen_vcpu_setup(int cpu)
+ 	}
+ }
+ 
++/*
++ * On restore, set the vcpu placement up again.
++ * If it fails, then we're in a bad state, since
++ * we can't back out from using it...
++ */
++void xen_vcpu_restore(void)
++{
++	if (have_vcpu_info_placement) {
++		int cpu;
++
++		for_each_online_cpu(cpu) {
++			bool other_cpu = (cpu != smp_processor_id());
++
++			if (other_cpu &&
++			    HYPERVISOR_vcpu_op(VCPUOP_down, cpu, NULL))
++				BUG();
++
++			xen_vcpu_setup(cpu);
++
++			if (other_cpu &&
++			    HYPERVISOR_vcpu_op(VCPUOP_up, cpu, NULL))
++				BUG();
++		}
++
++		BUG_ON(!have_vcpu_info_placement);
++	}
++}
++
+ static void __init xen_banner(void)
+ {
+ 	printk(KERN_INFO "Booting paravirtualized kernel on %s\n",
+ 	       pv_info.name);
+-	printk(KERN_INFO "Hypervisor signature: %s\n", xen_start_info->magic);
++	printk(KERN_INFO "Hypervisor signature: %s%s\n",
++	       xen_start_info->magic,
++	       xen_feature(XENFEAT_mmu_pt_update_preserve_ad) ? " (preserve-AD)" : "");
+ }
+ 
+ static void xen_cpuid(unsigned int *ax, unsigned int *bx,
+@@ -235,13 +265,13 @@ static void xen_irq_enable(void)
+ {
+ 	struct vcpu_info *vcpu;
+ 
+-	/* There's a one instruction preempt window here.  We need to
+-	   make sure we're don't switch CPUs between getting the vcpu
+-	   pointer and updating the mask. */
+-	preempt_disable();
++	/* We don't need to worry about being preempted here, since
++	   either a) interrupts are disabled, so no preemption, or b)
++	   the caller is confused and is trying to re-enable interrupts
++	   on an indeterminate processor. */
++
+ 	vcpu = x86_read_percpu(xen_vcpu);
+ 	vcpu->evtchn_upcall_mask = 0;
+-	preempt_enable_no_resched();
+ 
+ 	/* Doesn't matter if we get preempted here, because any
+ 	   pending event will get dealt with anyway. */
+@@ -254,7 +284,7 @@ static void xen_irq_enable(void)
+ static void xen_safe_halt(void)
+ {
+ 	/* Blocking includes an implicit local_irq_enable(). */
+-	if (HYPERVISOR_sched_op(SCHEDOP_block, 0) != 0)
++	if (HYPERVISOR_sched_op(SCHEDOP_block, NULL) != 0)
+ 		BUG();
+ }
+ 
+@@ -607,6 +637,30 @@ static void xen_flush_tlb_others(const cpumask_t *cpus, struct mm_struct *mm,
+ 	xen_mc_issue(PARAVIRT_LAZY_MMU);
+ }
+ 
++static void xen_clts(void)
++{
++	struct multicall_space mcs;
++
++	mcs = xen_mc_entry(0);
++
++	MULTI_fpu_taskswitch(mcs.mc, 0);
++
++	xen_mc_issue(PARAVIRT_LAZY_CPU);
++}
++
++static void xen_write_cr0(unsigned long cr0)
++{
++	struct multicall_space mcs;
++
++	/* Only pay attention to cr0.TS; everything else is
++	   ignored. */
++	mcs = xen_mc_entry(0);
++
++	MULTI_fpu_taskswitch(mcs.mc, (cr0 & X86_CR0_TS) != 0);
++
++	xen_mc_issue(PARAVIRT_LAZY_CPU);
++}
++
+ static void xen_write_cr2(unsigned long cr2)
+ {
+ 	x86_read_percpu(xen_vcpu)->arch.cr2 = cr2;
+@@ -624,8 +678,10 @@ static unsigned long xen_read_cr2_direct(void)
+ 
+ static void xen_write_cr4(unsigned long cr4)
+ {
+-	/* Just ignore cr4 changes; Xen doesn't allow us to do
+-	   anything anyway. */
++	cr4 &= ~X86_CR4_PGE;
++	cr4 &= ~X86_CR4_PSE;
++
++	native_write_cr4(cr4);
+ }
+ 
+ static unsigned long xen_read_cr3(void)
+@@ -831,7 +887,7 @@ static __init void xen_pagetable_setup_start(pgd_t *base)
+ 			  PFN_DOWN(__pa(xen_start_info->pt_base)));
+ }
+ 
+-static __init void setup_shared_info(void)
++void xen_setup_shared_info(void)
+ {
+ 	if (!xen_feature(XENFEAT_auto_translated_physmap)) {
+ 		unsigned long addr = fix_to_virt(FIX_PARAVIRT_BOOTMAP);
+@@ -854,6 +910,8 @@ static __init void setup_shared_info(void)
+ 	/* In UP this is as good a place as any to set up shared info */
+ 	xen_setup_vcpu_info_placement();
+ #endif
++
++	xen_setup_mfn_list_list();
+ }
+ 
+ static __init void xen_pagetable_setup_done(pgd_t *base)
+@@ -866,15 +924,23 @@ static __init void xen_pagetable_setup_done(pgd_t *base)
+ 	pv_mmu_ops.release_pmd = xen_release_pmd;
+ 	pv_mmu_ops.set_pte = xen_set_pte;
+ 
+-	setup_shared_info();
++	xen_setup_shared_info();
+ 
+ 	/* Actually pin the pagetable down, but we can't set PG_pinned
+ 	   yet because the page structures don't exist yet. */
+ 	pin_pagetable_pfn(MMUEXT_PIN_L3_TABLE, PFN_DOWN(__pa(base)));
+ }
+ 
++static __init void xen_post_allocator_init(void)
++{
++	pv_mmu_ops.set_pmd = xen_set_pmd;
++	pv_mmu_ops.set_pud = xen_set_pud;
++
++	xen_mark_init_mm_pinned();
++}
++
+ /* This is called once we have the cpu_possible_map */
+-void __init xen_setup_vcpu_info_placement(void)
++void xen_setup_vcpu_info_placement(void)
+ {
+ 	int cpu;
+ 
+@@ -960,7 +1026,7 @@ static const struct pv_init_ops xen_init_ops __initdata = {
+ 	.banner = xen_banner,
+ 	.memory_setup = xen_memory_setup,
+ 	.arch_setup = xen_arch_setup,
+-	.post_allocator_init = xen_mark_init_mm_pinned,
++	.post_allocator_init = xen_post_allocator_init,
+ };
+ 
+ static const struct pv_time_ops xen_time_ops __initdata = {
+@@ -978,10 +1044,10 @@ static const struct pv_cpu_ops xen_cpu_ops __initdata = {
+ 	.set_debugreg = xen_set_debugreg,
+ 	.get_debugreg = xen_get_debugreg,
+ 
+-	.clts = native_clts,
++	.clts = xen_clts,
+ 
+ 	.read_cr0 = native_read_cr0,
+-	.write_cr0 = native_write_cr0,
++	.write_cr0 = xen_write_cr0,
+ 
+ 	.read_cr4 = native_read_cr4,
+ 	.read_cr4_safe = native_read_cr4_safe,
+@@ -1072,9 +1138,13 @@ static const struct pv_mmu_ops xen_mmu_ops __initdata = {
+ 
+ 	.set_pte = NULL,	/* see xen_pagetable_setup_* */
+ 	.set_pte_at = xen_set_pte_at,
+-	.set_pmd = xen_set_pmd,
++	.set_pmd = xen_set_pmd_hyper,
++
++	.ptep_modify_prot_start = __ptep_modify_prot_start,
++	.ptep_modify_prot_commit = __ptep_modify_prot_commit,
+ 
+ 	.pte_val = xen_pte_val,
++	.pte_flags = native_pte_val,
+ 	.pgd_val = xen_pgd_val,
+ 
+ 	.make_pte = xen_make_pte,
+@@ -1082,7 +1152,7 @@ static const struct pv_mmu_ops xen_mmu_ops __initdata = {
+ 
+ 	.set_pte_atomic = xen_set_pte_atomic,
+ 	.set_pte_present = xen_set_pte_at,
+-	.set_pud = xen_set_pud,
++	.set_pud = xen_set_pud_hyper,
+ 	.pte_clear = xen_pte_clear,
+ 	.pmd_clear = xen_pmd_clear,
+ 
+@@ -1114,11 +1184,13 @@ static const struct smp_ops xen_smp_ops __initdata = {
+ 
+ static void xen_reboot(int reason)
+ {
++	struct sched_shutdown r = { .reason = reason };
++
+ #ifdef CONFIG_SMP
+ 	smp_send_stop();
+ #endif
+ 
+-	if (HYPERVISOR_sched_op(SCHEDOP_shutdown, reason))
++	if (HYPERVISOR_sched_op(SCHEDOP_shutdown, &r))
+ 		BUG();
+ }
+ 
+@@ -1173,6 +1245,8 @@ asmlinkage void __init xen_start_kernel(void)
+ 
+ 	BUG_ON(memcmp(xen_start_info->magic, "xen-3", 5) != 0);
+ 
++	xen_setup_features();
++
+ 	/* Install Xen paravirt ops */
+ 	pv_info = xen_info;
+ 	pv_init_ops = xen_init_ops;
+@@ -1182,17 +1256,20 @@ asmlinkage void __init xen_start_kernel(void)
+ 	pv_apic_ops = xen_apic_ops;
+ 	pv_mmu_ops = xen_mmu_ops;
+ 
++	if (xen_feature(XENFEAT_mmu_pt_update_preserve_ad)) {
++		pv_mmu_ops.ptep_modify_prot_start = xen_ptep_modify_prot_start;
++		pv_mmu_ops.ptep_modify_prot_commit = xen_ptep_modify_prot_commit;
++	}
++
+ 	machine_ops = xen_machine_ops;
+ 
+ #ifdef CONFIG_SMP
+ 	smp_ops = xen_smp_ops;
+ #endif
+ 
+-	xen_setup_features();
+-
+ 	/* Get mfn list */
+ 	if (!xen_feature(XENFEAT_auto_translated_physmap))
+-		phys_to_machine_mapping = (unsigned long *)xen_start_info->mfn_list;
++		xen_build_dynamic_phys_to_machine();
+ 
+ 	pgd = (pgd_t *)xen_start_info->pt_base;
+ 
+@@ -1232,8 +1309,11 @@ asmlinkage void __init xen_start_kernel(void)
+ 		? __pa(xen_start_info->mod_start) : 0;
+ 	boot_params.hdr.ramdisk_size = xen_start_info->mod_len;
+ 
+-	if (!is_initial_xendomain())
++	if (!is_initial_xendomain()) {
++		add_preferred_console("xenboot", 0, NULL);
++		add_preferred_console("tty", 0, NULL);
+ 		add_preferred_console("hvc", 0, NULL);
++	}
+ 
+ 	/* Start the world */
+ 	start_kernel();
+diff --git a/arch/x86/xen/manage.c b/arch/x86/xen/manage.c
+deleted file mode 100644
+index aa7af9e..0000000
+--- a/arch/x86/xen/manage.c
++++ /dev/null
+@@ -1,143 +0,0 @@
+-/*
+- * Handle extern requests for shutdown, reboot and sysrq
+- */
+-#include <linux/kernel.h>
+-#include <linux/err.h>
+-#include <linux/reboot.h>
+-#include <linux/sysrq.h>
+-
+-#include <xen/xenbus.h>
+-
+-#define SHUTDOWN_INVALID  -1
+-#define SHUTDOWN_POWEROFF  0
+-#define SHUTDOWN_SUSPEND   2
+-/* Code 3 is SHUTDOWN_CRASH, which we don't use because the domain can only
+- * report a crash, not be instructed to crash!
+- * HALT is the same as POWEROFF, as far as we're concerned.  The tools use
+- * the distinction when we return the reason code to them.
+- */
+-#define SHUTDOWN_HALT      4
+-
+-/* Ignore multiple shutdown requests. */
+-static int shutting_down = SHUTDOWN_INVALID;
+-
+-static void shutdown_handler(struct xenbus_watch *watch,
+-			     const char **vec, unsigned int len)
+-{
+-	char *str;
+-	struct xenbus_transaction xbt;
+-	int err;
+-
+-	if (shutting_down != SHUTDOWN_INVALID)
+-		return;
+-
+- again:
+-	err = xenbus_transaction_start(&xbt);
+-	if (err)
+-		return;
+-
+-	str = (char *)xenbus_read(xbt, "control", "shutdown", NULL);
+-	/* Ignore read errors and empty reads. */
+-	if (XENBUS_IS_ERR_READ(str)) {
+-		xenbus_transaction_end(xbt, 1);
+-		return;
+-	}
+-
+-	xenbus_write(xbt, "control", "shutdown", "");
+-
+-	err = xenbus_transaction_end(xbt, 0);
+-	if (err == -EAGAIN) {
+-		kfree(str);
+-		goto again;
+-	}
+-
+-	if (strcmp(str, "poweroff") == 0 ||
+-	    strcmp(str, "halt") == 0)
+-		orderly_poweroff(false);
+-	else if (strcmp(str, "reboot") == 0)
+-		ctrl_alt_del();
+-	else {
+-		printk(KERN_INFO "Ignoring shutdown request: %s\n", str);
+-		shutting_down = SHUTDOWN_INVALID;
+-	}
+-
+-	kfree(str);
+-}
+-
+-static void sysrq_handler(struct xenbus_watch *watch, const char **vec,
+-			  unsigned int len)
+-{
+-	char sysrq_key = '\0';
+-	struct xenbus_transaction xbt;
+-	int err;
+-
+- again:
+-	err = xenbus_transaction_start(&xbt);
+-	if (err)
+-		return;
+-	if (!xenbus_scanf(xbt, "control", "sysrq", "%c", &sysrq_key)) {
+-		printk(KERN_ERR "Unable to read sysrq code in "
+-		       "control/sysrq\n");
+-		xenbus_transaction_end(xbt, 1);
+-		return;
+-	}
+-
+-	if (sysrq_key != '\0')
+-		xenbus_printf(xbt, "control", "sysrq", "%c", '\0');
+-
+-	err = xenbus_transaction_end(xbt, 0);
+-	if (err == -EAGAIN)
+-		goto again;
+-
+-	if (sysrq_key != '\0')
+-		handle_sysrq(sysrq_key, NULL);
+-}
+-
+-static struct xenbus_watch shutdown_watch = {
+-	.node = "control/shutdown",
+-	.callback = shutdown_handler
+-};
+-
+-static struct xenbus_watch sysrq_watch = {
+-	.node = "control/sysrq",
+-	.callback = sysrq_handler
+-};
+-
+-static int setup_shutdown_watcher(void)
+-{
+-	int err;
+-
+-	err = register_xenbus_watch(&shutdown_watch);
+-	if (err) {
+-		printk(KERN_ERR "Failed to set shutdown watcher\n");
+-		return err;
+-	}
+-
+-	err = register_xenbus_watch(&sysrq_watch);
+-	if (err) {
+-		printk(KERN_ERR "Failed to set sysrq watcher\n");
+-		return err;
+-	}
+-
+-	return 0;
+-}
+-
+-static int shutdown_event(struct notifier_block *notifier,
+-			  unsigned long event,
+-			  void *data)
+-{
+-	setup_shutdown_watcher();
+-	return NOTIFY_DONE;
+-}
+-
+-static int __init setup_shutdown_event(void)
+-{
+-	static struct notifier_block xenstore_notifier = {
+-		.notifier_call = shutdown_event
+-	};
+-	register_xenstore_notifier(&xenstore_notifier);
+-
+-	return 0;
+-}
+-
+-subsys_initcall(setup_shutdown_event);
+diff --git a/arch/x86/xen/mmu.c b/arch/x86/xen/mmu.c
+index df40bf7..f6b8225 100644
+--- a/arch/x86/xen/mmu.c
++++ b/arch/x86/xen/mmu.c
+@@ -56,6 +56,131 @@
+ #include "multicalls.h"
+ #include "mmu.h"
+ 
++#define P2M_ENTRIES_PER_PAGE	(PAGE_SIZE / sizeof(unsigned long))
++#define TOP_ENTRIES		(MAX_DOMAIN_PAGES / P2M_ENTRIES_PER_PAGE)
++
++/* Placeholder for holes in the address space */
++static unsigned long p2m_missing[P2M_ENTRIES_PER_PAGE]
++	__attribute__((section(".data.page_aligned"))) =
++		{ [ 0 ... P2M_ENTRIES_PER_PAGE-1 ] = ~0UL };
++
++ /* Array of pointers to pages containing p2m entries */
++static unsigned long *p2m_top[TOP_ENTRIES]
++	__attribute__((section(".data.page_aligned"))) =
++		{ [ 0 ... TOP_ENTRIES - 1] = &p2m_missing[0] };
++
++/* Arrays of p2m arrays expressed in mfns used for save/restore */
++static unsigned long p2m_top_mfn[TOP_ENTRIES]
++	__attribute__((section(".bss.page_aligned")));
++
++static unsigned long p2m_top_mfn_list[
++			PAGE_ALIGN(TOP_ENTRIES / P2M_ENTRIES_PER_PAGE)]
++	__attribute__((section(".bss.page_aligned")));
++
++static inline unsigned p2m_top_index(unsigned long pfn)
++{
++	BUG_ON(pfn >= MAX_DOMAIN_PAGES);
++	return pfn / P2M_ENTRIES_PER_PAGE;
++}
++
++static inline unsigned p2m_index(unsigned long pfn)
++{
++	return pfn % P2M_ENTRIES_PER_PAGE;
++}
++
++/* Build the parallel p2m_top_mfn structures */
++void xen_setup_mfn_list_list(void)
++{
++	unsigned pfn, idx;
++
++	for(pfn = 0; pfn < MAX_DOMAIN_PAGES; pfn += P2M_ENTRIES_PER_PAGE) {
++		unsigned topidx = p2m_top_index(pfn);
++
++		p2m_top_mfn[topidx] = virt_to_mfn(p2m_top[topidx]);
++	}
++
++	for(idx = 0; idx < ARRAY_SIZE(p2m_top_mfn_list); idx++) {
++		unsigned topidx = idx * P2M_ENTRIES_PER_PAGE;
++		p2m_top_mfn_list[idx] = virt_to_mfn(&p2m_top_mfn[topidx]);
++	}
++
++	BUG_ON(HYPERVISOR_shared_info == &xen_dummy_shared_info);
++
++	HYPERVISOR_shared_info->arch.pfn_to_mfn_frame_list_list =
++		virt_to_mfn(p2m_top_mfn_list);
++	HYPERVISOR_shared_info->arch.max_pfn = xen_start_info->nr_pages;
++}
++
++/* Set up p2m_top to point to the domain-builder provided p2m pages */
++void __init xen_build_dynamic_phys_to_machine(void)
++{
++	unsigned long *mfn_list = (unsigned long *)xen_start_info->mfn_list;
++	unsigned long max_pfn = min(MAX_DOMAIN_PAGES, xen_start_info->nr_pages);
++	unsigned pfn;
++
++	for(pfn = 0; pfn < max_pfn; pfn += P2M_ENTRIES_PER_PAGE) {
++		unsigned topidx = p2m_top_index(pfn);
++
++		p2m_top[topidx] = &mfn_list[pfn];
++	}
++}
++
++unsigned long get_phys_to_machine(unsigned long pfn)
++{
++	unsigned topidx, idx;
++
++	if (unlikely(pfn >= MAX_DOMAIN_PAGES))
++		return INVALID_P2M_ENTRY;
++
++	topidx = p2m_top_index(pfn);
++	idx = p2m_index(pfn);
++	return p2m_top[topidx][idx];
++}
++EXPORT_SYMBOL_GPL(get_phys_to_machine);
++
++static void alloc_p2m(unsigned long **pp, unsigned long *mfnp)
++{
++	unsigned long *p;
++	unsigned i;
++
++	p = (void *)__get_free_page(GFP_KERNEL | __GFP_NOFAIL);
++	BUG_ON(p == NULL);
++
++	for(i = 0; i < P2M_ENTRIES_PER_PAGE; i++)
++		p[i] = INVALID_P2M_ENTRY;
++
++	if (cmpxchg(pp, p2m_missing, p) != p2m_missing)
++		free_page((unsigned long)p);
++	else
++		*mfnp = virt_to_mfn(p);
++}
++
++void set_phys_to_machine(unsigned long pfn, unsigned long mfn)
++{
++	unsigned topidx, idx;
++
++	if (unlikely(xen_feature(XENFEAT_auto_translated_physmap))) {
++		BUG_ON(pfn != mfn && mfn != INVALID_P2M_ENTRY);
++		return;
++	}
++
++	if (unlikely(pfn >= MAX_DOMAIN_PAGES)) {
++		BUG_ON(mfn != INVALID_P2M_ENTRY);
++		return;
++	}
++
++	topidx = p2m_top_index(pfn);
++	if (p2m_top[topidx] == p2m_missing) {
++		/* no need to allocate a page to store an invalid entry */
++		if (mfn == INVALID_P2M_ENTRY)
++			return;
++		alloc_p2m(&p2m_top[topidx], &p2m_top_mfn[topidx]);
++	}
++
++	idx = p2m_index(pfn);
++	p2m_top[topidx][idx] = mfn;
++}
++
+ xmaddr_t arbitrary_virt_to_machine(unsigned long address)
+ {
+ 	unsigned int level;
+@@ -98,24 +223,60 @@ void make_lowmem_page_readwrite(void *vaddr)
+ }
+ 
+ 
+-void xen_set_pmd(pmd_t *ptr, pmd_t val)
++static bool page_pinned(void *ptr)
++{
++	struct page *page = virt_to_page(ptr);
++
++	return PagePinned(page);
++}
++
++static void extend_mmu_update(const struct mmu_update *update)
+ {
+ 	struct multicall_space mcs;
+ 	struct mmu_update *u;
+ 
+-	preempt_disable();
++	mcs = xen_mc_extend_args(__HYPERVISOR_mmu_update, sizeof(*u));
++
++	if (mcs.mc != NULL)
++		mcs.mc->args[1]++;
++	else {
++		mcs = __xen_mc_entry(sizeof(*u));
++		MULTI_mmu_update(mcs.mc, mcs.args, 1, NULL, DOMID_SELF);
++	}
+ 
+-	mcs = xen_mc_entry(sizeof(*u));
+ 	u = mcs.args;
+-	u->ptr = virt_to_machine(ptr).maddr;
+-	u->val = pmd_val_ma(val);
+-	MULTI_mmu_update(mcs.mc, u, 1, NULL, DOMID_SELF);
++	*u = *update;
++}
++
++void xen_set_pmd_hyper(pmd_t *ptr, pmd_t val)
++{
++	struct mmu_update u;
++
++	preempt_disable();
++
++	xen_mc_batch();
++
++	u.ptr = virt_to_machine(ptr).maddr;
++	u.val = pmd_val_ma(val);
++	extend_mmu_update(&u);
+ 
+ 	xen_mc_issue(PARAVIRT_LAZY_MMU);
+ 
+ 	preempt_enable();
+ }
+ 
++void xen_set_pmd(pmd_t *ptr, pmd_t val)
++{
++	/* If page is not pinned, we can just update the entry
++	   directly */
++	if (!page_pinned(ptr)) {
++		*ptr = val;
++		return;
++	}
++
++	xen_set_pmd_hyper(ptr, val);
++}
++
+ /*
+  * Associate a virtual page frame with a given physical page frame
+  * and protection flags for that frame.
+@@ -179,6 +340,26 @@ out:
+ 		preempt_enable();
+ }
+ 
++pte_t xen_ptep_modify_prot_start(struct mm_struct *mm, unsigned long addr, pte_t *ptep)
++{
++	/* Just return the pte as-is.  We preserve the bits on commit */
++	return *ptep;
++}
++
++void xen_ptep_modify_prot_commit(struct mm_struct *mm, unsigned long addr,
++				 pte_t *ptep, pte_t pte)
++{
++	struct mmu_update u;
++
++	xen_mc_batch();
++
++	u.ptr = virt_to_machine(ptep).maddr | MMU_PT_UPDATE_PRESERVE_AD;
++	u.val = pte_val_ma(pte);
++	extend_mmu_update(&u);
++
++	xen_mc_issue(PARAVIRT_LAZY_MMU);
++}
++
+ /* Assume pteval_t is equivalent to all the other *val_t types. */
+ static pteval_t pte_mfn_to_pfn(pteval_t val)
+ {
+@@ -229,24 +410,35 @@ pmdval_t xen_pmd_val(pmd_t pmd)
+ 	return pte_mfn_to_pfn(pmd.pmd);
+ }
+ 
+-void xen_set_pud(pud_t *ptr, pud_t val)
++void xen_set_pud_hyper(pud_t *ptr, pud_t val)
+ {
+-	struct multicall_space mcs;
+-	struct mmu_update *u;
++	struct mmu_update u;
+ 
+ 	preempt_disable();
+ 
+-	mcs = xen_mc_entry(sizeof(*u));
+-	u = mcs.args;
+-	u->ptr = virt_to_machine(ptr).maddr;
+-	u->val = pud_val_ma(val);
+-	MULTI_mmu_update(mcs.mc, u, 1, NULL, DOMID_SELF);
++	xen_mc_batch();
++
++	u.ptr = virt_to_machine(ptr).maddr;
++	u.val = pud_val_ma(val);
++	extend_mmu_update(&u);
+ 
+ 	xen_mc_issue(PARAVIRT_LAZY_MMU);
+ 
+ 	preempt_enable();
+ }
+ 
++void xen_set_pud(pud_t *ptr, pud_t val)
++{
++	/* If page is not pinned, we can just update the entry
++	   directly */
++	if (!page_pinned(ptr)) {
++		*ptr = val;
++		return;
++	}
++
++	xen_set_pud_hyper(ptr, val);
++}
++
+ void xen_set_pte(pte_t *ptep, pte_t pte)
+ {
+ 	ptep->pte_high = pte.pte_high;
+@@ -268,7 +460,7 @@ void xen_pte_clear(struct mm_struct *mm, unsigned long addr, pte_t *ptep)
+ 
+ void xen_pmd_clear(pmd_t *pmdp)
+ {
+-	xen_set_pmd(pmdp, __pmd(0));
++	set_pmd(pmdp, __pmd(0));
+ }
+ 
+ pmd_t xen_make_pmd(pmdval_t pmd)
+@@ -441,6 +633,29 @@ void xen_pgd_pin(pgd_t *pgd)
+ 	xen_mc_issue(0);
+ }
+ 
++/*
++ * On save, we need to pin all pagetables to make sure they get their
++ * mfns turned into pfns.  Search the list for any unpinned pgds and pin
++ * them (unpinned pgds are not currently in use, probably because the
++ * process is under construction or destruction).
++ */
++void xen_mm_pin_all(void)
++{
++	unsigned long flags;
++	struct page *page;
++
++	spin_lock_irqsave(&pgd_lock, flags);
++
++	list_for_each_entry(page, &pgd_list, lru) {
++		if (!PagePinned(page)) {
++			xen_pgd_pin((pgd_t *)page_address(page));
++			SetPageSavePinned(page);
++		}
++	}
++
++	spin_unlock_irqrestore(&pgd_lock, flags);
++}
++
+ /* The init_mm pagetable is really pinned as soon as its created, but
+    that's before we have page structures to store the bits.  So do all
+    the book-keeping now. */
+@@ -498,6 +713,29 @@ static void xen_pgd_unpin(pgd_t *pgd)
+ 	xen_mc_issue(0);
+ }
+ 
++/*
++ * On resume, undo any pinning done at save, so that the rest of the
++ * kernel doesn't see any unexpected pinned pagetables.
++ */
++void xen_mm_unpin_all(void)
++{
++	unsigned long flags;
++	struct page *page;
++
++	spin_lock_irqsave(&pgd_lock, flags);
++
++	list_for_each_entry(page, &pgd_list, lru) {
++		if (PageSavePinned(page)) {
++			BUG_ON(!PagePinned(page));
++			printk("unpinning pinned %p\n", page_address(page));
++			xen_pgd_unpin((pgd_t *)page_address(page));
++			ClearPageSavePinned(page);
++		}
++	}
++
++	spin_unlock_irqrestore(&pgd_lock, flags);
++}
++
+ void xen_activate_mm(struct mm_struct *prev, struct mm_struct *next)
+ {
+ 	spin_lock(&next->page_table_lock);
+@@ -591,7 +829,7 @@ void xen_exit_mmap(struct mm_struct *mm)
+ 	spin_lock(&mm->page_table_lock);
+ 
+ 	/* pgd may not be pinned in the error exit path of execve */
+-	if (PagePinned(virt_to_page(mm->pgd)))
++	if (page_pinned(mm->pgd))
+ 		xen_pgd_unpin(mm->pgd);
+ 
+ 	spin_unlock(&mm->page_table_lock);
+diff --git a/arch/x86/xen/mmu.h b/arch/x86/xen/mmu.h
+index 5fe961c..297bf9f 100644
+--- a/arch/x86/xen/mmu.h
++++ b/arch/x86/xen/mmu.h
+@@ -25,10 +25,6 @@ enum pt_level {
+ 
+ void set_pte_mfn(unsigned long vaddr, unsigned long pfn, pgprot_t flags);
+ 
+-void xen_set_pte(pte_t *ptep, pte_t pteval);
+-void xen_set_pte_at(struct mm_struct *mm, unsigned long addr,
+-		    pte_t *ptep, pte_t pteval);
+-void xen_set_pmd(pmd_t *pmdp, pmd_t pmdval);
+ 
+ void xen_activate_mm(struct mm_struct *prev, struct mm_struct *next);
+ void xen_dup_mmap(struct mm_struct *oldmm, struct mm_struct *mm);
+@@ -45,11 +41,19 @@ pte_t xen_make_pte(pteval_t);
+ pmd_t xen_make_pmd(pmdval_t);
+ pgd_t xen_make_pgd(pgdval_t);
+ 
++void xen_set_pte(pte_t *ptep, pte_t pteval);
+ void xen_set_pte_at(struct mm_struct *mm, unsigned long addr,
+ 		    pte_t *ptep, pte_t pteval);
+ void xen_set_pte_atomic(pte_t *ptep, pte_t pte);
++void xen_set_pmd(pmd_t *pmdp, pmd_t pmdval);
+ void xen_set_pud(pud_t *ptr, pud_t val);
++void xen_set_pmd_hyper(pmd_t *pmdp, pmd_t pmdval);
++void xen_set_pud_hyper(pud_t *ptr, pud_t val);
+ void xen_pte_clear(struct mm_struct *mm, unsigned long addr, pte_t *ptep);
+ void xen_pmd_clear(pmd_t *pmdp);
+ 
++pte_t xen_ptep_modify_prot_start(struct mm_struct *mm, unsigned long addr, pte_t *ptep);
++void  xen_ptep_modify_prot_commit(struct mm_struct *mm, unsigned long addr,
++				  pte_t *ptep, pte_t pte);
++
+ #endif	/* _XEN_MMU_H */
+diff --git a/arch/x86/xen/multicalls.c b/arch/x86/xen/multicalls.c
+index 5791eb2..3c63c4d 100644
+--- a/arch/x86/xen/multicalls.c
++++ b/arch/x86/xen/multicalls.c
+@@ -29,14 +29,14 @@
+ #define MC_DEBUG	1
+ 
+ #define MC_BATCH	32
+-#define MC_ARGS		(MC_BATCH * 16 / sizeof(u64))
++#define MC_ARGS		(MC_BATCH * 16)
+ 
+ struct mc_buffer {
+ 	struct multicall_entry entries[MC_BATCH];
+ #if MC_DEBUG
+ 	struct multicall_entry debug[MC_BATCH];
+ #endif
+-	u64 args[MC_ARGS];
++	unsigned char args[MC_ARGS];
+ 	struct callback {
+ 		void (*fn)(void *);
+ 		void *data;
+@@ -107,20 +107,48 @@ struct multicall_space __xen_mc_entry(size_t args)
+ {
+ 	struct mc_buffer *b = &__get_cpu_var(mc_buffer);
+ 	struct multicall_space ret;
+-	unsigned argspace = (args + sizeof(u64) - 1) / sizeof(u64);
++	unsigned argidx = roundup(b->argidx, sizeof(u64));
+ 
+ 	BUG_ON(preemptible());
+-	BUG_ON(argspace > MC_ARGS);
++	BUG_ON(b->argidx > MC_ARGS);
+ 
+ 	if (b->mcidx == MC_BATCH ||
+-	    (b->argidx + argspace) > MC_ARGS)
++	    (argidx + args) > MC_ARGS) {
+ 		xen_mc_flush();
++		argidx = roundup(b->argidx, sizeof(u64));
++	}
+ 
+ 	ret.mc = &b->entries[b->mcidx];
+ 	b->mcidx++;
++	ret.args = &b->args[argidx];
++	b->argidx = argidx + args;
++
++	BUG_ON(b->argidx > MC_ARGS);
++	return ret;
++}
++
++struct multicall_space xen_mc_extend_args(unsigned long op, size_t size)
++{
++	struct mc_buffer *b = &__get_cpu_var(mc_buffer);
++	struct multicall_space ret = { NULL, NULL };
++
++	BUG_ON(preemptible());
++	BUG_ON(b->argidx > MC_ARGS);
++
++	if (b->mcidx == 0)
++		return ret;
++
++	if (b->entries[b->mcidx - 1].op != op)
++		return ret;
++
++	if ((b->argidx + size) > MC_ARGS)
++		return ret;
++
++	ret.mc = &b->entries[b->mcidx - 1];
+ 	ret.args = &b->args[b->argidx];
+-	b->argidx += argspace;
++	b->argidx += size;
+ 
++	BUG_ON(b->argidx > MC_ARGS);
+ 	return ret;
+ }
+ 
+diff --git a/arch/x86/xen/multicalls.h b/arch/x86/xen/multicalls.h
+index 8bae996..8589382 100644
+--- a/arch/x86/xen/multicalls.h
++++ b/arch/x86/xen/multicalls.h
+@@ -45,4 +45,16 @@ static inline void xen_mc_issue(unsigned mode)
+ /* Set up a callback to be called when the current batch is flushed */
+ void xen_mc_callback(void (*fn)(void *), void *data);
+ 
++/*
++ * Try to extend the arguments of the previous multicall command.  The
++ * previous command's op must match.  If it does, then it attempts to
++ * extend the argument space allocated to the multicall entry by
++ * arg_size bytes.
++ *
++ * The returned multicall_space will return with mc pointing to the
++ * command on success, or NULL on failure, and args pointing to the
++ * newly allocated space.
++ */
++struct multicall_space xen_mc_extend_args(unsigned long op, size_t arg_size);
++
+ #endif /* _XEN_MULTICALLS_H */
+diff --git a/arch/x86/xen/setup.c b/arch/x86/xen/setup.c
+index 82517e4..4884478 100644
+--- a/arch/x86/xen/setup.c
++++ b/arch/x86/xen/setup.c
+@@ -16,6 +16,7 @@
+ #include <asm/xen/hypervisor.h>
+ #include <asm/xen/hypercall.h>
+ 
++#include <xen/page.h>
+ #include <xen/interface/callback.h>
+ #include <xen/interface/physdev.h>
+ #include <xen/features.h>
+@@ -27,8 +28,6 @@
+ extern const char xen_hypervisor_callback[];
+ extern const char xen_failsafe_callback[];
+ 
+-unsigned long *phys_to_machine_mapping;
+-EXPORT_SYMBOL(phys_to_machine_mapping);
+ 
+ /**
+  * machine_specific_memory_setup - Hook for machine specific memory setup.
+@@ -38,6 +37,8 @@ char * __init xen_memory_setup(void)
+ {
+ 	unsigned long max_pfn = xen_start_info->nr_pages;
+ 
++	max_pfn = min(MAX_DOMAIN_PAGES, max_pfn);
++
+ 	e820.nr_map = 0;
+ 	add_memory_region(0, LOWMEMSIZE(), E820_RAM);
+ 	add_memory_region(HIGH_MEMORY, PFN_PHYS(max_pfn)-HIGH_MEMORY, E820_RAM);
+diff --git a/arch/x86/xen/smp.c b/arch/x86/xen/smp.c
+index 94e6900..d2e3c20 100644
+--- a/arch/x86/xen/smp.c
++++ b/arch/x86/xen/smp.c
+@@ -35,7 +35,7 @@
+ #include "xen-ops.h"
+ #include "mmu.h"
+ 
+-static cpumask_t xen_cpu_initialized_map;
++cpumask_t xen_cpu_initialized_map;
+ static DEFINE_PER_CPU(int, resched_irq) = -1;
+ static DEFINE_PER_CPU(int, callfunc_irq) = -1;
+ static DEFINE_PER_CPU(int, debug_irq) = -1;
+@@ -65,6 +65,12 @@ static struct call_data_struct *call_data;
+  */
+ static irqreturn_t xen_reschedule_interrupt(int irq, void *dev_id)
+ {
++#ifdef CONFIG_X86_32
++	__get_cpu_var(irq_stat).irq_resched_count++;
++#else
++	add_pda(irq_resched_count, 1);
++#endif
++
+ 	return IRQ_HANDLED;
+ }
+ 
+diff --git a/arch/x86/xen/suspend.c b/arch/x86/xen/suspend.c
+new file mode 100644
+index 0000000..251669a
+--- /dev/null
++++ b/arch/x86/xen/suspend.c
+@@ -0,0 +1,45 @@
++#include <linux/types.h>
++
++#include <xen/interface/xen.h>
++#include <xen/grant_table.h>
++#include <xen/events.h>
++
++#include <asm/xen/hypercall.h>
++#include <asm/xen/page.h>
++
++#include "xen-ops.h"
++#include "mmu.h"
++
++void xen_pre_suspend(void)
++{
++	xen_start_info->store_mfn = mfn_to_pfn(xen_start_info->store_mfn);
++	xen_start_info->console.domU.mfn =
++		mfn_to_pfn(xen_start_info->console.domU.mfn);
++
++	BUG_ON(!irqs_disabled());
++
++	HYPERVISOR_shared_info = &xen_dummy_shared_info;
++	if (HYPERVISOR_update_va_mapping(fix_to_virt(FIX_PARAVIRT_BOOTMAP),
++					 __pte_ma(0), 0))
++		BUG();
++}
++
++void xen_post_suspend(int suspend_cancelled)
++{
++	xen_setup_shared_info();
++
++	if (suspend_cancelled) {
++		xen_start_info->store_mfn =
++			pfn_to_mfn(xen_start_info->store_mfn);
++		xen_start_info->console.domU.mfn =
++			pfn_to_mfn(xen_start_info->console.domU.mfn);
++	} else {
++#ifdef CONFIG_SMP
++		xen_cpu_initialized_map = cpu_online_map;
++#endif
++		xen_vcpu_restore();
++		xen_timer_resume();
++	}
++
++}
++
+diff --git a/arch/x86/xen/time.c b/arch/x86/xen/time.c
+index 41e2175..64f0038 100644
+--- a/arch/x86/xen/time.c
++++ b/arch/x86/xen/time.c
+@@ -459,6 +459,19 @@ void xen_setup_cpu_clockevents(void)
+ 	clockevents_register_device(&__get_cpu_var(xen_clock_events));
+ }
+ 
++void xen_timer_resume(void)
++{
++	int cpu;
++
++	if (xen_clockevent != &xen_vcpuop_clockevent)
++		return;
++
++	for_each_online_cpu(cpu) {
++		if (HYPERVISOR_vcpu_op(VCPUOP_stop_periodic_timer, cpu, NULL))
++			BUG();
++	}
++}
++
+ __init void xen_time_init(void)
+ {
+ 	int cpu = smp_processor_id();
+diff --git a/arch/x86/xen/xen-head.S b/arch/x86/xen/xen-head.S
+index 6ec3b4f..7c0cf63 100644
+--- a/arch/x86/xen/xen-head.S
++++ b/arch/x86/xen/xen-head.S
+@@ -7,6 +7,7 @@
+ #include <linux/init.h>
+ #include <asm/boot.h>
+ #include <xen/interface/elfnote.h>
++#include <asm/xen/interface.h>
+ 
+ 	__INIT
+ ENTRY(startup_xen)
+@@ -32,5 +33,9 @@ ENTRY(hypercall_page)
+ 	ELFNOTE(Xen, XEN_ELFNOTE_FEATURES,       .asciz "!writable_page_tables|pae_pgdir_above_4gb")
+ 	ELFNOTE(Xen, XEN_ELFNOTE_PAE_MODE,       .asciz "yes")
+ 	ELFNOTE(Xen, XEN_ELFNOTE_LOADER,         .asciz "generic")
++	ELFNOTE(Xen, XEN_ELFNOTE_L1_MFN_VALID,
++		.quad _PAGE_PRESENT; .quad _PAGE_PRESENT)
++	ELFNOTE(Xen, XEN_ELFNOTE_SUSPEND_CANCEL, .long 1)
++	ELFNOTE(Xen, XEN_ELFNOTE_HV_START_LOW,   .long __HYPERVISOR_VIRT_START)
+ 
+ #endif /*CONFIG_XEN */
+diff --git a/arch/x86/xen/xen-ops.h b/arch/x86/xen/xen-ops.h
+index f1063ae..9a05559 100644
+--- a/arch/x86/xen/xen-ops.h
++++ b/arch/x86/xen/xen-ops.h
+@@ -9,18 +9,26 @@
+ extern const char xen_hypervisor_callback[];
+ extern const char xen_failsafe_callback[];
+ 
++struct trap_info;
+ void xen_copy_trap_info(struct trap_info *traps);
+ 
+ DECLARE_PER_CPU(unsigned long, xen_cr3);
+ DECLARE_PER_CPU(unsigned long, xen_current_cr3);
+ 
+ extern struct start_info *xen_start_info;
++extern struct shared_info xen_dummy_shared_info;
+ extern struct shared_info *HYPERVISOR_shared_info;
+ 
++void xen_setup_mfn_list_list(void);
++void xen_setup_shared_info(void);
++
+ char * __init xen_memory_setup(void);
+ void __init xen_arch_setup(void);
+ void __init xen_init_IRQ(void);
+ void xen_enable_sysenter(void);
++void xen_vcpu_restore(void);
++
++void __init xen_build_dynamic_phys_to_machine(void);
+ 
+ void xen_setup_timer(int cpu);
+ void xen_setup_cpu_clockevents(void);
+@@ -29,6 +37,7 @@ void __init xen_time_init(void);
+ unsigned long xen_get_wallclock(void);
+ int xen_set_wallclock(unsigned long time);
+ unsigned long long xen_sched_clock(void);
++void xen_timer_resume(void);
+ 
+ irqreturn_t xen_debug_interrupt(int irq, void *dev_id);
+ 
+@@ -54,6 +63,8 @@ int xen_smp_call_function_single(int cpu, void (*func) (void *info), void *info,
+ int xen_smp_call_function_mask(cpumask_t mask, void (*func)(void *),
+ 			       void *info, int wait);
+ 
++extern cpumask_t xen_cpu_initialized_map;
++
+ 
+ /* Declare an asm function, along with symbols needed to make it
+    inlineable */
+diff --git a/drivers/char/hvc_xen.c b/drivers/char/hvc_xen.c
+index dd68f85..db2ae42 100644
+--- a/drivers/char/hvc_xen.c
++++ b/drivers/char/hvc_xen.c
+@@ -39,9 +39,14 @@ static int xencons_irq;
+ 
+ /* ------------------------------------------------------------------ */
+ 
++static unsigned long console_pfn = ~0ul;
++
+ static inline struct xencons_interface *xencons_interface(void)
+ {
+-	return mfn_to_virt(xen_start_info->console.domU.mfn);
++	if (console_pfn == ~0ul)
++		return mfn_to_virt(xen_start_info->console.domU.mfn);
++	else
++		return __va(console_pfn << PAGE_SHIFT);
+ }
+ 
+ static inline void notify_daemon(void)
+@@ -101,20 +106,32 @@ static int __init xen_init(void)
+ {
+ 	struct hvc_struct *hp;
+ 
+-	if (!is_running_on_xen())
+-		return 0;
++	if (!is_running_on_xen() ||
++	    is_initial_xendomain() ||
++	    !xen_start_info->console.domU.evtchn)
++		return -ENODEV;
+ 
+ 	xencons_irq = bind_evtchn_to_irq(xen_start_info->console.domU.evtchn);
+ 	if (xencons_irq < 0)
+-		xencons_irq = 0 /* NO_IRQ */;
++		xencons_irq = 0; /* NO_IRQ */
++
+ 	hp = hvc_alloc(HVC_COOKIE, xencons_irq, &hvc_ops, 256);
+ 	if (IS_ERR(hp))
+ 		return PTR_ERR(hp);
+ 
+ 	hvc = hp;
++
++	console_pfn = mfn_to_pfn(xen_start_info->console.domU.mfn);
++
+ 	return 0;
+ }
+ 
++void xen_console_resume(void)
++{
++	if (xencons_irq)
++		rebind_evtchn_irq(xen_start_info->console.domU.evtchn, xencons_irq);
++}
++
+ static void __exit xen_fini(void)
+ {
+ 	if (hvc)
+@@ -134,12 +151,28 @@ module_init(xen_init);
+ module_exit(xen_fini);
+ console_initcall(xen_cons_init);
+ 
++static void raw_console_write(const char *str, int len)
++{
++	while(len > 0) {
++		int rc = HYPERVISOR_console_io(CONSOLEIO_write, len, (char *)str);
++		if (rc <= 0)
++			break;
++
++		str += rc;
++		len -= rc;
++	}
++}
++
++#ifdef CONFIG_EARLY_PRINTK
+ static void xenboot_write_console(struct console *console, const char *string,
+ 				  unsigned len)
+ {
+ 	unsigned int linelen, off = 0;
+ 	const char *pos;
+ 
++	raw_console_write(string, len);
++
++	write_console(0, "(early) ", 8);
+ 	while (off < len && NULL != (pos = strchr(string+off, '\n'))) {
+ 		linelen = pos-string+off;
+ 		if (off + linelen > len)
+@@ -155,5 +188,23 @@ static void xenboot_write_console(struct console *console, const char *string,
+ struct console xenboot_console = {
+ 	.name		= "xenboot",
+ 	.write		= xenboot_write_console,
+-	.flags		= CON_PRINTBUFFER | CON_BOOT,
++	.flags		= CON_PRINTBUFFER | CON_BOOT | CON_ANYTIME,
+ };
++#endif	/* CONFIG_EARLY_PRINTK */
++
++void xen_raw_console_write(const char *str)
++{
++	raw_console_write(str, strlen(str));
++}
++
++void xen_raw_printk(const char *fmt, ...)
++{
++	static char buf[512];
++	va_list ap;
++
++	va_start(ap, fmt);
++	vsnprintf(buf, sizeof(buf), fmt, ap);
++	va_end(ap);
++
++	xen_raw_console_write(buf);
++}
+diff --git a/drivers/input/xen-kbdfront.c b/drivers/input/xen-kbdfront.c
+index 0f47f46..9ce3b3b 100644
+--- a/drivers/input/xen-kbdfront.c
++++ b/drivers/input/xen-kbdfront.c
+@@ -66,6 +66,9 @@ static irqreturn_t input_handler(int rq, void *dev_id)
+ 		case XENKBD_TYPE_MOTION:
+ 			input_report_rel(dev, REL_X, event->motion.rel_x);
+ 			input_report_rel(dev, REL_Y, event->motion.rel_y);
++			if (event->motion.rel_z)
++				input_report_rel(dev, REL_WHEEL,
++						 -event->motion.rel_z);
+ 			break;
+ 		case XENKBD_TYPE_KEY:
+ 			dev = NULL;
+@@ -84,6 +87,9 @@ static irqreturn_t input_handler(int rq, void *dev_id)
+ 		case XENKBD_TYPE_POS:
+ 			input_report_abs(dev, ABS_X, event->pos.abs_x);
+ 			input_report_abs(dev, ABS_Y, event->pos.abs_y);
++			if (event->pos.rel_z)
++				input_report_rel(dev, REL_WHEEL,
++						 -event->pos.rel_z);
+ 			break;
+ 		}
+ 		if (dev)
+@@ -152,7 +158,7 @@ static int __devinit xenkbd_probe(struct xenbus_device *dev,
+ 	ptr->evbit[0] = BIT(EV_KEY) | BIT(EV_REL) | BIT(EV_ABS);
+ 	for (i = BTN_LEFT; i <= BTN_TASK; i++)
+ 		set_bit(i, ptr->keybit);
+-	ptr->relbit[0] = BIT(REL_X) | BIT(REL_Y);
++	ptr->relbit[0] = BIT(REL_X) | BIT(REL_Y) | BIT(REL_WHEEL);
+ 	input_set_abs_params(ptr, ABS_X, 0, XENFB_WIDTH, 0, 0);
+ 	input_set_abs_params(ptr, ABS_Y, 0, XENFB_HEIGHT, 0, 0);
+ 
+@@ -294,6 +300,16 @@ InitWait:
+ 		 */
+ 		if (dev->state != XenbusStateConnected)
+ 			goto InitWait; /* no InitWait seen yet, fudge it */
++
++		/* Set input abs params to match backend screen res */
++		if (xenbus_scanf(XBT_NIL, info->xbdev->otherend,
++				 "width", "%d", &val) > 0)
++			input_set_abs_params(info->ptr, ABS_X, 0, val, 0, 0);
++
++		if (xenbus_scanf(XBT_NIL, info->xbdev->otherend,
++				 "height", "%d", &val) > 0)
++			input_set_abs_params(info->ptr, ABS_Y, 0, val, 0, 0);
++
+ 		break;
+ 
+ 	case XenbusStateClosing:
+@@ -337,4 +353,6 @@ static void __exit xenkbd_cleanup(void)
+ module_init(xenkbd_init);
+ module_exit(xenkbd_cleanup);
+ 
++MODULE_DESCRIPTION("Xen virtual keyboard/pointer device frontend");
+ MODULE_LICENSE("GPL");
++MODULE_ALIAS("xen:vkbd");
+diff --git a/drivers/lguest/lg.h b/drivers/lguest/lg.h
+index 005bd04..5faefea 100644
+--- a/drivers/lguest/lg.h
++++ b/drivers/lguest/lg.h
+@@ -136,7 +136,6 @@ int run_guest(struct lg_cpu *cpu, unsigned long __user *user);
+  * first step in the migration to the kernel types.  pte_pfn is already defined
+  * in the kernel. */
+ #define pgd_flags(x)	(pgd_val(x) & ~PAGE_MASK)
+-#define pte_flags(x)	(pte_val(x) & ~PAGE_MASK)
+ #define pgd_pfn(x)	(pgd_val(x) >> PAGE_SHIFT)
+ 
+ /* interrupts_and_traps.c: */
+diff --git a/drivers/video/xen-fbfront.c b/drivers/video/xen-fbfront.c
+index 619a6f8..47ed39b 100644
+--- a/drivers/video/xen-fbfront.c
++++ b/drivers/video/xen-fbfront.c
+@@ -18,6 +18,7 @@
+  * frame buffer.
+  */
+ 
++#include <linux/console.h>
+ #include <linux/kernel.h>
+ #include <linux/errno.h>
+ #include <linux/fb.h>
+@@ -42,37 +43,68 @@ struct xenfb_info {
+ 	struct xenfb_page	*page;
+ 	unsigned long 		*mfns;
+ 	int			update_wanted; /* XENFB_TYPE_UPDATE wanted */
++	int			feature_resize; /* XENFB_TYPE_RESIZE ok */
++	struct xenfb_resize	resize;		/* protected by resize_lock */
++	int			resize_dpy;	/* ditto */
++	spinlock_t		resize_lock;
+ 
+ 	struct xenbus_device	*xbdev;
+ };
+ 
+-static u32 xenfb_mem_len = XENFB_WIDTH * XENFB_HEIGHT * XENFB_DEPTH / 8;
++#define XENFB_DEFAULT_FB_LEN (XENFB_WIDTH * XENFB_HEIGHT * XENFB_DEPTH / 8)
+ 
++enum { KPARAM_MEM, KPARAM_WIDTH, KPARAM_HEIGHT, KPARAM_CNT };
++static int video[KPARAM_CNT] = { 2, XENFB_WIDTH, XENFB_HEIGHT };
++module_param_array(video, int, NULL, 0);
++MODULE_PARM_DESC(video,
++	"Video memory size in MB, width, height in pixels (default 2,800,600)");
++
++static void xenfb_make_preferred_console(void);
+ static int xenfb_remove(struct xenbus_device *);
+-static void xenfb_init_shared_page(struct xenfb_info *);
++static void xenfb_init_shared_page(struct xenfb_info *, struct fb_info *);
+ static int xenfb_connect_backend(struct xenbus_device *, struct xenfb_info *);
+ static void xenfb_disconnect_backend(struct xenfb_info *);
+ 
++static void xenfb_send_event(struct xenfb_info *info,
++			     union xenfb_out_event *event)
++{
++	u32 prod;
++
++	prod = info->page->out_prod;
++	/* caller ensures !xenfb_queue_full() */
++	mb();			/* ensure ring space available */
++	XENFB_OUT_RING_REF(info->page, prod) = *event;
++	wmb();			/* ensure ring contents visible */
++	info->page->out_prod = prod + 1;
++
++	notify_remote_via_irq(info->irq);
++}
++
+ static void xenfb_do_update(struct xenfb_info *info,
+ 			    int x, int y, int w, int h)
+ {
+ 	union xenfb_out_event event;
+-	u32 prod;
+ 
++	memset(&event, 0, sizeof(event));
+ 	event.type = XENFB_TYPE_UPDATE;
+ 	event.update.x = x;
+ 	event.update.y = y;
+ 	event.update.width = w;
+ 	event.update.height = h;
+ 
+-	prod = info->page->out_prod;
+ 	/* caller ensures !xenfb_queue_full() */
+-	mb();			/* ensure ring space available */
+-	XENFB_OUT_RING_REF(info->page, prod) = event;
+-	wmb();			/* ensure ring contents visible */
+-	info->page->out_prod = prod + 1;
++	xenfb_send_event(info, &event);
++}
+ 
+-	notify_remote_via_irq(info->irq);
++static void xenfb_do_resize(struct xenfb_info *info)
++{
++	union xenfb_out_event event;
++
++	memset(&event, 0, sizeof(event));
++	event.resize = info->resize;
++
++	/* caller ensures !xenfb_queue_full() */
++	xenfb_send_event(info, &event);
+ }
+ 
+ static int xenfb_queue_full(struct xenfb_info *info)
+@@ -84,12 +116,28 @@ static int xenfb_queue_full(struct xenfb_info *info)
+ 	return prod - cons == XENFB_OUT_RING_LEN;
+ }
+ 
++static void xenfb_handle_resize_dpy(struct xenfb_info *info)
++{
++	unsigned long flags;
++
++	spin_lock_irqsave(&info->resize_lock, flags);
++	if (info->resize_dpy) {
++		if (!xenfb_queue_full(info)) {
++			info->resize_dpy = 0;
++			xenfb_do_resize(info);
++		}
++	}
++	spin_unlock_irqrestore(&info->resize_lock, flags);
++}
++
+ static void xenfb_refresh(struct xenfb_info *info,
+ 			  int x1, int y1, int w, int h)
+ {
+ 	unsigned long flags;
+-	int y2 = y1 + h - 1;
+ 	int x2 = x1 + w - 1;
++	int y2 = y1 + h - 1;
++
++	xenfb_handle_resize_dpy(info);
+ 
+ 	if (!info->update_wanted)
+ 		return;
+@@ -222,6 +270,57 @@ static ssize_t xenfb_write(struct fb_info *p, const char __user *buf,
+ 	return res;
+ }
+ 
++static int
++xenfb_check_var(struct fb_var_screeninfo *var, struct fb_info *info)
++{
++	struct xenfb_info *xenfb_info;
++	int required_mem_len;
++
++	xenfb_info = info->par;
++
++	if (!xenfb_info->feature_resize) {
++		if (var->xres == video[KPARAM_WIDTH] &&
++		    var->yres == video[KPARAM_HEIGHT] &&
++		    var->bits_per_pixel == xenfb_info->page->depth) {
++			return 0;
++		}
++		return -EINVAL;
++	}
++
++	/* Can't resize past initial width and height */
++	if (var->xres > video[KPARAM_WIDTH] || var->yres > video[KPARAM_HEIGHT])
++		return -EINVAL;
++
++	required_mem_len = var->xres * var->yres * xenfb_info->page->depth / 8;
++	if (var->bits_per_pixel == xenfb_info->page->depth &&
++	    var->xres <= info->fix.line_length / (XENFB_DEPTH / 8) &&
++	    required_mem_len <= info->fix.smem_len) {
++		var->xres_virtual = var->xres;
++		var->yres_virtual = var->yres;
++		return 0;
++	}
++	return -EINVAL;
++}
++
++static int xenfb_set_par(struct fb_info *info)
++{
++	struct xenfb_info *xenfb_info;
++	unsigned long flags;
++
++	xenfb_info = info->par;
++
++	spin_lock_irqsave(&xenfb_info->resize_lock, flags);
++	xenfb_info->resize.type = XENFB_TYPE_RESIZE;
++	xenfb_info->resize.width = info->var.xres;
++	xenfb_info->resize.height = info->var.yres;
++	xenfb_info->resize.stride = info->fix.line_length;
++	xenfb_info->resize.depth = info->var.bits_per_pixel;
++	xenfb_info->resize.offset = 0;
++	xenfb_info->resize_dpy = 1;
++	spin_unlock_irqrestore(&xenfb_info->resize_lock, flags);
++	return 0;
++}
++
+ static struct fb_ops xenfb_fb_ops = {
+ 	.owner		= THIS_MODULE,
+ 	.fb_read	= fb_sys_read,
+@@ -230,6 +329,8 @@ static struct fb_ops xenfb_fb_ops = {
+ 	.fb_fillrect	= xenfb_fillrect,
+ 	.fb_copyarea	= xenfb_copyarea,
+ 	.fb_imageblit	= xenfb_imageblit,
++	.fb_check_var	= xenfb_check_var,
++	.fb_set_par     = xenfb_set_par,
+ };
+ 
+ static irqreturn_t xenfb_event_handler(int rq, void *dev_id)
+@@ -258,6 +359,8 @@ static int __devinit xenfb_probe(struct xenbus_device *dev,
+ {
+ 	struct xenfb_info *info;
+ 	struct fb_info *fb_info;
++	int fb_size;
++	int val;
+ 	int ret;
+ 
+ 	info = kzalloc(sizeof(*info), GFP_KERNEL);
+@@ -265,18 +368,35 @@ static int __devinit xenfb_probe(struct xenbus_device *dev,
+ 		xenbus_dev_fatal(dev, -ENOMEM, "allocating info structure");
+ 		return -ENOMEM;
+ 	}
++
++	/* Limit kernel param videoram amount to what is in xenstore */
++	if (xenbus_scanf(XBT_NIL, dev->otherend, "videoram", "%d", &val) == 1) {
++		if (val < video[KPARAM_MEM])
++			video[KPARAM_MEM] = val;
++	}
++
++	/* If requested res does not fit in available memory, use default */
++	fb_size = video[KPARAM_MEM] * 1024 * 1024;
++	if (video[KPARAM_WIDTH] * video[KPARAM_HEIGHT] * XENFB_DEPTH / 8
++	    > fb_size) {
++		video[KPARAM_WIDTH] = XENFB_WIDTH;
++		video[KPARAM_HEIGHT] = XENFB_HEIGHT;
++		fb_size = XENFB_DEFAULT_FB_LEN;
++	}
++
+ 	dev->dev.driver_data = info;
+ 	info->xbdev = dev;
+ 	info->irq = -1;
+ 	info->x1 = info->y1 = INT_MAX;
+ 	spin_lock_init(&info->dirty_lock);
++	spin_lock_init(&info->resize_lock);
+ 
+-	info->fb = vmalloc(xenfb_mem_len);
++	info->fb = vmalloc(fb_size);
+ 	if (info->fb == NULL)
+ 		goto error_nomem;
+-	memset(info->fb, 0, xenfb_mem_len);
++	memset(info->fb, 0, fb_size);
+ 
+-	info->nr_pages = (xenfb_mem_len + PAGE_SIZE - 1) >> PAGE_SHIFT;
++	info->nr_pages = (fb_size + PAGE_SIZE - 1) >> PAGE_SHIFT;
+ 
+ 	info->mfns = vmalloc(sizeof(unsigned long) * info->nr_pages);
+ 	if (!info->mfns)
+@@ -287,8 +407,6 @@ static int __devinit xenfb_probe(struct xenbus_device *dev,
+ 	if (!info->page)
+ 		goto error_nomem;
+ 
+-	xenfb_init_shared_page(info);
+-
+ 	/* abusing framebuffer_alloc() to allocate pseudo_palette */
+ 	fb_info = framebuffer_alloc(sizeof(u32) * 256, NULL);
+ 	if (fb_info == NULL)
+@@ -301,9 +419,9 @@ static int __devinit xenfb_probe(struct xenbus_device *dev,
+ 	fb_info->screen_base = info->fb;
+ 
+ 	fb_info->fbops = &xenfb_fb_ops;
+-	fb_info->var.xres_virtual = fb_info->var.xres = info->page->width;
+-	fb_info->var.yres_virtual = fb_info->var.yres = info->page->height;
+-	fb_info->var.bits_per_pixel = info->page->depth;
++	fb_info->var.xres_virtual = fb_info->var.xres = video[KPARAM_WIDTH];
++	fb_info->var.yres_virtual = fb_info->var.yres = video[KPARAM_HEIGHT];
++	fb_info->var.bits_per_pixel = XENFB_DEPTH;
+ 
+ 	fb_info->var.red = (struct fb_bitfield){16, 8, 0};
+ 	fb_info->var.green = (struct fb_bitfield){8, 8, 0};
+@@ -315,9 +433,9 @@ static int __devinit xenfb_probe(struct xenbus_device *dev,
+ 	fb_info->var.vmode = FB_VMODE_NONINTERLACED;
+ 
+ 	fb_info->fix.visual = FB_VISUAL_TRUECOLOR;
+-	fb_info->fix.line_length = info->page->line_length;
++	fb_info->fix.line_length = fb_info->var.xres * XENFB_DEPTH / 8;
+ 	fb_info->fix.smem_start = 0;
+-	fb_info->fix.smem_len = xenfb_mem_len;
++	fb_info->fix.smem_len = fb_size;
+ 	strcpy(fb_info->fix.id, "xen");
+ 	fb_info->fix.type = FB_TYPE_PACKED_PIXELS;
+ 	fb_info->fix.accel = FB_ACCEL_NONE;
+@@ -334,6 +452,8 @@ static int __devinit xenfb_probe(struct xenbus_device *dev,
+ 	fb_info->fbdefio = &xenfb_defio;
+ 	fb_deferred_io_init(fb_info);
+ 
++	xenfb_init_shared_page(info, fb_info);
++
+ 	ret = register_framebuffer(fb_info);
+ 	if (ret) {
+ 		fb_deferred_io_cleanup(fb_info);
+@@ -348,6 +468,7 @@ static int __devinit xenfb_probe(struct xenbus_device *dev,
+ 	if (ret < 0)
+ 		goto error;
+ 
++	xenfb_make_preferred_console();
+ 	return 0;
+ 
+  error_nomem:
+@@ -358,12 +479,34 @@ static int __devinit xenfb_probe(struct xenbus_device *dev,
+ 	return ret;
+ }
+ 
++static __devinit void
++xenfb_make_preferred_console(void)
++{
++	struct console *c;
++
++	if (console_set_on_cmdline)
++		return;
++
++	acquire_console_sem();
++	for (c = console_drivers; c; c = c->next) {
++		if (!strcmp(c->name, "tty") && c->index == 0)
++			break;
++	}
++	release_console_sem();
++	if (c) {
++		unregister_console(c);
++		c->flags |= CON_CONSDEV;
++		c->flags &= ~CON_PRINTBUFFER; /* don't print again */
++		register_console(c);
++	}
++}
++
+ static int xenfb_resume(struct xenbus_device *dev)
+ {
+ 	struct xenfb_info *info = dev->dev.driver_data;
+ 
+ 	xenfb_disconnect_backend(info);
+-	xenfb_init_shared_page(info);
++	xenfb_init_shared_page(info, info->fb_info);
+ 	return xenfb_connect_backend(dev, info);
+ }
+ 
+@@ -391,20 +534,23 @@ static unsigned long vmalloc_to_mfn(void *address)
+ 	return pfn_to_mfn(vmalloc_to_pfn(address));
+ }
+ 
+-static void xenfb_init_shared_page(struct xenfb_info *info)
++static void xenfb_init_shared_page(struct xenfb_info *info,
++				   struct fb_info *fb_info)
+ {
+ 	int i;
++	int epd = PAGE_SIZE / sizeof(info->mfns[0]);
+ 
+ 	for (i = 0; i < info->nr_pages; i++)
+ 		info->mfns[i] = vmalloc_to_mfn(info->fb + i * PAGE_SIZE);
+ 
+-	info->page->pd[0] = vmalloc_to_mfn(info->mfns);
+-	info->page->pd[1] = 0;
+-	info->page->width = XENFB_WIDTH;
+-	info->page->height = XENFB_HEIGHT;
+-	info->page->depth = XENFB_DEPTH;
+-	info->page->line_length = (info->page->depth / 8) * info->page->width;
+-	info->page->mem_length = xenfb_mem_len;
++	for (i = 0; i * epd < info->nr_pages; i++)
++		info->page->pd[i] = vmalloc_to_mfn(&info->mfns[i * epd]);
++
++	info->page->width = fb_info->var.xres;
++	info->page->height = fb_info->var.yres;
++	info->page->depth = fb_info->var.bits_per_pixel;
++	info->page->line_length = fb_info->fix.line_length;
++	info->page->mem_length = fb_info->fix.smem_len;
+ 	info->page->in_cons = info->page->in_prod = 0;
+ 	info->page->out_cons = info->page->out_prod = 0;
+ }
+@@ -504,6 +650,11 @@ InitWait:
+ 			val = 0;
+ 		if (val)
+ 			info->update_wanted = 1;
++
++		if (xenbus_scanf(XBT_NIL, dev->otherend,
++				 "feature-resize", "%d", &val) < 0)
++			val = 0;
++		info->feature_resize = val;
+ 		break;
+ 
+ 	case XenbusStateClosing:
+@@ -547,4 +698,6 @@ static void __exit xenfb_cleanup(void)
+ module_init(xenfb_init);
+ module_exit(xenfb_cleanup);
+ 
++MODULE_DESCRIPTION("Xen virtual framebuffer device frontend");
+ MODULE_LICENSE("GPL");
++MODULE_ALIAS("xen:vfb");
+diff --git a/drivers/xen/Makefile b/drivers/xen/Makefile
+index 37af04f..363286c 100644
+--- a/drivers/xen/Makefile
++++ b/drivers/xen/Makefile
+@@ -1,4 +1,4 @@
+-obj-y	+= grant-table.o features.o events.o
++obj-y	+= grant-table.o features.o events.o manage.o
+ obj-y	+= xenbus/
+ obj-$(CONFIG_XEN_XENCOMM)	+= xencomm.o
+ obj-$(CONFIG_XEN_BALLOON)	+= balloon.o
+diff --git a/drivers/xen/balloon.c b/drivers/xen/balloon.c
+index ab25ba6..591bc29 100644
+--- a/drivers/xen/balloon.c
++++ b/drivers/xen/balloon.c
+@@ -225,7 +225,7 @@ static int increase_reservation(unsigned long nr_pages)
+ 		page = balloon_next_page(page);
+ 	}
+ 
+-	reservation.extent_start = (unsigned long)frame_list;
++	set_xen_guest_handle(reservation.extent_start, frame_list);
+ 	reservation.nr_extents   = nr_pages;
+ 	rc = HYPERVISOR_memory_op(
+ 		XENMEM_populate_physmap, &reservation);
+@@ -321,7 +321,7 @@ static int decrease_reservation(unsigned long nr_pages)
+ 		balloon_append(pfn_to_page(pfn));
+ 	}
+ 
+-	reservation.extent_start = (unsigned long)frame_list;
++	set_xen_guest_handle(reservation.extent_start, frame_list);
+ 	reservation.nr_extents   = nr_pages;
+ 	ret = HYPERVISOR_memory_op(XENMEM_decrease_reservation, &reservation);
+ 	BUG_ON(ret != nr_pages);
+@@ -368,7 +368,7 @@ static void balloon_process(struct work_struct *work)
+ }
+ 
+ /* Resets the Xen limit, sets new target, and kicks off processing. */
+-void balloon_set_new_target(unsigned long target)
++static void balloon_set_new_target(unsigned long target)
+ {
+ 	/* No need for lock. Not read-modify-write updates. */
+ 	balloon_stats.hard_limit   = ~0UL;
+@@ -483,7 +483,7 @@ static int dealloc_pte_fn(
+ 		.extent_order = 0,
+ 		.domid        = DOMID_SELF
+ 	};
+-	reservation.extent_start = (unsigned long)&mfn;
++	set_xen_guest_handle(reservation.extent_start, &mfn);
+ 	set_pte_at(&init_mm, addr, pte, __pte_ma(0ull));
+ 	set_phys_to_machine(__pa(addr) >> PAGE_SHIFT, INVALID_P2M_ENTRY);
+ 	ret = HYPERVISOR_memory_op(XENMEM_decrease_reservation, &reservation);
+@@ -519,7 +519,7 @@ static struct page **alloc_empty_pages_and_pagevec(int nr_pages)
+ 				.extent_order = 0,
+ 				.domid        = DOMID_SELF
+ 			};
+-			reservation.extent_start = (unsigned long)&gmfn;
++			set_xen_guest_handle(reservation.extent_start, &gmfn);
+ 			ret = HYPERVISOR_memory_op(XENMEM_decrease_reservation,
+ 						   &reservation);
+ 			if (ret == 1)
+diff --git a/drivers/xen/events.c b/drivers/xen/events.c
+index 76e5b73..332dd63 100644
+--- a/drivers/xen/events.c
++++ b/drivers/xen/events.c
+@@ -355,7 +355,7 @@ static void unbind_from_irq(unsigned int irq)
+ 
+ 	spin_lock(&irq_mapping_update_lock);
+ 
+-	if (VALID_EVTCHN(evtchn) && (--irq_bindcount[irq] == 0)) {
++	if ((--irq_bindcount[irq] == 0) && VALID_EVTCHN(evtchn)) {
+ 		close.port = evtchn;
+ 		if (HYPERVISOR_event_channel_op(EVTCHNOP_close, &close) != 0)
+ 			BUG();
+@@ -375,7 +375,7 @@ static void unbind_from_irq(unsigned int irq)
+ 		evtchn_to_irq[evtchn] = -1;
+ 		irq_info[irq] = IRQ_UNBOUND;
+ 
+-		dynamic_irq_init(irq);
++		dynamic_irq_cleanup(irq);
+ 	}
+ 
+ 	spin_unlock(&irq_mapping_update_lock);
+@@ -557,6 +557,33 @@ out:
+ 	put_cpu();
+ }
+ 
++/* Rebind a new event channel to an existing irq. */
++void rebind_evtchn_irq(int evtchn, int irq)
++{
++	/* Make sure the irq is masked, since the new event channel
++	   will also be masked. */
++	disable_irq(irq);
++
++	spin_lock(&irq_mapping_update_lock);
++
++	/* After resume the irq<->evtchn mappings are all cleared out */
++	BUG_ON(evtchn_to_irq[evtchn] != -1);
++	/* Expect irq to have been bound before,
++	   so the bindcount should be non-0 */
++	BUG_ON(irq_bindcount[irq] == 0);
++
++	evtchn_to_irq[evtchn] = irq;
++	irq_info[irq] = mk_irq_info(IRQT_EVTCHN, 0, evtchn);
++
++	spin_unlock(&irq_mapping_update_lock);
++
++	/* new event channels are always bound to cpu 0 */
++	irq_set_affinity(irq, cpumask_of_cpu(0));
++
++	/* Unmask the event channel. */
++	enable_irq(irq);
++}
++
+ /* Rebind an evtchn so that it gets delivered to a specific cpu */
+ static void rebind_irq_to_cpu(unsigned irq, unsigned tcpu)
+ {
+@@ -647,6 +674,89 @@ static int retrigger_dynirq(unsigned int irq)
+ 	return ret;
+ }
+ 
++static void restore_cpu_virqs(unsigned int cpu)
++{
++	struct evtchn_bind_virq bind_virq;
++	int virq, irq, evtchn;
++
++	for (virq = 0; virq < NR_VIRQS; virq++) {
++		if ((irq = per_cpu(virq_to_irq, cpu)[virq]) == -1)
++			continue;
++
++		BUG_ON(irq_info[irq].type != IRQT_VIRQ);
++		BUG_ON(irq_info[irq].index != virq);
++
++		/* Get a new binding from Xen. */
++		bind_virq.virq = virq;
++		bind_virq.vcpu = cpu;
++		if (HYPERVISOR_event_channel_op(EVTCHNOP_bind_virq,
++						&bind_virq) != 0)
++			BUG();
++		evtchn = bind_virq.port;
++
++		/* Record the new mapping. */
++		evtchn_to_irq[evtchn] = irq;
++		irq_info[irq] = mk_irq_info(IRQT_VIRQ, virq, evtchn);
++		bind_evtchn_to_cpu(evtchn, cpu);
++
++		/* Ready for use. */
++		unmask_evtchn(evtchn);
++	}
++}
++
++static void restore_cpu_ipis(unsigned int cpu)
++{
++	struct evtchn_bind_ipi bind_ipi;
++	int ipi, irq, evtchn;
++
++	for (ipi = 0; ipi < XEN_NR_IPIS; ipi++) {
++		if ((irq = per_cpu(ipi_to_irq, cpu)[ipi]) == -1)
++			continue;
++
++		BUG_ON(irq_info[irq].type != IRQT_IPI);
++		BUG_ON(irq_info[irq].index != ipi);
++
++		/* Get a new binding from Xen. */
++		bind_ipi.vcpu = cpu;
++		if (HYPERVISOR_event_channel_op(EVTCHNOP_bind_ipi,
++						&bind_ipi) != 0)
++			BUG();
++		evtchn = bind_ipi.port;
++
++		/* Record the new mapping. */
++		evtchn_to_irq[evtchn] = irq;
++		irq_info[irq] = mk_irq_info(IRQT_IPI, ipi, evtchn);
++		bind_evtchn_to_cpu(evtchn, cpu);
++
++		/* Ready for use. */
++		unmask_evtchn(evtchn);
++
++	}
++}
++
++void xen_irq_resume(void)
++{
++	unsigned int cpu, irq, evtchn;
++
++	init_evtchn_cpu_bindings();
++
++	/* New event-channel space is not 'live' yet. */
++	for (evtchn = 0; evtchn < NR_EVENT_CHANNELS; evtchn++)
++		mask_evtchn(evtchn);
++
++	/* No IRQ <-> event-channel mappings. */
++	for (irq = 0; irq < NR_IRQS; irq++)
++		irq_info[irq].evtchn = 0; /* zap event-channel binding */
++
++	for (evtchn = 0; evtchn < NR_EVENT_CHANNELS; evtchn++)
++		evtchn_to_irq[evtchn] = -1;
++
++	for_each_possible_cpu(cpu) {
++		restore_cpu_virqs(cpu);
++		restore_cpu_ipis(cpu);
++	}
++}
++
+ static struct irq_chip xen_dynamic_chip __read_mostly = {
+ 	.name		= "xen-dyn",
+ 	.mask		= disable_dynirq,
+diff --git a/drivers/xen/grant-table.c b/drivers/xen/grant-table.c
+index 52b6b41..e9e1116 100644
+--- a/drivers/xen/grant-table.c
++++ b/drivers/xen/grant-table.c
+@@ -471,14 +471,14 @@ static int gnttab_map(unsigned int start_idx, unsigned int end_idx)
+ 	return 0;
+ }
+ 
+-static int gnttab_resume(void)
++int gnttab_resume(void)
+ {
+ 	if (max_nr_grant_frames() < nr_grant_frames)
+ 		return -ENOSYS;
+ 	return gnttab_map(0, nr_grant_frames - 1);
+ }
+ 
+-static int gnttab_suspend(void)
++int gnttab_suspend(void)
+ {
+ 	arch_gnttab_unmap_shared(shared, nr_grant_frames);
+ 	return 0;
+diff --git a/drivers/xen/manage.c b/drivers/xen/manage.c
+new file mode 100644
+index 0000000..5b546e3
+--- /dev/null
++++ b/drivers/xen/manage.c
+@@ -0,0 +1,252 @@
++/*
++ * Handle extern requests for shutdown, reboot and sysrq
++ */
++#include <linux/kernel.h>
++#include <linux/err.h>
++#include <linux/reboot.h>
++#include <linux/sysrq.h>
++#include <linux/stop_machine.h>
++#include <linux/freezer.h>
++
++#include <xen/xenbus.h>
++#include <xen/grant_table.h>
++#include <xen/events.h>
++#include <xen/hvc-console.h>
++#include <xen/xen-ops.h>
++
++#include <asm/xen/hypercall.h>
++#include <asm/xen/page.h>
++
++enum shutdown_state {
++	SHUTDOWN_INVALID = -1,
++	SHUTDOWN_POWEROFF = 0,
++	SHUTDOWN_SUSPEND = 2,
++	/* Code 3 is SHUTDOWN_CRASH, which we don't use because the domain can only
++	   report a crash, not be instructed to crash!
++	   HALT is the same as POWEROFF, as far as we're concerned.  The tools use
++	   the distinction when we return the reason code to them.  */
++	 SHUTDOWN_HALT = 4,
++};
++
++/* Ignore multiple shutdown requests. */
++static enum shutdown_state shutting_down = SHUTDOWN_INVALID;
++
++#ifdef CONFIG_PM_SLEEP
++static int xen_suspend(void *data)
++{
++	int *cancelled = data;
++	int err;
++
++	BUG_ON(!irqs_disabled());
++
++	load_cr3(swapper_pg_dir);
++
++	err = device_power_down(PMSG_SUSPEND);
++	if (err) {
++		printk(KERN_ERR "xen_suspend: device_power_down failed: %d\n",
++		       err);
++		return err;
++	}
++
++	xen_mm_pin_all();
++	gnttab_suspend();
++	xen_pre_suspend();
++
++	/*
++	 * This hypercall returns 1 if suspend was cancelled
++	 * or the domain was merely checkpointed, and 0 if it
++	 * is resuming in a new domain.
++	 */
++	*cancelled = HYPERVISOR_suspend(virt_to_mfn(xen_start_info));
++
++	xen_post_suspend(*cancelled);
++	gnttab_resume();
++	xen_mm_unpin_all();
++
++	device_power_up();
++
++	if (!*cancelled) {
++		xen_irq_resume();
++		xen_console_resume();
++	}
++
++	return 0;
++}
++
++static void do_suspend(void)
++{
++	int err;
++	int cancelled = 1;
++
++	shutting_down = SHUTDOWN_SUSPEND;
++
++#ifdef CONFIG_PREEMPT
++	/* If the kernel is preemptible, we need to freeze all the processes
++	   to prevent them from being in the middle of a pagetable update
++	   during suspend. */
++	err = freeze_processes();
++	if (err) {
++		printk(KERN_ERR "xen suspend: freeze failed %d\n", err);
++		return;
++	}
++#endif
++
++	err = device_suspend(PMSG_SUSPEND);
++	if (err) {
++		printk(KERN_ERR "xen suspend: device_suspend %d\n", err);
++		goto out;
++	}
++
++	printk("suspending xenbus...\n");
++	/* XXX use normal device tree? */
++	xenbus_suspend();
++
++	err = stop_machine_run(xen_suspend, &cancelled, 0);
++	if (err) {
++		printk(KERN_ERR "failed to start xen_suspend: %d\n", err);
++		goto out;
++	}
++
++	if (!cancelled)
++		xenbus_resume();
++	else
++		xenbus_suspend_cancel();
++
++	device_resume();
++
++	/* Make sure timer events get retriggered on all CPUs */
++	clock_was_set();
++out:
++#ifdef CONFIG_PREEMPT
++	thaw_processes();
++#endif
++	shutting_down = SHUTDOWN_INVALID;
++}
++#endif	/* CONFIG_PM_SLEEP */
++
++static void shutdown_handler(struct xenbus_watch *watch,
++			     const char **vec, unsigned int len)
++{
++	char *str;
++	struct xenbus_transaction xbt;
++	int err;
++
++	if (shutting_down != SHUTDOWN_INVALID)
++		return;
++
++ again:
++	err = xenbus_transaction_start(&xbt);
++	if (err)
++		return;
++
++	str = (char *)xenbus_read(xbt, "control", "shutdown", NULL);
++	/* Ignore read errors and empty reads. */
++	if (XENBUS_IS_ERR_READ(str)) {
++		xenbus_transaction_end(xbt, 1);
++		return;
++	}
++
++	xenbus_write(xbt, "control", "shutdown", "");
++
++	err = xenbus_transaction_end(xbt, 0);
++	if (err == -EAGAIN) {
++		kfree(str);
++		goto again;
++	}
++
++	if (strcmp(str, "poweroff") == 0 ||
++	    strcmp(str, "halt") == 0) {
++		shutting_down = SHUTDOWN_POWEROFF;
++		orderly_poweroff(false);
++	} else if (strcmp(str, "reboot") == 0) {
++		shutting_down = SHUTDOWN_POWEROFF; /* ? */
++		ctrl_alt_del();
++#ifdef CONFIG_PM_SLEEP
++	} else if (strcmp(str, "suspend") == 0) {
++		do_suspend();
++#endif
++	} else {
++		printk(KERN_INFO "Ignoring shutdown request: %s\n", str);
++		shutting_down = SHUTDOWN_INVALID;
++	}
++
++	kfree(str);
++}
++
++static void sysrq_handler(struct xenbus_watch *watch, const char **vec,
++			  unsigned int len)
++{
++	char sysrq_key = '\0';
++	struct xenbus_transaction xbt;
++	int err;
++
++ again:
++	err = xenbus_transaction_start(&xbt);
++	if (err)
++		return;
++	if (!xenbus_scanf(xbt, "control", "sysrq", "%c", &sysrq_key)) {
++		printk(KERN_ERR "Unable to read sysrq code in "
++		       "control/sysrq\n");
++		xenbus_transaction_end(xbt, 1);
++		return;
++	}
++
++	if (sysrq_key != '\0')
++		xenbus_printf(xbt, "control", "sysrq", "%c", '\0');
++
++	err = xenbus_transaction_end(xbt, 0);
++	if (err == -EAGAIN)
++		goto again;
++
++	if (sysrq_key != '\0')
++		handle_sysrq(sysrq_key, NULL);
++}
++
++static struct xenbus_watch shutdown_watch = {
++	.node = "control/shutdown",
++	.callback = shutdown_handler
++};
++
++static struct xenbus_watch sysrq_watch = {
++	.node = "control/sysrq",
++	.callback = sysrq_handler
++};
++
++static int setup_shutdown_watcher(void)
++{
++	int err;
++
++	err = register_xenbus_watch(&shutdown_watch);
++	if (err) {
++		printk(KERN_ERR "Failed to set shutdown watcher\n");
++		return err;
++	}
++
++	err = register_xenbus_watch(&sysrq_watch);
++	if (err) {
++		printk(KERN_ERR "Failed to set sysrq watcher\n");
++		return err;
++	}
++
++	return 0;
++}
++
++static int shutdown_event(struct notifier_block *notifier,
++			  unsigned long event,
++			  void *data)
++{
++	setup_shutdown_watcher();
++	return NOTIFY_DONE;
++}
++
++static int __init setup_shutdown_event(void)
++{
++	static struct notifier_block xenstore_notifier = {
++		.notifier_call = shutdown_event
++	};
++	register_xenstore_notifier(&xenstore_notifier);
++
++	return 0;
++}
++
++subsys_initcall(setup_shutdown_event);
+diff --git a/drivers/xen/xenbus/xenbus_comms.c b/drivers/xen/xenbus/xenbus_comms.c
+index 6efbe3f..090c61e 100644
+--- a/drivers/xen/xenbus/xenbus_comms.c
++++ b/drivers/xen/xenbus/xenbus_comms.c
+@@ -203,7 +203,6 @@ int xb_read(void *data, unsigned len)
+ int xb_init_comms(void)
+ {
+ 	struct xenstore_domain_interface *intf = xen_store_interface;
+-	int err;
+ 
+ 	if (intf->req_prod != intf->req_cons)
+ 		printk(KERN_ERR "XENBUS request ring is not quiescent "
+@@ -216,18 +215,20 @@ int xb_init_comms(void)
+ 		intf->rsp_cons = intf->rsp_prod;
+ 	}
+ 
+-	if (xenbus_irq)
+-		unbind_from_irqhandler(xenbus_irq, &xb_waitq);
++	if (xenbus_irq) {
++		/* Already have an irq; assume we're resuming */
++		rebind_evtchn_irq(xen_store_evtchn, xenbus_irq);
++	} else {
++		int err;
++		err = bind_evtchn_to_irqhandler(xen_store_evtchn, wake_waiting,
++						0, "xenbus", &xb_waitq);
++		if (err <= 0) {
++			printk(KERN_ERR "XENBUS request irq failed %i\n", err);
++			return err;
++		}
+ 
+-	err = bind_evtchn_to_irqhandler(
+-		xen_store_evtchn, wake_waiting,
+-		0, "xenbus", &xb_waitq);
+-	if (err <= 0) {
+-		printk(KERN_ERR "XENBUS request irq failed %i\n", err);
+-		return err;
++		xenbus_irq = err;
+ 	}
+ 
+-	xenbus_irq = err;
+-
+ 	return 0;
+ }
+diff --git a/include/asm-generic/pgtable.h b/include/asm-generic/pgtable.h
+index 44ef329..4fce3db 100644
+--- a/include/asm-generic/pgtable.h
++++ b/include/asm-generic/pgtable.h
+@@ -197,6 +197,63 @@ static inline int pmd_none_or_clear_bad(pmd_t *pmd)
+ }
+ #endif /* CONFIG_MMU */
+ 
++static inline pte_t __ptep_modify_prot_start(struct mm_struct *mm,
++					     unsigned long addr,
++					     pte_t *ptep)
++{
++	/*
++	 * Get the current pte state, but zero it out to make it
++	 * non-present, preventing the hardware from asynchronously
++	 * updating it.
++	 */
++	return ptep_get_and_clear(mm, addr, ptep);
++}
++
++static inline void __ptep_modify_prot_commit(struct mm_struct *mm,
++					     unsigned long addr,
++					     pte_t *ptep, pte_t pte)
++{
++	/*
++	 * The pte is non-present, so there's no hardware state to
++	 * preserve.
++	 */
++	set_pte_at(mm, addr, ptep, pte);
++}
++
++#ifndef __HAVE_ARCH_PTEP_MODIFY_PROT_TRANSACTION
++/*
++ * Start a pte protection read-modify-write transaction, which
++ * protects against asynchronous hardware modifications to the pte.
++ * The intention is not to prevent the hardware from making pte
++ * updates, but to prevent any updates it may make from being lost.
++ *
++ * This does not protect against other software modifications of the
++ * pte; the appropriate pte lock must be held over the transation.
++ *
++ * Note that this interface is intended to be batchable, meaning that
++ * ptep_modify_prot_commit may not actually update the pte, but merely
++ * queue the update to be done at some later time.  The update must be
++ * actually committed before the pte lock is released, however.
++ */
++static inline pte_t ptep_modify_prot_start(struct mm_struct *mm,
++					   unsigned long addr,
++					   pte_t *ptep)
++{
++	return __ptep_modify_prot_start(mm, addr, ptep);
++}
++
++/*
++ * Commit an update to a pte, leaving any hardware-controlled bits in
++ * the PTE unmodified.
++ */
++static inline void ptep_modify_prot_commit(struct mm_struct *mm,
++					   unsigned long addr,
++					   pte_t *ptep, pte_t pte)
++{
++	__ptep_modify_prot_commit(mm, addr, ptep, pte);
++}
++#endif /* __HAVE_ARCH_PTEP_MODIFY_PROT_TRANSACTION */
++
+ /*
+  * A facility to provide lazy MMU batching.  This allows PTE updates and
+  * page invalidations to be delayed until a call to leave lazy MMU mode
+diff --git a/include/asm-x86/page.h b/include/asm-x86/page.h
+index dc936dd..a1e2b94 100644
+--- a/include/asm-x86/page.h
++++ b/include/asm-x86/page.h
+@@ -160,6 +160,7 @@ static inline pteval_t native_pte_val(pte_t pte)
+ #endif
+ 
+ #define pte_val(x)	native_pte_val(x)
++#define pte_flags(x)	native_pte_val(x)
+ #define __pte(x)	native_make_pte(x)
+ 
+ #endif	/* CONFIG_PARAVIRT */
+diff --git a/include/asm-x86/paravirt.h b/include/asm-x86/paravirt.h
+index 0f13b94..e9ada31 100644
+--- a/include/asm-x86/paravirt.h
++++ b/include/asm-x86/paravirt.h
+@@ -238,7 +238,13 @@ struct pv_mmu_ops {
+ 	void (*pte_update_defer)(struct mm_struct *mm,
+ 				 unsigned long addr, pte_t *ptep);
+ 
++	pte_t (*ptep_modify_prot_start)(struct mm_struct *mm, unsigned long addr,
++					pte_t *ptep);
++	void (*ptep_modify_prot_commit)(struct mm_struct *mm, unsigned long addr,
++					pte_t *ptep, pte_t pte);
++
+ 	pteval_t (*pte_val)(pte_t);
++	pteval_t (*pte_flags)(pte_t);
+ 	pte_t (*make_pte)(pteval_t pte);
+ 
+ 	pgdval_t (*pgd_val)(pgd_t);
+@@ -996,6 +1002,20 @@ static inline pteval_t pte_val(pte_t pte)
+ 	return ret;
+ }
+ 
++static inline pteval_t pte_flags(pte_t pte)
++{
++	pteval_t ret;
++
++	if (sizeof(pteval_t) > sizeof(long))
++		ret = PVOP_CALL2(pteval_t, pv_mmu_ops.pte_flags,
++				 pte.pte, (u64)pte.pte >> 32);
++	else
++		ret = PVOP_CALL1(pteval_t, pv_mmu_ops.pte_flags,
++				 pte.pte);
++
++	return ret;
++}
++
+ static inline pgd_t __pgd(pgdval_t val)
+ {
+ 	pgdval_t ret;
+@@ -1024,6 +1044,29 @@ static inline pgdval_t pgd_val(pgd_t pgd)
+ 	return ret;
+ }
+ 
++#define  __HAVE_ARCH_PTEP_MODIFY_PROT_TRANSACTION
++static inline pte_t ptep_modify_prot_start(struct mm_struct *mm, unsigned long addr,
++					   pte_t *ptep)
++{
++	pteval_t ret;
++
++	ret = PVOP_CALL3(pteval_t, pv_mmu_ops.ptep_modify_prot_start,
++			 mm, addr, ptep);
++
++	return (pte_t) { .pte = ret };
++}
++
++static inline void ptep_modify_prot_commit(struct mm_struct *mm, unsigned long addr,
++					   pte_t *ptep, pte_t pte)
++{
++	if (sizeof(pteval_t) > sizeof(long))
++		/* 5 arg words */
++		pv_mmu_ops.ptep_modify_prot_commit(mm, addr, ptep, pte);
++	else
++		PVOP_VCALL4(pv_mmu_ops.ptep_modify_prot_commit,
++			    mm, addr, ptep, pte.pte);
++}
++
+ static inline void set_pte(pte_t *ptep, pte_t pte)
+ {
+ 	if (sizeof(pteval_t) > sizeof(long))
+diff --git a/include/asm-x86/pgtable.h b/include/asm-x86/pgtable.h
+index 97c271b..47a852c 100644
+--- a/include/asm-x86/pgtable.h
++++ b/include/asm-x86/pgtable.h
+@@ -164,37 +164,37 @@ extern struct list_head pgd_list;
+  */
+ static inline int pte_dirty(pte_t pte)
+ {
+-	return pte_val(pte) & _PAGE_DIRTY;
++	return pte_flags(pte) & _PAGE_DIRTY;
+ }
+ 
+ static inline int pte_young(pte_t pte)
+ {
+-	return pte_val(pte) & _PAGE_ACCESSED;
++	return pte_flags(pte) & _PAGE_ACCESSED;
+ }
+ 
+ static inline int pte_write(pte_t pte)
+ {
+-	return pte_val(pte) & _PAGE_RW;
++	return pte_flags(pte) & _PAGE_RW;
+ }
+ 
+ static inline int pte_file(pte_t pte)
+ {
+-	return pte_val(pte) & _PAGE_FILE;
++	return pte_flags(pte) & _PAGE_FILE;
+ }
+ 
+ static inline int pte_huge(pte_t pte)
+ {
+-	return pte_val(pte) & _PAGE_PSE;
++	return pte_flags(pte) & _PAGE_PSE;
+ }
+ 
+ static inline int pte_global(pte_t pte)
+ {
+-	return pte_val(pte) & _PAGE_GLOBAL;
++	return pte_flags(pte) & _PAGE_GLOBAL;
+ }
+ 
+ static inline int pte_exec(pte_t pte)
+ {
+-	return !(pte_val(pte) & _PAGE_NX);
++	return !(pte_flags(pte) & _PAGE_NX);
+ }
+ 
+ static inline int pte_special(pte_t pte)
+@@ -305,7 +305,7 @@ static inline pgprot_t pgprot_modify(pgprot_t oldprot, pgprot_t newprot)
+ 	return __pgprot(preservebits | addbits);
+ }
+ 
+-#define pte_pgprot(x) __pgprot(pte_val(x) & ~PTE_MASK)
++#define pte_pgprot(x) __pgprot(pte_flags(x) & ~PTE_MASK)
+ 
+ #define canon_pgprot(p) __pgprot(pgprot_val(p) & __supported_pte_mask)
+ 
+diff --git a/include/asm-x86/xen/hypercall.h b/include/asm-x86/xen/hypercall.h
+index c2ccd99..2a4f9b4 100644
+--- a/include/asm-x86/xen/hypercall.h
++++ b/include/asm-x86/xen/hypercall.h
+@@ -176,9 +176,9 @@ HYPERVISOR_fpu_taskswitch(int set)
+ }
+ 
+ static inline int
+-HYPERVISOR_sched_op(int cmd, unsigned long arg)
++HYPERVISOR_sched_op(int cmd, void *arg)
+ {
+-	return _hypercall2(int, sched_op, cmd, arg);
++	return _hypercall2(int, sched_op_new, cmd, arg);
+ }
+ 
+ static inline long
+@@ -315,6 +315,13 @@ HYPERVISOR_nmi_op(unsigned long op, unsigned long arg)
+ }
+ 
+ static inline void
++MULTI_fpu_taskswitch(struct multicall_entry *mcl, int set)
++{
++	mcl->op = __HYPERVISOR_fpu_taskswitch;
++	mcl->args[0] = set;
++}
++
++static inline void
+ MULTI_update_va_mapping(struct multicall_entry *mcl, unsigned long va,
+ 			pte_t new_val, unsigned long flags)
+ {
+diff --git a/include/asm-x86/xen/page.h b/include/asm-x86/xen/page.h
+index e11f240..377c045 100644
+--- a/include/asm-x86/xen/page.h
++++ b/include/asm-x86/xen/page.h
+@@ -26,15 +26,20 @@ typedef struct xpaddr {
+ #define FOREIGN_FRAME_BIT	(1UL<<31)
+ #define FOREIGN_FRAME(m)	((m) | FOREIGN_FRAME_BIT)
+ 
+-extern unsigned long *phys_to_machine_mapping;
++/* Maximum amount of memory we can handle in a domain in pages */
++#define MAX_DOMAIN_PAGES						\
++    ((unsigned long)((u64)CONFIG_XEN_MAX_DOMAIN_MEMORY * 1024 * 1024 * 1024 / PAGE_SIZE))
++
++
++extern unsigned long get_phys_to_machine(unsigned long pfn);
++extern void set_phys_to_machine(unsigned long pfn, unsigned long mfn);
+ 
+ static inline unsigned long pfn_to_mfn(unsigned long pfn)
+ {
+ 	if (xen_feature(XENFEAT_auto_translated_physmap))
+ 		return pfn;
+ 
+-	return phys_to_machine_mapping[(unsigned int)(pfn)] &
+-		~FOREIGN_FRAME_BIT;
++	return get_phys_to_machine(pfn) & ~FOREIGN_FRAME_BIT;
+ }
+ 
+ static inline int phys_to_machine_mapping_valid(unsigned long pfn)
+@@ -42,7 +47,7 @@ static inline int phys_to_machine_mapping_valid(unsigned long pfn)
+ 	if (xen_feature(XENFEAT_auto_translated_physmap))
+ 		return 1;
+ 
+-	return (phys_to_machine_mapping[pfn] != INVALID_P2M_ENTRY);
++	return get_phys_to_machine(pfn) != INVALID_P2M_ENTRY;
+ }
+ 
+ static inline unsigned long mfn_to_pfn(unsigned long mfn)
+@@ -106,20 +111,12 @@ static inline unsigned long mfn_to_local_pfn(unsigned long mfn)
+ 	unsigned long pfn = mfn_to_pfn(mfn);
+ 	if ((pfn < max_mapnr)
+ 	    && !xen_feature(XENFEAT_auto_translated_physmap)
+-	    && (phys_to_machine_mapping[pfn] != mfn))
++	    && (get_phys_to_machine(pfn) != mfn))
+ 		return max_mapnr; /* force !pfn_valid() */
++	/* XXX fixme; not true with sparsemem */
+ 	return pfn;
+ }
+ 
+-static inline void set_phys_to_machine(unsigned long pfn, unsigned long mfn)
+-{
+-	if (xen_feature(XENFEAT_auto_translated_physmap)) {
+-		BUG_ON(pfn != mfn && mfn != INVALID_P2M_ENTRY);
+-		return;
+-	}
+-	phys_to_machine_mapping[pfn] = mfn;
+-}
+-
+ /* VIRT <-> MACHINE conversion */
+ #define virt_to_machine(v)	(phys_to_machine(XPADDR(__pa(v))))
+ #define virt_to_mfn(v)		(pfn_to_mfn(PFN_DOWN(__pa(v))))
+diff --git a/include/linux/console.h b/include/linux/console.h
+index a4f27fb..248e6e3 100644
+--- a/include/linux/console.h
++++ b/include/linux/console.h
+@@ -108,6 +108,8 @@ struct console {
+ 	struct	 console *next;
+ };
+ 
++extern int console_set_on_cmdline;
++
+ extern int add_preferred_console(char *name, int idx, char *options);
+ extern int update_console_cmdline(char *name, int idx, char *name_new, int idx_new, char *options);
+ extern void register_console(struct console *);
+diff --git a/include/linux/page-flags.h b/include/linux/page-flags.h
+index f31debf..0d2a4e7 100644
+--- a/include/linux/page-flags.h
++++ b/include/linux/page-flags.h
+@@ -157,6 +157,7 @@ PAGEFLAG(Active, active) __CLEARPAGEFLAG(Active, active)
+ __PAGEFLAG(Slab, slab)
+ PAGEFLAG(Checked, owner_priv_1)		/* Used by some filesystems */
+ PAGEFLAG(Pinned, owner_priv_1) TESTSCFLAG(Pinned, owner_priv_1) /* Xen */
++PAGEFLAG(SavePinned, dirty);					/* Xen */
+ PAGEFLAG(Reserved, reserved) __CLEARPAGEFLAG(Reserved, reserved)
+ PAGEFLAG(Private, private) __CLEARPAGEFLAG(Private, private)
+ 	__SETPAGEFLAG(Private, private)
+diff --git a/include/xen/events.h b/include/xen/events.h
+index acd8e06..67c4436 100644
+--- a/include/xen/events.h
++++ b/include/xen/events.h
+@@ -32,6 +32,7 @@ void unbind_from_irqhandler(unsigned int irq, void *dev_id);
+ 
+ void xen_send_IPI_one(unsigned int cpu, enum ipi_vector vector);
+ int resend_irq_on_evtchn(unsigned int irq);
++void rebind_evtchn_irq(int evtchn, int irq);
+ 
+ static inline void notify_remote_via_evtchn(int port)
+ {
+@@ -40,4 +41,7 @@ static inline void notify_remote_via_evtchn(int port)
+ }
+ 
+ extern void notify_remote_via_irq(int irq);
++
++extern void xen_irq_resume(void);
++
+ #endif	/* _XEN_EVENTS_H */
+diff --git a/include/xen/grant_table.h b/include/xen/grant_table.h
+index 4662048..a40f1cd 100644
+--- a/include/xen/grant_table.h
++++ b/include/xen/grant_table.h
+@@ -51,6 +51,9 @@ struct gnttab_free_callback {
+ 	u16 count;
+ };
+ 
++int gnttab_suspend(void);
++int gnttab_resume(void);
++
+ int gnttab_grant_foreign_access(domid_t domid, unsigned long frame,
+ 				int readonly);
+ 
+diff --git a/include/xen/hvc-console.h b/include/xen/hvc-console.h
+index 21c0ecf..98b79bc 100644
+--- a/include/xen/hvc-console.h
++++ b/include/xen/hvc-console.h
+@@ -3,4 +3,13 @@
+ 
+ extern struct console xenboot_console;
+ 
++#ifdef CONFIG_HVC_XEN
++void xen_console_resume(void);
++#else
++static inline void xen_console_resume(void) { }
++#endif
++
++void xen_raw_console_write(const char *str);
++void xen_raw_printk(const char *fmt, ...);
++
+ #endif	/* XEN_HVC_CONSOLE_H */
+diff --git a/include/xen/interface/elfnote.h b/include/xen/interface/elfnote.h
+index a64d3df..7a8262c 100644
+--- a/include/xen/interface/elfnote.h
++++ b/include/xen/interface/elfnote.h
+@@ -120,6 +120,26 @@
+  */
+ #define XEN_ELFNOTE_BSD_SYMTAB    11
+ 
++/*
++ * The lowest address the hypervisor hole can begin at (numeric).
++ *
++ * This must not be set higher than HYPERVISOR_VIRT_START. Its presence
++ * also indicates to the hypervisor that the kernel can deal with the
++ * hole starting at a higher address.
++ */
++#define XEN_ELFNOTE_HV_START_LOW  12
++
++/*
++ * List of maddr_t-sized mask/value pairs describing how to recognize
++ * (non-present) L1 page table entries carrying valid MFNs (numeric).
++ */
++#define XEN_ELFNOTE_L1_MFN_VALID  13
++
++/*
++ * Whether or not the guest supports cooperative suspend cancellation.
++ */
++#define XEN_ELFNOTE_SUSPEND_CANCEL 14
++
+ #endif /* __XEN_PUBLIC_ELFNOTE_H__ */
+ 
+ /*
+diff --git a/include/xen/interface/features.h b/include/xen/interface/features.h
+index d73228d..f51b641 100644
+--- a/include/xen/interface/features.h
++++ b/include/xen/interface/features.h
+@@ -38,6 +38,9 @@
+  */
+ #define XENFEAT_pae_pgdir_above_4gb        4
+ 
++/* x86: Does this Xen host support the MMU_PT_UPDATE_PRESERVE_AD hypercall? */
++#define XENFEAT_mmu_pt_update_preserve_ad  5
++
+ #define XENFEAT_NR_SUBMAPS 1
+ 
+ #endif /* __XEN_PUBLIC_FEATURES_H__ */
+diff --git a/include/xen/interface/io/fbif.h b/include/xen/interface/io/fbif.h
+index 5a934dd..974a51e 100644
+--- a/include/xen/interface/io/fbif.h
++++ b/include/xen/interface/io/fbif.h
+@@ -49,11 +49,27 @@ struct xenfb_update {
+ 	int32_t height;		/* rect height */
+ };
+ 
++/*
++ * Framebuffer resize notification event
++ * Capable backend sets feature-resize in xenstore.
++ */
++#define XENFB_TYPE_RESIZE 3
++
++struct xenfb_resize {
++	uint8_t type;		/* XENFB_TYPE_RESIZE */
++	int32_t width;		/* width in pixels */
++	int32_t height;		/* height in pixels */
++	int32_t stride;		/* stride in bytes */
++	int32_t depth;		/* depth in bits */
++	int32_t offset;		/* start offset within framebuffer */
++};
++
+ #define XENFB_OUT_EVENT_SIZE 40
+ 
+ union xenfb_out_event {
+ 	uint8_t type;
+ 	struct xenfb_update update;
++	struct xenfb_resize resize;
+ 	char pad[XENFB_OUT_EVENT_SIZE];
+ };
+ 
+@@ -105,15 +121,18 @@ struct xenfb_page {
+ 	 * Each directory page holds PAGE_SIZE / sizeof(*pd)
+ 	 * framebuffer pages, and can thus map up to PAGE_SIZE *
+ 	 * PAGE_SIZE / sizeof(*pd) bytes.  With PAGE_SIZE == 4096 and
+-	 * sizeof(unsigned long) == 4, that's 4 Megs.  Two directory
+-	 * pages should be enough for a while.
++	 * sizeof(unsigned long) == 4/8, that's 4 Megs 32 bit and 2
++	 * Megs 64 bit.  256 directories give enough room for a 512
++	 * Meg framebuffer with a max resolution of 12,800x10,240.
++	 * Should be enough for a while with room leftover for
++	 * expansion.
+ 	 */
+-	unsigned long pd[2];
++	unsigned long pd[256];
+ };
+ 
+ /*
+- * Wart: xenkbd needs to know resolution.  Put it here until a better
+- * solution is found, but don't leak it to the backend.
++ * Wart: xenkbd needs to know default resolution.  Put it here until a
++ * better solution is found, but don't leak it to the backend.
+  */
+ #ifdef __KERNEL__
+ #define XENFB_WIDTH 800
+diff --git a/include/xen/interface/io/kbdif.h b/include/xen/interface/io/kbdif.h
+index fb97f42..8066c78 100644
+--- a/include/xen/interface/io/kbdif.h
++++ b/include/xen/interface/io/kbdif.h
+@@ -49,6 +49,7 @@ struct xenkbd_motion {
+ 	uint8_t type;		/* XENKBD_TYPE_MOTION */
+ 	int32_t rel_x;		/* relative X motion */
+ 	int32_t rel_y;		/* relative Y motion */
++	int32_t rel_z;		/* relative Z motion (wheel) */
+ };
+ 
+ struct xenkbd_key {
+@@ -61,6 +62,7 @@ struct xenkbd_position {
+ 	uint8_t type;		/* XENKBD_TYPE_POS */
+ 	int32_t abs_x;		/* absolute X position (in FB pixels) */
+ 	int32_t abs_y;		/* absolute Y position (in FB pixels) */
++	int32_t rel_z;		/* relative Z motion (wheel) */
+ };
+ 
+ #define XENKBD_IN_EVENT_SIZE 40
+diff --git a/include/xen/interface/memory.h b/include/xen/interface/memory.h
+index da76846..af36ead 100644
+--- a/include/xen/interface/memory.h
++++ b/include/xen/interface/memory.h
+@@ -29,7 +29,7 @@ struct xen_memory_reservation {
+      *   OUT: GMFN bases of extents that were allocated
+      *   (NB. This command also updates the mach_to_phys translation table)
+      */
+-    ulong extent_start;
++    GUEST_HANDLE(ulong) extent_start;
+ 
+     /* Number of extents, and size/alignment of each (2^extent_order pages). */
+     unsigned long  nr_extents;
+@@ -50,6 +50,7 @@ struct xen_memory_reservation {
+     domid_t        domid;
+ 
+ };
++DEFINE_GUEST_HANDLE_STRUCT(xen_memory_reservation);
+ 
+ /*
+  * Returns the maximum machine frame number of mapped RAM in this system.
+@@ -85,7 +86,7 @@ struct xen_machphys_mfn_list {
+      * any large discontiguities in the machine address space, 2MB gaps in
+      * the machphys table will be represented by an MFN base of zero.
+      */
+-    ulong extent_start;
++    GUEST_HANDLE(ulong) extent_start;
+ 
+     /*
+      * Number of extents written to the above array. This will be smaller
+@@ -93,6 +94,7 @@ struct xen_machphys_mfn_list {
+      */
+     unsigned int nr_extents;
+ };
++DEFINE_GUEST_HANDLE_STRUCT(xen_machphys_mfn_list);
+ 
+ /*
+  * Sets the GPFN at which a particular page appears in the specified guest's
+@@ -115,6 +117,7 @@ struct xen_add_to_physmap {
+     /* GPFN where the source mapping page should appear. */
+     unsigned long gpfn;
+ };
++DEFINE_GUEST_HANDLE_STRUCT(xen_add_to_physmap);
+ 
+ /*
+  * Translates a list of domain-specific GPFNs into MFNs. Returns a -ve error
+@@ -129,13 +132,14 @@ struct xen_translate_gpfn_list {
+     unsigned long nr_gpfns;
+ 
+     /* List of GPFNs to translate. */
+-    ulong gpfn_list;
++    GUEST_HANDLE(ulong) gpfn_list;
+ 
+     /*
+      * Output list to contain MFN translations. May be the same as the input
+      * list (in which case each input GPFN is overwritten with the output MFN).
+      */
+-    ulong mfn_list;
++    GUEST_HANDLE(ulong) mfn_list;
+ };
++DEFINE_GUEST_HANDLE_STRUCT(xen_translate_gpfn_list);
+ 
+ #endif /* __XEN_PUBLIC_MEMORY_H__ */
+diff --git a/include/xen/interface/xen.h b/include/xen/interface/xen.h
+index 819a033..2befa3e 100644
+--- a/include/xen/interface/xen.h
++++ b/include/xen/interface/xen.h
+@@ -114,9 +114,14 @@
+  * ptr[:2]  -- Machine address within the frame whose mapping to modify.
+  *             The frame must belong to the FD, if one is specified.
+  * val      -- Value to write into the mapping entry.
++ *
++ * ptr[1:0] == MMU_PT_UPDATE_PRESERVE_AD:
++ * As MMU_NORMAL_PT_UPDATE above, but A/D bits currently in the PTE are ORed
++ * with those in @val.
+  */
+-#define MMU_NORMAL_PT_UPDATE     0 /* checked '*ptr = val'. ptr is MA.       */
+-#define MMU_MACHPHYS_UPDATE      1 /* ptr = MA of frame to modify entry for  */
++#define MMU_NORMAL_PT_UPDATE      0 /* checked '*ptr = val'. ptr is MA.       */
++#define MMU_MACHPHYS_UPDATE       1 /* ptr = MA of frame to modify entry for  */
++#define MMU_PT_UPDATE_PRESERVE_AD 2 /* atomically: *ptr = val | (*ptr&(A|D)) */
+ 
+ /*
+  * MMU EXTENDED OPERATIONS
+diff --git a/include/xen/xen-ops.h b/include/xen/xen-ops.h
+index 10ddfe0..a706d6a 100644
+--- a/include/xen/xen-ops.h
++++ b/include/xen/xen-ops.h
+@@ -5,4 +5,10 @@
+ 
+ DECLARE_PER_CPU(struct vcpu_info *, xen_vcpu);
+ 
++void xen_pre_suspend(void);
++void xen_post_suspend(int suspend_cancelled);
++
++void xen_mm_pin_all(void);
++void xen_mm_unpin_all(void);
++
+ #endif /* INCLUDE_XEN_OPS_H */
+diff --git a/kernel/printk.c b/kernel/printk.c
+index 8fb01c3..028ed75 100644
+--- a/kernel/printk.c
++++ b/kernel/printk.c
+@@ -121,6 +121,8 @@ struct console_cmdline
+ static struct console_cmdline console_cmdline[MAX_CMDLINECONSOLES];
+ static int selected_console = -1;
+ static int preferred_console = -1;
++int console_set_on_cmdline;
++EXPORT_SYMBOL(console_set_on_cmdline);
+ 
+ /* Flag: console code may call schedule() */
+ static int console_may_schedule;
+@@ -890,6 +892,7 @@ static int __init console_setup(char *str)
+ 	*s = 0;
+ 
+ 	__add_preferred_console(buf, idx, options, brl_options);
++	console_set_on_cmdline = 1;
+ 	return 1;
+ }
+ __setup("console=", console_setup);
+diff --git a/mm/mprotect.c b/mm/mprotect.c
+index a5bf31c..acfe7c8 100644
+--- a/mm/mprotect.c
++++ b/mm/mprotect.c
+@@ -47,19 +47,17 @@ static void change_pte_range(struct mm_struct *mm, pmd_t *pmd,
+ 		if (pte_present(oldpte)) {
+ 			pte_t ptent;
+ 
+-			/* Avoid an SMP race with hardware updated dirty/clean
+-			 * bits by wiping the pte and then setting the new pte
+-			 * into place.
+-			 */
+-			ptent = ptep_get_and_clear(mm, addr, pte);
++			ptent = ptep_modify_prot_start(mm, addr, pte);
+ 			ptent = pte_modify(ptent, newprot);
++
+ 			/*
+ 			 * Avoid taking write faults for pages we know to be
+ 			 * dirty.
+ 			 */
+ 			if (dirty_accountable && pte_dirty(ptent))
+ 				ptent = pte_mkwrite(ptent);
+-			set_pte_at(mm, addr, pte, ptent);
++
++			ptep_modify_prot_commit(mm, addr, pte, ptent);
+ #ifdef CONFIG_MIGRATION
+ 		} else if (!pte_file(oldpte)) {
+ 			swp_entry_t entry = pte_to_swp_entry(oldpte);

Modified: dists/trunk/linux-2.6/debian/patches/series/1~experimental.1-extra
==============================================================================
--- dists/trunk/linux-2.6/debian/patches/series/1~experimental.1-extra	(original)
+++ dists/trunk/linux-2.6/debian/patches/series/1~experimental.1-extra	Wed Jul  2 21:33:25 2008
@@ -7,3 +7,4 @@
 + features/all/xen/xenctrl-privcmd.patch featureset=xen
 + features/all/xen/xenctrl-xenbus.patch featureset=xen
 + features/all/xen/xenctrl-sys-hypervisor.patch featureset=xen
+#+ features/xen-x86.patch featureset=xen



More information about the Kernel-svn-changes mailing list