[linux] 01/01: [amd64] Implement Kernel Page Table Isolation (KPTI, aka KAISER) (CVE-2017-5754)

debian-kernel at lists.debian.org debian-kernel at lists.debian.org
Thu Jan 4 05:03:35 UTC 2018


This is an automated email from the git hooks/post-receive script.

benh pushed a commit to branch wheezy-security
in repository linux.

commit 3590e69755b1c471b3a4d0879f5eda8c390608f1
Author: Ben Hutchings <ben at decadent.org.uk>
Date:   Thu Jan 4 05:03:15 2018 +0000

    [amd64] Implement Kernel Page Table Isolation (KPTI, aka KAISER) (CVE-2017-5754)
---
 debian/changelog                                   |    7 +
 ...dd-nokaiser-boot-option-using-alternative.patch |  610 +++++++
 ...iser-alloc_ldt_struct-use-get_zeroed_page.patch |   28 +
 ...sm-tlbflush.h-handle-nopge-at-lower-level.patch |   79 +
 .../all/kpti/kaiser-disabled-on-xen-pv.patch       |   48 +
 ...er_flush_tlb_on_return_to_user-check-pcid.patch |   83 +
 .../all/kpti/kaiser-kernel-address-isolation.patch | 1909 ++++++++++++++++++++
 ...ternative-instead-of-x86_cr3_pcid_noflush.patch |  108 ++
 .../kpti/kaiser-user_map-__kprobes_text-too.patch  |   26 +
 .../kpti/kpti-rename-to-page_table_isolation.patch |  275 +++
 .../bugfix/all/kpti/kpti-report-when-enabled.patch |   44 +
 ...t-sched-core-fix-mmu_context.h-assumption.patch |   37 +
 ...h_mm_irqs_off-and-use-it-in-the-scheduler.patch |   73 +
 ...ask_exit-shouldn-t-use-switch_mm_irqs_off.patch |   41 +
 .../x86-alternatives-add-instruction-padding.patch |  348 ++++
 .../x86-alternatives-cleanup-dprintk-macro.patch   |  108 ++
 .../x86-alternatives-make-jmps-more-robust.patch   |  257 +++
 ...ternatives-use-optimized-nops-for-padding.patch |   50 +
 ...mdline-parsing-for-options-with-arguments.patch |  174 ++
 ...-carve-out-early-cmdline-parsing-function.patch |  131 ++
 ...command-line-parsing-when-matching-at-end.patch |  119 ++
 ...nd-line-parsing-when-partial-word-matches.patch |  100 +
 ...oot-pass-in-size-to-early-cmdline-parsing.patch |   59 +
 ...-boot-simplify-early-command-line-parsing.patch |   51 +
 ...-features-from-intel-document-319433-012a.patch |   30 +
 .../x86-kaiser-check-boottime-cmdline-params.patch |  118 ++
 .../x86-kaiser-move-feature-detection-up.patch     |   77 +
 .../all/kpti/x86-kaiser-reenable-paravirt.patch    |   25 +
 ...-and-simplify-x86_feature_kaiser-handling.patch |   94 +
 ...-64-fix-reboot-interaction-with-cr4.pcide.patch |   41 +
 ...noinvpcid-boot-option-to-turn-off-invpcid.patch |   72 +
 .../all/kpti/x86-mm-add-invpcid-helpers.patch      |   91 +
 ...d-the-nopcid-boot-option-to-turn-off-pcid.patch |   72 +
 ...86-mm-build-arch-x86-mm-tlb.c-even-on-smp.patch |   63 +
 .../x86-mm-disable-pcid-on-32-bit-kernels.patch    |   63 +
 ...-mm-enable-cr4.pcide-on-supported-systems.patch |  135 ++
 .../kpti/x86-mm-fix-invpcid-asm-constraint.patch   |   66 +
 ...available-use-it-to-flush-global-mappings.patch |   54 +
 .../kpti/x86-mm-kaiser-re-enable-vsyscalls.patch   |  132 ++
 ...h.h-code-always-use-the-formerly-smp-code.patch |  232 +++
 ...-mm-sched-core-turn-off-irqs-in-switch_mm.patch |   64 +
 .../x86-mm-sched-core-uninline-switch_mm.patch     |  190 ++
 .../x86-paravirt-dont-patch-flush_tlb_single.patch |   65 +
 ...-x86-vdso-Use-seqcount-instead-of-seqlock.patch |   20 +-
 ...osix-timers-thread-posix-cpu-timers-on-rt.patch |   30 +-
 debian/patches/series                              |   42 +
 46 files changed, 6508 insertions(+), 33 deletions(-)

diff --git a/debian/changelog b/debian/changelog
index 7949c4e..2187c14 100644
--- a/debian/changelog
+++ b/debian/changelog
@@ -1,3 +1,10 @@
+linux (3.2.96-3) UNRELEASED; urgency=high
+
+  * [amd64] Implement Kernel Page Table Isolation (KPTI, aka KAISER)
+    (CVE-2017-5754)
+
+ -- Ben Hutchings <ben at decadent.org.uk>  Thu, 04 Jan 2018 05:01:25 +0000
+
 linux (3.2.96-2) wheezy-security; urgency=high
 
   * [!x86] Adjust "mmap: Add an exception to the stack gap for Hotspot JVM
diff --git a/debian/patches/bugfix/all/kpti/kaiser-add-nokaiser-boot-option-using-alternative.patch b/debian/patches/bugfix/all/kpti/kaiser-add-nokaiser-boot-option-using-alternative.patch
new file mode 100644
index 0000000..30bc8c3
--- /dev/null
+++ b/debian/patches/bugfix/all/kpti/kaiser-add-nokaiser-boot-option-using-alternative.patch
@@ -0,0 +1,610 @@
+From: Hugh Dickins <hughd at google.com>
+Date: Sun, 24 Sep 2017 16:59:49 -0700
+Subject: kaiser: add "nokaiser" boot option, using ALTERNATIVE
+
+Added "nokaiser" boot option: an early param like "noinvpcid".
+Most places now check int kaiser_enabled (#defined 0 when not
+CONFIG_KAISER) instead of #ifdef CONFIG_KAISER; but entry_64.S
+and entry_64_compat.S are using the ALTERNATIVE technique, which
+patches in the preferred instructions at runtime.  That technique
+is tied to x86 cpu features, so X86_FEATURE_KAISER fabricated
+("" in its comment so "kaiser" not magicked into /proc/cpuinfo).
+
+Prior to "nokaiser", Kaiser #defined _PAGE_GLOBAL 0: revert that,
+but be careful with both _PAGE_GLOBAL and CR4.PGE: setting them when
+nokaiser like when !CONFIG_KAISER, but not setting either when kaiser -
+neither matters on its own, but it's hard to be sure that _PAGE_GLOBAL
+won't get set in some obscure corner, or something add PGE into CR4.
+By omitting _PAGE_GLOBAL from __supported_pte_mask when kaiser_enabled,
+all page table setup which uses pte_pfn() masks it out of the ptes.
+
+It's slightly shameful that the same declaration versus definition of
+kaiser_enabled appears in not one, not two, but in three header files
+(asm/kaiser.h, asm/pgtable.h, asm/tlbflush.h).  I felt safer that way,
+than with #including any of those in any of the others; and did not
+feel it worth an asm/kaiser_enabled.h - kernel/cpu/common.c includes
+them all, so we shall hear about it if they get out of synch.
+
+Cleanups while in the area: removed the silly #ifdef CONFIG_KAISER
+from kaiser.c; removed the unused native_get_normal_pgd(); removed
+the spurious reg clutter from SWITCH_*_CR3 macro stubs; corrected some
+comments.  But more interestingly, set CR4.PSE in secondary_startup_64:
+the manual is clear that it does not matter whether it's 0 or 1 when
+4-level-pts are enabled, but I was distracted to find cr4 different on
+BSP and auxiliaries - BSP alone was adding PSE, in init_memory_mapping().
+
+(cherry picked from Change-Id: I8e5bec716944444359cbd19f6729311eff943e9a)
+
+Signed-off-by: Hugh Dickins <hughd at google.com>
+Signed-off-by: Ben Hutchings <ben at decadent.org.uk>
+---
+ Documentation/kernel-parameters.txt  |  2 ++
+ arch/x86/ia32/ia32entry.S            |  2 ++
+ arch/x86/include/asm/cpufeature.h    |  3 +++
+ arch/x86/include/asm/kaiser.h        | 27 ++++++++++++++++++++-------
+ arch/x86/include/asm/pgtable.h       | 19 +++++++++++++------
+ arch/x86/include/asm/pgtable_64.h    | 13 ++++---------
+ arch/x86/include/asm/pgtable_types.h |  4 ----
+ arch/x86/include/asm/tlbflush.h      | 35 +++++++++++++++++++++--------------
+ arch/x86/kernel/cpu/common.c         | 30 +++++++++++++++++++++++++++++-
+ arch/x86/kernel/entry_64.S           |  6 +++++-
+ arch/x86/kernel/espfix_64.c          |  3 ++-
+ arch/x86/kernel/head_64.S            |  4 ++--
+ arch/x86/mm/init.c                   |  2 +-
+ arch/x86/mm/init_64.c                | 10 ++++++++++
+ arch/x86/mm/kaiser.c                 | 26 ++++++++++++++++++++++----
+ arch/x86/mm/pgtable.c                |  8 ++------
+ arch/x86/mm/tlb.c                    |  4 +---
+ 17 files changed, 139 insertions(+), 59 deletions(-)
+
+--- a/Documentation/kernel-parameters.txt
++++ b/Documentation/kernel-parameters.txt
+@@ -1803,6 +1803,8 @@ bytes respectively. Such letter suffixes
+ 
+ 	nojitter	[IA-64] Disables jitter checking for ITC timers.
+ 
++	nokaiser	[X86-64] Disable KAISER isolation of kernel from user.
++
+ 	no-kvmclock	[X86,KVM] Disable paravirtualized KVM clock driver
+ 
+ 	no-kvmapf	[X86,KVM] Disable paravirtualized asynchronous page
+--- a/arch/x86/ia32/ia32entry.S
++++ b/arch/x86/ia32/ia32entry.S
+@@ -13,6 +13,8 @@
+ #include <asm/thread_info.h>	
+ #include <asm/segment.h>
+ #include <asm/pgtable_types.h>
++#include <asm/alternative-asm.h>
++#include <asm/cpufeature.h>
+ #include <asm/kaiser.h>
+ #include <asm/irqflags.h>
+ #include <linux/linkage.h>
+--- a/arch/x86/include/asm/cpufeature.h
++++ b/arch/x86/include/asm/cpufeature.h
+@@ -179,6 +179,9 @@
+ #define X86_FEATURE_HW_PSTATE	(7*32+ 8) /* AMD HW-PState */
+ #define X86_FEATURE_INVPCID_SINGLE (7*32+ 9) /* Effectively INVPCID && CR4.PCIDE=1 */
+ 
++/* Because the ALTERNATIVE scheme is for members of the X86_FEATURE club... */
++#define X86_FEATURE_KAISER	( 7*32+31) /* "" CONFIG_KAISER w/o nokaiser */
++
+ /* Virtualization flags: Linux defined, word 8 */
+ #define X86_FEATURE_TPR_SHADOW  (8*32+ 0) /* Intel TPR Shadow */
+ #define X86_FEATURE_VNMI        (8*32+ 1) /* Intel Virtual NMI */
+--- a/arch/x86/include/asm/kaiser.h
++++ b/arch/x86/include/asm/kaiser.h
+@@ -46,28 +46,33 @@ movq \reg, %cr3
+ .endm
+ 
+ .macro SWITCH_KERNEL_CR3
+-pushq %rax
++ALTERNATIVE "jmp 8f", "pushq %rax", X86_FEATURE_KAISER
+ _SWITCH_TO_KERNEL_CR3 %rax
+ popq %rax
++8:
+ .endm
+ 
+ .macro SWITCH_USER_CR3
+-pushq %rax
++ALTERNATIVE "jmp 8f", "pushq %rax", X86_FEATURE_KAISER
+ _SWITCH_TO_USER_CR3 %rax %al
+ popq %rax
++8:
+ .endm
+ 
+ .macro SWITCH_KERNEL_CR3_NO_STACK
+-movq %rax, PER_CPU_VAR(unsafe_stack_register_backup)
++ALTERNATIVE "jmp 8f", \
++	__stringify(movq %rax, PER_CPU_VAR(unsafe_stack_register_backup)), \
++	X86_FEATURE_KAISER
+ _SWITCH_TO_KERNEL_CR3 %rax
+ movq PER_CPU_VAR(unsafe_stack_register_backup), %rax
++8:
+ .endm
+ 
+ #else /* CONFIG_KAISER */
+ 
+-.macro SWITCH_KERNEL_CR3 reg
++.macro SWITCH_KERNEL_CR3
+ .endm
+-.macro SWITCH_USER_CR3 reg regb
++.macro SWITCH_USER_CR3
+ .endm
+ .macro SWITCH_KERNEL_CR3_NO_STACK
+ .endm
+@@ -90,6 +95,16 @@ DECLARE_PER_CPU(unsigned long, x86_cr3_p
+ 
+ extern char __per_cpu_user_mapped_start[], __per_cpu_user_mapped_end[];
+ 
++extern int kaiser_enabled;
++#else
++#define kaiser_enabled	0
++#endif /* CONFIG_KAISER */
++
++/*
++ * Kaiser function prototypes are needed even when CONFIG_KAISER is not set,
++ * so as to build with tests on kaiser_enabled instead of #ifdefs.
++ */
++
+ /**
+  *  kaiser_add_mapping - map a virtual memory part to the shadow (user) mapping
+  *  @addr: the start address of the range
+@@ -119,8 +134,6 @@ extern void kaiser_remove_mapping(unsign
+  */
+ extern void kaiser_init(void);
+ 
+-#endif /* CONFIG_KAISER */
+-
+ #endif /* __ASSEMBLY */
+ 
+ #endif /* _ASM_X86_KAISER_H */
+--- a/arch/x86/include/asm/pgtable.h
++++ b/arch/x86/include/asm/pgtable.h
+@@ -17,6 +17,11 @@
+ #ifndef __ASSEMBLY__
+ 
+ #include <asm/x86_init.h>
++#ifdef CONFIG_KAISER
++extern int kaiser_enabled;
++#else
++#define kaiser_enabled 0
++#endif
+ 
+ /*
+  * ZERO_PAGE is a global shared page that is always zero: used
+@@ -577,7 +582,7 @@ static inline int pgd_bad(pgd_t pgd)
+ 	 * page table by accident; it will fault on the first
+ 	 * instruction it tries to run.  See native_set_pgd().
+ 	 */
+-	if (IS_ENABLED(CONFIG_KAISER))
++	if (kaiser_enabled)
+ 		ignore_flags |= _PAGE_NX;
+ 
+ 	return (pgd_flags(pgd) & ~ignore_flags) != _KERNPG_TABLE;
+@@ -780,12 +785,14 @@ static inline void pmdp_set_wrprotect(st
+  */
+ static inline void clone_pgd_range(pgd_t *dst, pgd_t *src, int count)
+ {
+-       memcpy(dst, src, count * sizeof(pgd_t));
++	memcpy(dst, src, count * sizeof(pgd_t));
+ #ifdef CONFIG_KAISER
+-	/* Clone the shadow pgd part as well */
+-	memcpy(native_get_shadow_pgd(dst),
+-	       native_get_shadow_pgd(src),
+-	       count * sizeof(pgd_t));
++	if (kaiser_enabled) {
++		/* Clone the shadow pgd part as well */
++		memcpy(native_get_shadow_pgd(dst),
++			native_get_shadow_pgd(src),
++			count * sizeof(pgd_t));
++	}
+ #endif
+ }
+ 
+--- a/arch/x86/include/asm/pgtable_64.h
++++ b/arch/x86/include/asm/pgtable_64.h
+@@ -110,13 +110,12 @@ extern pgd_t kaiser_set_shadow_pgd(pgd_t
+ 
+ static inline pgd_t *native_get_shadow_pgd(pgd_t *pgdp)
+ {
++#ifdef CONFIG_DEBUG_VM
++	/* linux/mmdebug.h may not have been included at this point */
++	BUG_ON(!kaiser_enabled);
++#endif
+ 	return (pgd_t *)((unsigned long)pgdp | (unsigned long)PAGE_SIZE);
+ }
+-
+-static inline pgd_t *native_get_normal_pgd(pgd_t *pgdp)
+-{
+-	return (pgd_t *)((unsigned long)pgdp & ~(unsigned long)PAGE_SIZE);
+-}
+ #else
+ static inline pgd_t kaiser_set_shadow_pgd(pgd_t *pgdp, pgd_t pgd)
+ {
+@@ -126,10 +125,6 @@ static inline pgd_t *native_get_shadow_p
+ {
+ 	return NULL;
+ }
+-static inline pgd_t *native_get_normal_pgd(pgd_t *pgdp)
+-{
+-	return pgdp;
+-}
+ #endif /* CONFIG_KAISER */
+ 
+ static inline void native_set_pgd(pgd_t *pgdp, pgd_t pgd)
+--- a/arch/x86/include/asm/pgtable_types.h
++++ b/arch/x86/include/asm/pgtable_types.h
+@@ -39,11 +39,7 @@
+ #define _PAGE_ACCESSED	(_AT(pteval_t, 1) << _PAGE_BIT_ACCESSED)
+ #define _PAGE_DIRTY	(_AT(pteval_t, 1) << _PAGE_BIT_DIRTY)
+ #define _PAGE_PSE	(_AT(pteval_t, 1) << _PAGE_BIT_PSE)
+-#ifdef CONFIG_KAISER
+-#define _PAGE_GLOBAL	(_AT(pteval_t, 0))
+-#else
+ #define _PAGE_GLOBAL	(_AT(pteval_t, 1) << _PAGE_BIT_GLOBAL)
+-#endif
+ #define _PAGE_UNUSED1	(_AT(pteval_t, 1) << _PAGE_BIT_UNUSED1)
+ #define _PAGE_IOMAP	(_AT(pteval_t, 1) << _PAGE_BIT_IOMAP)
+ #define _PAGE_PAT	(_AT(pteval_t, 1) << _PAGE_BIT_PAT)
+--- a/arch/x86/include/asm/tlbflush.h
++++ b/arch/x86/include/asm/tlbflush.h
+@@ -69,9 +69,11 @@ static inline void invpcid_flush_all_non
+  * to avoid the need for asm/kaiser.h in unexpected places.
+  */
+ #ifdef CONFIG_KAISER
++extern int kaiser_enabled;
+ extern void kaiser_setup_pcid(void);
+ extern void kaiser_flush_tlb_on_return_to_user(void);
+ #else
++#define kaiser_enabled 0
+ static inline void kaiser_setup_pcid(void)
+ {
+ }
+@@ -96,7 +98,7 @@ static inline void __native_flush_tlb(vo
+ 	 * back:
+ 	 */
+ 	preempt_disable();
+-	if (this_cpu_has(X86_FEATURE_PCID))
++	if (kaiser_enabled && this_cpu_has(X86_FEATURE_PCID))
+ 		kaiser_flush_tlb_on_return_to_user();
+ 	native_write_cr3(native_read_cr3());
+ 	preempt_enable();
+@@ -104,13 +106,15 @@ static inline void __native_flush_tlb(vo
+ 
+ static inline void __native_flush_tlb_global(void)
+ {
+-#ifdef CONFIG_KAISER
+-	/* Globals are not used at all */
+-	__native_flush_tlb();
+-#else
+ 	unsigned long flags;
+ 	unsigned long cr4;
+ 
++	if (kaiser_enabled) {
++		/* Globals are not used at all */
++		__native_flush_tlb();
++		return;
++	}
++
+ 	if (this_cpu_has(X86_FEATURE_INVPCID)) {
+ 		/*
+ 		 * Using INVPCID is considerably faster than a pair of writes
+@@ -130,13 +134,16 @@ static inline void __native_flush_tlb_gl
+ 	raw_local_irq_save(flags);
+ 
+ 	cr4 = native_read_cr4();
+-	/* clear PGE */
+-	native_write_cr4(cr4 & ~X86_CR4_PGE);
+-	/* write old PGE again and flush TLBs */
+-	native_write_cr4(cr4);
++	if (cr4 & X86_CR4_PGE) {
++		/* clear PGE and flush TLB of all entries */
++		native_write_cr4(cr4 & ~X86_CR4_PGE);
++		/* restore PGE as it was before */
++		native_write_cr4(cr4);
++	} else {
++		native_write_cr3(native_read_cr3());
++	}
+ 
+ 	raw_local_irq_restore(flags);
+-#endif
+ }
+ 
+ static inline void __native_flush_tlb_single(unsigned long addr)
+@@ -151,7 +158,7 @@ static inline void __native_flush_tlb_si
+ 	 */
+ 
+ 	if (!this_cpu_has(X86_FEATURE_INVPCID_SINGLE)) {
+-		if (this_cpu_has(X86_FEATURE_PCID))
++		if (kaiser_enabled && this_cpu_has(X86_FEATURE_PCID))
+ 			kaiser_flush_tlb_on_return_to_user();
+ 		asm volatile("invlpg (%0)" ::"r" (addr) : "memory");
+ 		return;
+@@ -166,9 +173,9 @@ static inline void __native_flush_tlb_si
+ 	 * Make sure to do only a single invpcid when KAISER is
+ 	 * disabled and we have only a single ASID.
+ 	 */
+-	if (X86_CR3_PCID_ASID_KERN != X86_CR3_PCID_ASID_USER)
+-		invpcid_flush_one(X86_CR3_PCID_ASID_KERN, addr);
+-	invpcid_flush_one(X86_CR3_PCID_ASID_USER, addr);
++	if (kaiser_enabled)
++		invpcid_flush_one(X86_CR3_PCID_ASID_USER, addr);
++	invpcid_flush_one(X86_CR3_PCID_ASID_KERN, addr);
+ }
+ 
+ static inline void __flush_tlb_all(void)
+--- a/arch/x86/kernel/cpu/common.c
++++ b/arch/x86/kernel/cpu/common.c
+@@ -171,6 +171,20 @@ static int __init x86_pcid_setup(char *s
+ 	return 1;
+ }
+ __setup("nopcid", x86_pcid_setup);
++
++static int __init x86_nokaiser_setup(char *s)
++{
++	/* nokaiser doesn't accept parameters */
++	if (s)
++		return -EINVAL;
++#ifdef CONFIG_KAISER
++	kaiser_enabled = 0;
++	setup_clear_cpu_cap(X86_FEATURE_KAISER);
++	pr_info("nokaiser: KAISER feature disabled\n");
++#endif
++	return 0;
++}
++early_param("nokaiser", x86_nokaiser_setup);
+ #endif
+ 
+ static int __init x86_noinvpcid_setup(char *s)
+@@ -313,7 +327,8 @@ static __cpuinit void setup_smep(struct
+ static void setup_pcid(struct cpuinfo_x86 *c)
+ {
+ 	if (cpu_has(c, X86_FEATURE_PCID)) {
+-		if (cpu_has(c, X86_FEATURE_PGE) && IS_ENABLED(CONFIG_X86_64)) {
++		if (IS_ENABLED(CONFIG_X86_64) &&
++                    (cpu_has(c, X86_FEATURE_PGE) || kaiser_enabled)) {
+ 			/*
+ 			 * Regardless of whether PCID is enumerated, the
+ 			 * SDM says that it can't be enabled in 32-bit mode.
+@@ -680,6 +695,10 @@ void __cpuinit get_cpu_cap(struct cpuinf
+ 		c->x86_power = cpuid_edx(0x80000007);
+ 
+ 	init_scattered_cpuid_features(c);
++#ifdef CONFIG_KAISER
++	if (kaiser_enabled)
++		set_cpu_cap(c, X86_FEATURE_KAISER);
++#endif
+ }
+ 
+ static void __cpuinit identify_cpu_without_cpuid(struct cpuinfo_x86 *c)
+@@ -1229,6 +1248,15 @@ void __cpuinit cpu_init(void)
+ 	int cpu;
+ 	int i;
+ 
++	if (!kaiser_enabled) {
++		/*
++		 * secondary_startup_64() deferred setting PGE in cr4:
++		 * init_memory_mapping() sets it on the boot cpu,
++		 * but it needs to be set on each secondary cpu.
++		 */
++		set_in_cr4(X86_CR4_PGE);
++	}
++
+ 	cpu = stack_smp_processor_id();
+ 	t = &per_cpu(init_tss, cpu);
+ 	oist = &per_cpu(orig_ist, cpu);
+--- a/arch/x86/kernel/entry_64.S
++++ b/arch/x86/kernel/entry_64.S
+@@ -56,6 +56,8 @@
+ #include <asm/ftrace.h>
+ #include <asm/percpu.h>
+ #include <asm/pgtable_types.h>
++#include <asm/alternative-asm.h>
++#include <asm/cpufeature.h>
+ #include <asm/kaiser.h>
+ 
+ /* Avoid __ASSEMBLER__'ifying <linux/audit.h> just for this.  */
+@@ -404,7 +406,7 @@ ENTRY(save_paranoid)
+ 	 * unconditionally, but we need to find out whether the reverse
+ 	 * should be done on return (conveyed to paranoid_exit in %ebx).
+ 	 */
+-	movq	%cr3, %rax
++	ALTERNATIVE "jmp 2f", "movq %cr3, %rax", X86_FEATURE_KAISER
+ 	testl	$KAISER_SHADOW_PGD_OFFSET, %eax
+ 	jz	2f
+ 	orl	$2, %ebx
+@@ -1428,6 +1430,7 @@ paranoid_kernel:
+ 	movq	%r12, %rbx		/* restore after paranoid_userspace */
+ 	TRACE_IRQS_IRETQ 0
+ #ifdef CONFIG_KAISER
++	/* No ALTERNATIVE for X86_FEATURE_KAISER: save_paranoid sets %ebx */
+ 	testl	$2, %ebx		/* SWITCH_USER_CR3 needed? */
+ 	jz	paranoid_exit_no_switch
+ 	SWITCH_USER_CR3
+@@ -1597,6 +1600,7 @@ ENTRY(nmi)
+ nmi_kernel:
+ 	movq	%r12, %rbx		/* restore after nmi_userspace */
+ #ifdef CONFIG_KAISER
++	/* No ALTERNATIVE for X86_FEATURE_KAISER: save_paranoid sets %ebx */
+ 	testl	$2, %ebx		/* SWITCH_USER_CR3 needed? */
+ 	jz	nmi_exit_no_switch
+ 	SWITCH_USER_CR3
+--- a/arch/x86/kernel/espfix_64.c
++++ b/arch/x86/kernel/espfix_64.c
+@@ -135,9 +135,10 @@ void __init init_espfix_bsp(void)
+ 	 * area to ensure it is mapped into the shadow user page
+ 	 * tables.
+ 	 */
+-	if (IS_ENABLED(CONFIG_KAISER))
++	if (kaiser_enabled) {
+ 		set_pgd(native_get_shadow_pgd(pgd_p),
+ 			__pgd(_KERNPG_TABLE | __pa((pud_t *)espfix_pud_page)));
++	}
+ 
+ 	/* Randomize the locations */
+ 	init_espfix_random();
+--- a/arch/x86/kernel/head_64.S
++++ b/arch/x86/kernel/head_64.S
+@@ -166,8 +166,8 @@ ENTRY(secondary_startup_64)
+ 	/* Sanitize CPU configuration */
+ 	call verify_cpu
+ 
+-	/* Enable PAE mode and PGE */
+-	movl	$(X86_CR4_PAE | X86_CR4_PGE), %eax
++	/* Enable PAE and PSE, but defer PGE until kaiser_enabled is decided */
++	movl	$(X86_CR4_PAE | X86_CR4_PSE), %eax
+ 	movq	%rax, %cr4
+ 
+ 	/* Setup early boot stage 4 level pagetables. */
+--- a/arch/x86/mm/init.c
++++ b/arch/x86/mm/init.c
+@@ -161,7 +161,7 @@ unsigned long __init_refok init_memory_m
+ 		set_in_cr4(X86_CR4_PSE);
+ 
+ 	/* Enable PGE if available */
+-	if (cpu_has_pge) {
++	if (cpu_has_pge && !kaiser_enabled) {
+ 		set_in_cr4(X86_CR4_PGE);
+ 		__supported_pte_mask |= _PAGE_GLOBAL;
+ 	}
+--- a/arch/x86/mm/init_64.c
++++ b/arch/x86/mm/init_64.c
+@@ -312,6 +312,16 @@ void __init cleanup_highmap(void)
+ 			continue;
+ 		if (vaddr < (unsigned long) _text || vaddr > end)
+ 			set_pmd(pmd, __pmd(0));
++		else if (kaiser_enabled) {
++			/*
++			 * level2_kernel_pgt is initialized with _PAGE_GLOBAL:
++			 * clear that now.  This is not important, so long as
++			 * CR4.PGE remains clear, but it removes an anomaly.
++			 * Physical mapping setup below avoids _PAGE_GLOBAL
++			 * by use of massage_pgprot() inside pfn_pte() etc.
++			 */
++			set_pmd(pmd, pmd_clear_flags(*pmd, _PAGE_GLOBAL));
++		}
+ 	}
+ }
+ 
+--- a/arch/x86/mm/kaiser.c
++++ b/arch/x86/mm/kaiser.c
+@@ -21,7 +21,9 @@ extern struct mm_struct init_mm;
+ #include <asm/pgalloc.h>
+ #include <asm/desc.h>
+ 
+-#ifdef CONFIG_KAISER
++int kaiser_enabled __read_mostly = 1;
++EXPORT_SYMBOL(kaiser_enabled);	/* for inlined TLB flush functions */
++
+ DEFINE_PER_CPU_USER_MAPPED(unsigned long, unsafe_stack_register_backup);
+ 
+ /*
+@@ -165,8 +167,8 @@ static pte_t *kaiser_pagetable_walk(unsi
+ 	return pte_offset_kernel(pmd, address);
+ }
+ 
+-int kaiser_add_user_map(const void *__start_addr, unsigned long size,
+-			unsigned long flags)
++static int kaiser_add_user_map(const void *__start_addr, unsigned long size,
++			       unsigned long flags)
+ {
+ 	int ret = 0;
+ 	pte_t *pte;
+@@ -175,6 +177,15 @@ int kaiser_add_user_map(const void *__st
+ 	unsigned long end_addr = PAGE_ALIGN(start_addr + size);
+ 	unsigned long target_address;
+ 
++	/*
++	 * It is convenient for callers to pass in __PAGE_KERNEL etc,
++	 * and there is no actual harm from setting _PAGE_GLOBAL, so
++	 * long as CR4.PGE is not set.  But it is nonetheless troubling
++	 * to see Kaiser itself setting _PAGE_GLOBAL (now that "nokaiser"
++	 * requires that not to be #defined to 0): so mask it off here.
++	 */
++	flags &= ~_PAGE_GLOBAL;
++
+ 	if (flags & _PAGE_USER)
+ 		BUG_ON(address < FIXADDR_START || end_addr >= FIXADDR_TOP);
+ 
+@@ -264,6 +275,8 @@ void __init kaiser_init(void)
+ {
+ 	int cpu;
+ 
++	if (!kaiser_enabled)
++		return;
+ 	kaiser_init_all_pgds();
+ 
+ 	for_each_possible_cpu(cpu) {
+@@ -303,6 +316,8 @@ void __init kaiser_init(void)
+ /* Add a mapping to the shadow mapping, and synchronize the mappings */
+ int kaiser_add_mapping(unsigned long addr, unsigned long size, unsigned long flags)
+ {
++	if (!kaiser_enabled)
++		return 0;
+ 	return kaiser_add_user_map((const void *)addr, size, flags);
+ }
+ 
+@@ -312,6 +327,8 @@ void kaiser_remove_mapping(unsigned long
+ 	unsigned long addr;
+ 	pte_t *pte;
+ 
++	if (!kaiser_enabled)
++		return;
+ 	for (addr = start; addr < end; addr += PAGE_SIZE) {
+ 		pte = kaiser_pagetable_walk(addr);
+ 		if (pte)
+@@ -333,6 +350,8 @@ static inline bool is_userspace_pgd(pgd_
+ 
+ pgd_t kaiser_set_shadow_pgd(pgd_t *pgdp, pgd_t pgd)
+ {
++	if (!kaiser_enabled)
++		return pgd;
+ 	/*
+ 	 * Do we need to also populate the shadow pgd?  Check _PAGE_USER to
+ 	 * skip cases like kexec and EFI which make temporary low mappings.
+@@ -389,4 +408,3 @@ void kaiser_flush_tlb_on_return_to_user(
+ 			X86_CR3_PCID_USER_FLUSH | KAISER_SHADOW_PGD_OFFSET);
+ }
+ EXPORT_SYMBOL(kaiser_flush_tlb_on_return_to_user);
+-#endif /* CONFIG_KAISER */
+--- a/arch/x86/mm/pgtable.c
++++ b/arch/x86/mm/pgtable.c
+@@ -253,16 +253,12 @@ static void pgd_prepopulate_pmd(struct m
+ 	}
+ }
+ 
+-#ifdef CONFIG_KAISER
+ /*
+- * Instead of one pmd, we aquire two pmds.  Being order-1, it is
++ * Instead of one pgd, Kaiser acquires two pgds.  Being order-1, it is
+  * both 8k in size and 8k-aligned.  That lets us just flip bit 12
+  * in a pointer to swap between the two 4k halves.
+  */
+-#define PGD_ALLOCATION_ORDER 1
+-#else
+-#define PGD_ALLOCATION_ORDER 0
+-#endif
++#define PGD_ALLOCATION_ORDER	kaiser_enabled
+ 
+ static inline pgd_t *_pgd_alloc(void)
+ {
+--- a/arch/x86/mm/tlb.c
++++ b/arch/x86/mm/tlb.c
+@@ -21,8 +21,7 @@ static void load_new_mm_cr3(pgd_t *pgdir
+ {
+ 	unsigned long new_mm_cr3 = __pa(pgdir);
+ 
+-#ifdef CONFIG_KAISER
+-	if (this_cpu_has(X86_FEATURE_PCID)) {
++	if (kaiser_enabled && this_cpu_has(X86_FEATURE_PCID)) {
+ 		/*
+ 		 * We reuse the same PCID for different tasks, so we must
+ 		 * flush all the entries for the PCID out when we change tasks.
+@@ -39,7 +38,6 @@ static void load_new_mm_cr3(pgd_t *pgdir
+ 		new_mm_cr3 |= X86_CR3_PCID_KERN_FLUSH;
+ 		kaiser_flush_tlb_on_return_to_user();
+ 	}
+-#endif /* CONFIG_KAISER */
+ 
+ 	/*
+ 	 * Caution: many callers of this function expect
diff --git a/debian/patches/bugfix/all/kpti/kaiser-alloc_ldt_struct-use-get_zeroed_page.patch b/debian/patches/bugfix/all/kpti/kaiser-alloc_ldt_struct-use-get_zeroed_page.patch
new file mode 100644
index 0000000..460d4c3
--- /dev/null
+++ b/debian/patches/bugfix/all/kpti/kaiser-alloc_ldt_struct-use-get_zeroed_page.patch
@@ -0,0 +1,28 @@
+From: Hugh Dickins <hughd at google.com>
+Date: Sun, 17 Dec 2017 19:53:01 -0800
+Subject: kaiser: alloc_ldt_struct() use get_zeroed_page()
+
+Change the 3.2.96 and 3.18.72 alloc_ldt_struct() to allocate its entries
+with get_zeroed_page(), as 4.3 onwards does since f454b4788613 ("x86/ldt:
+Fix small LDT allocation for Xen").  This then matches the free_page()
+I had misported in __free_ldt_struct(), and fixes the
+"BUG: Bad page state in process ldt_gdt_32 ... flags: 0x80(slab)"
+reported by Kees Cook and Jiri Kosina, and analysed by Jiri.
+
+Signed-off-by: Hugh Dickins <hughd at google.com>
+Signed-off-by: Ben Hutchings <ben at decadent.org.uk>
+---
+ arch/x86/kernel/ldt.c | 2 +-
+ 1 file changed, 1 insertion(+), 1 deletion(-)
+
+--- a/arch/x86/kernel/ldt.c
++++ b/arch/x86/kernel/ldt.c
+@@ -70,7 +70,7 @@ static struct ldt_struct *alloc_ldt_stru
+ 	if (alloc_size > PAGE_SIZE)
+ 		new_ldt->entries = vzalloc(alloc_size);
+ 	else
+-		new_ldt->entries = kzalloc(PAGE_SIZE, GFP_KERNEL);
++		new_ldt->entries = (void *)get_zeroed_page(GFP_KERNEL);
+ 
+ 	if (!new_ldt->entries) {
+ 		kfree(new_ldt);
diff --git a/debian/patches/bugfix/all/kpti/kaiser-asm-tlbflush.h-handle-nopge-at-lower-level.patch b/debian/patches/bugfix/all/kpti/kaiser-asm-tlbflush.h-handle-nopge-at-lower-level.patch
new file mode 100644
index 0000000..fb60fce
--- /dev/null
+++ b/debian/patches/bugfix/all/kpti/kaiser-asm-tlbflush.h-handle-nopge-at-lower-level.patch
@@ -0,0 +1,79 @@
+From: Hugh Dickins <hughd at google.com>
+Date: Sat, 4 Nov 2017 18:23:24 -0700
+Subject: kaiser: asm/tlbflush.h handle noPGE at lower level
+
+I found asm/tlbflush.h too twisty, and think it safer not to avoid
+__native_flush_tlb_global_irq_disabled() in the kaiser_enabled case,
+but instead let it handle kaiser_enabled along with cr3: it can just
+use __native_flush_tlb() for that, no harm in re-disabling preemption.
+
+(This is not the same change as Kirill and Dave have suggested for
+upstream, flipping PGE in cr4: that's neat, but needs a cpu_has_pge
+check; cr3 is enough for kaiser, and thought to be cheaper than cr4.)
+
+Also delete the X86_FEATURE_INVPCID invpcid_flush_all_nonglobals()
+preference from __native_flush_tlb(): unlike the invpcid_flush_all()
+preference in __native_flush_tlb_global(), it's not seen in upstream
+4.14, and was recently reported to be surprisingly slow.
+
+(cherry picked from Change-Id: I0da819a797ff46bca6590040b6480178dff6ba1e)
+
+Signed-off-by: Hugh Dickins <hughd at google.com>
+Signed-off-by: Ben Hutchings <ben at decadent.org.uk>
+---
+ arch/x86/include/asm/tlbflush.h | 23 +++--------------------
+ 1 file changed, 3 insertions(+), 20 deletions(-)
+
+--- a/arch/x86/include/asm/tlbflush.h
++++ b/arch/x86/include/asm/tlbflush.h
+@@ -84,14 +84,6 @@ static inline void kaiser_flush_tlb_on_r
+ 
+ static inline void __native_flush_tlb(void)
+ {
+-	if (this_cpu_has(X86_FEATURE_INVPCID)) {
+-		/*
+-		 * Note, this works with CR4.PCIDE=0 or 1.
+-		 */
+-		invpcid_flush_all_nonglobals();
+-		return;
+-	}
+-
+ 	/*
+ 	 * If current->mm == NULL then we borrow a mm which may change during a
+ 	 * task switch and therefore we must not be preempted while we write CR3
+@@ -109,12 +101,6 @@ static inline void __native_flush_tlb_gl
+ 	unsigned long flags;
+ 	unsigned long cr4;
+ 
+-	if (kaiser_enabled) {
+-		/* Globals are not used at all */
+-		__native_flush_tlb();
+-		return;
+-	}
+-
+ 	if (this_cpu_has(X86_FEATURE_INVPCID)) {
+ 		/*
+ 		 * Using INVPCID is considerably faster than a pair of writes
+@@ -140,7 +126,8 @@ static inline void __native_flush_tlb_gl
+ 		/* restore PGE as it was before */
+ 		native_write_cr4(cr4);
+ 	} else {
+-		native_write_cr3(native_read_cr3());
++		/* do it with cr3, letting kaiser flush user PCID */
++		__native_flush_tlb();
+ 	}
+ 
+ 	raw_local_irq_restore(flags);
+@@ -180,11 +167,7 @@ static inline void __native_flush_tlb_si
+ 
+ static inline void __flush_tlb_all(void)
+ {
+-	if (cpu_has_pge)
+-		__flush_tlb_global();
+-	else
+-		__flush_tlb();
+-
++	__flush_tlb_global();
+ 	/*
+ 	 * Note: if we somehow had PCID but not PGE, then this wouldn't work --
+ 	 * we'd end up flushing kernel translations for the current ASID but
diff --git a/debian/patches/bugfix/all/kpti/kaiser-disabled-on-xen-pv.patch b/debian/patches/bugfix/all/kpti/kaiser-disabled-on-xen-pv.patch
new file mode 100644
index 0000000..50676f9
--- /dev/null
+++ b/debian/patches/bugfix/all/kpti/kaiser-disabled-on-xen-pv.patch
@@ -0,0 +1,48 @@
+From: Jiri Kosina <jkosina at suse.cz>
+Date: Tue, 2 Jan 2018 14:19:49 +0100
+Subject: kaiser: disabled on Xen PV
+
+Kaiser cannot be used on paravirtualized MMUs (namely reading and writing CR3).
+This does not work with KAISER as the CR3 switch from and to user space PGD
+would require to map the whole XEN_PV machinery into both.
+
+More importantly, enabling KAISER on Xen PV doesn't make too much sense, as PV
+guests use distinct %cr3 values for kernel and user already.
+
+Signed-off-by: Jiri Kosina <jkosina at suse.cz>
+Signed-off-by: Greg Kroah-Hartman <gregkh at linuxfoundation.org>
+[bwh: Backported to 3.2: use xen_pv_domain()]
+Signed-off-by: Ben Hutchings <ben at decadent.org.uk>
+---
+ arch/x86/mm/kaiser.c | 6 ++++++
+ 1 file changed, 6 insertions(+)
+
+--- a/arch/x86/mm/kaiser.c
++++ b/arch/x86/mm/kaiser.c
+@@ -11,6 +11,7 @@
+ #include <linux/module.h>
+ #include <linux/uaccess.h>
+ #include <linux/ftrace.h>
++#include <xen/xen.h>
+ 
+ extern struct mm_struct init_mm;
+ 
+@@ -270,6 +271,9 @@ void __init kaiser_check_boottime_disabl
+ 	char arg[5];
+ 	int ret;
+ 
++	if (xen_pv_domain())
++		goto silent_disable;
++
+ 	ret = cmdline_find_option(boot_command_line, "pti", arg, sizeof(arg));
+ 	if (ret > 0) {
+ 		if (!strncmp(arg, "on", 2))
+@@ -294,6 +298,8 @@ enable:
+ 
+ disable:
+ 	pr_info("Kernel/User page tables isolation: disabled\n");
++
++silent_disable:
+ 	kaiser_enabled = 0;
+ 	setup_clear_cpu_cap(X86_FEATURE_KAISER);
+ }
diff --git a/debian/patches/bugfix/all/kpti/kaiser-kaiser_flush_tlb_on_return_to_user-check-pcid.patch b/debian/patches/bugfix/all/kpti/kaiser-kaiser_flush_tlb_on_return_to_user-check-pcid.patch
new file mode 100644
index 0000000..779f49c
--- /dev/null
+++ b/debian/patches/bugfix/all/kpti/kaiser-kaiser_flush_tlb_on_return_to_user-check-pcid.patch
@@ -0,0 +1,83 @@
+From: Hugh Dickins <hughd at google.com>
+Date: Sat, 4 Nov 2017 18:43:06 -0700
+Subject: kaiser: kaiser_flush_tlb_on_return_to_user() check PCID
+
+Let kaiser_flush_tlb_on_return_to_user() do the X86_FEATURE_PCID
+check, instead of each caller doing it inline first: nobody needs
+to optimize for the noPCID case, it's clearer this way, and better
+suits later changes.  Replace those no-op X86_CR3_PCID_KERN_FLUSH lines
+by a BUILD_BUG_ON() in load_new_mm_cr3(), in case something changes.
+
+(cherry picked from Change-Id: I9b528ed9d7c1ae4a3b4738c2894ee1740b6fb0b9)
+
+Signed-off-by: Hugh Dickins <hughd at google.com>
+Signed-off-by: Ben Hutchings <ben at decadent.org.uk>
+---
+ arch/x86/include/asm/tlbflush.h | 4 ++--
+ arch/x86/mm/kaiser.c            | 6 +++---
+ arch/x86/mm/tlb.c               | 8 ++++----
+ 3 files changed, 9 insertions(+), 9 deletions(-)
+
+--- a/arch/x86/include/asm/tlbflush.h
++++ b/arch/x86/include/asm/tlbflush.h
+@@ -90,7 +90,7 @@ static inline void __native_flush_tlb(vo
+ 	 * back:
+ 	 */
+ 	preempt_disable();
+-	if (kaiser_enabled && this_cpu_has(X86_FEATURE_PCID))
++	if (kaiser_enabled)
+ 		kaiser_flush_tlb_on_return_to_user();
+ 	native_write_cr3(native_read_cr3());
+ 	preempt_enable();
+@@ -145,7 +145,7 @@ static inline void __native_flush_tlb_si
+ 	 */
+ 
+ 	if (!this_cpu_has(X86_FEATURE_INVPCID_SINGLE)) {
+-		if (kaiser_enabled && this_cpu_has(X86_FEATURE_PCID))
++		if (kaiser_enabled)
+ 			kaiser_flush_tlb_on_return_to_user();
+ 		asm volatile("invlpg (%0)" ::"r" (addr) : "memory");
+ 		return;
+--- a/arch/x86/mm/kaiser.c
++++ b/arch/x86/mm/kaiser.c
+@@ -429,12 +429,12 @@ void kaiser_setup_pcid(void)
+ 
+ /*
+  * Make a note that this cpu will need to flush USER tlb on return to user.
+- * Caller checks whether this_cpu_has(X86_FEATURE_PCID) before calling:
+- * if cpu does not, then the NOFLUSH bit will never have been set.
++ * If cpu does not have PCID, then the NOFLUSH bit will never have been set.
+  */
+ void kaiser_flush_tlb_on_return_to_user(void)
+ {
+-	this_cpu_write(x86_cr3_pcid_user,
++	if (this_cpu_has(X86_FEATURE_PCID))
++		this_cpu_write(x86_cr3_pcid_user,
+ 			X86_CR3_PCID_USER_FLUSH | KAISER_SHADOW_PGD_OFFSET);
+ }
+ EXPORT_SYMBOL(kaiser_flush_tlb_on_return_to_user);
+--- a/arch/x86/mm/tlb.c
++++ b/arch/x86/mm/tlb.c
+@@ -21,7 +21,7 @@ static void load_new_mm_cr3(pgd_t *pgdir
+ {
+ 	unsigned long new_mm_cr3 = __pa(pgdir);
+ 
+-	if (kaiser_enabled && this_cpu_has(X86_FEATURE_PCID)) {
++	if (kaiser_enabled) {
+ 		/*
+ 		 * We reuse the same PCID for different tasks, so we must
+ 		 * flush all the entries for the PCID out when we change tasks.
+@@ -32,10 +32,10 @@ static void load_new_mm_cr3(pgd_t *pgdir
+ 		 * do it here, but can only be used if X86_FEATURE_INVPCID is
+ 		 * available - and many machines support pcid without invpcid.
+ 		 *
+-		 * The line below is a no-op: X86_CR3_PCID_KERN_FLUSH is now 0;
+-		 * but keep that line in there in case something changes.
++		 * If X86_CR3_PCID_KERN_FLUSH actually added something, then it
++		 * would be needed in the write_cr3() below - if PCIDs enabled.
+ 		 */
+-		new_mm_cr3 |= X86_CR3_PCID_KERN_FLUSH;
++		BUILD_BUG_ON(X86_CR3_PCID_KERN_FLUSH);
+ 		kaiser_flush_tlb_on_return_to_user();
+ 	}
+ 
diff --git a/debian/patches/bugfix/all/kpti/kaiser-kernel-address-isolation.patch b/debian/patches/bugfix/all/kpti/kaiser-kernel-address-isolation.patch
new file mode 100644
index 0000000..bf7b417
--- /dev/null
+++ b/debian/patches/bugfix/all/kpti/kaiser-kernel-address-isolation.patch
@@ -0,0 +1,1909 @@
+From: Hugh Dickins <hughd at google.com>
+Date: Mon, 11 Dec 2017 17:59:50 -0800
+Subject: KAISER: Kernel Address Isolation
+
+This patch introduces our implementation of KAISER (Kernel Address
+Isolation to have Side-channels Efficiently Removed), a kernel isolation
+technique to close hardware side channels on kernel address information.
+
+More information about the original patch can be found at:
+https://github.com/IAIK/KAISER
+http://marc.info/?l=linux-kernel&m=149390087310405&w=2
+
+Daniel Gruss <daniel.gruss at iaik.tugraz.at>
+Richard Fellner <richard.fellner at student.tugraz.at>
+Michael Schwarz <michael.schwarz at iaik.tugraz.at>
+<clementine.maurice at iaik.tugraz.at>
+<moritz.lipp at iaik.tugraz.at>
+
+That original was then developed further by
+Dave Hansen <dave.hansen at intel.com>
+Hugh Dickins <hughd at google.com>
+then others after this snapshot.
+
+This combined patch for 3.2.96 was derived from hughd's patches below
+for 3.18.72, in 2017-12-04's kaiser-3.18.72.tar; except for the last,
+which was sent in 2017-12-09's nokaiser-3.18.72.tar.  They have been
+combined in order to minimize the effort of rebasing: most of the
+patches in the 3.18.72 series were small fixes and cleanups and
+enhancements to three large patches.  About the only new work in this
+backport is a simple reimplementation of kaiser_remove_mapping():
+since mm/pageattr.c changed a lot between 3.2 and 3.18, and the
+mods there for Kaiser never seemed necessary.
+
+KAISER: Kernel Address Isolation
+kaiser: merged update
+kaiser: do not set _PAGE_NX on pgd_none
+kaiser: stack map PAGE_SIZE at THREAD_SIZE-PAGE_SIZE
+kaiser: fix build and FIXME in alloc_ldt_struct()
+kaiser: KAISER depends on SMP
+kaiser: fix regs to do_nmi() ifndef CONFIG_KAISER
+kaiser: fix perf crashes
+kaiser: ENOMEM if kaiser_pagetable_walk() NULL
+kaiser: tidied up asm/kaiser.h somewhat
+kaiser: tidied up kaiser_add/remove_mapping slightly
+kaiser: kaiser_remove_mapping() move along the pgd
+kaiser: align addition to x86/mm/Makefile
+kaiser: cleanups while trying for gold link
+kaiser: name that 0x1000 KAISER_SHADOW_PGD_OFFSET
+kaiser: delete KAISER_REAL_SWITCH option
+kaiser: vmstat show NR_KAISERTABLE as nr_overhead
+kaiser: enhanced by kernel and user PCIDs
+kaiser: load_new_mm_cr3() let SWITCH_USER_CR3 flush user
+kaiser: PCID 0 for kernel and 128 for user
+kaiser: x86_cr3_pcid_noflush and x86_cr3_pcid_user
+kaiser: paranoid_entry pass cr3 need to paranoid_exit
+kaiser: _pgd_alloc() without __GFP_REPEAT to avoid stalls
+kaiser: fix unlikely error in alloc_ldt_struct()
+kaiser: drop is_atomic arg to kaiser_pagetable_walk()
+
+Signed-off-by: Hugh Dickins <hughd at google.com>
+[bwh:
+ - Fixed the #undef in arch/x86/boot/compressed/misc.h
+ - Add missing #include in arch/x86/mm/kaiser.c]
+Signed-off-by: Ben Hutchings <ben at decadent.org.uk>
+[bwh: For wheezy, renumber X86_FEATURE_INVPCID_SINGLE to avoid collision with
+ X86_FEATURE_HW_PSTATE]
+---
+ arch/x86/boot/compressed/misc.h           |   1 +
+ arch/x86/ia32/ia32entry.S                 |   7 +
+ arch/x86/include/asm/cpufeature.h         |   1 +
+ arch/x86/include/asm/desc.h               |   2 +-
+ arch/x86/include/asm/hw_irq.h             |   2 +-
+ arch/x86/include/asm/kaiser.h             | 126 ++++++++++
+ arch/x86/include/asm/pgtable.h            |  18 +-
+ arch/x86/include/asm/pgtable_64.h         |  29 ++-
+ arch/x86/include/asm/pgtable_types.h      |  33 ++-
+ arch/x86/include/asm/processor-flags.h    |   2 +
+ arch/x86/include/asm/processor.h          |   2 +-
+ arch/x86/include/asm/tlbflush.h           |  64 ++++-
+ arch/x86/kernel/cpu/common.c              |  18 +-
+ arch/x86/kernel/cpu/perf_event_intel_ds.c |  54 ++++-
+ arch/x86/kernel/entry_64.S                | 117 +++++++--
+ arch/x86/kernel/espfix_64.c               |   9 +
+ arch/x86/kernel/head_64.S                 |  25 +-
+ arch/x86/kernel/init_task.c               |   2 +-
+ arch/x86/kernel/irqinit.c                 |   2 +-
+ arch/x86/kernel/ldt.c                     |  25 +-
+ arch/x86/kernel/process_64.c              |   2 +-
+ arch/x86/mm/Makefile                      |   1 +
+ arch/x86/mm/kaiser.c                      | 382 ++++++++++++++++++++++++++++++
+ arch/x86/mm/pgtable.c                     |  31 ++-
+ arch/x86/mm/tlb.c                         |  48 +++-
+ include/asm-generic/vmlinux.lds.h         |   7 +
+ include/linux/kaiser.h                    |  52 ++++
+ include/linux/mmzone.h                    |   3 +-
+ include/linux/percpu-defs.h               |  32 ++-
+ init/main.c                               |   2 +
+ kernel/fork.c                             |   6 +
+ mm/vmstat.c                               |   1 +
+ security/Kconfig                          |  10 +
+ 33 files changed, 1049 insertions(+), 67 deletions(-)
+ create mode 100644 arch/x86/include/asm/kaiser.h
+ create mode 100644 arch/x86/mm/kaiser.c
+ create mode 100644 include/linux/kaiser.h
+
+--- a/arch/x86/boot/compressed/misc.h
++++ b/arch/x86/boot/compressed/misc.h
+@@ -7,6 +7,7 @@
+  * we just keep it from happening
+  */
+ #undef CONFIG_PARAVIRT
++#undef CONFIG_KAISER
+ #ifdef CONFIG_X86_32
+ #define _ASM_X86_DESC_H 1
+ #endif
+--- a/arch/x86/ia32/ia32entry.S
++++ b/arch/x86/ia32/ia32entry.S
+@@ -12,6 +12,8 @@
+ #include <asm/ia32_unistd.h>	
+ #include <asm/thread_info.h>	
+ #include <asm/segment.h>
++#include <asm/pgtable_types.h>
++#include <asm/kaiser.h>
+ #include <asm/irqflags.h>
+ #include <linux/linkage.h>
+ 
+@@ -120,6 +122,7 @@ ENTRY(ia32_sysenter_target)
+ 	CFI_DEF_CFA	rsp,0
+ 	CFI_REGISTER	rsp,rbp
+ 	SWAPGS_UNSAFE_STACK
++	SWITCH_KERNEL_CR3_NO_STACK
+ 	movq	PER_CPU_VAR(kernel_stack), %rsp
+ 	addq	$(KERNEL_STACK_OFFSET),%rsp
+ 	/*
+@@ -183,6 +186,7 @@ sysexit_from_sys_call:
+ 	popq_cfi %rcx				/* User %esp */
+ 	CFI_REGISTER rsp,rcx
+ 	TRACE_IRQS_ON
++	SWITCH_USER_CR3
+ 	ENABLE_INTERRUPTS_SYSEXIT32
+ 
+ #ifdef CONFIG_AUDITSYSCALL
+@@ -281,6 +285,7 @@ ENTRY(ia32_cstar_target)
+ 	CFI_REGISTER	rip,rcx
+ 	/*CFI_REGISTER	rflags,r11*/
+ 	SWAPGS_UNSAFE_STACK
++	SWITCH_KERNEL_CR3_NO_STACK
+ 	movl	%esp,%r8d
+ 	CFI_REGISTER	rsp,r8
+ 	movq	PER_CPU_VAR(kernel_stack),%rsp
+@@ -337,6 +342,7 @@ sysretl_from_sys_call:
+ 	xorq	%r9,%r9
+ 	xorq	%r8,%r8
+ 	TRACE_IRQS_ON
++	SWITCH_USER_CR3
+ 	movl RSP-ARGOFFSET(%rsp),%esp
+ 	CFI_RESTORE rsp
+ 	USERGS_SYSRET32
+@@ -409,6 +415,7 @@ ENTRY(ia32_syscall)
+ 	CFI_REL_OFFSET	rip,RIP-RIP
+ 	PARAVIRT_ADJUST_EXCEPTION_FRAME
+ 	SWAPGS
++	SWITCH_KERNEL_CR3_NO_STACK
+ 	/*
+ 	 * No need to follow this irqs on/off section: the syscall
+ 	 * disabled irqs and here we enable it straight after entry:
+--- a/arch/x86/include/asm/cpufeature.h
++++ b/arch/x86/include/asm/cpufeature.h
+@@ -177,6 +177,7 @@
+ #define X86_FEATURE_PTS		(7*32+ 6) /* Intel Package Thermal Status */
+ #define X86_FEATURE_DTHERM	(7*32+ 7) /* Digital Thermal Sensor */
+ #define X86_FEATURE_HW_PSTATE	(7*32+ 8) /* AMD HW-PState */
++#define X86_FEATURE_INVPCID_SINGLE (7*32+ 9) /* Effectively INVPCID && CR4.PCIDE=1 */
+ 
+ /* Virtualization flags: Linux defined, word 8 */
+ #define X86_FEATURE_TPR_SHADOW  (8*32+ 0) /* Intel TPR Shadow */
+--- a/arch/x86/include/asm/desc.h
++++ b/arch/x86/include/asm/desc.h
+@@ -40,7 +40,7 @@ struct gdt_page {
+ 	struct desc_struct gdt[GDT_ENTRIES];
+ } __attribute__((aligned(PAGE_SIZE)));
+ 
+-DECLARE_PER_CPU_PAGE_ALIGNED(struct gdt_page, gdt_page);
++DECLARE_PER_CPU_PAGE_ALIGNED_USER_MAPPED(struct gdt_page, gdt_page);
+ 
+ static inline struct desc_struct *get_cpu_gdt_table(unsigned int cpu)
+ {
+--- a/arch/x86/include/asm/hw_irq.h
++++ b/arch/x86/include/asm/hw_irq.h
+@@ -164,7 +164,7 @@ extern asmlinkage void smp_invalidate_in
+ extern void (*__initconst interrupt[NR_VECTORS-FIRST_EXTERNAL_VECTOR])(void);
+ 
+ typedef int vector_irq_t[NR_VECTORS];
+-DECLARE_PER_CPU(vector_irq_t, vector_irq);
++DECLARE_PER_CPU_USER_MAPPED(vector_irq_t, vector_irq);
+ extern void setup_vector_irq(int cpu);
+ 
+ #ifdef CONFIG_X86_IO_APIC
+--- /dev/null
++++ b/arch/x86/include/asm/kaiser.h
+@@ -0,0 +1,126 @@
++#ifndef _ASM_X86_KAISER_H
++#define _ASM_X86_KAISER_H
++
++#include <asm/processor-flags.h> /* For PCID constants */
++
++/*
++ * This file includes the definitions for the KAISER feature.
++ * KAISER is a counter measure against x86_64 side channel attacks on
++ * the kernel virtual memory.  It has a shadow pgd for every process: the
++ * shadow pgd has a minimalistic kernel-set mapped, but includes the whole
++ * user memory. Within a kernel context switch, or when an interrupt is handled,
++ * the pgd is switched to the normal one. When the system switches to user mode,
++ * the shadow pgd is enabled. By this, the virtual memory caches are freed,
++ * and the user may not attack the whole kernel memory.
++ *
++ * A minimalistic kernel mapping holds the parts needed to be mapped in user
++ * mode, such as the entry/exit functions of the user space, or the stacks.
++ */
++
++#define KAISER_SHADOW_PGD_OFFSET 0x1000
++
++#ifdef __ASSEMBLY__
++#ifdef CONFIG_KAISER
++
++.macro _SWITCH_TO_KERNEL_CR3 reg
++movq %cr3, \reg
++andq $(~(X86_CR3_PCID_ASID_MASK | KAISER_SHADOW_PGD_OFFSET)), \reg
++orq  x86_cr3_pcid_noflush, \reg
++movq \reg, %cr3
++.endm
++
++.macro _SWITCH_TO_USER_CR3 reg regb
++/*
++ * regb must be the low byte portion of reg: because we have arranged
++ * for the low byte of the user PCID to serve as the high byte of NOFLUSH
++ * (0x80 for each when PCID is enabled, or 0x00 when PCID and NOFLUSH are
++ * not enabled): so that the one register can update both memory and cr3.
++ */
++movq %cr3, \reg
++orq  PER_CPU_VAR(x86_cr3_pcid_user), \reg
++js   9f
++/* FLUSH this time, reset to NOFLUSH for next time (if PCID enabled) */
++movb \regb, PER_CPU_VAR(x86_cr3_pcid_user+7)
++9:
++movq \reg, %cr3
++.endm
++
++.macro SWITCH_KERNEL_CR3
++pushq %rax
++_SWITCH_TO_KERNEL_CR3 %rax
++popq %rax
++.endm
++
++.macro SWITCH_USER_CR3
++pushq %rax
++_SWITCH_TO_USER_CR3 %rax %al
++popq %rax
++.endm
++
++.macro SWITCH_KERNEL_CR3_NO_STACK
++movq %rax, PER_CPU_VAR(unsafe_stack_register_backup)
++_SWITCH_TO_KERNEL_CR3 %rax
++movq PER_CPU_VAR(unsafe_stack_register_backup), %rax
++.endm
++
++#else /* CONFIG_KAISER */
++
++.macro SWITCH_KERNEL_CR3 reg
++.endm
++.macro SWITCH_USER_CR3 reg regb
++.endm
++.macro SWITCH_KERNEL_CR3_NO_STACK
++.endm
++
++#endif /* CONFIG_KAISER */
++
++#else /* __ASSEMBLY__ */
++
++#ifdef CONFIG_KAISER
++/*
++ * Upon kernel/user mode switch, it may happen that the address
++ * space has to be switched before the registers have been
++ * stored.  To change the address space, another register is
++ * needed.  A register therefore has to be stored/restored.
++*/
++DECLARE_PER_CPU_USER_MAPPED(unsigned long, unsafe_stack_register_backup);
++
++extern unsigned long x86_cr3_pcid_noflush;
++DECLARE_PER_CPU(unsigned long, x86_cr3_pcid_user);
++
++extern char __per_cpu_user_mapped_start[], __per_cpu_user_mapped_end[];
++
++/**
++ *  kaiser_add_mapping - map a virtual memory part to the shadow (user) mapping
++ *  @addr: the start address of the range
++ *  @size: the size of the range
++ *  @flags: The mapping flags of the pages
++ *
++ *  The mapping is done on a global scope, so no bigger
++ *  synchronization has to be done.  the pages have to be
++ *  manually unmapped again when they are not needed any longer.
++ */
++extern int kaiser_add_mapping(unsigned long addr, unsigned long size, unsigned long flags);
++
++/**
++ *  kaiser_remove_mapping - unmap a virtual memory part of the shadow mapping
++ *  @addr: the start address of the range
++ *  @size: the size of the range
++ */
++extern void kaiser_remove_mapping(unsigned long start, unsigned long size);
++
++/**
++ *  kaiser_init - Initialize the shadow mapping
++ *
++ *  Most parts of the shadow mapping can be mapped upon boot
++ *  time.  Only per-process things like the thread stacks
++ *  or a new LDT have to be mapped at runtime.  These boot-
++ *  time mappings are permanent and never unmapped.
++ */
++extern void kaiser_init(void);
++
++#endif /* CONFIG_KAISER */
++
++#endif /* __ASSEMBLY */
++
++#endif /* _ASM_X86_KAISER_H */
+--- a/arch/x86/include/asm/pgtable.h
++++ b/arch/x86/include/asm/pgtable.h
+@@ -570,7 +570,17 @@ static inline pud_t *pud_offset(pgd_t *p
+ 
+ static inline int pgd_bad(pgd_t pgd)
+ {
+-	return (pgd_flags(pgd) & ~_PAGE_USER) != _KERNPG_TABLE;
++	pgdval_t ignore_flags = _PAGE_USER;
++	/*
++	 * We set NX on KAISER pgds that map userspace memory so
++	 * that userspace can not meaningfully use the kernel
++	 * page table by accident; it will fault on the first
++	 * instruction it tries to run.  See native_set_pgd().
++	 */
++	if (IS_ENABLED(CONFIG_KAISER))
++		ignore_flags |= _PAGE_NX;
++
++	return (pgd_flags(pgd) & ~ignore_flags) != _KERNPG_TABLE;
+ }
+ 
+ static inline int pgd_none(pgd_t pgd)
+@@ -771,6 +781,12 @@ static inline void pmdp_set_wrprotect(st
+ static inline void clone_pgd_range(pgd_t *dst, pgd_t *src, int count)
+ {
+        memcpy(dst, src, count * sizeof(pgd_t));
++#ifdef CONFIG_KAISER
++	/* Clone the shadow pgd part as well */
++	memcpy(native_get_shadow_pgd(dst),
++	       native_get_shadow_pgd(src),
++	       count * sizeof(pgd_t));
++#endif
+ }
+ 
+ 
+--- a/arch/x86/include/asm/pgtable_64.h
++++ b/arch/x86/include/asm/pgtable_64.h
+@@ -105,9 +105,36 @@ static inline void native_pud_clear(pud_
+ 	native_set_pud(pud, native_make_pud(0));
+ }
+ 
++#ifdef CONFIG_KAISER
++extern pgd_t kaiser_set_shadow_pgd(pgd_t *pgdp, pgd_t pgd);
++
++static inline pgd_t *native_get_shadow_pgd(pgd_t *pgdp)
++{
++	return (pgd_t *)((unsigned long)pgdp | (unsigned long)PAGE_SIZE);
++}
++
++static inline pgd_t *native_get_normal_pgd(pgd_t *pgdp)
++{
++	return (pgd_t *)((unsigned long)pgdp & ~(unsigned long)PAGE_SIZE);
++}
++#else
++static inline pgd_t kaiser_set_shadow_pgd(pgd_t *pgdp, pgd_t pgd)
++{
++	return pgd;
++}
++static inline pgd_t *native_get_shadow_pgd(pgd_t *pgdp)
++{
++	return NULL;
++}
++static inline pgd_t *native_get_normal_pgd(pgd_t *pgdp)
++{
++	return pgdp;
++}
++#endif /* CONFIG_KAISER */
++
+ static inline void native_set_pgd(pgd_t *pgdp, pgd_t pgd)
+ {
+-	*pgdp = pgd;
++	*pgdp = kaiser_set_shadow_pgd(pgdp, pgd);
+ }
+ 
+ static inline void native_pgd_clear(pgd_t *pgd)
+--- a/arch/x86/include/asm/pgtable_types.h
++++ b/arch/x86/include/asm/pgtable_types.h
+@@ -39,7 +39,11 @@
+ #define _PAGE_ACCESSED	(_AT(pteval_t, 1) << _PAGE_BIT_ACCESSED)
+ #define _PAGE_DIRTY	(_AT(pteval_t, 1) << _PAGE_BIT_DIRTY)
+ #define _PAGE_PSE	(_AT(pteval_t, 1) << _PAGE_BIT_PSE)
++#ifdef CONFIG_KAISER
++#define _PAGE_GLOBAL	(_AT(pteval_t, 0))
++#else
+ #define _PAGE_GLOBAL	(_AT(pteval_t, 1) << _PAGE_BIT_GLOBAL)
++#endif
+ #define _PAGE_UNUSED1	(_AT(pteval_t, 1) << _PAGE_BIT_UNUSED1)
+ #define _PAGE_IOMAP	(_AT(pteval_t, 1) << _PAGE_BIT_IOMAP)
+ #define _PAGE_PAT	(_AT(pteval_t, 1) << _PAGE_BIT_PAT)
+@@ -62,7 +66,7 @@
+ #endif
+ 
+ #define _PAGE_FILE	(_AT(pteval_t, 1) << _PAGE_BIT_FILE)
+-#define _PAGE_PROTNONE	(_AT(pteval_t, 1) << _PAGE_BIT_PROTNONE)
++#define _PAGE_PROTNONE  (_AT(pteval_t, 1) << _PAGE_BIT_PROTNONE)
+ 
+ #define _PAGE_TABLE	(_PAGE_PRESENT | _PAGE_RW | _PAGE_USER |	\
+ 			 _PAGE_ACCESSED | _PAGE_DIRTY)
+@@ -74,6 +78,33 @@
+ 			 _PAGE_SPECIAL | _PAGE_ACCESSED | _PAGE_DIRTY)
+ #define _HPAGE_CHG_MASK (_PAGE_CHG_MASK | _PAGE_PSE)
+ 
++/* The ASID is the lower 12 bits of CR3 */
++#define X86_CR3_PCID_ASID_MASK  (_AC((1<<12)-1,UL))
++
++/* Mask for all the PCID-related bits in CR3: */
++#define X86_CR3_PCID_MASK       (X86_CR3_PCID_NOFLUSH | X86_CR3_PCID_ASID_MASK)
++#define X86_CR3_PCID_ASID_KERN  (_AC(0x0,UL))
++
++#if defined(CONFIG_KAISER) && defined(CONFIG_X86_64)
++/* Let X86_CR3_PCID_ASID_USER be usable for the X86_CR3_PCID_NOFLUSH bit */
++#define X86_CR3_PCID_ASID_USER	(_AC(0x80,UL))
++
++#define X86_CR3_PCID_KERN_FLUSH		(X86_CR3_PCID_ASID_KERN)
++#define X86_CR3_PCID_USER_FLUSH		(X86_CR3_PCID_ASID_USER)
++#define X86_CR3_PCID_KERN_NOFLUSH	(X86_CR3_PCID_NOFLUSH | X86_CR3_PCID_ASID_KERN)
++#define X86_CR3_PCID_USER_NOFLUSH	(X86_CR3_PCID_NOFLUSH | X86_CR3_PCID_ASID_USER)
++#else
++#define X86_CR3_PCID_ASID_USER  (_AC(0x0,UL))
++/*
++ * PCIDs are unsupported on 32-bit and none of these bits can be
++ * set in CR3:
++ */
++#define X86_CR3_PCID_KERN_FLUSH		(0)
++#define X86_CR3_PCID_USER_FLUSH		(0)
++#define X86_CR3_PCID_KERN_NOFLUSH	(0)
++#define X86_CR3_PCID_USER_NOFLUSH	(0)
++#endif
++
+ #define _PAGE_CACHE_MASK	(_PAGE_PCD | _PAGE_PWT)
+ #define _PAGE_CACHE_WB		(0)
+ #define _PAGE_CACHE_WC		(_PAGE_PWT)
+--- a/arch/x86/include/asm/processor-flags.h
++++ b/arch/x86/include/asm/processor-flags.h
+@@ -43,6 +43,8 @@
+  */
+ #define X86_CR3_PWT	0x00000008 /* Page Write Through */
+ #define X86_CR3_PCD	0x00000010 /* Page Cache Disable */
++#define X86_CR3_PCID_NOFLUSH_BIT 63 /* Preserve old PCID */
++#define X86_CR3_PCID_NOFLUSH (_AC(1,ULL) << X86_CR3_PCID_NOFLUSH_BIT)
+ 
+ /*
+  * Intel CPU features in CR4
+--- a/arch/x86/include/asm/processor.h
++++ b/arch/x86/include/asm/processor.h
+@@ -266,7 +266,7 @@ struct tss_struct {
+ 
+ } ____cacheline_aligned;
+ 
+-DECLARE_PER_CPU_SHARED_ALIGNED(struct tss_struct, init_tss);
++DECLARE_PER_CPU_SHARED_ALIGNED_USER_MAPPED(struct tss_struct, init_tss);
+ 
+ /*
+  * Save the original ist values for checking stack pointers during debugging
+--- a/arch/x86/include/asm/tlbflush.h
++++ b/arch/x86/include/asm/tlbflush.h
+@@ -64,27 +64,59 @@ static inline void invpcid_flush_all_non
+ #define __flush_tlb_single(addr) __native_flush_tlb_single(addr)
+ #endif
+ 
++/*
++ * Declare a couple of kaiser interfaces here for convenience,
++ * to avoid the need for asm/kaiser.h in unexpected places.
++ */
++#ifdef CONFIG_KAISER
++extern void kaiser_setup_pcid(void);
++extern void kaiser_flush_tlb_on_return_to_user(void);
++#else
++static inline void kaiser_setup_pcid(void)
++{
++}
++static inline void kaiser_flush_tlb_on_return_to_user(void)
++{
++}
++#endif
++
+ static inline void __native_flush_tlb(void)
+ {
++	if (this_cpu_has(X86_FEATURE_INVPCID)) {
++		/*
++		 * Note, this works with CR4.PCIDE=0 or 1.
++		 */
++		invpcid_flush_all_nonglobals();
++		return;
++	}
++
+ 	/*
+ 	 * If current->mm == NULL then we borrow a mm which may change during a
+ 	 * task switch and therefore we must not be preempted while we write CR3
+ 	 * back:
+ 	 */
+ 	preempt_disable();
++	if (this_cpu_has(X86_FEATURE_PCID))
++		kaiser_flush_tlb_on_return_to_user();
+ 	native_write_cr3(native_read_cr3());
+ 	preempt_enable();
+ }
+ 
+ static inline void __native_flush_tlb_global(void)
+ {
++#ifdef CONFIG_KAISER
++	/* Globals are not used at all */
++	__native_flush_tlb();
++#else
+ 	unsigned long flags;
+ 	unsigned long cr4;
+ 
+-	if (static_cpu_has(X86_FEATURE_INVPCID)) {
++	if (this_cpu_has(X86_FEATURE_INVPCID)) {
+ 		/*
+ 		 * Using INVPCID is considerably faster than a pair of writes
+ 		 * to CR4 sandwiched inside an IRQ flag save/restore.
++		 *
++	 	 * Note, this works with CR4.PCIDE=0 or 1.
+ 		 */
+ 		invpcid_flush_all();
+ 		return;
+@@ -104,11 +136,39 @@ static inline void __native_flush_tlb_gl
+ 	native_write_cr4(cr4);
+ 
+ 	raw_local_irq_restore(flags);
++#endif
+ }
+ 
+ static inline void __native_flush_tlb_single(unsigned long addr)
+ {
+-	asm volatile("invlpg (%0)" ::"r" (addr) : "memory");
++	/*
++	 * SIMICS #GP's if you run INVPCID with type 2/3
++	 * and X86_CR4_PCIDE clear.  Shame!
++	 *
++	 * The ASIDs used below are hard-coded.  But, we must not
++	 * call invpcid(type=1/2) before CR4.PCIDE=1.  Just call
++	 * invlpg in the case we are called early.
++	 */
++
++	if (!this_cpu_has(X86_FEATURE_INVPCID_SINGLE)) {
++		if (this_cpu_has(X86_FEATURE_PCID))
++			kaiser_flush_tlb_on_return_to_user();
++		asm volatile("invlpg (%0)" ::"r" (addr) : "memory");
++		return;
++	}
++	/* Flush the address out of both PCIDs. */
++	/*
++	 * An optimization here might be to determine addresses
++	 * that are only kernel-mapped and only flush the kernel
++	 * ASID.  But, userspace flushes are probably much more
++	 * important performance-wise.
++	 *
++	 * Make sure to do only a single invpcid when KAISER is
++	 * disabled and we have only a single ASID.
++	 */
++	if (X86_CR3_PCID_ASID_KERN != X86_CR3_PCID_ASID_USER)
++		invpcid_flush_one(X86_CR3_PCID_ASID_KERN, addr);
++	invpcid_flush_one(X86_CR3_PCID_ASID_USER, addr);
+ }
+ 
+ static inline void __flush_tlb_all(void)
+--- a/arch/x86/kernel/cpu/common.c
++++ b/arch/x86/kernel/cpu/common.c
+@@ -84,7 +84,7 @@ static const struct cpu_dev __cpuinitcon
+ 
+ static const struct cpu_dev *this_cpu __cpuinitdata = &default_cpu;
+ 
+-DEFINE_PER_CPU_PAGE_ALIGNED(struct gdt_page, gdt_page) = { .gdt = {
++DEFINE_PER_CPU_PAGE_ALIGNED_USER_MAPPED(struct gdt_page, gdt_page) = { .gdt = {
+ #ifdef CONFIG_X86_64
+ 	/*
+ 	 * We need valid kernel segments for data and code in long mode too
+@@ -319,6 +319,19 @@ static void setup_pcid(struct cpuinfo_x8
+ 			 * SDM says that it can't be enabled in 32-bit mode.
+ 			 */
+ 			set_in_cr4(X86_CR4_PCIDE);
++			/*
++			 * INVPCID has two "groups" of types:
++			 * 1/2: Invalidate an individual address
++			 * 3/4: Invalidate all contexts
++			 *
++			 * 1/2 take a PCID, but 3/4 do not.  So, 3/4
++			 * ignore the PCID argument in the descriptor.
++			 * But, we have to be careful not to call 1/2
++			 * with an actual non-zero PCID in them before
++			 * we do the above set_in_cr4().
++			 */
++			if (cpu_has(c, X86_FEATURE_INVPCID))
++				set_cpu_cap(c, X86_FEATURE_INVPCID_SINGLE);
+ 		} else {
+ 			/*
+ 			 * flush_tlb_all(), as currently implemented, won't
+@@ -331,6 +344,7 @@ static void setup_pcid(struct cpuinfo_x8
+ 			clear_cpu_cap(c, X86_FEATURE_PCID);
+ 		}
+ 	}
++	kaiser_setup_pcid();
+ }
+ 
+ /*
+@@ -1115,7 +1129,7 @@ static const unsigned int exception_stac
+ 	  [DEBUG_STACK - 1]			= DEBUG_STKSZ
+ };
+ 
+-static DEFINE_PER_CPU_PAGE_ALIGNED(char, exception_stacks
++DEFINE_PER_CPU_PAGE_ALIGNED_USER_MAPPED(char, exception_stacks
+ 	[(N_EXCEPTION_STACKS - 1) * EXCEPTION_STKSZ + DEBUG_STKSZ]);
+ 
+ /* May not be marked __init: used by software suspend */
+--- a/arch/x86/kernel/cpu/perf_event_intel_ds.c
++++ b/arch/x86/kernel/cpu/perf_event_intel_ds.c
+@@ -2,10 +2,14 @@
+ #include <linux/types.h>
+ #include <linux/slab.h>
+ 
++#include <asm/kaiser.h>
+ #include <asm/perf_event.h>
+ 
+ #include "perf_event.h"
+ 
++static
++DEFINE_PER_CPU_SHARED_ALIGNED_USER_MAPPED(struct debug_store, cpu_debug_store);
++
+ /* The size of a BTS record in bytes: */
+ #define BTS_RECORD_SIZE		24
+ 
+@@ -60,6 +64,39 @@ void fini_debug_store_on_cpu(int cpu)
+ 	wrmsr_on_cpu(cpu, MSR_IA32_DS_AREA, 0, 0);
+ }
+ 
++static void *dsalloc(size_t size, gfp_t flags, int node)
++{
++#ifdef CONFIG_KAISER
++	unsigned int order = get_order(size);
++	struct page *page;
++	unsigned long addr;
++
++	page = alloc_pages_node(node, flags | __GFP_ZERO, order);
++	if (!page)
++		return NULL;
++	addr = (unsigned long)page_address(page);
++	if (kaiser_add_mapping(addr, size, __PAGE_KERNEL) < 0) {
++		__free_pages(page, order);
++		addr = 0;
++	}
++	return (void *)addr;
++#else
++	return kmalloc_node(size, flags | __GFP_ZERO, node);
++#endif
++}
++
++static void dsfree(const void *buffer, size_t size)
++{
++#ifdef CONFIG_KAISER
++	if (!buffer)
++		return;
++	kaiser_remove_mapping((unsigned long)buffer, size);
++	free_pages((unsigned long)buffer, get_order(size));
++#else
++	kfree(buffer);
++#endif
++}
++
+ static int alloc_pebs_buffer(int cpu)
+ {
+ 	struct debug_store *ds = per_cpu(cpu_hw_events, cpu).ds;
+@@ -70,7 +107,7 @@ static int alloc_pebs_buffer(int cpu)
+ 	if (!x86_pmu.pebs)
+ 		return 0;
+ 
+-	buffer = kmalloc_node(PEBS_BUFFER_SIZE, GFP_KERNEL | __GFP_ZERO, node);
++	buffer = dsalloc(PEBS_BUFFER_SIZE, GFP_KERNEL, node);
+ 	if (unlikely(!buffer))
+ 		return -ENOMEM;
+ 
+@@ -94,7 +131,7 @@ static void release_pebs_buffer(int cpu)
+ 	if (!ds || !x86_pmu.pebs)
+ 		return;
+ 
+-	kfree((void *)(unsigned long)ds->pebs_buffer_base);
++	dsfree((void *)(unsigned long)ds->pebs_buffer_base, PEBS_BUFFER_SIZE);
+ 	ds->pebs_buffer_base = 0;
+ }
+ 
+@@ -108,7 +145,7 @@ static int alloc_bts_buffer(int cpu)
+ 	if (!x86_pmu.bts)
+ 		return 0;
+ 
+-	buffer = kmalloc_node(BTS_BUFFER_SIZE, GFP_KERNEL | __GFP_ZERO, node);
++	buffer = dsalloc(BTS_BUFFER_SIZE, GFP_KERNEL, node);
+ 	if (unlikely(!buffer))
+ 		return -ENOMEM;
+ 
+@@ -132,19 +169,15 @@ static void release_bts_buffer(int cpu)
+ 	if (!ds || !x86_pmu.bts)
+ 		return;
+ 
+-	kfree((void *)(unsigned long)ds->bts_buffer_base);
++	dsfree((void *)(unsigned long)ds->bts_buffer_base, BTS_BUFFER_SIZE);
+ 	ds->bts_buffer_base = 0;
+ }
+ 
+ static int alloc_ds_buffer(int cpu)
+ {
+-	int node = cpu_to_node(cpu);
+-	struct debug_store *ds;
+-
+-	ds = kmalloc_node(sizeof(*ds), GFP_KERNEL | __GFP_ZERO, node);
+-	if (unlikely(!ds))
+-		return -ENOMEM;
++	struct debug_store *ds = per_cpu_ptr(&cpu_debug_store, cpu);
+ 
++	memset(ds, 0, sizeof(*ds));
+ 	per_cpu(cpu_hw_events, cpu).ds = ds;
+ 
+ 	return 0;
+@@ -158,7 +191,6 @@ static void release_ds_buffer(int cpu)
+ 		return;
+ 
+ 	per_cpu(cpu_hw_events, cpu).ds = NULL;
+-	kfree(ds);
+ }
+ 
+ void release_ds_buffers(void)
+--- a/arch/x86/kernel/entry_64.S
++++ b/arch/x86/kernel/entry_64.S
+@@ -56,6 +56,7 @@
+ #include <asm/ftrace.h>
+ #include <asm/percpu.h>
+ #include <asm/pgtable_types.h>
++#include <asm/kaiser.h>
+ 
+ /* Avoid __ASSEMBLER__'ifying <linux/audit.h> just for this.  */
+ #include <linux/elf-em.h>
+@@ -323,6 +324,7 @@ ENDPROC(native_usergs_sysret64)
+ 	testl $3, CS(%rdi)
+ 	je 1f
+ 	SWAPGS
++	SWITCH_KERNEL_CR3
+ 	/*
+ 	 * irq_count is used to check if a CPU is already on an interrupt stack
+ 	 * or not. While this is essentially redundant with preempt_count it is
+@@ -362,6 +364,12 @@ END(save_rest)
+ 
+ /* save complete stack frame */
+ 	.pushsection .kprobes.text, "ax"
++/*
++ * Return: ebx=0: needs swapgs but not SWITCH_USER_CR3 in paranoid_exit
++ *         ebx=1: needs neither swapgs nor SWITCH_USER_CR3 in paranoid_exit
++ *         ebx=2: needs both swapgs and SWITCH_USER_CR3 in paranoid_exit
++ *         ebx=3: needs SWITCH_USER_CR3 but not swapgs in paranoid_exit
++ */
+ ENTRY(save_paranoid)
+ 	XCPT_FRAME 1 RDI+8
+ 	cld
+@@ -387,7 +395,25 @@ ENTRY(save_paranoid)
+ 	js 1f	/* negative -> in kernel */
+ 	SWAPGS
+ 	xorl %ebx,%ebx
+-1:	ret
++1:
++#ifdef CONFIG_KAISER
++	/*
++	 * We might have come in between a swapgs and a SWITCH_KERNEL_CR3
++	 * on entry, or between a SWITCH_USER_CR3 and a swapgs on exit.
++	 * Do a conditional SWITCH_KERNEL_CR3: this could safely be done
++	 * unconditionally, but we need to find out whether the reverse
++	 * should be done on return (conveyed to paranoid_exit in %ebx).
++	 */
++	movq	%cr3, %rax
++	testl	$KAISER_SHADOW_PGD_OFFSET, %eax
++	jz	2f
++	orl	$2, %ebx
++	andq	$(~(X86_CR3_PCID_ASID_MASK | KAISER_SHADOW_PGD_OFFSET)), %rax
++	orq	x86_cr3_pcid_noflush, %rax
++	movq	%rax, %cr3
++2:
++#endif
++	ret
+ 	CFI_ENDPROC
+ END(save_paranoid)
+ 	.popsection
+@@ -464,6 +490,7 @@ ENTRY(system_call)
+ 	CFI_REGISTER	rip,rcx
+ 	/*CFI_REGISTER	rflags,r11*/
+ 	SWAPGS_UNSAFE_STACK
++	SWITCH_KERNEL_CR3_NO_STACK
+ 	/*
+ 	 * A hypervisor implementation might want to use a label
+ 	 * after the swapgs, so that it can do the swapgs
+@@ -515,6 +542,14 @@ sysret_check:
+ 	CFI_REGISTER	rip,rcx
+ 	RESTORE_ARGS 1,-ARG_SKIP,0
+ 	/*CFI_REGISTER	rflags,r11*/
++	/*
++	 * This opens a window where we have a user CR3, but are
++	 * running in the kernel.  This makes using the CS
++	 * register useless for telling whether or not we need to
++	 * switch CR3 in NMIs.  Normal interrupts are OK because
++	 * they are off here.
++	 */
++	SWITCH_USER_CR3
+ 	movq	PER_CPU_VAR(old_rsp), %rsp
+ 	USERGS_SYSRET64
+ 
+@@ -851,6 +886,14 @@ retint_swapgs:		/* return to user-space
+ 	 */
+ 	DISABLE_INTERRUPTS(CLBR_ANY)
+ 	TRACE_IRQS_IRETQ
++	/*
++	 * This opens a window where we have a user CR3, but are
++	 * running in the kernel.  This makes using the CS
++	 * register useless for telling whether or not we need to
++	 * switch CR3 in NMIs.  Normal interrupts are OK because
++	 * they are off here.
++	 */
++	SWITCH_USER_CR3
+ 	SWAPGS
+ 	jmp restore_args
+ 
+@@ -891,6 +934,7 @@ native_irq_return_ldt:
+ 	pushq_cfi %rax
+ 	pushq_cfi %rdi
+ 	SWAPGS
++	SWITCH_KERNEL_CR3
+ 	movq PER_CPU_VAR(espfix_waddr),%rdi
+ 	movq %rax,(0*8)(%rdi)	/* RAX */
+ 	movq (2*8)(%rsp),%rax	/* RIP */
+@@ -906,6 +950,7 @@ native_irq_return_ldt:
+ 	andl $0xffff0000,%eax
+ 	popq_cfi %rdi
+ 	orq PER_CPU_VAR(espfix_stack),%rax
++	SWITCH_USER_CR3
+ 	SWAPGS
+ 	movq %rax,%rsp
+ 	popq_cfi %rax
+@@ -1366,30 +1411,40 @@ paranoidzeroentry machine_check *machine
+ 	 * is fundamentally NMI-unsafe. (we cannot change the soft and
+ 	 * hard flags at once, atomically)
+ 	 */
+-
+-	/* ebx:	no swapgs flag */
++/*
++ * On entry: ebx=0: needs swapgs but not SWITCH_USER_CR3
++ *           ebx=1: needs neither swapgs nor SWITCH_USER_CR3
++ *           ebx=2: needs both swapgs and SWITCH_USER_CR3
++ *           ebx=3: needs SWITCH_USER_CR3 but not swapgs
++ */
+ ENTRY(paranoid_exit)
+ 	DEFAULT_FRAME
+ 	DISABLE_INTERRUPTS(CLBR_NONE)
+ 	TRACE_IRQS_OFF
+-	testl %ebx,%ebx				/* swapgs needed? */
+-	jnz paranoid_restore
+-	testl $3,CS(%rsp)
+-	jnz   paranoid_userspace
+-paranoid_swapgs:
++	movq	%rbx, %r12		/* paranoid_userspace uses %ebx */
++	testl	$3, CS(%rsp)
++	jnz	paranoid_userspace
++paranoid_kernel:
++	movq	%r12, %rbx		/* restore after paranoid_userspace */
+ 	TRACE_IRQS_IRETQ 0
++#ifdef CONFIG_KAISER
++	testl	$2, %ebx		/* SWITCH_USER_CR3 needed? */
++	jz	paranoid_exit_no_switch
++	SWITCH_USER_CR3
++paranoid_exit_no_switch:
++#endif
++	testl	$1, %ebx		/* swapgs needed? */
++	jnz	paranoid_exit_no_swapgs
+ 	SWAPGS_UNSAFE_STACK
++paranoid_exit_no_swapgs:
+ 	RESTORE_ALL 8
+-	jmp irq_return
+-paranoid_restore:
+-	TRACE_IRQS_IRETQ 0
+-	RESTORE_ALL 8
+-	jmp irq_return
++	jmp	irq_return
++
+ paranoid_userspace:
+ 	GET_THREAD_INFO(%rcx)
+ 	movl TI_flags(%rcx),%ebx
+ 	andl $_TIF_WORK_MASK,%ebx
+-	jz paranoid_swapgs
++	jz paranoid_kernel
+ 	movq %rsp,%rdi			/* &pt_regs */
+ 	call sync_regs
+ 	movq %rax,%rsp			/* switch stack for scheduling */
+@@ -1438,6 +1493,13 @@ ENTRY(error_entry)
+ 	movq_cfi r13, R13+8
+ 	movq_cfi r14, R14+8
+ 	movq_cfi r15, R15+8
++	/*
++	 * error_entry() always returns with a kernel gsbase and
++	 * CR3.  We must also have a kernel CR3/gsbase before
++	 * calling TRACE_IRQS_*.  Just unconditionally switch to
++	 * the kernel CR3 here.
++	 */
++	SWITCH_KERNEL_CR3
+ 	xorl %ebx,%ebx
+ 	testl $3,CS+8(%rsp)
+ 	je error_kernelspace
+@@ -1527,22 +1589,31 @@ ENTRY(nmi)
+ 	call do_nmi
+ #ifdef CONFIG_TRACE_IRQFLAGS
+ 	/* paranoidexit; without TRACE_IRQS_OFF */
+-	/* ebx:	no swapgs flag */
++	/* ebx:	no-swapgs and kaiser-switch-cr3 flag */
+ 	DISABLE_INTERRUPTS(CLBR_NONE)
+-	testl %ebx,%ebx				/* swapgs needed? */
+-	jnz nmi_restore
+-	testl $3,CS(%rsp)
+-	jnz nmi_userspace
+-nmi_swapgs:
++	movq	%rbx, %r12		/* nmi_userspace uses %ebx */
++	testl	$3, CS(%rsp)
++	jnz	nmi_userspace
++nmi_kernel:
++	movq	%r12, %rbx		/* restore after nmi_userspace */
++#ifdef CONFIG_KAISER
++	testl	$2, %ebx		/* SWITCH_USER_CR3 needed? */
++	jz	nmi_exit_no_switch
++	SWITCH_USER_CR3
++nmi_exit_no_switch:
++#endif
++	testl	$1, %ebx		/* swapgs needed? */
++	jnz	nmi_exit_no_swapgs
+ 	SWAPGS_UNSAFE_STACK
+-nmi_restore:
++nmi_exit_no_swapgs:
+ 	RESTORE_ALL 8
+-	jmp irq_return
++	jmp	irq_return
++
+ nmi_userspace:
+ 	GET_THREAD_INFO(%rcx)
+ 	movl TI_flags(%rcx),%ebx
+ 	andl $_TIF_WORK_MASK,%ebx
+-	jz nmi_swapgs
++	jz nmi_kernel
+ 	movq %rsp,%rdi			/* &pt_regs */
+ 	call sync_regs
+ 	movq %rax,%rsp			/* switch stack for scheduling */
+--- a/arch/x86/kernel/espfix_64.c
++++ b/arch/x86/kernel/espfix_64.c
+@@ -41,6 +41,7 @@
+ #include <asm/pgalloc.h>
+ #include <asm/setup.h>
+ #include <asm/espfix.h>
++#include <asm/kaiser.h>
+ 
+ /*
+  * Note: we only need 6*8 = 48 bytes for the espfix stack, but round
+@@ -129,6 +130,14 @@ void __init init_espfix_bsp(void)
+ 	/* Install the espfix pud into the kernel page directory */
+ 	pgd_p = &init_level4_pgt[pgd_index(ESPFIX_BASE_ADDR)];
+ 	pgd_populate(&init_mm, pgd_p, (pud_t *)espfix_pud_page);
++	/*
++	 * Just copy the top-level PGD that is mapping the espfix
++	 * area to ensure it is mapped into the shadow user page
++	 * tables.
++	 */
++	if (IS_ENABLED(CONFIG_KAISER))
++		set_pgd(native_get_shadow_pgd(pgd_p),
++			__pgd(_KERNPG_TABLE | __pa((pud_t *)espfix_pud_page)));
+ 
+ 	/* Randomize the locations */
+ 	init_espfix_random();
+--- a/arch/x86/kernel/head_64.S
++++ b/arch/x86/kernel/head_64.S
+@@ -338,6 +338,27 @@ early_idt_ripmsg:
+ 	.balign	PAGE_SIZE; \
+ ENTRY(name)
+ 
++#ifdef CONFIG_KAISER
++/*
++ * Each PGD needs to be 8k long and 8k aligned.  We do not
++ * ever go out to userspace with these, so we do not
++ * strictly *need* the second page, but this allows us to
++ * have a single set_pgd() implementation that does not
++ * need to worry about whether it has 4k or 8k to work
++ * with.
++ *
++ * This ensures PGDs are 8k long:
++ */
++#define KAISER_USER_PGD_FILL	512
++/* This ensures they are 8k-aligned: */
++#define NEXT_PGD_PAGE(name) \
++	.balign 2 * PAGE_SIZE; \
++GLOBAL(name)
++#else
++#define NEXT_PGD_PAGE(name) NEXT_PAGE(name)
++#define KAISER_USER_PGD_FILL	0
++#endif
++
+ /* Automate the creation of 1 to 1 mapping pmd entries */
+ #define PMDS(START, PERM, COUNT)			\
+ 	i = 0 ;						\
+@@ -353,13 +374,14 @@ ENTRY(name)
+ 	 * 0xffffffff80000000 to physical address 0x000000. (always using
+ 	 * 2Mbyte large pages provided by PAE mode)
+ 	 */
+-NEXT_PAGE(init_level4_pgt)
++NEXT_PGD_PAGE(init_level4_pgt)
+ 	.quad	level3_ident_pgt - __START_KERNEL_map + _KERNPG_TABLE
+ 	.org	init_level4_pgt + L4_PAGE_OFFSET*8, 0
+ 	.quad	level3_ident_pgt - __START_KERNEL_map + _KERNPG_TABLE
+ 	.org	init_level4_pgt + L4_START_KERNEL*8, 0
+ 	/* (2^48-(2*1024*1024*1024))/(2^39) = 511 */
+ 	.quad	level3_kernel_pgt - __START_KERNEL_map + _PAGE_TABLE
++	.fill	KAISER_USER_PGD_FILL,8,0
+ 
+ NEXT_PAGE(level3_ident_pgt)
+ 	.quad	level2_ident_pgt - __START_KERNEL_map + _KERNPG_TABLE
+@@ -385,6 +407,7 @@ NEXT_PAGE(level2_ident_pgt)
+ 	 * Don't set NX because code runs from these pages.
+ 	 */
+ 	PMDS(0, __PAGE_KERNEL_IDENT_LARGE_EXEC, PTRS_PER_PMD)
++	.fill	KAISER_USER_PGD_FILL,8,0
+ 
+ NEXT_PAGE(level2_kernel_pgt)
+ 	/*
+--- a/arch/x86/kernel/init_task.c
++++ b/arch/x86/kernel/init_task.c
+@@ -38,5 +38,5 @@ EXPORT_SYMBOL(init_task);
+  * section. Since TSS's are completely CPU-local, we want them
+  * on exact cacheline boundaries, to eliminate cacheline ping-pong.
+  */
+-DEFINE_PER_CPU_SHARED_ALIGNED(struct tss_struct, init_tss) = INIT_TSS;
++DEFINE_PER_CPU_SHARED_ALIGNED_USER_MAPPED(struct tss_struct, init_tss) = INIT_TSS;
+ 
+--- a/arch/x86/kernel/irqinit.c
++++ b/arch/x86/kernel/irqinit.c
+@@ -85,7 +85,7 @@ static struct irqaction irq2 = {
+ 	.flags = IRQF_NO_THREAD,
+ };
+ 
+-DEFINE_PER_CPU(vector_irq_t, vector_irq) = {
++DEFINE_PER_CPU_USER_MAPPED(vector_irq_t, vector_irq) = {
+ 	[0 ... NR_VECTORS - 1] = -1,
+ };
+ 
+--- a/arch/x86/kernel/ldt.c
++++ b/arch/x86/kernel/ldt.c
+@@ -15,6 +15,7 @@
+ #include <linux/slab.h>
+ #include <linux/vmalloc.h>
+ #include <linux/uaccess.h>
++#include <linux/kaiser.h>
+ 
+ #include <asm/system.h>
+ #include <asm/ldt.h>
+@@ -34,11 +35,21 @@ static void flush_ldt(void *current_mm)
+ 	set_ldt(pc->ldt->entries, pc->ldt->size);
+ }
+ 
++static void __free_ldt_struct(struct ldt_struct *ldt)
++{
++	if (ldt->size * LDT_ENTRY_SIZE > PAGE_SIZE)
++		vfree(ldt->entries);
++	else
++		free_page((unsigned long)ldt->entries);
++	kfree(ldt);
++}
++
+ /* The caller must call finalize_ldt_struct on the result. LDT starts zeroed. */
+ static struct ldt_struct *alloc_ldt_struct(int size)
+ {
+ 	struct ldt_struct *new_ldt;
+ 	int alloc_size;
++	int ret;
+ 
+ 	if (size > LDT_ENTRIES)
+ 		return NULL;
+@@ -66,7 +77,13 @@ static struct ldt_struct *alloc_ldt_stru
+ 		return NULL;
+ 	}
+ 
++	ret = kaiser_add_mapping((unsigned long)new_ldt->entries, alloc_size,
++				 __PAGE_KERNEL);
+ 	new_ldt->size = size;
++	if (ret) {
++		__free_ldt_struct(new_ldt);
++		return NULL;
++	}
+ 	return new_ldt;
+ }
+ 
+@@ -97,12 +114,10 @@ static void free_ldt_struct(struct ldt_s
+ 	if (likely(!ldt))
+ 		return;
+ 
++	kaiser_remove_mapping((unsigned long)ldt->entries,
++			      ldt->size * LDT_ENTRY_SIZE);
+ 	paravirt_free_ldt(ldt->entries, ldt->size);
+-	if (ldt->size * LDT_ENTRY_SIZE > PAGE_SIZE)
+-		vfree(ldt->entries);
+-	else
+-		kfree(ldt->entries);
+-	kfree(ldt);
++	__free_ldt_struct(ldt);
+ }
+ 
+ /*
+--- a/arch/x86/kernel/process_64.c
++++ b/arch/x86/kernel/process_64.c
+@@ -57,7 +57,7 @@
+ 
+ asmlinkage extern void ret_from_fork(void);
+ 
+-DEFINE_PER_CPU(unsigned long, old_rsp);
++DEFINE_PER_CPU_USER_MAPPED(unsigned long, old_rsp);
+ static DEFINE_PER_CPU(unsigned char, is_idle);
+ 
+ static ATOMIC_NOTIFIER_HEAD(idle_notifier);
+--- a/arch/x86/mm/Makefile
++++ b/arch/x86/mm/Makefile
+@@ -29,3 +29,4 @@ obj-$(CONFIG_NUMA_EMU)		+= numa_emulatio
+ obj-$(CONFIG_HAVE_MEMBLOCK)		+= memblock.o
+ 
+ obj-$(CONFIG_MEMTEST)		+= memtest.o
++obj-$(CONFIG_KAISER)		+= kaiser.o
+--- /dev/null
++++ b/arch/x86/mm/kaiser.c
+@@ -0,0 +1,382 @@
++#include <linux/bug.h>
++#include <linux/kernel.h>
++#include <linux/errno.h>
++#include <linux/string.h>
++#include <linux/types.h>
++#include <linux/bug.h>
++#include <linux/init.h>
++#include <linux/interrupt.h>
++#include <linux/spinlock.h>
++#include <linux/mm.h>
++#include <linux/module.h>
++#include <linux/uaccess.h>
++#include <linux/ftrace.h>
++
++extern struct mm_struct init_mm;
++
++#include <asm/kaiser.h>
++#include <asm/tlbflush.h>	/* to verify its kaiser declarations */
++#include <asm/pgtable.h>
++#include <asm/pgalloc.h>
++#include <asm/desc.h>
++
++#ifdef CONFIG_KAISER
++DEFINE_PER_CPU_USER_MAPPED(unsigned long, unsafe_stack_register_backup);
++
++/*
++ * These can have bit 63 set, so we can not just use a plain "or"
++ * instruction to get their value or'd into CR3.  It would take
++ * another register.  So, we use a memory reference to these instead.
++ *
++ * This is also handy because systems that do not support PCIDs
++ * just end up or'ing a 0 into their CR3, which does no harm.
++ */
++unsigned long x86_cr3_pcid_noflush __read_mostly;
++DEFINE_PER_CPU(unsigned long, x86_cr3_pcid_user);
++
++/*
++ * At runtime, the only things we map are some things for CPU
++ * hotplug, and stacks for new processes.  No two CPUs will ever
++ * be populating the same addresses, so we only need to ensure
++ * that we protect between two CPUs trying to allocate and
++ * populate the same page table page.
++ *
++ * Only take this lock when doing a set_p[4um]d(), but it is not
++ * needed for doing a set_pte().  We assume that only the *owner*
++ * of a given allocation will be doing this for _their_
++ * allocation.
++ *
++ * This ensures that once a system has been running for a while
++ * and there have been stacks all over and these page tables
++ * are fully populated, there will be no further acquisitions of
++ * this lock.
++ */
++static DEFINE_SPINLOCK(shadow_table_allocation_lock);
++
++/*
++ * Returns -1 on error.
++ */
++static inline unsigned long get_pa_from_mapping(unsigned long vaddr)
++{
++	pgd_t *pgd;
++	pud_t *pud;
++	pmd_t *pmd;
++	pte_t *pte;
++
++	pgd = pgd_offset_k(vaddr);
++	/*
++	 * We made all the kernel PGDs present in kaiser_init().
++	 * We expect them to stay that way.
++	 */
++	BUG_ON(pgd_none(*pgd));
++	/*
++	 * PGDs are either 512GB or 128TB on all x86_64
++	 * configurations.  We don't handle these.
++	 */
++	BUG_ON(pgd_large(*pgd));
++
++	pud = pud_offset(pgd, vaddr);
++	if (pud_none(*pud)) {
++		WARN_ON_ONCE(1);
++		return -1;
++	}
++
++	if (pud_large(*pud))
++		return (pud_pfn(*pud) << PAGE_SHIFT) | (vaddr & ~PUD_PAGE_MASK);
++
++	pmd = pmd_offset(pud, vaddr);
++	if (pmd_none(*pmd)) {
++		WARN_ON_ONCE(1);
++		return -1;
++	}
++
++	if (pmd_large(*pmd))
++		return (pmd_pfn(*pmd) << PAGE_SHIFT) | (vaddr & ~PMD_PAGE_MASK);
++
++	pte = pte_offset_kernel(pmd, vaddr);
++	if (pte_none(*pte)) {
++		WARN_ON_ONCE(1);
++		return -1;
++	}
++
++	return (pte_pfn(*pte) << PAGE_SHIFT) | (vaddr & ~PAGE_MASK);
++}
++
++/*
++ * This is a relatively normal page table walk, except that it
++ * also tries to allocate page tables pages along the way.
++ *
++ * Returns a pointer to a PTE on success, or NULL on failure.
++ */
++static pte_t *kaiser_pagetable_walk(unsigned long address)
++{
++	pmd_t *pmd;
++	pud_t *pud;
++	pgd_t *pgd = native_get_shadow_pgd(pgd_offset_k(address));
++	gfp_t gfp = (GFP_KERNEL | __GFP_NOTRACK | __GFP_ZERO);
++
++	if (pgd_none(*pgd)) {
++		WARN_ONCE(1, "All shadow pgds should have been populated");
++		return NULL;
++	}
++	BUILD_BUG_ON(pgd_large(*pgd) != 0);
++
++	pud = pud_offset(pgd, address);
++	/* The shadow page tables do not use large mappings: */
++	if (pud_large(*pud)) {
++		WARN_ON(1);
++		return NULL;
++	}
++	if (pud_none(*pud)) {
++		unsigned long new_pmd_page = __get_free_page(gfp);
++		if (!new_pmd_page)
++			return NULL;
++		spin_lock(&shadow_table_allocation_lock);
++		if (pud_none(*pud)) {
++			set_pud(pud, __pud(_KERNPG_TABLE | __pa(new_pmd_page)));
++			__inc_zone_page_state(virt_to_page((void *)
++						new_pmd_page), NR_KAISERTABLE);
++		} else
++			free_page(new_pmd_page);
++		spin_unlock(&shadow_table_allocation_lock);
++	}
++
++	pmd = pmd_offset(pud, address);
++	/* The shadow page tables do not use large mappings: */
++	if (pmd_large(*pmd)) {
++		WARN_ON(1);
++		return NULL;
++	}
++	if (pmd_none(*pmd)) {
++		unsigned long new_pte_page = __get_free_page(gfp);
++		if (!new_pte_page)
++			return NULL;
++		spin_lock(&shadow_table_allocation_lock);
++		if (pmd_none(*pmd)) {
++			set_pmd(pmd, __pmd(_KERNPG_TABLE | __pa(new_pte_page)));
++			__inc_zone_page_state(virt_to_page((void *)
++						new_pte_page), NR_KAISERTABLE);
++		} else
++			free_page(new_pte_page);
++		spin_unlock(&shadow_table_allocation_lock);
++	}
++
++	return pte_offset_kernel(pmd, address);
++}
++
++int kaiser_add_user_map(const void *__start_addr, unsigned long size,
++			unsigned long flags)
++{
++	int ret = 0;
++	pte_t *pte;
++	unsigned long start_addr = (unsigned long )__start_addr;
++	unsigned long address = start_addr & PAGE_MASK;
++	unsigned long end_addr = PAGE_ALIGN(start_addr + size);
++	unsigned long target_address;
++
++	for (; address < end_addr; address += PAGE_SIZE) {
++		target_address = get_pa_from_mapping(address);
++		if (target_address == -1) {
++			ret = -EIO;
++			break;
++		}
++		pte = kaiser_pagetable_walk(address);
++		if (!pte) {
++			ret = -ENOMEM;
++			break;
++		}
++		if (pte_none(*pte)) {
++			set_pte(pte, __pte(flags | target_address));
++		} else {
++			pte_t tmp;
++			set_pte(&tmp, __pte(flags | target_address));
++			WARN_ON_ONCE(!pte_same(*pte, tmp));
++		}
++	}
++	return ret;
++}
++
++static int kaiser_add_user_map_ptrs(const void *start, const void *end, unsigned long flags)
++{
++	unsigned long size = end - start;
++
++	return kaiser_add_user_map(start, size, flags);
++}
++
++/*
++ * Ensure that the top level of the (shadow) page tables are
++ * entirely populated.  This ensures that all processes that get
++ * forked have the same entries.  This way, we do not have to
++ * ever go set up new entries in older processes.
++ *
++ * Note: we never free these, so there are no updates to them
++ * after this.
++ */
++static void __init kaiser_init_all_pgds(void)
++{
++	pgd_t *pgd;
++	int i = 0;
++
++	pgd = native_get_shadow_pgd(pgd_offset_k((unsigned long )0));
++	for (i = PTRS_PER_PGD / 2; i < PTRS_PER_PGD; i++) {
++		pgd_t new_pgd;
++		pud_t *pud = pud_alloc_one(&init_mm,
++					   PAGE_OFFSET + i * PGDIR_SIZE);
++		if (!pud) {
++			WARN_ON(1);
++			break;
++		}
++		inc_zone_page_state(virt_to_page(pud), NR_KAISERTABLE);
++		new_pgd = __pgd(_KERNPG_TABLE |__pa(pud));
++		/*
++		 * Make sure not to stomp on some other pgd entry.
++		 */
++		if (!pgd_none(pgd[i])) {
++			WARN_ON(1);
++			continue;
++		}
++		set_pgd(pgd + i, new_pgd);
++	}
++}
++
++#define kaiser_add_user_map_early(start, size, flags) do {	\
++	int __ret = kaiser_add_user_map(start, size, flags);	\
++	WARN_ON(__ret);						\
++} while (0)
++
++#define kaiser_add_user_map_ptrs_early(start, end, flags) do {		\
++	int __ret = kaiser_add_user_map_ptrs(start, end, flags);	\
++	WARN_ON(__ret);							\
++} while (0)
++
++/*
++ * If anything in here fails, we will likely die on one of the
++ * first kernel->user transitions and init will die.  But, we
++ * will have most of the kernel up by then and should be able to
++ * get a clean warning out of it.  If we BUG_ON() here, we run
++ * the risk of being before we have good console output.
++ */
++void __init kaiser_init(void)
++{
++	int cpu;
++
++	kaiser_init_all_pgds();
++
++	for_each_possible_cpu(cpu) {
++		void *percpu_vaddr = __per_cpu_user_mapped_start +
++				     per_cpu_offset(cpu);
++		unsigned long percpu_sz = __per_cpu_user_mapped_end -
++					  __per_cpu_user_mapped_start;
++		kaiser_add_user_map_early(percpu_vaddr, percpu_sz,
++					  __PAGE_KERNEL);
++	}
++
++	/*
++	 * Map the entry/exit text section, which is needed at
++	 * switches from user to and from kernel.
++	 */
++	kaiser_add_user_map_ptrs_early(__entry_text_start, __entry_text_end,
++				       __PAGE_KERNEL_RX);
++#ifdef CONFIG_FUNCTION_GRAPH_TRACER
++	kaiser_add_user_map_ptrs_early(__irqentry_text_start,
++				       __irqentry_text_end,
++				       __PAGE_KERNEL_RX);
++#endif
++	kaiser_add_user_map_early((void *)idt_descr.address,
++				  sizeof(gate_desc) * NR_VECTORS,
++				  __PAGE_KERNEL_RO);
++	kaiser_add_user_map_early(&x86_cr3_pcid_noflush,
++				  sizeof(x86_cr3_pcid_noflush),
++				  __PAGE_KERNEL);
++}
++
++/* Add a mapping to the shadow mapping, and synchronize the mappings */
++int kaiser_add_mapping(unsigned long addr, unsigned long size, unsigned long flags)
++{
++	return kaiser_add_user_map((const void *)addr, size, flags);
++}
++
++void kaiser_remove_mapping(unsigned long start, unsigned long size)
++{
++	unsigned long end = start + size;
++	unsigned long addr;
++	pte_t *pte;
++
++	for (addr = start; addr < end; addr += PAGE_SIZE) {
++		pte = kaiser_pagetable_walk(addr);
++		if (pte)
++			set_pte(pte, __pte(0));
++	}
++}
++
++/*
++ * Page table pages are page-aligned.  The lower half of the top
++ * level is used for userspace and the top half for the kernel.
++ * This returns true for user pages that need to get copied into
++ * both the user and kernel copies of the page tables, and false
++ * for kernel pages that should only be in the kernel copy.
++ */
++static inline bool is_userspace_pgd(pgd_t *pgdp)
++{
++	return ((unsigned long)pgdp % PAGE_SIZE) < (PAGE_SIZE / 2);
++}
++
++pgd_t kaiser_set_shadow_pgd(pgd_t *pgdp, pgd_t pgd)
++{
++	/*
++	 * Do we need to also populate the shadow pgd?  Check _PAGE_USER to
++	 * skip cases like kexec and EFI which make temporary low mappings.
++	 */
++	if (pgd.pgd & _PAGE_USER) {
++		if (is_userspace_pgd(pgdp)) {
++			native_get_shadow_pgd(pgdp)->pgd = pgd.pgd;
++			/*
++			 * Even if the entry is *mapping* userspace, ensure
++			 * that userspace can not use it.  This way, if we
++			 * get out to userspace running on the kernel CR3,
++			 * userspace will crash instead of running.
++			 */
++			pgd.pgd |= _PAGE_NX;
++		}
++	} else if (!pgd.pgd) {
++		/*
++		 * pgd_clear() cannot check _PAGE_USER, and is even used to
++		 * clear corrupted pgd entries: so just rely on cases like
++		 * kexec and EFI never to be using pgd_clear().
++		 */
++		if (!WARN_ON_ONCE((unsigned long)pgdp & PAGE_SIZE) &&
++		    is_userspace_pgd(pgdp))
++			native_get_shadow_pgd(pgdp)->pgd = pgd.pgd;
++	}
++	return pgd;
++}
++
++void kaiser_setup_pcid(void)
++{
++	unsigned long kern_cr3 = 0;
++	unsigned long user_cr3 = KAISER_SHADOW_PGD_OFFSET;
++
++	if (this_cpu_has(X86_FEATURE_PCID)) {
++		kern_cr3 |= X86_CR3_PCID_KERN_NOFLUSH;
++		user_cr3 |= X86_CR3_PCID_USER_NOFLUSH;
++	}
++	/*
++	 * These variables are used by the entry/exit
++	 * code to change PCID and pgd and TLB flushing.
++	 */
++	x86_cr3_pcid_noflush = kern_cr3;
++	this_cpu_write(x86_cr3_pcid_user, user_cr3);
++}
++
++/*
++ * Make a note that this cpu will need to flush USER tlb on return to user.
++ * Caller checks whether this_cpu_has(X86_FEATURE_PCID) before calling:
++ * if cpu does not, then the NOFLUSH bit will never have been set.
++ */
++void kaiser_flush_tlb_on_return_to_user(void)
++{
++	this_cpu_write(x86_cr3_pcid_user,
++			X86_CR3_PCID_USER_FLUSH | KAISER_SHADOW_PGD_OFFSET);
++}
++EXPORT_SYMBOL(kaiser_flush_tlb_on_return_to_user);
++#endif /* CONFIG_KAISER */
+--- a/arch/x86/mm/pgtable.c
++++ b/arch/x86/mm/pgtable.c
+@@ -5,7 +5,7 @@
+ #include <asm/tlb.h>
+ #include <asm/fixmap.h>
+ 
+-#define PGALLOC_GFP GFP_KERNEL | __GFP_NOTRACK | __GFP_REPEAT | __GFP_ZERO
++#define PGALLOC_GFP (GFP_KERNEL | __GFP_NOTRACK | __GFP_REPEAT | __GFP_ZERO)
+ 
+ #ifdef CONFIG_HIGHPTE
+ #define PGALLOC_USER_GFP __GFP_HIGHMEM
+@@ -253,12 +253,35 @@ static void pgd_prepopulate_pmd(struct m
+ 	}
+ }
+ 
++#ifdef CONFIG_KAISER
++/*
++ * Instead of one pmd, we aquire two pmds.  Being order-1, it is
++ * both 8k in size and 8k-aligned.  That lets us just flip bit 12
++ * in a pointer to swap between the two 4k halves.
++ */
++#define PGD_ALLOCATION_ORDER 1
++#else
++#define PGD_ALLOCATION_ORDER 0
++#endif
++
++static inline pgd_t *_pgd_alloc(void)
++{
++	/* No __GFP_REPEAT: to avoid page allocation stalls in order-1 case */
++	return (pgd_t *)__get_free_pages(PGALLOC_GFP & ~__GFP_REPEAT,
++					 PGD_ALLOCATION_ORDER);
++}
++
++static inline void _pgd_free(pgd_t *pgd)
++{
++	free_pages((unsigned long)pgd, PGD_ALLOCATION_ORDER);
++}
++
+ pgd_t *pgd_alloc(struct mm_struct *mm)
+ {
+ 	pgd_t *pgd;
+ 	pmd_t *pmds[PREALLOCATED_PMDS];
+ 
+-	pgd = (pgd_t *)__get_free_page(PGALLOC_GFP);
++	pgd = _pgd_alloc();
+ 
+ 	if (pgd == NULL)
+ 		goto out;
+@@ -288,7 +311,7 @@ pgd_t *pgd_alloc(struct mm_struct *mm)
+ out_free_pmds:
+ 	free_pmds(pmds);
+ out_free_pgd:
+-	free_page((unsigned long)pgd);
++	_pgd_free(pgd);
+ out:
+ 	return NULL;
+ }
+@@ -298,7 +321,7 @@ void pgd_free(struct mm_struct *mm, pgd_
+ 	pgd_mop_up_pmds(mm, pgd);
+ 	pgd_dtor(pgd);
+ 	paravirt_pgd_free(mm, pgd);
+-	free_page((unsigned long)pgd);
++	_pgd_free(pgd);
+ }
+ 
+ int ptep_set_access_flags(struct vm_area_struct *vma,
+--- a/arch/x86/mm/tlb.c
++++ b/arch/x86/mm/tlb.c
+@@ -12,10 +12,43 @@
+ #include <asm/cache.h>
+ #include <asm/apic.h>
+ #include <asm/uv/uv.h>
++#include <asm/kaiser.h>
+ 
+ DEFINE_PER_CPU_SHARED_ALIGNED(struct tlb_state, cpu_tlbstate)
+ 			= { &init_mm, 0, };
+ 
++static void load_new_mm_cr3(pgd_t *pgdir)
++{
++	unsigned long new_mm_cr3 = __pa(pgdir);
++
++#ifdef CONFIG_KAISER
++	if (this_cpu_has(X86_FEATURE_PCID)) {
++		/*
++		 * We reuse the same PCID for different tasks, so we must
++		 * flush all the entries for the PCID out when we change tasks.
++		 * Flush KERN below, flush USER when returning to userspace in
++		 * kaiser's SWITCH_USER_CR3 (_SWITCH_TO_USER_CR3) macro.
++		 *
++		 * invpcid_flush_single_context(X86_CR3_PCID_ASID_USER) could
++		 * do it here, but can only be used if X86_FEATURE_INVPCID is
++		 * available - and many machines support pcid without invpcid.
++		 *
++		 * The line below is a no-op: X86_CR3_PCID_KERN_FLUSH is now 0;
++		 * but keep that line in there in case something changes.
++		 */
++		new_mm_cr3 |= X86_CR3_PCID_KERN_FLUSH;
++		kaiser_flush_tlb_on_return_to_user();
++	}
++#endif /* CONFIG_KAISER */
++
++	/*
++	 * Caution: many callers of this function expect
++	 * that load_new_mm_cr3() is serializing and orders TLB
++	 * fills with respect to the mm_cpumask writes.
++	 */
++	write_cr3(new_mm_cr3);
++}
++
+ /*
+  *	TLB flushing, formerly SMP-only
+  *		c/o Linus Torvalds.
+@@ -65,7 +98,7 @@ void leave_mm(int cpu)
+ 		BUG();
+ 	cpumask_clear_cpu(cpu,
+ 			  mm_cpumask(percpu_read(cpu_tlbstate.active_mm)));
+-	load_cr3(swapper_pg_dir);
++	load_new_mm_cr3(swapper_pg_dir);
+ }
+ EXPORT_SYMBOL_GPL(leave_mm);
+ 
+@@ -113,11 +146,10 @@ void switch_mm_irqs_off(struct mm_struct
+ 		 * from next->pgd.  TLB fills are special and can happen
+ 		 * due to instruction fetches or for no reason at all,
+ 		 * and neither LOCK nor MFENCE orders them.
+-		 * Fortunately, load_cr3() is serializing and gives the
+-		 * ordering guarantee we need.
+-		 *
++		 * Fortunately, load_new_mm_cr3() is serializing
++		 * and gives the  ordering guarantee we need.
+ 		 */
+-		load_cr3(next->pgd);
++		load_new_mm_cr3(next->pgd);
+ 
+ 		/* stop flush ipis for the previous mm */
+ 		cpumask_clear_cpu(cpu, mm_cpumask(prev));
+@@ -136,10 +168,10 @@ void switch_mm_irqs_off(struct mm_struct
+ 			 * tlb flush IPI delivery. We must reload CR3
+ 			 * to make sure to use no freed page tables.
+ 			 *
+-			 * As above, load_cr3() is serializing and orders TLB
+-			 * fills with respect to the mm_cpumask write.
++			 * As above, load_new_mm_cr3() is serializing and orders
++			 * TLB fills with respect to the mm_cpumask write.
+ 			 */
+-			load_cr3(next->pgd);
++			load_new_mm_cr3(next->pgd);
+ 			load_mm_ldt(next);
+ 		}
+ 	}
+--- a/include/asm-generic/vmlinux.lds.h
++++ b/include/asm-generic/vmlinux.lds.h
+@@ -692,7 +692,14 @@
+  */
+ #define PERCPU_INPUT(cacheline)						\
+ 	VMLINUX_SYMBOL(__per_cpu_start) = .;				\
++	VMLINUX_SYMBOL(__per_cpu_user_mapped_start) = .;		\
+ 	*(.data..percpu..first)						\
++	. = ALIGN(cacheline);						\
++	*(.data..percpu..user_mapped)					\
++	*(.data..percpu..user_mapped..shared_aligned)			\
++	. = ALIGN(PAGE_SIZE);						\
++	*(.data..percpu..user_mapped..page_aligned)			\
++	VMLINUX_SYMBOL(__per_cpu_user_mapped_end) = .;			\
+ 	. = ALIGN(PAGE_SIZE);						\
+ 	*(.data..percpu..page_aligned)					\
+ 	. = ALIGN(cacheline);						\
+--- /dev/null
++++ b/include/linux/kaiser.h
+@@ -0,0 +1,52 @@
++#ifndef _LINUX_KAISER_H
++#define _LINUX_KAISER_H
++
++#ifdef CONFIG_KAISER
++#include <asm/kaiser.h>
++
++static inline int kaiser_map_thread_stack(void *stack)
++{
++	/*
++	 * Map that page of kernel stack on which we enter from user context.
++	 */
++	return kaiser_add_mapping((unsigned long)stack +
++			THREAD_SIZE - PAGE_SIZE, PAGE_SIZE, __PAGE_KERNEL);
++}
++
++static inline void kaiser_unmap_thread_stack(void *stack)
++{
++	/*
++	 * Note: may be called even when kaiser_map_thread_stack() failed.
++	 */
++	kaiser_remove_mapping((unsigned long)stack +
++			THREAD_SIZE - PAGE_SIZE, PAGE_SIZE);
++}
++#else
++
++/*
++ * These stubs are used whenever CONFIG_KAISER is off, which
++ * includes architectures that support KAISER, but have it disabled.
++ */
++
++static inline void kaiser_init(void)
++{
++}
++static inline int kaiser_add_mapping(unsigned long addr,
++				     unsigned long size, unsigned long flags)
++{
++	return 0;
++}
++static inline void kaiser_remove_mapping(unsigned long start,
++					 unsigned long size)
++{
++}
++static inline int kaiser_map_thread_stack(void *stack)
++{
++	return 0;
++}
++static inline void kaiser_unmap_thread_stack(void *stack)
++{
++}
++
++#endif /* !CONFIG_KAISER */
++#endif /* _LINUX_KAISER_H */
+--- a/include/linux/mmzone.h
++++ b/include/linux/mmzone.h
+@@ -95,8 +95,9 @@ enum zone_stat_item {
+ 	NR_SLAB_RECLAIMABLE,
+ 	NR_SLAB_UNRECLAIMABLE,
+ 	NR_PAGETABLE,		/* used for pagetables */
+-	NR_KERNEL_STACK,
+ 	/* Second 128 byte cacheline */
++	NR_KERNEL_STACK,
++	NR_KAISERTABLE,
+ 	NR_UNSTABLE_NFS,	/* NFS unstable pages */
+ 	NR_BOUNCE,
+ 	NR_VMSCAN_WRITE,
+--- a/include/linux/percpu-defs.h
++++ b/include/linux/percpu-defs.h
+@@ -28,6 +28,12 @@
+ 	(void)__vpp_verify;						\
+ } while (0)
+ 
++#ifdef CONFIG_KAISER
++#define USER_MAPPED_SECTION "..user_mapped"
++#else
++#define USER_MAPPED_SECTION ""
++#endif
++
+ /*
+  * s390 and alpha modules require percpu variables to be defined as
+  * weak to force the compiler to generate GOT based external
+@@ -90,6 +96,12 @@
+ #define DEFINE_PER_CPU(type, name)					\
+ 	DEFINE_PER_CPU_SECTION(type, name, "")
+ 
++#define DECLARE_PER_CPU_USER_MAPPED(type, name)				\
++	DECLARE_PER_CPU_SECTION(type, name, USER_MAPPED_SECTION)
++
++#define DEFINE_PER_CPU_USER_MAPPED(type, name)				\
++	DEFINE_PER_CPU_SECTION(type, name, USER_MAPPED_SECTION)
++
+ /*
+  * Declaration/definition used for per-CPU variables that must come first in
+  * the set of variables.
+@@ -119,6 +131,14 @@
+ 	DEFINE_PER_CPU_SECTION(type, name, PER_CPU_SHARED_ALIGNED_SECTION) \
+ 	____cacheline_aligned_in_smp
+ 
++#define DECLARE_PER_CPU_SHARED_ALIGNED_USER_MAPPED(type, name)		\
++	DECLARE_PER_CPU_SECTION(type, name, USER_MAPPED_SECTION PER_CPU_SHARED_ALIGNED_SECTION) \
++	____cacheline_aligned_in_smp
++
++#define DEFINE_PER_CPU_SHARED_ALIGNED_USER_MAPPED(type, name)		\
++	DEFINE_PER_CPU_SECTION(type, name, USER_MAPPED_SECTION PER_CPU_SHARED_ALIGNED_SECTION) \
++	____cacheline_aligned_in_smp
++
+ #define DECLARE_PER_CPU_ALIGNED(type, name)				\
+ 	DECLARE_PER_CPU_SECTION(type, name, PER_CPU_ALIGNED_SECTION)	\
+ 	____cacheline_aligned
+@@ -137,11 +157,21 @@
+ #define DEFINE_PER_CPU_PAGE_ALIGNED(type, name)				\
+ 	DEFINE_PER_CPU_SECTION(type, name, "..page_aligned")		\
+ 	__aligned(PAGE_SIZE)
++/*
++ * Declaration/definition used for per-CPU variables that must be page aligned and need to be mapped in user mode.
++ */
++#define DECLARE_PER_CPU_PAGE_ALIGNED_USER_MAPPED(type, name)		\
++	DECLARE_PER_CPU_SECTION(type, name, USER_MAPPED_SECTION"..page_aligned") \
++	__aligned(PAGE_SIZE)
++
++#define DEFINE_PER_CPU_PAGE_ALIGNED_USER_MAPPED(type, name)		\
++	DEFINE_PER_CPU_SECTION(type, name, USER_MAPPED_SECTION"..page_aligned") \
++	__aligned(PAGE_SIZE)
+ 
+ /*
+  * Declaration/definition used for per-CPU variables that must be read mostly.
+  */
+-#define DECLARE_PER_CPU_READ_MOSTLY(type, name)			\
++#define DECLARE_PER_CPU_READ_MOSTLY(type, name)				\
+ 	DECLARE_PER_CPU_SECTION(type, name, "..readmostly")
+ 
+ #define DEFINE_PER_CPU_READ_MOSTLY(type, name)				\
+--- a/init/main.c
++++ b/init/main.c
+@@ -69,6 +69,7 @@
+ #include <linux/slab.h>
+ #include <linux/perf_event.h>
+ #include <linux/random.h>
++#include <linux/kaiser.h>
+ 
+ #include <asm/io.h>
+ #include <asm/bugs.h>
+@@ -463,6 +464,7 @@ static void __init mm_init(void)
+ 	percpu_init_late();
+ 	pgtable_cache_init();
+ 	vmalloc_init();
++	kaiser_init();
+ }
+ 
+ asmlinkage void __init start_kernel(void)
+--- a/kernel/fork.c
++++ b/kernel/fork.c
+@@ -55,6 +55,7 @@
+ #include <linux/tsacct_kern.h>
+ #include <linux/cn_proc.h>
+ #include <linux/freezer.h>
++#include <linux/kaiser.h>
+ #include <linux/delayacct.h>
+ #include <linux/taskstats_kern.h>
+ #include <linux/random.h>
+@@ -133,6 +134,7 @@ static struct thread_info *alloc_thread_
+ 
+ static inline void free_thread_info(struct thread_info *ti)
+ {
++	kaiser_unmap_thread_stack(ti);
+ 	free_pages((unsigned long)ti, THREAD_SIZE_ORDER);
+ }
+ #endif
+@@ -275,6 +277,10 @@ static struct task_struct *dup_task_stru
+ 
+ 	tsk->stack = ti;
+ 
++	err = kaiser_map_thread_stack(tsk->stack);
++	if (err)
++		goto out;
++
+ 	setup_thread_stack(tsk, orig);
+ 	clear_user_return_notifier(tsk);
+ 	clear_tsk_need_resched(tsk);
+--- a/mm/vmstat.c
++++ b/mm/vmstat.c
+@@ -699,6 +699,7 @@ const char * const vmstat_text[] = {
+ 	"nr_slab_unreclaimable",
+ 	"nr_page_table_pages",
+ 	"nr_kernel_stack",
++	"nr_overhead",
+ 	"nr_unstable",
+ 	"nr_bounce",
+ 	"nr_vmscan_write",
+--- a/security/Kconfig
++++ b/security/Kconfig
+@@ -105,6 +105,16 @@ config SECURITY
+ 
+ 	  If you are unsure how to answer this question, answer N.
+ 
++config KAISER
++	bool "Remove the kernel mapping in user mode"
++	default y
++	depends on X86_64 && SMP && !PARAVIRT
++	help
++	  This enforces a strict kernel and user space isolation, in order
++	  to close hardware side channels on kernel address information.
++
++	  If you are unsure how to answer this question, answer Y.
++
+ config SECURITYFS
+ 	bool "Enable the securityfs filesystem"
+ 	help
diff --git a/debian/patches/bugfix/all/kpti/kaiser-use-alternative-instead-of-x86_cr3_pcid_noflush.patch b/debian/patches/bugfix/all/kpti/kaiser-use-alternative-instead-of-x86_cr3_pcid_noflush.patch
new file mode 100644
index 0000000..2ae1936
--- /dev/null
+++ b/debian/patches/bugfix/all/kpti/kaiser-use-alternative-instead-of-x86_cr3_pcid_noflush.patch
@@ -0,0 +1,108 @@
+From: Hugh Dickins <hughd at google.com>
+Date: Tue, 3 Oct 2017 20:49:04 -0700
+Subject: kaiser: use ALTERNATIVE instead of x86_cr3_pcid_noflush
+
+Now that we're playing the ALTERNATIVE game, use that more efficient
+method: instead of user-mapping an extra page, and reading an extra
+cacheline each time for x86_cr3_pcid_noflush.
+
+Neel has found that __stringify(bts $X86_CR3_PCID_NOFLUSH_BIT, %rax)
+is a working substitute for the "bts $63, %rax" in these ALTERNATIVEs;
+but the one line with $63 in looks clearer, so let's stick with that.
+
+Worried about what happens with an ALTERNATIVE between the jump and
+jump label in another ALTERNATIVE?  I was, but have checked the
+combinations in SWITCH_KERNEL_CR3_NO_STACK at entry_SYSCALL_64,
+and it does a good job.
+
+(cherry picked from Change-Id: I46d06167615aa8d628eed9972125ab2faca93f05)
+
+Signed-off-by: Hugh Dickins <hughd at google.com>
+Signed-off-by: Ben Hutchings <ben at decadent.org.uk>
+---
+ arch/x86/include/asm/kaiser.h |  6 +++---
+ arch/x86/kernel/entry_64.S    |  3 ++-
+ arch/x86/mm/kaiser.c          | 10 +---------
+ 3 files changed, 6 insertions(+), 13 deletions(-)
+
+--- a/arch/x86/include/asm/kaiser.h
++++ b/arch/x86/include/asm/kaiser.h
+@@ -25,7 +25,8 @@
+ .macro _SWITCH_TO_KERNEL_CR3 reg
+ movq %cr3, \reg
+ andq $(~(X86_CR3_PCID_ASID_MASK | KAISER_SHADOW_PGD_OFFSET)), \reg
+-orq  x86_cr3_pcid_noflush, \reg
++/* If PCID enabled, set X86_CR3_PCID_NOFLUSH_BIT */
++ALTERNATIVE "", "bts $63, \reg", X86_FEATURE_PCID
+ movq \reg, %cr3
+ .endm
+ 
+@@ -39,7 +40,7 @@ movq \reg, %cr3
+ movq %cr3, \reg
+ orq  PER_CPU_VAR(x86_cr3_pcid_user), \reg
+ js   9f
+-/* FLUSH this time, reset to NOFLUSH for next time (if PCID enabled) */
++/* If PCID enabled, FLUSH this time, reset to NOFLUSH for next time */
+ movb \regb, PER_CPU_VAR(x86_cr3_pcid_user+7)
+ 9:
+ movq \reg, %cr3
+@@ -90,7 +91,6 @@ movq PER_CPU_VAR(unsafe_stack_register_b
+ */
+ DECLARE_PER_CPU_USER_MAPPED(unsigned long, unsafe_stack_register_backup);
+ 
+-extern unsigned long x86_cr3_pcid_noflush;
+ DECLARE_PER_CPU(unsigned long, x86_cr3_pcid_user);
+ 
+ extern char __per_cpu_user_mapped_start[], __per_cpu_user_mapped_end[];
+--- a/arch/x86/kernel/entry_64.S
++++ b/arch/x86/kernel/entry_64.S
+@@ -411,7 +411,8 @@ ENTRY(save_paranoid)
+ 	jz	2f
+ 	orl	$2, %ebx
+ 	andq	$(~(X86_CR3_PCID_ASID_MASK | KAISER_SHADOW_PGD_OFFSET)), %rax
+-	orq	x86_cr3_pcid_noflush, %rax
++	/* If PCID enabled, set X86_CR3_PCID_NOFLUSH_BIT */
++	ALTERNATIVE "", "bts $63, %rax", X86_FEATURE_PCID
+ 	movq	%rax, %cr3
+ 2:
+ #endif
+--- a/arch/x86/mm/kaiser.c
++++ b/arch/x86/mm/kaiser.c
+@@ -35,7 +35,6 @@ DEFINE_PER_CPU_USER_MAPPED(unsigned long
+  * This is also handy because systems that do not support PCIDs
+  * just end up or'ing a 0 into their CR3, which does no harm.
+  */
+-unsigned long x86_cr3_pcid_noflush __read_mostly;
+ DEFINE_PER_CPU(unsigned long, x86_cr3_pcid_user);
+ 
+ /*
+@@ -346,9 +345,6 @@ void __init kaiser_init(void)
+ 				  __PAGE_KERNEL_VVAR);
+ 	kaiser_add_user_map_early((void *)VSYSCALL_START, PAGE_SIZE,
+ 				  vsyscall_pgprot);
+-	kaiser_add_user_map_early(&x86_cr3_pcid_noflush,
+-				  sizeof(x86_cr3_pcid_noflush),
+-				  __PAGE_KERNEL);
+ }
+ 
+ /* Add a mapping to the shadow mapping, and synchronize the mappings */
+@@ -420,18 +416,14 @@ pgd_t kaiser_set_shadow_pgd(pgd_t *pgdp,
+ 
+ void kaiser_setup_pcid(void)
+ {
+-	unsigned long kern_cr3 = 0;
+ 	unsigned long user_cr3 = KAISER_SHADOW_PGD_OFFSET;
+ 
+-	if (this_cpu_has(X86_FEATURE_PCID)) {
+-		kern_cr3 |= X86_CR3_PCID_KERN_NOFLUSH;
++	if (this_cpu_has(X86_FEATURE_PCID))
+ 		user_cr3 |= X86_CR3_PCID_USER_NOFLUSH;
+-	}
+ 	/*
+ 	 * These variables are used by the entry/exit
+ 	 * code to change PCID and pgd and TLB flushing.
+ 	 */
+-	x86_cr3_pcid_noflush = kern_cr3;
+ 	this_cpu_write(x86_cr3_pcid_user, user_cr3);
+ }
+ 
diff --git a/debian/patches/bugfix/all/kpti/kaiser-user_map-__kprobes_text-too.patch b/debian/patches/bugfix/all/kpti/kaiser-user_map-__kprobes_text-too.patch
new file mode 100644
index 0000000..3ba91bb
--- /dev/null
+++ b/debian/patches/bugfix/all/kpti/kaiser-user_map-__kprobes_text-too.patch
@@ -0,0 +1,26 @@
+From: Hugh Dickins <hughd at google.com>
+Date: Sun, 17 Dec 2017 19:29:01 -0800
+Subject: kaiser: user_map __kprobes_text too
+
+In 3.2 (and earlier, and up to 3.15) Kaiser needs to user_map the
+__kprobes_text as well as the __entry_text: entry_64.S places some
+vital functions there, so without this you very soon triple-fault.
+Many thanks to Jiri Kosina for pointing me in this direction.
+
+Signed-off-by: Hugh Dickins <hughd at google.com>
+Signed-off-by: Ben Hutchings <ben at decadent.org.uk>
+---
+ arch/x86/mm/kaiser.c | 2 ++
+ 1 file changed, 2 insertions(+)
+
+--- a/arch/x86/mm/kaiser.c
++++ b/arch/x86/mm/kaiser.c
+@@ -281,6 +281,8 @@ void __init kaiser_init(void)
+ 	 */
+ 	kaiser_add_user_map_ptrs_early(__entry_text_start, __entry_text_end,
+ 				       __PAGE_KERNEL_RX);
++	kaiser_add_user_map_ptrs_early(__kprobes_text_start, __kprobes_text_end,
++				       __PAGE_KERNEL_RX);
+ #ifdef CONFIG_FUNCTION_GRAPH_TRACER
+ 	kaiser_add_user_map_ptrs_early(__irqentry_text_start,
+ 				       __irqentry_text_end,
diff --git a/debian/patches/bugfix/all/kpti/kpti-rename-to-page_table_isolation.patch b/debian/patches/bugfix/all/kpti/kpti-rename-to-page_table_isolation.patch
new file mode 100644
index 0000000..23dde2e
--- /dev/null
+++ b/debian/patches/bugfix/all/kpti/kpti-rename-to-page_table_isolation.patch
@@ -0,0 +1,275 @@
+From: Kees Cook <keescook at chromium.org>
+Date: Thu, 4 Jan 2018 01:14:24 +0000
+Subject: KPTI: Rename to PAGE_TABLE_ISOLATION
+
+This renames CONFIG_KAISER to CONFIG_PAGE_TABLE_ISOLATION.
+
+Signed-off-by: Kees Cook <keescook at chromium.org>
+Signed-off-by: Greg Kroah-Hartman <gregkh at linuxfoundation.org>
+[bwh: Backported to 3.2]
+Signed-off-by: Ben Hutchings <ben at decadent.org.uk>
+---
+ arch/x86/boot/compressed/misc.h           |  2 +-
+ arch/x86/include/asm/cpufeature.h         |  2 +-
+ arch/x86/include/asm/kaiser.h             | 12 ++++++------
+ arch/x86/include/asm/pgtable.h            |  4 ++--
+ arch/x86/include/asm/pgtable_64.h         |  4 ++--
+ arch/x86/include/asm/pgtable_types.h      |  2 +-
+ arch/x86/include/asm/tlbflush.h           |  2 +-
+ arch/x86/kernel/cpu/perf_event_intel_ds.c |  4 ++--
+ arch/x86/kernel/entry_64.S                |  6 +++---
+ arch/x86/kernel/head_64.S                 |  2 +-
+ arch/x86/mm/Makefile                      |  2 +-
+ include/linux/kaiser.h                    |  6 +++---
+ include/linux/percpu-defs.h               |  2 +-
+ security/Kconfig                          |  2 +-
+ 14 files changed, 26 insertions(+), 26 deletions(-)
+
+--- a/arch/x86/boot/compressed/misc.h
++++ b/arch/x86/boot/compressed/misc.h
+@@ -7,7 +7,7 @@
+  * we just keep it from happening
+  */
+ #undef CONFIG_PARAVIRT
+-#undef CONFIG_KAISER
++#undef CONFIG_PAGE_TABLE_ISOLATION
+ #ifdef CONFIG_X86_32
+ #define _ASM_X86_DESC_H 1
+ #endif
+--- a/arch/x86/include/asm/cpufeature.h
++++ b/arch/x86/include/asm/cpufeature.h
+@@ -180,7 +180,7 @@
+ #define X86_FEATURE_INVPCID_SINGLE (7*32+ 9) /* Effectively INVPCID && CR4.PCIDE=1 */
+ 
+ /* Because the ALTERNATIVE scheme is for members of the X86_FEATURE club... */
+-#define X86_FEATURE_KAISER	( 7*32+31) /* "" CONFIG_KAISER w/o nokaiser */
++#define X86_FEATURE_KAISER	( 7*32+31) /* "" CONFIG_PAGE_TABLE_ISOLATION w/o nokaiser */
+ 
+ /* Virtualization flags: Linux defined, word 8 */
+ #define X86_FEATURE_TPR_SHADOW  (8*32+ 0) /* Intel TPR Shadow */
+--- a/arch/x86/include/asm/kaiser.h
++++ b/arch/x86/include/asm/kaiser.h
+@@ -20,7 +20,7 @@
+ #define KAISER_SHADOW_PGD_OFFSET 0x1000
+ 
+ #ifdef __ASSEMBLY__
+-#ifdef CONFIG_KAISER
++#ifdef CONFIG_PAGE_TABLE_ISOLATION
+ 
+ .macro _SWITCH_TO_KERNEL_CR3 reg
+ movq %cr3, \reg
+@@ -69,7 +69,7 @@ movq PER_CPU_VAR(unsafe_stack_register_b
+ 8:
+ .endm
+ 
+-#else /* CONFIG_KAISER */
++#else /* CONFIG_PAGE_TABLE_ISOLATION */
+ 
+ .macro SWITCH_KERNEL_CR3
+ .endm
+@@ -78,11 +78,11 @@ movq PER_CPU_VAR(unsafe_stack_register_b
+ .macro SWITCH_KERNEL_CR3_NO_STACK
+ .endm
+ 
+-#endif /* CONFIG_KAISER */
++#endif /* CONFIG_PAGE_TABLE_ISOLATION */
+ 
+ #else /* __ASSEMBLY__ */
+ 
+-#ifdef CONFIG_KAISER
++#ifdef CONFIG_PAGE_TABLE_ISOLATION
+ /*
+  * Upon kernel/user mode switch, it may happen that the address
+  * space has to be switched before the registers have been
+@@ -100,10 +100,10 @@ extern void __init kaiser_check_boottime
+ #else
+ #define kaiser_enabled	0
+ static inline void __init kaiser_check_boottime_disable(void) {}
+-#endif /* CONFIG_KAISER */
++#endif /* CONFIG_PAGE_TABLE_ISOLATION */
+ 
+ /*
+- * Kaiser function prototypes are needed even when CONFIG_KAISER is not set,
++ * Kaiser function prototypes are needed even when CONFIG_PAGE_TABLE_ISOLATION is not set,
+  * so as to build with tests on kaiser_enabled instead of #ifdefs.
+  */
+ 
+--- a/arch/x86/include/asm/pgtable.h
++++ b/arch/x86/include/asm/pgtable.h
+@@ -17,7 +17,7 @@
+ #ifndef __ASSEMBLY__
+ 
+ #include <asm/x86_init.h>
+-#ifdef CONFIG_KAISER
++#ifdef CONFIG_PAGE_TABLE_ISOLATION
+ extern int kaiser_enabled;
+ #else
+ #define kaiser_enabled 0
+@@ -786,7 +786,7 @@ static inline void pmdp_set_wrprotect(st
+ static inline void clone_pgd_range(pgd_t *dst, pgd_t *src, int count)
+ {
+ 	memcpy(dst, src, count * sizeof(pgd_t));
+-#ifdef CONFIG_KAISER
++#ifdef CONFIG_PAGE_TABLE_ISOLATION
+ 	if (kaiser_enabled) {
+ 		/* Clone the shadow pgd part as well */
+ 		memcpy(native_get_shadow_pgd(dst),
+--- a/arch/x86/include/asm/pgtable_64.h
++++ b/arch/x86/include/asm/pgtable_64.h
+@@ -105,7 +105,7 @@ static inline void native_pud_clear(pud_
+ 	native_set_pud(pud, native_make_pud(0));
+ }
+ 
+-#ifdef CONFIG_KAISER
++#ifdef CONFIG_PAGE_TABLE_ISOLATION
+ extern pgd_t kaiser_set_shadow_pgd(pgd_t *pgdp, pgd_t pgd);
+ 
+ static inline pgd_t *native_get_shadow_pgd(pgd_t *pgdp)
+@@ -125,7 +125,7 @@ static inline pgd_t *native_get_shadow_p
+ {
+ 	return NULL;
+ }
+-#endif /* CONFIG_KAISER */
++#endif /* CONFIG_PAGE_TABLE_ISOLATION */
+ 
+ static inline void native_set_pgd(pgd_t *pgdp, pgd_t pgd)
+ {
+--- a/arch/x86/include/asm/pgtable_types.h
++++ b/arch/x86/include/asm/pgtable_types.h
+@@ -81,7 +81,7 @@
+ #define X86_CR3_PCID_MASK       (X86_CR3_PCID_NOFLUSH | X86_CR3_PCID_ASID_MASK)
+ #define X86_CR3_PCID_ASID_KERN  (_AC(0x0,UL))
+ 
+-#if defined(CONFIG_KAISER) && defined(CONFIG_X86_64)
++#if defined(CONFIG_PAGE_TABLE_ISOLATION) && defined(CONFIG_X86_64)
+ /* Let X86_CR3_PCID_ASID_USER be usable for the X86_CR3_PCID_NOFLUSH bit */
+ #define X86_CR3_PCID_ASID_USER	(_AC(0x80,UL))
+ 
+--- a/arch/x86/include/asm/tlbflush.h
++++ b/arch/x86/include/asm/tlbflush.h
+@@ -68,7 +68,7 @@ static inline void invpcid_flush_all_non
+  * Declare a couple of kaiser interfaces here for convenience,
+  * to avoid the need for asm/kaiser.h in unexpected places.
+  */
+-#ifdef CONFIG_KAISER
++#ifdef CONFIG_PAGE_TABLE_ISOLATION
+ extern int kaiser_enabled;
+ extern void kaiser_setup_pcid(void);
+ extern void kaiser_flush_tlb_on_return_to_user(void);
+--- a/arch/x86/kernel/cpu/perf_event_intel_ds.c
++++ b/arch/x86/kernel/cpu/perf_event_intel_ds.c
+@@ -66,7 +66,7 @@ void fini_debug_store_on_cpu(int cpu)
+ 
+ static void *dsalloc(size_t size, gfp_t flags, int node)
+ {
+-#ifdef CONFIG_KAISER
++#ifdef CONFIG_PAGE_TABLE_ISOLATION
+ 	unsigned int order = get_order(size);
+ 	struct page *page;
+ 	unsigned long addr;
+@@ -87,7 +87,7 @@ static void *dsalloc(size_t size, gfp_t
+ 
+ static void dsfree(const void *buffer, size_t size)
+ {
+-#ifdef CONFIG_KAISER
++#ifdef CONFIG_PAGE_TABLE_ISOLATION
+ 	if (!buffer)
+ 		return;
+ 	kaiser_remove_mapping((unsigned long)buffer, size);
+--- a/arch/x86/kernel/entry_64.S
++++ b/arch/x86/kernel/entry_64.S
+@@ -398,7 +398,7 @@ ENTRY(save_paranoid)
+ 	SWAPGS
+ 	xorl %ebx,%ebx
+ 1:
+-#ifdef CONFIG_KAISER
++#ifdef CONFIG_PAGE_TABLE_ISOLATION
+ 	/*
+ 	 * We might have come in between a swapgs and a SWITCH_KERNEL_CR3
+ 	 * on entry, or between a SWITCH_USER_CR3 and a swapgs on exit.
+@@ -1430,7 +1430,7 @@ ENTRY(paranoid_exit)
+ paranoid_kernel:
+ 	movq	%r12, %rbx		/* restore after paranoid_userspace */
+ 	TRACE_IRQS_IRETQ 0
+-#ifdef CONFIG_KAISER
++#ifdef CONFIG_PAGE_TABLE_ISOLATION
+ 	/* No ALTERNATIVE for X86_FEATURE_KAISER: save_paranoid sets %ebx */
+ 	testl	$2, %ebx		/* SWITCH_USER_CR3 needed? */
+ 	jz	paranoid_exit_no_switch
+@@ -1600,7 +1600,7 @@ ENTRY(nmi)
+ 	jnz	nmi_userspace
+ nmi_kernel:
+ 	movq	%r12, %rbx		/* restore after nmi_userspace */
+-#ifdef CONFIG_KAISER
++#ifdef CONFIG_PAGE_TABLE_ISOLATION
+ 	/* No ALTERNATIVE for X86_FEATURE_KAISER: save_paranoid sets %ebx */
+ 	testl	$2, %ebx		/* SWITCH_USER_CR3 needed? */
+ 	jz	nmi_exit_no_switch
+--- a/arch/x86/kernel/head_64.S
++++ b/arch/x86/kernel/head_64.S
+@@ -338,7 +338,7 @@ early_idt_ripmsg:
+ 	.balign	PAGE_SIZE; \
+ ENTRY(name)
+ 
+-#ifdef CONFIG_KAISER
++#ifdef CONFIG_PAGE_TABLE_ISOLATION
+ /*
+  * Each PGD needs to be 8k long and 8k aligned.  We do not
+  * ever go out to userspace with these, so we do not
+--- a/arch/x86/mm/Makefile
++++ b/arch/x86/mm/Makefile
+@@ -29,4 +29,4 @@ obj-$(CONFIG_NUMA_EMU)		+= numa_emulatio
+ obj-$(CONFIG_HAVE_MEMBLOCK)		+= memblock.o
+ 
+ obj-$(CONFIG_MEMTEST)		+= memtest.o
+-obj-$(CONFIG_KAISER)		+= kaiser.o
++obj-$(CONFIG_PAGE_TABLE_ISOLATION)		+= kaiser.o
+--- a/include/linux/kaiser.h
++++ b/include/linux/kaiser.h
+@@ -1,7 +1,7 @@
+ #ifndef _LINUX_KAISER_H
+ #define _LINUX_KAISER_H
+ 
+-#ifdef CONFIG_KAISER
++#ifdef CONFIG_PAGE_TABLE_ISOLATION
+ #include <asm/kaiser.h>
+ 
+ static inline int kaiser_map_thread_stack(void *stack)
+@@ -24,7 +24,7 @@ static inline void kaiser_unmap_thread_s
+ #else
+ 
+ /*
+- * These stubs are used whenever CONFIG_KAISER is off, which
++ * These stubs are used whenever CONFIG_PAGE_TABLE_ISOLATION is off, which
+  * includes architectures that support KAISER, but have it disabled.
+  */
+ 
+@@ -48,5 +48,5 @@ static inline void kaiser_unmap_thread_s
+ {
+ }
+ 
+-#endif /* !CONFIG_KAISER */
++#endif /* !CONFIG_PAGE_TABLE_ISOLATION */
+ #endif /* _LINUX_KAISER_H */
+--- a/include/linux/percpu-defs.h
++++ b/include/linux/percpu-defs.h
+@@ -28,7 +28,7 @@
+ 	(void)__vpp_verify;						\
+ } while (0)
+ 
+-#ifdef CONFIG_KAISER
++#ifdef CONFIG_PAGE_TABLE_ISOLATION
+ #define USER_MAPPED_SECTION "..user_mapped"
+ #else
+ #define USER_MAPPED_SECTION ""
+--- a/security/Kconfig
++++ b/security/Kconfig
+@@ -105,7 +105,7 @@ config SECURITY
+ 
+ 	  If you are unsure how to answer this question, answer N.
+ 
+-config KAISER
++config PAGE_TABLE_ISOLATION
+ 	bool "Remove the kernel mapping in user mode"
+ 	default y
+ 	depends on X86_64 && SMP
diff --git a/debian/patches/bugfix/all/kpti/kpti-report-when-enabled.patch b/debian/patches/bugfix/all/kpti/kpti-report-when-enabled.patch
new file mode 100644
index 0000000..31773ce
--- /dev/null
+++ b/debian/patches/bugfix/all/kpti/kpti-report-when-enabled.patch
@@ -0,0 +1,44 @@
+From: Kees Cook <keescook at chromium.org>
+Date: Wed, 3 Jan 2018 10:18:01 -0800
+Subject: KPTI: Report when enabled
+
+Make sure dmesg reports when KPTI is enabled.
+
+Signed-off-by: Kees Cook <keescook at chromium.org>
+Signed-off-by: Greg Kroah-Hartman <gregkh at linuxfoundation.org>
+[bwh: Backported to 3.2: adjust context]
+Signed-off-by: Ben Hutchings <ben at decadent.org.uk>
+---
+ arch/x86/mm/kaiser.c | 7 ++++++-
+ 1 file changed, 6 insertions(+), 1 deletion(-)
+
+--- a/arch/x86/mm/kaiser.c
++++ b/arch/x86/mm/kaiser.c
+@@ -13,6 +13,9 @@
+ #include <linux/ftrace.h>
+ #include <xen/xen.h>
+ 
++#undef pr_fmt
++#define pr_fmt(fmt)     "Kernel/User page tables isolation: " fmt
++
+ extern struct mm_struct init_mm;
+ 
+ #include <asm/kaiser.h>
+@@ -297,7 +300,7 @@ enable:
+ 	return;
+ 
+ disable:
+-	pr_info("Kernel/User page tables isolation: disabled\n");
++	pr_info("disabled\n");
+ 
+ silent_disable:
+ 	kaiser_enabled = 0;
+@@ -349,6 +352,8 @@ void __init kaiser_init(void)
+ 				  __PAGE_KERNEL_VVAR);
+ 	kaiser_add_user_map_early((void *)VSYSCALL_START, PAGE_SIZE,
+ 				  vsyscall_pgprot);
++
++	pr_info("enabled\n");
+ }
+ 
+ /* Add a mapping to the shadow mapping, and synchronize the mappings */
diff --git a/debian/patches/bugfix/all/kpti/mm-mmu_context-sched-core-fix-mmu_context.h-assumption.patch b/debian/patches/bugfix/all/kpti/mm-mmu_context-sched-core-fix-mmu_context.h-assumption.patch
new file mode 100644
index 0000000..7be8dec
--- /dev/null
+++ b/debian/patches/bugfix/all/kpti/mm-mmu_context-sched-core-fix-mmu_context.h-assumption.patch
@@ -0,0 +1,37 @@
+From: Ingo Molnar <mingo at kernel.org>
+Date: Thu, 28 Apr 2016 11:39:12 +0200
+Subject: mm/mmu_context, sched/core: Fix mmu_context.h  assumption
+
+commit 8efd755ac2fe262d4c8d5c9bbe054bb67dae93da upstream.
+
+Some architectures (such as Alpha) rely on include/linux/sched.h definitions
+in their mmu_context.h files.
+
+So include sched.h before mmu_context.h.
+
+Cc: Andy Lutomirski <luto at kernel.org>
+Cc: Borislav Petkov <bp at alien8.de>
+Cc: Linus Torvalds <torvalds at linux-foundation.org>
+Cc: linux-kernel at vger.kernel.org
+Cc: Peter Zijlstra <peterz at infradead.org>
+Cc: Thomas Gleixner <tglx at linutronix.de>
+Signed-off-by: Ingo Molnar <mingo at kernel.org>
+Cc: Hugh Dickins <hughd at google.com>
+Signed-off-by: Ben Hutchings <ben at decadent.org.uk>
+---
+ mm/mmu_context.c | 2 +-
+ 1 file changed, 1 insertion(+), 1 deletion(-)
+
+--- a/mm/mmu_context.c
++++ b/mm/mmu_context.c
+@@ -4,9 +4,9 @@
+  */
+ 
+ #include <linux/mm.h>
++#include <linux/sched.h>
+ #include <linux/mmu_context.h>
+ #include <linux/export.h>
+-#include <linux/sched.h>
+ 
+ #include <asm/mmu_context.h>
+ 
diff --git a/debian/patches/bugfix/all/kpti/sched-core-add-switch_mm_irqs_off-and-use-it-in-the-scheduler.patch b/debian/patches/bugfix/all/kpti/sched-core-add-switch_mm_irqs_off-and-use-it-in-the-scheduler.patch
new file mode 100644
index 0000000..fc1a345
--- /dev/null
+++ b/debian/patches/bugfix/all/kpti/sched-core-add-switch_mm_irqs_off-and-use-it-in-the-scheduler.patch
@@ -0,0 +1,73 @@
+From: Andy Lutomirski <luto at kernel.org>
+Date: Tue, 26 Apr 2016 09:39:06 -0700
+Subject: sched/core: Add switch_mm_irqs_off() and use it in the  scheduler
+
+commit f98db6013c557c216da5038d9c52045be55cd039 upstream.
+
+By default, this is the same thing as switch_mm().
+
+x86 will override it as an optimization.
+
+Signed-off-by: Andy Lutomirski <luto at kernel.org>
+Reviewed-by: Borislav Petkov <bp at suse.de>
+Cc: Borislav Petkov <bp at alien8.de>
+Cc: Linus Torvalds <torvalds at linux-foundation.org>
+Cc: Peter Zijlstra <peterz at infradead.org>
+Cc: Thomas Gleixner <tglx at linutronix.de>
+Link: http://lkml.kernel.org/r/df401df47bdd6be3e389c6f1e3f5310d70e81b2c.1461688545.git.luto@kernel.org
+Signed-off-by: Ingo Molnar <mingo at kernel.org>
+Signed-off-by: Hugh Dickins <hughd at google.com>
+Signed-off-by: Ben Hutchings <ben at decadent.org.uk>
+---
+ include/linux/mmu_context.h | 7 +++++++
+ kernel/sched.c              | 6 +++---
+ 2 files changed, 10 insertions(+), 3 deletions(-)
+
+--- a/include/linux/mmu_context.h
++++ b/include/linux/mmu_context.h
+@@ -1,9 +1,16 @@
+ #ifndef _LINUX_MMU_CONTEXT_H
+ #define _LINUX_MMU_CONTEXT_H
+ 
++#include <asm/mmu_context.h>
++
+ struct mm_struct;
+ 
+ void use_mm(struct mm_struct *mm);
+ void unuse_mm(struct mm_struct *mm);
+ 
++/* Architectures that care about IRQ state in switch_mm can override this. */
++#ifndef switch_mm_irqs_off
++# define switch_mm_irqs_off switch_mm
++#endif
++
+ #endif
+--- a/kernel/sched.c
++++ b/kernel/sched.c
+@@ -32,7 +32,7 @@
+ #include <linux/init.h>
+ #include <linux/uaccess.h>
+ #include <linux/highmem.h>
+-#include <asm/mmu_context.h>
++#include <linux/mmu_context.h>
+ #include <linux/interrupt.h>
+ #include <linux/capability.h>
+ #include <linux/completion.h>
+@@ -3331,7 +3331,7 @@ context_switch(struct rq *rq, struct tas
+ 		atomic_inc(&oldmm->mm_count);
+ 		enter_lazy_tlb(oldmm, next);
+ 	} else
+-		switch_mm(oldmm, mm, next);
++		switch_mm_irqs_off(oldmm, mm, next);
+ 
+ 	if (!prev->mm) {
+ 		prev->active_mm = NULL;
+@@ -6553,7 +6553,7 @@ void idle_task_exit(void)
+ 	BUG_ON(cpu_online(smp_processor_id()));
+ 
+ 	if (mm != &init_mm)
+-		switch_mm(mm, &init_mm, current);
++		switch_mm_irqs_off(mm, &init_mm, current);
+ 	mmdrop(mm);
+ }
+ 
diff --git a/debian/patches/bugfix/all/kpti/sched-core-idle_task_exit-shouldn-t-use-switch_mm_irqs_off.patch b/debian/patches/bugfix/all/kpti/sched-core-idle_task_exit-shouldn-t-use-switch_mm_irqs_off.patch
new file mode 100644
index 0000000..5564a7e
--- /dev/null
+++ b/debian/patches/bugfix/all/kpti/sched-core-idle_task_exit-shouldn-t-use-switch_mm_irqs_off.patch
@@ -0,0 +1,41 @@
+From: Andy Lutomirski <luto at kernel.org>
+Date: Fri, 9 Jun 2017 11:49:15 -0700
+Subject: sched/core: Idle_task_exit() shouldn't use  switch_mm_irqs_off()
+
+commit 252d2a4117bc181b287eeddf848863788da733ae upstream.
+
+idle_task_exit() can be called with IRQs on x86 on and therefore
+should use switch_mm(), not switch_mm_irqs_off().
+
+This doesn't seem to cause any problems right now, but it will
+confuse my upcoming TLB flush changes.  Nonetheless, I think it
+should be backported because it's trivial.  There won't be any
+meaningful performance impact because idle_task_exit() is only
+used when offlining a CPU.
+
+Signed-off-by: Andy Lutomirski <luto at kernel.org>
+Cc: Borislav Petkov <bp at suse.de>
+Cc: Linus Torvalds <torvalds at linux-foundation.org>
+Cc: Peter Zijlstra <peterz at infradead.org>
+Cc: Thomas Gleixner <tglx at linutronix.de>
+Cc: stable at vger.kernel.org
+Fixes: f98db6013c55 ("sched/core: Add switch_mm_irqs_off() and use it in the scheduler")
+Link: http://lkml.kernel.org/r/ca3d1a9fa93a0b49f5a8ff729eda3640fb6abdf9.1497034141.git.luto@kernel.org
+Signed-off-by: Ingo Molnar <mingo at kernel.org>
+Signed-off-by: Hugh Dickins <hughd at google.com>
+Signed-off-by: Ben Hutchings <ben at decadent.org.uk>
+---
+ kernel/sched.c | 2 +-
+ 1 file changed, 1 insertion(+), 1 deletion(-)
+
+--- a/kernel/sched.c
++++ b/kernel/sched.c
+@@ -6553,7 +6553,7 @@ void idle_task_exit(void)
+ 	BUG_ON(cpu_online(smp_processor_id()));
+ 
+ 	if (mm != &init_mm)
+-		switch_mm_irqs_off(mm, &init_mm, current);
++		switch_mm(mm, &init_mm, current);
+ 	mmdrop(mm);
+ }
+ 
diff --git a/debian/patches/bugfix/all/kpti/x86-alternatives-add-instruction-padding.patch b/debian/patches/bugfix/all/kpti/x86-alternatives-add-instruction-padding.patch
new file mode 100644
index 0000000..ee31878
--- /dev/null
+++ b/debian/patches/bugfix/all/kpti/x86-alternatives-add-instruction-padding.patch
@@ -0,0 +1,348 @@
+From: Borislav Petkov <bp at suse.de>
+Date: Sat, 27 Dec 2014 10:41:52 +0100
+Subject: x86/alternatives: Add instruction padding
+
+commit 4332195c5615bf748624094ce4ff6797e475024d upstream.
+
+Up until now we have always paid attention to make sure the length of
+the new instruction replacing the old one is at least less or equal to
+the length of the old instruction. If the new instruction is longer, at
+the time it replaces the old instruction it will overwrite the beginning
+of the next instruction in the kernel image and cause your pants to
+catch fire.
+
+So instead of having to pay attention, teach the alternatives framework
+to pad shorter old instructions with NOPs at buildtime - but only in the
+case when
+
+  len(old instruction(s)) < len(new instruction(s))
+
+and add nothing in the >= case. (In that case we do add_nops() when
+patching).
+
+This way the alternatives user shouldn't have to care about instruction
+sizes and simply use the macros.
+
+Add asm ALTERNATIVE* flavor macros too, while at it.
+
+Also, we need to save the pad length in a separate struct alt_instr
+member for NOP optimization and the way to do that reliably is to carry
+the pad length instead of trying to detect whether we're looking at
+single-byte NOPs or at pathological instruction offsets like e9 90 90 90
+90, for example, which is a valid instruction.
+
+Thanks to Michael Matz for the great help with toolchain questions.
+
+Signed-off-by: Borislav Petkov <bp at suse.de>
+Signed-off-by: Hugh Dickins <hughd at google.com>
+Signed-off-by: Ben Hutchings <ben at decadent.org.uk>
+---
+ arch/x86/include/asm/alternative-asm.h | 43 +++++++++++++++++-
+ arch/x86/include/asm/alternative.h     | 80 ++++++++++++++++++++++++++--------
+ arch/x86/include/asm/cpufeature.h      |  4 +-
+ arch/x86/kernel/alternative.c          |  6 +--
+ arch/x86/kernel/entry_32.S             |  2 +-
+ arch/x86/lib/clear_page_64.S           |  4 +-
+ arch/x86/lib/copy_page_64.S            |  2 +-
+ arch/x86/lib/copy_user_64.S            |  4 +-
+ arch/x86/lib/memcpy_64.S               |  8 ++--
+ arch/x86/lib/memmove_64.S              |  2 +-
+ arch/x86/lib/memset_64.S               |  8 ++--
+ 11 files changed, 126 insertions(+), 37 deletions(-)
+
+--- a/arch/x86/include/asm/alternative-asm.h
++++ b/arch/x86/include/asm/alternative-asm.h
+@@ -15,12 +15,53 @@
+ 	.endm
+ #endif
+ 
+-.macro altinstruction_entry orig alt feature orig_len alt_len
++.macro altinstruction_entry orig alt feature orig_len alt_len pad_len
+ 	.long \orig - .
+ 	.long \alt - .
+ 	.word \feature
+ 	.byte \orig_len
+ 	.byte \alt_len
++	.byte \pad_len
++.endm
++
++.macro ALTERNATIVE oldinstr, newinstr, feature
++140:
++	\oldinstr
++141:
++	.skip -(((144f-143f)-(141b-140b)) > 0) * ((144f-143f)-(141b-140b)),0x90
++142:
++
++	.pushsection .altinstructions,"a"
++	altinstruction_entry 140b,143f,\feature,142b-140b,144f-143f,142b-141b
++	.popsection
++
++	.pushsection .altinstr_replacement,"ax"
++143:
++	\newinstr
++144:
++	.popsection
++.endm
++
++.macro ALTERNATIVE_2 oldinstr, newinstr1, feature1, newinstr2, feature2
++140:
++	\oldinstr
++141:
++	.skip -(((144f-143f)-(141b-140b)) > 0) * ((144f-143f)-(141b-140b)),0x90
++	.skip -(((145f-144f)-(144f-143f)-(141b-140b)) > 0) * ((145f-144f)-(144f-143f)-(141b-140b)),0x90
++142:
++
++	.pushsection .altinstructions,"a"
++	altinstruction_entry 140b,143f,\feature1,142b-140b,144f-143f,142b-141b
++	altinstruction_entry 140b,144f,\feature2,142b-140b,145f-144f,142b-141b
++	.popsection
++
++	.pushsection .altinstr_replacement,"ax"
++143:
++	\newinstr1
++144:
++	\newinstr2
++145:
++	.popsection
+ .endm
+ 
+ #endif  /*  __ASSEMBLY__  */
+--- a/arch/x86/include/asm/alternative.h
++++ b/arch/x86/include/asm/alternative.h
+@@ -47,8 +47,9 @@ struct alt_instr {
+ 	s32 repl_offset;	/* offset to replacement instruction */
+ 	u16 cpuid;		/* cpuid bit set for replacement */
+ 	u8  instrlen;		/* length of original instruction */
+-	u8  replacementlen;	/* length of new instruction, <= instrlen */
+-};
++	u8  replacementlen;	/* length of new instruction */
++	u8  padlen;		/* length of build-time padding */
++} __packed;
+ 
+ extern void alternative_instructions(void);
+ extern void apply_alternatives(struct alt_instr *start, struct alt_instr *end);
+@@ -75,23 +76,65 @@ static inline int alternatives_text_rese
+ }
+ #endif	/* CONFIG_SMP */
+ 
++#define b_replacement(num)	"664"#num
++#define e_replacement(num)	"665"#num
++
++#define alt_end_marker		"663"
++#define alt_slen		"662b-661b"
++#define alt_pad_len		alt_end_marker"b-662b"
++#define alt_total_slen		alt_end_marker"b-661b"
++#define alt_rlen(num)		e_replacement(num)"f-"b_replacement(num)"f"
++
++#define __OLDINSTR(oldinstr, num)					\
++	"661:\n\t" oldinstr "\n662:\n"					\
++	".skip -(((" alt_rlen(num) ")-(" alt_slen ")) > 0) * "		\
++		"((" alt_rlen(num) ")-(" alt_slen ")),0x90\n"
++
++#define OLDINSTR(oldinstr, num)						\
++	__OLDINSTR(oldinstr, num)					\
++	alt_end_marker ":\n"
++
++/*
++ * Pad the second replacement alternative with additional NOPs if it is
++ * additionally longer than the first replacement alternative.
++ */
++#define OLDINSTR_2(oldinstr, num1, num2)					\
++	__OLDINSTR(oldinstr, num1)						\
++	".skip -(((" alt_rlen(num2) ")-(" alt_rlen(num1) ")-(662b-661b)) > 0) * " \
++		"((" alt_rlen(num2) ")-(" alt_rlen(num1) ")-(662b-661b)),0x90\n"  \
++	alt_end_marker ":\n"
++
++#define ALTINSTR_ENTRY(feature, num)					      \
++	" .long 661b - .\n"				/* label           */ \
++	" .long " b_replacement(num)"f - .\n"		/* new instruction */ \
++	" .word " __stringify(feature) "\n"		/* feature bit     */ \
++	" .byte " alt_total_slen "\n"			/* source len      */ \
++	" .byte " alt_rlen(num) "\n"			/* replacement len */ \
++	" .byte " alt_pad_len "\n"			/* pad len */
++
++#define ALTINSTR_REPLACEMENT(newinstr, feature, num)	/* replacement */     \
++	b_replacement(num)":\n\t" newinstr "\n" e_replacement(num) ":\n\t"
++
+ /* alternative assembly primitive: */
+ #define ALTERNATIVE(oldinstr, newinstr, feature)			\
+-									\
+-      "661:\n\t" oldinstr "\n662:\n"					\
+-      ".section .altinstructions,\"a\"\n"				\
+-      "	 .long 661b - .\n"			/* label           */	\
+-      "	 .long 663f - .\n"			/* new instruction */	\
+-      "	 .word " __stringify(feature) "\n"	/* feature bit     */	\
+-      "	 .byte 662b-661b\n"			/* sourcelen       */	\
+-      "	 .byte 664f-663f\n"			/* replacementlen  */	\
+-      ".previous\n"							\
+-      ".section .discard,\"aw\", at progbits\n"				\
+-      "	 .byte 0xff + (664f-663f) - (662b-661b)\n" /* rlen <= slen */	\
+-      ".previous\n"							\
+-      ".section .altinstr_replacement, \"ax\"\n"			\
+-      "663:\n\t" newinstr "\n664:\n"		/* replacement     */	\
+-      ".previous"
++	OLDINSTR(oldinstr, 1)						\
++	".pushsection .altinstructions,\"a\"\n"				\
++	ALTINSTR_ENTRY(feature, 1)					\
++	".popsection\n"							\
++	".pushsection .altinstr_replacement, \"ax\"\n"			\
++	ALTINSTR_REPLACEMENT(newinstr, feature, 1)			\
++	".popsection"
++
++#define ALTERNATIVE_2(oldinstr, newinstr1, feature1, newinstr2, feature2)\
++	OLDINSTR_2(oldinstr, 1, 2)					\
++	".pushsection .altinstructions,\"a\"\n"				\
++	ALTINSTR_ENTRY(feature1, 1)					\
++	ALTINSTR_ENTRY(feature2, 2)					\
++	".popsection\n"							\
++	".pushsection .altinstr_replacement, \"ax\"\n"			\
++	ALTINSTR_REPLACEMENT(newinstr1, feature1, 1)			\
++	ALTINSTR_REPLACEMENT(newinstr2, feature2, 2)			\
++	".popsection"
+ 
+ /*
+  * This must be included *after* the definition of ALTERNATIVE due to
+@@ -114,6 +157,9 @@ static inline int alternatives_text_rese
+ #define alternative(oldinstr, newinstr, feature)			\
+ 	asm volatile (ALTERNATIVE(oldinstr, newinstr, feature) : : : "memory")
+ 
++#define alternative_2(oldinstr, newinstr1, feature1, newinstr2, feature2) \
++	asm volatile(ALTERNATIVE_2(oldinstr, newinstr1, feature1, newinstr2, feature2) ::: "memory")
++
+ /*
+  * Alternative inline assembly with input.
+  *
+--- a/arch/x86/include/asm/cpufeature.h
++++ b/arch/x86/include/asm/cpufeature.h
+@@ -339,7 +339,7 @@ extern const char * const x86_power_flag
+  */
+ static __always_inline __pure bool __static_cpu_has(u16 bit)
+ {
+-#if __GNUC__ > 4 || __GNUC_MINOR__ >= 5
++#ifdef CC_HAVE_ASM_GOTO
+ 		asm_volatile_goto("1: jmp %l[t_no]\n"
+ 			 "2:\n"
+ 			 ".section .altinstructions,\"a\"\n"
+@@ -348,6 +348,7 @@ static __always_inline __pure bool __sta
+ 			 " .word %P0\n"		/* feature bit */
+ 			 " .byte 2b - 1b\n"	/* source len */
+ 			 " .byte 0\n"		/* replacement len */
++			 " .byte 0\n"		/* pad len */
+ 			 ".previous\n"
+ 			 /* skipping size check since replacement size = 0 */
+ 			 : : "i" (bit) : : t_no);
+@@ -365,6 +366,7 @@ static __always_inline __pure bool __sta
+ 			     " .word %P1\n"		/* feature bit */
+ 			     " .byte 2b - 1b\n"		/* source len */
+ 			     " .byte 4f - 3f\n"		/* replacement len */
++			     " .byte 0\n"		/* pad len */
+ 			     ".previous\n"
+ 			     ".section .discard,\"aw\", at progbits\n"
+ 			     " .byte 0xff + (4f-3f) - (2b-1b)\n" /* size check */
+--- a/arch/x86/kernel/alternative.c
++++ b/arch/x86/kernel/alternative.c
+@@ -281,7 +281,6 @@ void __init_or_module apply_alternatives
+ 	for (a = start; a < end; a++) {
+ 		instr = (u8 *)&a->instr_offset + a->instr_offset;
+ 		replacement = (u8 *)&a->repl_offset + a->repl_offset;
+-		BUG_ON(a->replacementlen > a->instrlen);
+ 		BUG_ON(a->instrlen > sizeof(insnbuf));
+ 		BUG_ON(a->cpuid >= NCAPINTS*32);
+ 		if (!boot_cpu_has(a->cpuid))
+@@ -301,8 +300,9 @@ void __init_or_module apply_alternatives
+ 			DPRINTK("Fix CALL offset: 0x%x", *(s32 *)(insnbuf + 1));
+ 		}
+ 
+-		add_nops(insnbuf + a->replacementlen,
+-			 a->instrlen - a->replacementlen);
++		if (a->instrlen > a->replacementlen)
++			add_nops(insnbuf + a->replacementlen,
++				 a->instrlen - a->replacementlen);
+ 
+ 		text_poke_early(instr, insnbuf, a->instrlen);
+ 	}
+--- a/arch/x86/kernel/entry_32.S
++++ b/arch/x86/kernel/entry_32.S
+@@ -887,7 +887,7 @@ ENTRY(simd_coprocessor_error)
+ 661:	pushl_cfi $do_general_protection
+ 662:
+ .section .altinstructions,"a"
+-	altinstruction_entry 661b, 663f, X86_FEATURE_XMM, 662b-661b, 664f-663f
++	altinstruction_entry 661b, 663f, X86_FEATURE_XMM, 662b-661b, 664f-663f, 0
+ .previous
+ .section .altinstr_replacement,"ax"
+ 663:	pushl $do_simd_coprocessor_error
+--- a/arch/x86/lib/clear_page_64.S
++++ b/arch/x86/lib/clear_page_64.S
+@@ -67,7 +67,7 @@ ENDPROC(clear_page)
+ 	.previous
+ 	.section .altinstructions,"a"
+ 	altinstruction_entry clear_page,1b,X86_FEATURE_REP_GOOD,\
+-			     .Lclear_page_end-clear_page, 2b-1b
++			     .Lclear_page_end-clear_page, 2b-1b, 0
+ 	altinstruction_entry clear_page,2b,X86_FEATURE_ERMS,   \
+-			     .Lclear_page_end-clear_page,3b-2b
++			     .Lclear_page_end-clear_page,3b-2b, 0
+ 	.previous
+--- a/arch/x86/lib/copy_page_64.S
++++ b/arch/x86/lib/copy_page_64.S
+@@ -112,5 +112,5 @@ ENDPROC(copy_page)
+ 	.previous
+ 	.section .altinstructions,"a"
+ 	altinstruction_entry copy_page, 1b, X86_FEATURE_REP_GOOD,	\
+-		.Lcopy_page_end-copy_page, 2b-1b
++		.Lcopy_page_end-copy_page, 2b-1b, 0
+ 	.previous
+--- a/arch/x86/lib/copy_user_64.S
++++ b/arch/x86/lib/copy_user_64.S
+@@ -37,8 +37,8 @@
+ 	.previous
+ 
+ 	.section .altinstructions,"a"
+-	altinstruction_entry 0b,2b,\feature1,5,5
+-	altinstruction_entry 0b,3b,\feature2,5,5
++	altinstruction_entry 0b,2b,\feature1,5,5,0
++	altinstruction_entry 0b,3b,\feature2,5,5,0
+ 	.previous
+ 	.endm
+ 
+--- a/arch/x86/lib/memcpy_64.S
++++ b/arch/x86/lib/memcpy_64.S
+@@ -203,8 +203,8 @@ ENDPROC(__memcpy)
+ 	 * only outcome...
+ 	 */
+ 	.section .altinstructions, "a"
+-	altinstruction_entry memcpy,.Lmemcpy_c,X86_FEATURE_REP_GOOD,\
+-			     .Lmemcpy_e-.Lmemcpy_c,.Lmemcpy_e-.Lmemcpy_c
+-	altinstruction_entry memcpy,.Lmemcpy_c_e,X86_FEATURE_ERMS, \
+-			     .Lmemcpy_e_e-.Lmemcpy_c_e,.Lmemcpy_e_e-.Lmemcpy_c_e
++	altinstruction_entry __memcpy,.Lmemcpy_c,X86_FEATURE_REP_GOOD,\
++			     .Lmemcpy_e-.Lmemcpy_c,.Lmemcpy_e-.Lmemcpy_c,0
++	altinstruction_entry __memcpy,.Lmemcpy_c_e,X86_FEATURE_ERMS, \
++			     .Lmemcpy_e_e-.Lmemcpy_c_e,.Lmemcpy_e_e-.Lmemcpy_c_e,0
+ 	.previous
+--- a/arch/x86/lib/memmove_64.S
++++ b/arch/x86/lib/memmove_64.S
+@@ -218,6 +218,6 @@ ENTRY(memmove)
+ 	altinstruction_entry .Lmemmove_begin_forward,		\
+ 		.Lmemmove_begin_forward_efs,X86_FEATURE_ERMS,	\
+ 		.Lmemmove_end_forward-.Lmemmove_begin_forward,	\
+-		.Lmemmove_end_forward_efs-.Lmemmove_begin_forward_efs
++		.Lmemmove_end_forward_efs-.Lmemmove_begin_forward_efs,0
+ 	.previous
+ ENDPROC(memmove)
+--- a/arch/x86/lib/memset_64.S
++++ b/arch/x86/lib/memset_64.S
+@@ -150,8 +150,8 @@ ENDPROC(__memset)
+          * feature to implement the right patch order.
+ 	 */
+ 	.section .altinstructions,"a"
+-	altinstruction_entry memset,.Lmemset_c,X86_FEATURE_REP_GOOD,\
+-			     .Lfinal-memset,.Lmemset_e-.Lmemset_c
+-	altinstruction_entry memset,.Lmemset_c_e,X86_FEATURE_ERMS, \
+-			     .Lfinal-memset,.Lmemset_e_e-.Lmemset_c_e
++	altinstruction_entry __memset,.Lmemset_c,X86_FEATURE_REP_GOOD,\
++			     .Lfinal-__memset,.Lmemset_e-.Lmemset_c,0
++	altinstruction_entry __memset,.Lmemset_c_e,X86_FEATURE_ERMS, \
++			     .Lfinal-__memset,.Lmemset_e_e-.Lmemset_c_e,0
+ 	.previous
diff --git a/debian/patches/bugfix/all/kpti/x86-alternatives-cleanup-dprintk-macro.patch b/debian/patches/bugfix/all/kpti/x86-alternatives-cleanup-dprintk-macro.patch
new file mode 100644
index 0000000..1d27ad6
--- /dev/null
+++ b/debian/patches/bugfix/all/kpti/x86-alternatives-cleanup-dprintk-macro.patch
@@ -0,0 +1,108 @@
+From: Borislav Petkov <bp at suse.de>
+Date: Tue, 30 Dec 2014 20:27:09 +0100
+Subject: x86/alternatives: Cleanup DPRINTK macro
+
+commit db477a3386dee183130916d6bbf21f5828b0b2e2 upstream.
+
+Make it pass __func__ implicitly. Also, dump info about each replacing
+we're doing. Fixup comments and style while at it.
+
+Signed-off-by: Borislav Petkov <bp at suse.de>
+Signed-off-by: Hugh Dickins <hughd at google.com>
+[bwh: Update one more use of DPRINTK() that was removed upstream]
+Signed-off-by: Ben Hutchings <ben at decadent.org.uk>
+---
+ arch/x86/kernel/alternative.c | 42 +++++++++++++++++++++++++++---------------
+ 1 file changed, 27 insertions(+), 15 deletions(-)
+
+--- a/arch/x86/kernel/alternative.c
++++ b/arch/x86/kernel/alternative.c
+@@ -63,8 +63,11 @@ static int __init setup_noreplace_paravi
+ __setup("noreplace-paravirt", setup_noreplace_paravirt);
+ #endif
+ 
+-#define DPRINTK(fmt, args...) if (debug_alternative) \
+-	printk(KERN_DEBUG fmt, args)
++#define DPRINTK(fmt, args...)						\
++do {									\
++	if (debug_alternative)						\
++		printk(KERN_DEBUG "%s: " fmt "\n", __func__, ##args);	\
++} while (0)
+ 
+ /*
+  * Each GENERIC_NOPX is of X bytes, and defined as an array of bytes
+@@ -251,12 +254,13 @@ extern struct alt_instr __alt_instructio
+ extern s32 __smp_locks[], __smp_locks_end[];
+ void *text_poke_early(void *addr, const void *opcode, size_t len);
+ 
+-/* Replace instructions with better alternatives for this CPU type.
+-   This runs before SMP is initialized to avoid SMP problems with
+-   self modifying code. This implies that asymmetric systems where
+-   APs have less capabilities than the boot processor are not handled.
+-   Tough. Make sure you disable such features by hand. */
+-
++/*
++ * Replace instructions with better alternatives for this CPU type. This runs
++ * before SMP is initialized to avoid SMP problems with self modifying code.
++ * This implies that asymmetric systems where APs have less capabilities than
++ * the boot processor are not handled. Tough. Make sure you disable such
++ * features by hand.
++ */
+ void __init_or_module apply_alternatives(struct alt_instr *start,
+ 					 struct alt_instr *end)
+ {
+@@ -264,10 +268,10 @@ void __init_or_module apply_alternatives
+ 	u8 *instr, *replacement;
+ 	u8 insnbuf[MAX_PATCH_LEN];
+ 
+-	DPRINTK("%s: alt table %p -> %p\n", __func__, start, end);
++	DPRINTK("alt table %p -> %p", start, end);
+ 	/*
+ 	 * The scan order should be from start to end. A later scanned
+-	 * alternative code can overwrite a previous scanned alternative code.
++	 * alternative code can overwrite previously scanned alternative code.
+ 	 * Some kernel functions (e.g. memcpy, memset, etc) use this order to
+ 	 * patch code.
+ 	 *
+@@ -283,11 +287,19 @@ void __init_or_module apply_alternatives
+ 		if (!boot_cpu_has(a->cpuid))
+ 			continue;
+ 
++		DPRINTK("feat: %d*32+%d, old: (%p, len: %d), repl: (%p, len: %d)",
++			a->cpuid >> 5,
++			a->cpuid & 0x1f,
++			instr, a->instrlen,
++			replacement, a->replacementlen);
++
+ 		memcpy(insnbuf, replacement, a->replacementlen);
+ 
+ 		/* 0xe8 is a relative jump; fix the offset. */
+-		if (*insnbuf == 0xe8 && a->replacementlen == 5)
+-		    *(s32 *)(insnbuf + 1) += replacement - instr;
++		if (*insnbuf == 0xe8 && a->replacementlen == 5) {
++			*(s32 *)(insnbuf + 1) += replacement - instr;
++			DPRINTK("Fix CALL offset: 0x%x", *(s32 *)(insnbuf + 1));
++		}
+ 
+ 		add_nops(insnbuf + a->replacementlen,
+ 			 a->instrlen - a->replacementlen);
+@@ -383,8 +395,8 @@ void __init_or_module alternatives_smp_m
+ 	smp->locks_end	= locks_end;
+ 	smp->text	= text;
+ 	smp->text_end	= text_end;
+-	DPRINTK("%s: locks %p -> %p, text %p -> %p, name %s\n",
+-		__func__, smp->locks, smp->locks_end,
++	DPRINTK("locks %p -> %p, text %p -> %p, name %s\n",
++		smp->locks, smp->locks_end,
+ 		smp->text, smp->text_end, smp->name);
+ 
+ 	mutex_lock(&smp_alt);
+@@ -408,7 +420,7 @@ void __init_or_module alternatives_smp_m
+ 			continue;
+ 		list_del(&item->next);
+ 		mutex_unlock(&smp_alt);
+-		DPRINTK("%s: %s\n", __func__, item->name);
++		DPRINTK("%s\n", item->name);
+ 		kfree(item);
+ 		return;
+ 	}
diff --git a/debian/patches/bugfix/all/kpti/x86-alternatives-make-jmps-more-robust.patch b/debian/patches/bugfix/all/kpti/x86-alternatives-make-jmps-more-robust.patch
new file mode 100644
index 0000000..5aa1b79
--- /dev/null
+++ b/debian/patches/bugfix/all/kpti/x86-alternatives-make-jmps-more-robust.patch
@@ -0,0 +1,257 @@
+From: Borislav Petkov <bp at suse.de>
+Date: Mon, 5 Jan 2015 13:48:41 +0100
+Subject: x86/alternatives: Make JMPs more robust
+
+commit 48c7a2509f9e237d8465399d9cdfe487d3212a23 upstream.
+
+Up until now we had to pay attention to relative JMPs in alternatives
+about how their relative offset gets computed so that the jump target
+is still correct. Or, as it is the case for near CALLs (opcode e8), we
+still have to go and readjust the offset at patching time.
+
+What is more, the static_cpu_has_safe() facility had to forcefully
+generate 5-byte JMPs since we couldn't rely on the compiler to generate
+properly sized ones so we had to force the longest ones. Worse than
+that, sometimes it would generate a replacement JMP which is longer than
+the original one, thus overwriting the beginning of the next instruction
+at patching time.
+
+So, in order to alleviate all that and make using JMPs more
+straight-forward we go and pad the original instruction in an
+alternative block with NOPs at build time, should the replacement(s) be
+longer. This way, alternatives users shouldn't pay special attention
+so that original and replacement instruction sizes are fine but the
+assembler would simply add padding where needed and not do anything
+otherwise.
+
+As a second aspect, we go and recompute JMPs at patching time so that we
+can try to make 5-byte JMPs into two-byte ones if possible. If not, we
+still have to recompute the offsets as the replacement JMP gets put far
+away in the .altinstr_replacement section leading to a wrong offset if
+copied verbatim.
+
+For example, on a locally generated kernel image
+
+  old insn VA: 0xffffffff810014bd, CPU feat: X86_FEATURE_ALWAYS, size: 2
+  __switch_to:
+   ffffffff810014bd:      eb 21                   jmp ffffffff810014e0
+  repl insn: size: 5
+  ffffffff81d0b23c:       e9 b1 62 2f ff          jmpq ffffffff810014f2
+
+gets corrected to a 2-byte JMP:
+
+  apply_alternatives: feat: 3*32+21, old: (ffffffff810014bd, len: 2), repl: (ffffffff81d0b23c, len: 5)
+  alt_insn: e9 b1 62 2f ff
+  recompute_jumps: next_rip: ffffffff81d0b241, tgt_rip: ffffffff810014f2, new_displ: 0x00000033, ret len: 2
+  converted to: eb 33 90 90 90
+
+and a 5-byte JMP:
+
+  old insn VA: 0xffffffff81001516, CPU feat: X86_FEATURE_ALWAYS, size: 2
+  __switch_to:
+   ffffffff81001516:      eb 30                   jmp ffffffff81001548
+  repl insn: size: 5
+   ffffffff81d0b241:      e9 10 63 2f ff          jmpq ffffffff81001556
+
+gets shortened into a two-byte one:
+
+  apply_alternatives: feat: 3*32+21, old: (ffffffff81001516, len: 2), repl: (ffffffff81d0b241, len: 5)
+  alt_insn: e9 10 63 2f ff
+  recompute_jumps: next_rip: ffffffff81d0b246, tgt_rip: ffffffff81001556, new_displ: 0x0000003e, ret len: 2
+  converted to: eb 3e 90 90 90
+
+... and so on.
+
+This leads to a net win of around
+
+40ish replacements * 3 bytes savings =~ 120 bytes of I$
+
+on an AMD guest which means some savings of precious instruction cache
+bandwidth. The padding to the shorter 2-byte JMPs are single-byte NOPs
+which on smart microarchitectures means discarding NOPs at decode time
+and thus freeing up execution bandwidth.
+
+Signed-off-by: Borislav Petkov <bp at suse.de>
+Signed-off-by: Hugh Dickins <hughd at google.com>
+Signed-off-by: Ben Hutchings <ben at decadent.org.uk>
+---
+ arch/x86/kernel/alternative.c | 103 ++++++++++++++++++++++++++++++++++++++++--
+ arch/x86/lib/copy_user_64.S   |  11 ++---
+ 2 files changed, 103 insertions(+), 11 deletions(-)
+
+--- a/arch/x86/kernel/alternative.c
++++ b/arch/x86/kernel/alternative.c
+@@ -69,6 +69,21 @@ do {									\
+ 		printk(KERN_DEBUG "%s: " fmt "\n", __func__, ##args);	\
+ } while (0)
+ 
++#define DUMP_BYTES(buf, len, fmt, args...)				\
++do {									\
++	if (unlikely(debug_alternative)) {				\
++		int j;							\
++									\
++		if (!(len))						\
++			break;						\
++									\
++		printk(KERN_DEBUG fmt, ##args);				\
++		for (j = 0; j < (len) - 1; j++)				\
++			printk(KERN_CONT "%02hhx ", buf[j]);		\
++		printk(KERN_CONT "%02hhx\n", buf[j]);			\
++	}								\
++} while (0)
++
+ /*
+  * Each GENERIC_NOPX is of X bytes, and defined as an array of bytes
+  * that correspond to that nop. Getting from one nop to the next, we
+@@ -255,6 +270,71 @@ extern s32 __smp_locks[], __smp_locks_en
+ void *text_poke_early(void *addr, const void *opcode, size_t len);
+ 
+ /*
++ * Are we looking at a near JMP with a 1 or 4-byte displacement.
++ */
++static inline bool is_jmp(const u8 opcode)
++{
++	return opcode == 0xeb || opcode == 0xe9;
++}
++
++static void __init_or_module
++recompute_jump(struct alt_instr *a, u8 *orig_insn, u8 *repl_insn, u8 *insnbuf)
++{
++	u8 *next_rip, *tgt_rip;
++	s32 n_dspl, o_dspl;
++	int repl_len;
++
++	if (a->replacementlen != 5)
++		return;
++
++	o_dspl = *(s32 *)(insnbuf + 1);
++
++	/* next_rip of the replacement JMP */
++	next_rip = repl_insn + a->replacementlen;
++	/* target rip of the replacement JMP */
++	tgt_rip  = next_rip + o_dspl;
++	n_dspl = tgt_rip - orig_insn;
++
++	DPRINTK("target RIP: %p, new_displ: 0x%x", tgt_rip, n_dspl);
++
++	if (tgt_rip - orig_insn >= 0) {
++		if (n_dspl - 2 <= 127)
++			goto two_byte_jmp;
++		else
++			goto five_byte_jmp;
++	/* negative offset */
++	} else {
++		if (((n_dspl - 2) & 0xff) == (n_dspl - 2))
++			goto two_byte_jmp;
++		else
++			goto five_byte_jmp;
++	}
++
++two_byte_jmp:
++	n_dspl -= 2;
++
++	insnbuf[0] = 0xeb;
++	insnbuf[1] = (s8)n_dspl;
++	add_nops(insnbuf + 2, 3);
++
++	repl_len = 2;
++	goto done;
++
++five_byte_jmp:
++	n_dspl -= 5;
++
++	insnbuf[0] = 0xe9;
++	*(s32 *)&insnbuf[1] = n_dspl;
++
++	repl_len = 5;
++
++done:
++
++	DPRINTK("final displ: 0x%08x, JMP 0x%lx",
++		n_dspl, (unsigned long)orig_insn + n_dspl + repl_len);
++}
++
++/*
+  * Replace instructions with better alternatives for this CPU type. This runs
+  * before SMP is initialized to avoid SMP problems with self modifying code.
+  * This implies that asymmetric systems where APs have less capabilities than
+@@ -279,6 +359,8 @@ void __init_or_module apply_alternatives
+ 	 * order.
+ 	 */
+ 	for (a = start; a < end; a++) {
++		int insnbuf_sz = 0;
++
+ 		instr = (u8 *)&a->instr_offset + a->instr_offset;
+ 		replacement = (u8 *)&a->repl_offset + a->repl_offset;
+ 		BUG_ON(a->instrlen > sizeof(insnbuf));
+@@ -292,24 +374,35 @@ void __init_or_module apply_alternatives
+ 			instr, a->instrlen,
+ 			replacement, a->replacementlen);
+ 
++		DUMP_BYTES(instr, a->instrlen, "%p: old_insn: ", instr);
++		DUMP_BYTES(replacement, a->replacementlen, "%p: rpl_insn: ", replacement);
++
+ 		memcpy(insnbuf, replacement, a->replacementlen);
++		insnbuf_sz = a->replacementlen;
+ 
+ 		/* 0xe8 is a relative jump; fix the offset. */
+ 		if (*insnbuf == 0xe8 && a->replacementlen == 5) {
+ 			*(s32 *)(insnbuf + 1) += replacement - instr;
+-			DPRINTK("Fix CALL offset: 0x%x", *(s32 *)(insnbuf + 1));
++			DPRINTK("Fix CALL offset: 0x%x, CALL 0x%lx",
++				*(s32 *)(insnbuf + 1),
++				(unsigned long)instr + *(s32 *)(insnbuf + 1) + 5);
+ 		}
+ 
+-		if (a->instrlen > a->replacementlen)
++		if (a->replacementlen && is_jmp(replacement[0]))
++			recompute_jump(a, instr, replacement, insnbuf);
++
++		if (a->instrlen > a->replacementlen) {
+ 			add_nops(insnbuf + a->replacementlen,
+ 				 a->instrlen - a->replacementlen);
++			insnbuf_sz += a->instrlen - a->replacementlen;
++		}
++		DUMP_BYTES(insnbuf, insnbuf_sz, "%p: final_insn: ", instr);
+ 
+-		text_poke_early(instr, insnbuf, a->instrlen);
++		text_poke_early(instr, insnbuf, insnbuf_sz);
+ 	}
+ }
+ 
+ #ifdef CONFIG_SMP
+-
+ static void alternatives_smp_lock(const s32 *start, const s32 *end,
+ 				  u8 *text, u8 *text_end)
+ {
+@@ -495,7 +588,7 @@ int alternatives_text_reserved(void *sta
+ 
+ 	return 0;
+ }
+-#endif
++#endif /* CONFIG_SMP */
+ 
+ #ifdef CONFIG_PARAVIRT
+ void __init_or_module apply_paravirt(struct paravirt_patch_site *start,
+--- a/arch/x86/lib/copy_user_64.S
++++ b/arch/x86/lib/copy_user_64.S
+@@ -26,14 +26,13 @@
+  */
+ 	.macro ALTERNATIVE_JUMP feature1,feature2,orig,alt1,alt2
+ 0:
+-	.byte 0xe9	/* 32bit jump */
+-	.long \orig-1f	/* by default jump to orig */
++	jmp \orig
+ 1:
+ 	.section .altinstr_replacement,"ax"
+-2:	.byte 0xe9			/* near jump with 32bit immediate */
+-	.long \alt1-1b /* offset */   /* or alternatively to alt1 */
+-3:	.byte 0xe9			/* near jump with 32bit immediate */
+-	.long \alt2-1b /* offset */   /* or alternatively to alt2 */
++2:
++	jmp \alt1
++3:
++	jmp \alt2
+ 	.previous
+ 
+ 	.section .altinstructions,"a"
diff --git a/debian/patches/bugfix/all/kpti/x86-alternatives-use-optimized-nops-for-padding.patch b/debian/patches/bugfix/all/kpti/x86-alternatives-use-optimized-nops-for-padding.patch
new file mode 100644
index 0000000..cbcdd11
--- /dev/null
+++ b/debian/patches/bugfix/all/kpti/x86-alternatives-use-optimized-nops-for-padding.patch
@@ -0,0 +1,50 @@
+From: Borislav Petkov <bp at suse.de>
+Date: Sat, 10 Jan 2015 20:34:07 +0100
+Subject: x86/alternatives: Use optimized NOPs for padding
+
+commit 4fd4b6e5537cec5b56db0b22546dd439ebb26830 upstream.
+
+Alternatives allow now for an empty old instruction. In this case we go
+and pad the space with NOPs at assembly time. However, there are the
+optimal, longer NOPs which should be used. Do that at patching time by
+adding alt_instr.padlen-sized NOPs at the old instruction address.
+
+Cc: Andy Lutomirski <luto at amacapital.net>
+Signed-off-by: Borislav Petkov <bp at suse.de>
+Signed-off-by: Hugh Dickins <hughd at google.com>
+Signed-off-by: Ben Hutchings <ben at decadent.org.uk>
+---
+ arch/x86/kernel/alternative.c | 14 +++++++++++++-
+ 1 file changed, 13 insertions(+), 1 deletion(-)
+
+--- a/arch/x86/kernel/alternative.c
++++ b/arch/x86/kernel/alternative.c
+@@ -334,6 +334,14 @@ done:
+ 		n_dspl, (unsigned long)orig_insn + n_dspl + repl_len);
+ }
+ 
++static void __init_or_module optimize_nops(struct alt_instr *a, u8 *instr)
++{
++	add_nops(instr + (a->instrlen - a->padlen), a->padlen);
++
++	DUMP_BYTES(instr, a->instrlen, "%p: [%d:%d) optimized NOPs: ",
++		   instr, a->instrlen - a->padlen, a->padlen);
++}
++
+ /*
+  * Replace instructions with better alternatives for this CPU type. This runs
+  * before SMP is initialized to avoid SMP problems with self modifying code.
+@@ -365,8 +373,12 @@ void __init_or_module apply_alternatives
+ 		replacement = (u8 *)&a->repl_offset + a->repl_offset;
+ 		BUG_ON(a->instrlen > sizeof(insnbuf));
+ 		BUG_ON(a->cpuid >= NCAPINTS*32);
+-		if (!boot_cpu_has(a->cpuid))
++		if (!boot_cpu_has(a->cpuid)) {
++			if (a->padlen > 1)
++				optimize_nops(a, instr);
++
+ 			continue;
++		}
+ 
+ 		DPRINTK("feat: %d*32+%d, old: (%p, len: %d), repl: (%p, len: %d)",
+ 			a->cpuid >> 5,
diff --git a/debian/patches/bugfix/all/kpti/x86-boot-add-early-cmdline-parsing-for-options-with-arguments.patch b/debian/patches/bugfix/all/kpti/x86-boot-add-early-cmdline-parsing-for-options-with-arguments.patch
new file mode 100644
index 0000000..bb0c4c1
--- /dev/null
+++ b/debian/patches/bugfix/all/kpti/x86-boot-add-early-cmdline-parsing-for-options-with-arguments.patch
@@ -0,0 +1,174 @@
+From: Tom Lendacky <thomas.lendacky at amd.com>
+Date: Mon, 17 Jul 2017 16:10:33 -0500
+Subject: x86/boot: Add early cmdline parsing for options with  arguments
+MIME-Version: 1.0
+Content-Type: text/plain; charset=UTF-8
+Content-Transfer-Encoding: 8bit
+
+commit e505371dd83963caae1a37ead9524e8d997341be upstream.
+
+Add a cmdline_find_option() function to look for cmdline options that
+take arguments. The argument is returned in a supplied buffer and the
+argument length (regardless of whether it fits in the supplied buffer)
+is returned, with -1 indicating not found.
+
+Signed-off-by: Tom Lendacky <thomas.lendacky at amd.com>
+Reviewed-by: Thomas Gleixner <tglx at linutronix.de>
+Cc: Alexander Potapenko <glider at google.com>
+Cc: Andrey Ryabinin <aryabinin at virtuozzo.com>
+Cc: Andy Lutomirski <luto at kernel.org>
+Cc: Arnd Bergmann <arnd at arndb.de>
+Cc: Borislav Petkov <bp at alien8.de>
+Cc: Brijesh Singh <brijesh.singh at amd.com>
+Cc: Dave Young <dyoung at redhat.com>
+Cc: Dmitry Vyukov <dvyukov at google.com>
+Cc: Jonathan Corbet <corbet at lwn.net>
+Cc: Konrad Rzeszutek Wilk <konrad.wilk at oracle.com>
+Cc: Larry Woodman <lwoodman at redhat.com>
+Cc: Linus Torvalds <torvalds at linux-foundation.org>
+Cc: Matt Fleming <matt at codeblueprint.co.uk>
+Cc: Michael S. Tsirkin <mst at redhat.com>
+Cc: Paolo Bonzini <pbonzini at redhat.com>
+Cc: Peter Zijlstra <peterz at infradead.org>
+Cc: Radim Krčmář <rkrcmar at redhat.com>
+Cc: Rik van Riel <riel at redhat.com>
+Cc: Toshimitsu Kani <toshi.kani at hpe.com>
+Cc: kasan-dev at googlegroups.com
+Cc: kvm at vger.kernel.org
+Cc: linux-arch at vger.kernel.org
+Cc: linux-doc at vger.kernel.org
+Cc: linux-efi at vger.kernel.org
+Cc: linux-mm at kvack.org
+Link: http://lkml.kernel.org/r/36b5f97492a9745dce27682305f990fc20e5cf8a.1500319216.git.thomas.lendacky@amd.com
+Signed-off-by: Ingo Molnar <mingo at kernel.org>
+Signed-off-by: Ben Hutchings <ben at decadent.org.uk>
+---
+ arch/x86/include/asm/cmdline.h |   2 +
+ arch/x86/lib/cmdline.c         | 105 +++++++++++++++++++++++++++++++++++++++++
+ 2 files changed, 107 insertions(+)
+
+--- a/arch/x86/include/asm/cmdline.h
++++ b/arch/x86/include/asm/cmdline.h
+@@ -2,5 +2,7 @@
+ #define _ASM_X86_CMDLINE_H
+ 
+ int cmdline_find_option_bool(const char *cmdline_ptr, const char *option);
++int cmdline_find_option(const char *cmdline_ptr, const char *option,
++			char *buffer, int bufsize);
+ 
+ #endif /* _ASM_X86_CMDLINE_H */
+--- a/arch/x86/lib/cmdline.c
++++ b/arch/x86/lib/cmdline.c
+@@ -104,7 +104,112 @@ __cmdline_find_option_bool(const char *c
+ 	return 0;	/* Buffer overrun */
+ }
+ 
++/*
++ * Find a non-boolean option (i.e. option=argument). In accordance with
++ * standard Linux practice, if this option is repeated, this returns the
++ * last instance on the command line.
++ *
++ * @cmdline: the cmdline string
++ * @max_cmdline_size: the maximum size of cmdline
++ * @option: option string to look for
++ * @buffer: memory buffer to return the option argument
++ * @bufsize: size of the supplied memory buffer
++ *
++ * Returns the length of the argument (regardless of if it was
++ * truncated to fit in the buffer), or -1 on not found.
++ */
++static int
++__cmdline_find_option(const char *cmdline, int max_cmdline_size,
++		      const char *option, char *buffer, int bufsize)
++{
++	char c;
++	int pos = 0, len = -1;
++	const char *opptr = NULL;
++	char *bufptr = buffer;
++	enum {
++		st_wordstart = 0,	/* Start of word/after whitespace */
++		st_wordcmp,	/* Comparing this word */
++		st_wordskip,	/* Miscompare, skip */
++		st_bufcpy,	/* Copying this to buffer */
++	} state = st_wordstart;
++
++	if (!cmdline)
++		return -1;      /* No command line */
++
++	/*
++	 * This 'pos' check ensures we do not overrun
++	 * a non-NULL-terminated 'cmdline'
++	 */
++	while (pos++ < max_cmdline_size) {
++		c = *(char *)cmdline++;
++		if (!c)
++			break;
++
++		switch (state) {
++		case st_wordstart:
++			if (myisspace(c))
++				break;
++
++			state = st_wordcmp;
++			opptr = option;
++			/* fall through */
++
++		case st_wordcmp:
++			if ((c == '=') && !*opptr) {
++				/*
++				 * We matched all the way to the end of the
++				 * option we were looking for, prepare to
++				 * copy the argument.
++				 */
++				len = 0;
++				bufptr = buffer;
++				state = st_bufcpy;
++				break;
++			} else if (c == *opptr++) {
++				/*
++				 * We are currently matching, so continue
++				 * to the next character on the cmdline.
++				 */
++				break;
++			}
++			state = st_wordskip;
++			/* fall through */
++
++		case st_wordskip:
++			if (myisspace(c))
++				state = st_wordstart;
++			break;
++
++		case st_bufcpy:
++			if (myisspace(c)) {
++				state = st_wordstart;
++			} else {
++				/*
++				 * Increment len, but don't overrun the
++				 * supplied buffer and leave room for the
++				 * NULL terminator.
++				 */
++				if (++len < bufsize)
++					*bufptr++ = c;
++			}
++			break;
++		}
++	}
++
++	if (bufsize)
++		*bufptr = '\0';
++
++	return len;
++}
++
+ int cmdline_find_option_bool(const char *cmdline, const char *option)
+ {
+ 	return __cmdline_find_option_bool(cmdline, COMMAND_LINE_SIZE, option);
+ }
++
++int cmdline_find_option(const char *cmdline, const char *option, char *buffer,
++			int bufsize)
++{
++	return __cmdline_find_option(cmdline, COMMAND_LINE_SIZE, option,
++				     buffer, bufsize);
++}
diff --git a/debian/patches/bugfix/all/kpti/x86-boot-carve-out-early-cmdline-parsing-function.patch b/debian/patches/bugfix/all/kpti/x86-boot-carve-out-early-cmdline-parsing-function.patch
new file mode 100644
index 0000000..e79631b
--- /dev/null
+++ b/debian/patches/bugfix/all/kpti/x86-boot-carve-out-early-cmdline-parsing-function.patch
@@ -0,0 +1,131 @@
+From: Borislav Petkov <bp at suse.de>
+Date: Mon, 19 May 2014 20:59:16 +0200
+Subject: x86, boot: Carve out early cmdline parsing function
+
+commit 1b1ded57a4f2f4420b4de7c395d1b841d8b3c41a upstream.
+
+Carve out early cmdline parsing function into .../lib/cmdline.c so it
+can be used by early code in the kernel proper as well.
+
+Adapted from arch/x86/boot/cmdline.c.
+
+Signed-off-by: Borislav Petkov <bp at suse.de>
+Link: http://lkml.kernel.org/r/1400525957-11525-2-git-send-email-bp@alien8.de
+Signed-off-by: H. Peter Anvin <hpa at zytor.com>
+[bwh: Backported to 3.2: adjust context]
+Signed-off-by: Ben Hutchings <ben at decadent.org.uk>
+---
+ arch/x86/include/asm/cmdline.h |  6 +++
+ arch/x86/lib/Makefile          |  2 +-
+ arch/x86/lib/cmdline.c         | 84 ++++++++++++++++++++++++++++++++++++++++++
+ 3 files changed, 91 insertions(+), 1 deletion(-)
+ create mode 100644 arch/x86/include/asm/cmdline.h
+ create mode 100644 arch/x86/lib/cmdline.c
+
+--- /dev/null
++++ b/arch/x86/include/asm/cmdline.h
+@@ -0,0 +1,6 @@
++#ifndef _ASM_X86_CMDLINE_H
++#define _ASM_X86_CMDLINE_H
++
++int cmdline_find_option_bool(const char *cmdline_ptr, const char *option);
++
++#endif /* _ASM_X86_CMDLINE_H */
+--- a/arch/x86/lib/Makefile
++++ b/arch/x86/lib/Makefile
+@@ -16,7 +16,7 @@ clean-files := inat-tables.c
+ 
+ obj-$(CONFIG_SMP) += msr-smp.o cache-smp.o
+ 
+-lib-y := delay.o
++lib-y := delay.o cmdline.o
+ lib-y += thunk_$(BITS).o
+ lib-y += usercopy_$(BITS).o usercopy.o getuser.o putuser.o
+ lib-y += memcpy_$(BITS).o
+--- /dev/null
++++ b/arch/x86/lib/cmdline.c
+@@ -0,0 +1,84 @@
++/*
++ * This file is part of the Linux kernel, and is made available under
++ * the terms of the GNU General Public License version 2.
++ *
++ * Misc librarized functions for cmdline poking.
++ */
++#include <linux/kernel.h>
++#include <linux/string.h>
++#include <linux/ctype.h>
++#include <asm/setup.h>
++
++static inline int myisspace(u8 c)
++{
++	return c <= ' ';	/* Close enough approximation */
++}
++
++/**
++ * Find a boolean option (like quiet,noapic,nosmp....)
++ *
++ * @cmdline: the cmdline string
++ * @option: option string to look for
++ *
++ * Returns the position of that @option (starts counting with 1)
++ * or 0 on not found.
++ */
++int cmdline_find_option_bool(const char *cmdline, const char *option)
++{
++	char c;
++	int len, pos = 0, wstart = 0;
++	const char *opptr = NULL;
++	enum {
++		st_wordstart = 0,	/* Start of word/after whitespace */
++		st_wordcmp,	/* Comparing this word */
++		st_wordskip,	/* Miscompare, skip */
++	} state = st_wordstart;
++
++	if (!cmdline)
++		return -1;      /* No command line */
++
++	len = min_t(int, strlen(cmdline), COMMAND_LINE_SIZE);
++	if (!len)
++		return 0;
++
++	while (len--) {
++		c = *(char *)cmdline++;
++		pos++;
++
++		switch (state) {
++		case st_wordstart:
++			if (!c)
++				return 0;
++			else if (myisspace(c))
++				break;
++
++			state = st_wordcmp;
++			opptr = option;
++			wstart = pos;
++			/* fall through */
++
++		case st_wordcmp:
++			if (!*opptr)
++				if (!c || myisspace(c))
++					return wstart;
++				else
++					state = st_wordskip;
++			else if (!c)
++				return 0;
++			else if (c != *opptr++)
++				state = st_wordskip;
++			else if (!len)		/* last word and is matching */
++				return wstart;
++			break;
++
++		case st_wordskip:
++			if (!c)
++				return 0;
++			else if (myisspace(c))
++				state = st_wordstart;
++			break;
++		}
++	}
++
++	return 0;	/* Buffer overrun */
++}
diff --git a/debian/patches/bugfix/all/kpti/x86-boot-fix-early-command-line-parsing-when-matching-at-end.patch b/debian/patches/bugfix/all/kpti/x86-boot-fix-early-command-line-parsing-when-matching-at-end.patch
new file mode 100644
index 0000000..04b2d17
--- /dev/null
+++ b/debian/patches/bugfix/all/kpti/x86-boot-fix-early-command-line-parsing-when-matching-at-end.patch
@@ -0,0 +1,119 @@
+From: Dave Hansen <dave.hansen at linux.intel.com>
+Date: Tue, 22 Dec 2015 14:52:38 -0800
+Subject: x86/boot: Fix early command-line parsing when matching  at end
+
+commit 02afeaae9843733a39cd9b11053748b2d1dc5ae7 upstream.
+
+The x86 early command line parsing in cmdline_find_option_bool() is
+buggy. If it matches a specified 'option' all the way to the end of the
+command-line, it will consider it a match.
+
+For instance,
+
+  cmdline = "foo";
+  cmdline_find_option_bool(cmdline, "fool");
+
+will return 1. This is particularly annoying since we have actual FPU
+options like "noxsave" and "noxsaves" So, command-line "foo bar noxsave"
+will match *BOTH* a "noxsave" and "noxsaves". (This turns out not to be
+an actual problem because "noxsave" implies "noxsaves", but it's still
+confusing.)
+
+To fix this, we simplify the code and stop tracking 'len'. 'len'
+was trying to indicate either the NULL terminator *OR* the end of a
+non-NULL-terminated command line at 'COMMAND_LINE_SIZE'. But, each of the
+three states is *already* checking 'cmdline' for a NULL terminator.
+
+We _only_ need to check if we have overrun 'COMMAND_LINE_SIZE', and that
+we can do without keeping 'len' around.
+
+Also add some commends to clarify what is going on.
+
+Signed-off-by: Dave Hansen <dave.hansen at linux.intel.com>
+Signed-off-by: Borislav Petkov <bp at suse.de>
+Cc: Andy Lutomirski <luto at amacapital.net>
+Cc: Borislav Petkov <bp at alien8.de>
+Cc: Brian Gerst <brgerst at gmail.com>
+Cc: Denys Vlasenko <dvlasenk at redhat.com>
+Cc: H. Peter Anvin <hpa at zytor.com>
+Cc: Linus Torvalds <torvalds at linux-foundation.org>
+Cc: Peter Zijlstra <peterz at infradead.org>
+Cc: Thomas Gleixner <tglx at linutronix.de>
+Cc: fenghua.yu at intel.com
+Cc: yu-cheng.yu at intel.com
+Link: http://lkml.kernel.org/r/20151222225238.9AEB560C@viggo.jf.intel.com
+Signed-off-by: Ingo Molnar <mingo at kernel.org>
+Signed-off-by: Ben Hutchings <ben at decadent.org.uk>
+---
+ arch/x86/lib/cmdline.c | 34 ++++++++++++++++++++++++----------
+ 1 file changed, 24 insertions(+), 10 deletions(-)
+
+--- a/arch/x86/lib/cmdline.c
++++ b/arch/x86/lib/cmdline.c
+@@ -21,12 +21,14 @@ static inline int myisspace(u8 c)
+  * @option: option string to look for
+  *
+  * Returns the position of that @option (starts counting with 1)
+- * or 0 on not found.
++ * or 0 on not found.  @option will only be found if it is found
++ * as an entire word in @cmdline.  For instance, if @option="car"
++ * then a cmdline which contains "cart" will not match.
+  */
+ int cmdline_find_option_bool(const char *cmdline, const char *option)
+ {
+ 	char c;
+-	int len, pos = 0, wstart = 0;
++	int pos = 0, wstart = 0;
+ 	const char *opptr = NULL;
+ 	enum {
+ 		st_wordstart = 0,	/* Start of word/after whitespace */
+@@ -37,11 +39,14 @@ int cmdline_find_option_bool(const char
+ 	if (!cmdline)
+ 		return -1;      /* No command line */
+ 
+-	len = min_t(int, strlen(cmdline), COMMAND_LINE_SIZE);
+-	if (!len)
++	if (!strlen(cmdline))
+ 		return 0;
+ 
+-	while (len--) {
++	/*
++	 * This 'pos' check ensures we do not overrun
++	 * a non-NULL-terminated 'cmdline'
++	 */
++	while (pos < COMMAND_LINE_SIZE) {
+ 		c = *(char *)cmdline++;
+ 		pos++;
+ 
+@@ -58,17 +63,26 @@ int cmdline_find_option_bool(const char
+ 			/* fall through */
+ 
+ 		case st_wordcmp:
+-			if (!*opptr)
++			if (!*opptr) {
++				/*
++				 * We matched all the way to the end of the
++				 * option we were looking for.  If the
++				 * command-line has a space _or_ ends, then
++				 * we matched!
++				 */
+ 				if (!c || myisspace(c))
+ 					return wstart;
+ 				else
+ 					state = st_wordskip;
+-			else if (!c)
++			} else if (!c) {
++				/*
++				 * Hit the NULL terminator on the end of
++				 * cmdline.
++				 */
+ 				return 0;
+-			else if (c != *opptr++)
++			} else if (c != *opptr++) {
+ 				state = st_wordskip;
+-			else if (!len)		/* last word and is matching */
+-				return wstart;
++			}
+ 			break;
+ 
+ 		case st_wordskip:
diff --git a/debian/patches/bugfix/all/kpti/x86-boot-fix-early-command-line-parsing-when-partial-word-matches.patch b/debian/patches/bugfix/all/kpti/x86-boot-fix-early-command-line-parsing-when-partial-word-matches.patch
new file mode 100644
index 0000000..2b7a9af
--- /dev/null
+++ b/debian/patches/bugfix/all/kpti/x86-boot-fix-early-command-line-parsing-when-partial-word-matches.patch
@@ -0,0 +1,100 @@
+From: Dave Hansen <dave.hansen at linux.intel.com>
+Date: Tue, 22 Dec 2015 14:52:39 -0800
+Subject: x86/boot: Fix early command-line parsing when partial  word matches
+
+commit abcdc1c694fa4055323cbec1cde4c2cb6b68398c upstream.
+
+cmdline_find_option_bool() keeps track of position in two strings:
+
+ 1. the command-line
+ 2. the option we are searchign for in the command-line
+
+We plow through each character in the command-line one at a time, always
+moving forward. We move forward in the option ('opptr') when we match
+characters in 'cmdline'. We reset the 'opptr' only when we go in to the
+'st_wordstart' state.
+
+But, if we fail to match an option because we see a space
+(state=st_wordcmp, *opptr='\0',c=' '), we set state='st_wordskip' and
+'break', moving to the next character. But, that move to the next
+character is the one *after* the ' '. This means that we will miss a
+'st_wordstart' state.
+
+For instance, if we have
+
+  cmdline = "foo fool";
+
+and are searching for "fool", we have:
+
+	  "fool"
+  opptr = ----^
+
+           "foo fool"
+   c = --------^
+
+We see that 'l' != ' ', set state=st_wordskip, break, and then move 'c', so:
+
+          "foo fool"
+  c = ---------^
+
+and are still in state=st_wordskip. We will stay in wordskip until we
+have skipped "fool", thus missing the option we were looking for. This
+*only* happens when you have a partially- matching word followed by a
+matching one.
+
+To fix this, we always fall *into* the 'st_wordskip' state when we set
+it.
+
+Signed-off-by: Dave Hansen <dave.hansen at linux.intel.com>
+Signed-off-by: Borislav Petkov <bp at suse.de>
+Cc: Andy Lutomirski <luto at amacapital.net>
+Cc: Borislav Petkov <bp at alien8.de>
+Cc: Brian Gerst <brgerst at gmail.com>
+Cc: Denys Vlasenko <dvlasenk at redhat.com>
+Cc: H. Peter Anvin <hpa at zytor.com>
+Cc: Linus Torvalds <torvalds at linux-foundation.org>
+Cc: Peter Zijlstra <peterz at infradead.org>
+Cc: Thomas Gleixner <tglx at linutronix.de>
+Cc: fenghua.yu at intel.com
+Cc: yu-cheng.yu at intel.com
+Link: http://lkml.kernel.org/r/20151222225239.8E1DCA58@viggo.jf.intel.com
+Signed-off-by: Ingo Molnar <mingo at kernel.org>
+Signed-off-by: Ben Hutchings <ben at decadent.org.uk>
+---
+ arch/x86/lib/cmdline.c | 18 +++++++++++++-----
+ 1 file changed, 13 insertions(+), 5 deletions(-)
+
+--- a/arch/x86/lib/cmdline.c
++++ b/arch/x86/lib/cmdline.c
+@@ -72,18 +72,26 @@ int cmdline_find_option_bool(const char
+ 				 */
+ 				if (!c || myisspace(c))
+ 					return wstart;
+-				else
+-					state = st_wordskip;
++				/*
++				 * We hit the end of the option, but _not_
++				 * the end of a word on the cmdline.  Not
++				 * a match.
++				 */
+ 			} else if (!c) {
+ 				/*
+ 				 * Hit the NULL terminator on the end of
+ 				 * cmdline.
+ 				 */
+ 				return 0;
+-			} else if (c != *opptr++) {
+-				state = st_wordskip;
++			} else if (c == *opptr++) {
++				/*
++				 * We are currently matching, so continue
++				 * to the next character on the cmdline.
++				 */
++				break;
+ 			}
+-			break;
++			state = st_wordskip;
++			/* fall through */
+ 
+ 		case st_wordskip:
+ 			if (!c)
diff --git a/debian/patches/bugfix/all/kpti/x86-boot-pass-in-size-to-early-cmdline-parsing.patch b/debian/patches/bugfix/all/kpti/x86-boot-pass-in-size-to-early-cmdline-parsing.patch
new file mode 100644
index 0000000..be397a8
--- /dev/null
+++ b/debian/patches/bugfix/all/kpti/x86-boot-pass-in-size-to-early-cmdline-parsing.patch
@@ -0,0 +1,59 @@
+From: Dave Hansen <dave.hansen at linux.intel.com>
+Date: Tue, 22 Dec 2015 14:52:43 -0800
+Subject: x86/boot: Pass in size to early cmdline parsing
+
+commit 8c0517759a1a100a8b83134cf3c7f254774aaeba upstream.
+
+We will use this in a few patches to implement tests for early parsing.
+
+Signed-off-by: Dave Hansen <dave.hansen at linux.intel.com>
+[ Aligned args properly. ]
+Signed-off-by: Borislav Petkov <bp at suse.de>
+Cc: Andy Lutomirski <luto at amacapital.net>
+Cc: Borislav Petkov <bp at alien8.de>
+Cc: Brian Gerst <brgerst at gmail.com>
+Cc: Denys Vlasenko <dvlasenk at redhat.com>
+Cc: H. Peter Anvin <hpa at zytor.com>
+Cc: Linus Torvalds <torvalds at linux-foundation.org>
+Cc: Peter Zijlstra <peterz at infradead.org>
+Cc: Thomas Gleixner <tglx at linutronix.de>
+Cc: fenghua.yu at intel.com
+Cc: yu-cheng.yu at intel.com
+Link: http://lkml.kernel.org/r/20151222225243.5CC47EB6@viggo.jf.intel.com
+Signed-off-by: Ingo Molnar <mingo at kernel.org>
+Signed-off-by: Ben Hutchings <ben at decadent.org.uk>
+---
+ arch/x86/lib/cmdline.c | 11 +++++++++--
+ 1 file changed, 9 insertions(+), 2 deletions(-)
+
+--- a/arch/x86/lib/cmdline.c
++++ b/arch/x86/lib/cmdline.c
+@@ -25,7 +25,9 @@ static inline int myisspace(u8 c)
+  * as an entire word in @cmdline.  For instance, if @option="car"
+  * then a cmdline which contains "cart" will not match.
+  */
+-int cmdline_find_option_bool(const char *cmdline, const char *option)
++static int
++__cmdline_find_option_bool(const char *cmdline, int max_cmdline_size,
++			   const char *option)
+ {
+ 	char c;
+ 	int pos = 0, wstart = 0;
+@@ -43,7 +45,7 @@ int cmdline_find_option_bool(const char
+ 	 * This 'pos' check ensures we do not overrun
+ 	 * a non-NULL-terminated 'cmdline'
+ 	 */
+-	while (pos < COMMAND_LINE_SIZE) {
++	while (pos < max_cmdline_size) {
+ 		c = *(char *)cmdline++;
+ 		pos++;
+ 
+@@ -101,3 +103,8 @@ int cmdline_find_option_bool(const char
+ 
+ 	return 0;	/* Buffer overrun */
+ }
++
++int cmdline_find_option_bool(const char *cmdline, const char *option)
++{
++	return __cmdline_find_option_bool(cmdline, COMMAND_LINE_SIZE, option);
++}
diff --git a/debian/patches/bugfix/all/kpti/x86-boot-simplify-early-command-line-parsing.patch b/debian/patches/bugfix/all/kpti/x86-boot-simplify-early-command-line-parsing.patch
new file mode 100644
index 0000000..1fe339c
--- /dev/null
+++ b/debian/patches/bugfix/all/kpti/x86-boot-simplify-early-command-line-parsing.patch
@@ -0,0 +1,51 @@
+From: Dave Hansen <dave.hansen at linux.intel.com>
+Date: Tue, 22 Dec 2015 14:52:41 -0800
+Subject: x86/boot: Simplify early command line parsing
+
+commit 4de07ea481361b08fe13735004dafae862482d38 upstream.
+
+__cmdline_find_option_bool() tries to account for both NULL-terminated
+and non-NULL-terminated strings. It keeps 'pos' to look for the end of
+the buffer and also looks for '!c' in a bunch of places to look for NULL
+termination.
+
+But, it also calls strlen(). You can't call strlen on a
+non-NULL-terminated string.
+
+If !strlen(cmdline), then cmdline[0]=='\0'. In that case, we will go in
+to the while() loop, set c='\0', hit st_wordstart, notice !c, and will
+immediately return 0.
+
+So, remove the strlen().  It is unnecessary and unsafe.
+
+Signed-off-by: Dave Hansen <dave.hansen at linux.intel.com>
+Signed-off-by: Borislav Petkov <bp at suse.de>
+Cc: Andy Lutomirski <luto at amacapital.net>
+Cc: Borislav Petkov <bp at alien8.de>
+Cc: Brian Gerst <brgerst at gmail.com>
+Cc: Denys Vlasenko <dvlasenk at redhat.com>
+Cc: H. Peter Anvin <hpa at zytor.com>
+Cc: Linus Torvalds <torvalds at linux-foundation.org>
+Cc: Peter Zijlstra <peterz at infradead.org>
+Cc: Thomas Gleixner <tglx at linutronix.de>
+Cc: fenghua.yu at intel.com
+Cc: yu-cheng.yu at intel.com
+Link: http://lkml.kernel.org/r/20151222225241.15365E43@viggo.jf.intel.com
+Signed-off-by: Ingo Molnar <mingo at kernel.org>
+Signed-off-by: Ben Hutchings <ben at decadent.org.uk>
+---
+ arch/x86/lib/cmdline.c | 3 ---
+ 1 file changed, 3 deletions(-)
+
+--- a/arch/x86/lib/cmdline.c
++++ b/arch/x86/lib/cmdline.c
+@@ -39,9 +39,6 @@ int cmdline_find_option_bool(const char
+ 	if (!cmdline)
+ 		return -1;      /* No command line */
+ 
+-	if (!strlen(cmdline))
+-		return 0;
+-
+ 	/*
+ 	 * This 'pos' check ensures we do not overrun
+ 	 * a non-NULL-terminated 'cmdline'
diff --git a/debian/patches/bugfix/all/kpti/x86-cpufeature-add-cpu-features-from-intel-document-319433-012a.patch b/debian/patches/bugfix/all/kpti/x86-cpufeature-add-cpu-features-from-intel-document-319433-012a.patch
new file mode 100644
index 0000000..036aca3
--- /dev/null
+++ b/debian/patches/bugfix/all/kpti/x86-cpufeature-add-cpu-features-from-intel-document-319433-012a.patch
@@ -0,0 +1,30 @@
+From: "H. Peter Anvin" <hpa at linux.intel.com>
+Date: Tue, 21 Feb 2012 17:25:50 -0800
+Subject: x86, cpufeature: Add CPU features from Intel document  319433-012A
+
+commit 513c4ec6e4759aa33c90af0658b82eb4d2027871 upstream.
+
+Add CPU features from the Intel Archicture Instruction Set Extensions
+Programming Reference version 012A (Feb 2012), document number 319433-012A.
+
+Signed-off-by: H. Peter Anvin <hpa at linux.intel.com>
+Signed-off-by: Hugh Dickins <hughd at google.com>
+Signed-off-by: Ben Hutchings <ben at decadent.org.uk>
+---
+ arch/x86/include/asm/cpufeature.h | 3 +++
+ 1 file changed, 3 insertions(+)
+
+--- a/arch/x86/include/asm/cpufeature.h
++++ b/arch/x86/include/asm/cpufeature.h
+@@ -199,8 +199,11 @@
+ 
+ /* Intel-defined CPU features, CPUID level 0x00000007:0 (ebx), word 9 */
+ #define X86_FEATURE_FSGSBASE	(9*32+ 0) /* {RD/WR}{FS/GS}BASE instructions*/
++#define X86_FEATURE_HLE		(9*32+ 4) /* Hardware Lock Elision */
+ #define X86_FEATURE_SMEP	(9*32+ 7) /* Supervisor Mode Execution Protection */
+ #define X86_FEATURE_ERMS	(9*32+ 9) /* Enhanced REP MOVSB/STOSB */
++#define X86_FEATURE_INVPCID	(9*32+10) /* Invalidate Processor Context ID */
++#define X86_FEATURE_RTM		(9*32+11) /* Restricted Transactional Memory */
+ 
+ #if defined(__KERNEL__) && !defined(__ASSEMBLY__)
+ 
diff --git a/debian/patches/bugfix/all/kpti/x86-kaiser-check-boottime-cmdline-params.patch b/debian/patches/bugfix/all/kpti/x86-kaiser-check-boottime-cmdline-params.patch
new file mode 100644
index 0000000..5bb5ac8
--- /dev/null
+++ b/debian/patches/bugfix/all/kpti/x86-kaiser-check-boottime-cmdline-params.patch
@@ -0,0 +1,118 @@
+From: Borislav Petkov <bp at suse.de>
+Date: Tue, 2 Jan 2018 14:19:48 +0100
+Subject: x86/kaiser: Check boottime cmdline params
+
+AMD (and possibly other vendors) are not affected by the leak
+KAISER is protecting against.
+
+Keep the "nopti" for traditional reasons and add pti=<on|off|auto>
+like upstream.
+
+Signed-off-by: Borislav Petkov <bp at suse.de>
+Signed-off-by: Greg Kroah-Hartman <gregkh at linuxfoundation.org>
+[bwh: Drop the exclusion of AMD, which does not exist upstream.]
+Signed-off-by: Ben Hutchings <ben at decadent.org.uk>
+---
+ Documentation/kernel-parameters.txt |  6 ++++
+ arch/x86/mm/kaiser.c                | 56 +++++++++++++++++++++++++------------
+ 2 files changed, 44 insertions(+), 18 deletions(-)
+
+--- a/Documentation/kernel-parameters.txt
++++ b/Documentation/kernel-parameters.txt
+@@ -2251,6 +2251,12 @@ bytes respectively. Such letter suffixes
+ 	pt.		[PARIDE]
+ 			See Documentation/blockdev/paride.txt.
+ 
++	pti=		[X86_64]
++			Control KAISER user/kernel address space isolation:
++			on - enable
++			off - disable
++			auto - default setting
++
+ 	pty.legacy_count=
+ 			[KNL] Number of legacy pty's. Overwrites compiled-in
+ 			default number.
+--- a/arch/x86/mm/kaiser.c
++++ b/arch/x86/mm/kaiser.c
+@@ -20,6 +20,7 @@ extern struct mm_struct init_mm;
+ #include <asm/pgtable.h>
+ #include <asm/pgalloc.h>
+ #include <asm/desc.h>
++#include <asm/cmdline.h>
+ 
+ int kaiser_enabled __read_mostly = 1;
+ EXPORT_SYMBOL(kaiser_enabled);	/* for inlined TLB flush functions */
+@@ -264,6 +265,40 @@ static void __init kaiser_init_all_pgds(
+ 	WARN_ON(__ret);							\
+ } while (0)
+ 
++void __init kaiser_check_boottime_disable(void)
++{
++	bool enable = true;
++	char arg[5];
++	int ret;
++
++	ret = cmdline_find_option(boot_command_line, "pti", arg, sizeof(arg));
++	if (ret > 0) {
++		if (!strncmp(arg, "on", 2))
++			goto enable;
++
++		if (!strncmp(arg, "off", 3))
++			goto disable;
++
++		if (!strncmp(arg, "auto", 4))
++			goto skip;
++	}
++
++	if (cmdline_find_option_bool(boot_command_line, "nopti"))
++		goto disable;
++
++skip:
++enable:
++	if (enable)
++		setup_force_cpu_cap(X86_FEATURE_KAISER);
++
++	return;
++
++disable:
++	pr_info("Kernel/User page tables isolation: disabled\n");
++	kaiser_enabled = 0;
++	setup_clear_cpu_cap(X86_FEATURE_KAISER);
++}
++
+ /*
+  * If anything in here fails, we will likely die on one of the
+  * first kernel->user transitions and init will die.  But, we
+@@ -275,12 +310,10 @@ void __init kaiser_init(void)
+ {
+ 	int cpu;
+ 
+-	if (!kaiser_enabled) {
+-		setup_clear_cpu_cap(X86_FEATURE_KAISER);
+-		return;
+-	}
++	kaiser_check_boottime_disable();
+ 
+-	setup_force_cpu_cap(X86_FEATURE_KAISER);
++	if (!kaiser_enabled)
++		return;
+ 
+ 	kaiser_init_all_pgds();
+ 
+@@ -413,16 +446,3 @@ void kaiser_flush_tlb_on_return_to_user(
+ 			X86_CR3_PCID_USER_FLUSH | KAISER_SHADOW_PGD_OFFSET);
+ }
+ EXPORT_SYMBOL(kaiser_flush_tlb_on_return_to_user);
+-
+-static int __init x86_nokaiser_setup(char *s)
+-{
+-	/* nopti doesn't accept parameters */
+-	if (s)
+-		return -EINVAL;
+-
+-	kaiser_enabled = 0;
+-	pr_info("Kernel/User page tables isolation: disabled\n");
+-
+-	return 0;
+-}
+-early_param("nopti", x86_nokaiser_setup);
diff --git a/debian/patches/bugfix/all/kpti/x86-kaiser-move-feature-detection-up.patch b/debian/patches/bugfix/all/kpti/x86-kaiser-move-feature-detection-up.patch
new file mode 100644
index 0000000..346679b
--- /dev/null
+++ b/debian/patches/bugfix/all/kpti/x86-kaiser-move-feature-detection-up.patch
@@ -0,0 +1,77 @@
+From: Borislav Petkov <bp at suse.de>
+Date: Mon, 25 Dec 2017 13:57:16 +0100
+Subject: x86/kaiser: Move feature detection up
+
+... before the first use of kaiser_enabled as otherwise funky
+things happen:
+
+  about to get started...
+  (XEN) d0v0 Unhandled page fault fault/trap [#14, ec=0000]
+  (XEN) Pagetable walk from ffff88022a449090:
+  (XEN)  L4[0x110] = 0000000229e0e067 0000000000001e0e
+  (XEN)  L3[0x008] = 0000000000000000 ffffffffffffffff
+  (XEN) domain_crash_sync called from entry.S: fault at ffff82d08033fd08
+  entry.o#create_bounce_frame+0x135/0x14d
+  (XEN) Domain 0 (vcpu#0) crashed on cpu#0:
+  (XEN) ----[ Xen-4.9.1_02-3.21  x86_64  debug=n   Not tainted ]----
+  (XEN) CPU:    0
+  (XEN) RIP:    e033:[<ffffffff81007460>]
+  (XEN) RFLAGS: 0000000000000286   EM: 1   CONTEXT: pv guest (d0v0)
+
+Signed-off-by: Borislav Petkov <bp at suse.de>
+Signed-off-by: Greg Kroah-Hartman <gregkh at linuxfoundation.org>
+[bwh: Backported to 3.2: adjust context]
+Signed-off-by: Ben Hutchings <ben at decadent.org.uk>
+---
+ arch/x86/include/asm/kaiser.h | 2 ++
+ arch/x86/kernel/setup.c       | 7 +++++++
+ arch/x86/mm/kaiser.c          | 2 --
+ 3 files changed, 9 insertions(+), 2 deletions(-)
+
+--- a/arch/x86/include/asm/kaiser.h
++++ b/arch/x86/include/asm/kaiser.h
+@@ -96,8 +96,10 @@ DECLARE_PER_CPU(unsigned long, x86_cr3_p
+ extern char __per_cpu_user_mapped_start[], __per_cpu_user_mapped_end[];
+ 
+ extern int kaiser_enabled;
++extern void __init kaiser_check_boottime_disable(void);
+ #else
+ #define kaiser_enabled	0
++static inline void __init kaiser_check_boottime_disable(void) {}
+ #endif /* CONFIG_KAISER */
+ 
+ /*
+--- a/arch/x86/kernel/setup.c
++++ b/arch/x86/kernel/setup.c
+@@ -114,6 +114,7 @@
+ #include <asm/mce.h>
+ #include <asm/alternative.h>
+ #include <asm/prom.h>
++#include <asm/kaiser.h>
+ 
+ /*
+  * end_pfn only includes RAM, while max_pfn_mapped includes all e820 entries.
+@@ -921,6 +922,12 @@ void __init setup_arch(char **cmdline_p)
+ 	 */
+ 	init_hypervisor_platform();
+ 
++	/*
++	 * This needs to happen right after XENPV is set on xen and
++	 * kaiser_enabled is checked below in cleanup_highmap().
++	 */
++	kaiser_check_boottime_disable();
++
+ 	x86_init.resources.probe_roms();
+ 
+ 	/* after parse_early_param, so could debug it */
+--- a/arch/x86/mm/kaiser.c
++++ b/arch/x86/mm/kaiser.c
+@@ -315,8 +315,6 @@ void __init kaiser_init(void)
+ {
+ 	int cpu;
+ 
+-	kaiser_check_boottime_disable();
+-
+ 	if (!kaiser_enabled)
+ 		return;
+ 
diff --git a/debian/patches/bugfix/all/kpti/x86-kaiser-reenable-paravirt.patch b/debian/patches/bugfix/all/kpti/x86-kaiser-reenable-paravirt.patch
new file mode 100644
index 0000000..5c8c1fc
--- /dev/null
+++ b/debian/patches/bugfix/all/kpti/x86-kaiser-reenable-paravirt.patch
@@ -0,0 +1,25 @@
+From: Borislav Petkov <bp at suse.de>
+Date: Tue, 2 Jan 2018 14:19:49 +0100
+Subject: x86/kaiser: Reenable PARAVIRT
+
+Now that the required bits have been addressed, reenable
+PARAVIRT.
+
+Signed-off-by: Borislav Petkov <bp at suse.de>
+Signed-off-by: Greg Kroah-Hartman <gregkh at linuxfoundation.org>
+Signed-off-by: Ben Hutchings <ben at decadent.org.uk>
+---
+ security/Kconfig | 2 +-
+ 1 file changed, 1 insertion(+), 1 deletion(-)
+
+--- a/security/Kconfig
++++ b/security/Kconfig
+@@ -108,7 +108,7 @@ config SECURITY
+ config KAISER
+ 	bool "Remove the kernel mapping in user mode"
+ 	default y
+-	depends on X86_64 && SMP && !PARAVIRT
++	depends on X86_64 && SMP
+ 	help
+ 	  This enforces a strict kernel and user space isolation, in order
+ 	  to close hardware side channels on kernel address information.
diff --git a/debian/patches/bugfix/all/kpti/x86-kaiser-rename-and-simplify-x86_feature_kaiser-handling.patch b/debian/patches/bugfix/all/kpti/x86-kaiser-rename-and-simplify-x86_feature_kaiser-handling.patch
new file mode 100644
index 0000000..2cb1684
--- /dev/null
+++ b/debian/patches/bugfix/all/kpti/x86-kaiser-rename-and-simplify-x86_feature_kaiser-handling.patch
@@ -0,0 +1,94 @@
+From: Borislav Petkov <bp at suse.de>
+Date: Tue, 2 Jan 2018 14:19:48 +0100
+Subject: x86/kaiser: Rename and simplify X86_FEATURE_KAISER  handling
+
+Concentrate it in arch/x86/mm/kaiser.c and use the upstream string "nopti".
+
+Signed-off-by: Borislav Petkov <bp at suse.de>
+Signed-off-by: Greg Kroah-Hartman <gregkh at linuxfoundation.org>
+Signed-off-by: Ben Hutchings <ben at decadent.org.uk>
+---
+ Documentation/kernel-parameters.txt |  2 +-
+ arch/x86/kernel/cpu/common.c        | 18 ------------------
+ arch/x86/mm/kaiser.c                | 20 +++++++++++++++++++-
+ 3 files changed, 20 insertions(+), 20 deletions(-)
+
+--- a/Documentation/kernel-parameters.txt
++++ b/Documentation/kernel-parameters.txt
+@@ -1803,7 +1803,7 @@ bytes respectively. Such letter suffixes
+ 
+ 	nojitter	[IA-64] Disables jitter checking for ITC timers.
+ 
+-	nokaiser	[X86-64] Disable KAISER isolation of kernel from user.
++	nopti		[X86-64] Disable KAISER isolation of kernel from user.
+ 
+ 	no-kvmclock	[X86,KVM] Disable paravirtualized KVM clock driver
+ 
+--- a/arch/x86/kernel/cpu/common.c
++++ b/arch/x86/kernel/cpu/common.c
+@@ -171,20 +171,6 @@ static int __init x86_pcid_setup(char *s
+ 	return 1;
+ }
+ __setup("nopcid", x86_pcid_setup);
+-
+-static int __init x86_nokaiser_setup(char *s)
+-{
+-	/* nokaiser doesn't accept parameters */
+-	if (s)
+-		return -EINVAL;
+-#ifdef CONFIG_KAISER
+-	kaiser_enabled = 0;
+-	setup_clear_cpu_cap(X86_FEATURE_KAISER);
+-	pr_info("nokaiser: KAISER feature disabled\n");
+-#endif
+-	return 0;
+-}
+-early_param("nokaiser", x86_nokaiser_setup);
+ #endif
+ 
+ static int __init x86_noinvpcid_setup(char *s)
+@@ -695,10 +681,6 @@ void __cpuinit get_cpu_cap(struct cpuinf
+ 		c->x86_power = cpuid_edx(0x80000007);
+ 
+ 	init_scattered_cpuid_features(c);
+-#ifdef CONFIG_KAISER
+-	if (kaiser_enabled)
+-		set_cpu_cap(c, X86_FEATURE_KAISER);
+-#endif
+ }
+ 
+ static void __cpuinit identify_cpu_without_cpuid(struct cpuinfo_x86 *c)
+--- a/arch/x86/mm/kaiser.c
++++ b/arch/x86/mm/kaiser.c
+@@ -275,8 +275,13 @@ void __init kaiser_init(void)
+ {
+ 	int cpu;
+ 
+-	if (!kaiser_enabled)
++	if (!kaiser_enabled) {
++		setup_clear_cpu_cap(X86_FEATURE_KAISER);
+ 		return;
++	}
++
++	setup_force_cpu_cap(X86_FEATURE_KAISER);
++
+ 	kaiser_init_all_pgds();
+ 
+ 	for_each_possible_cpu(cpu) {
+@@ -408,3 +413,16 @@ void kaiser_flush_tlb_on_return_to_user(
+ 			X86_CR3_PCID_USER_FLUSH | KAISER_SHADOW_PGD_OFFSET);
+ }
+ EXPORT_SYMBOL(kaiser_flush_tlb_on_return_to_user);
++
++static int __init x86_nokaiser_setup(char *s)
++{
++	/* nopti doesn't accept parameters */
++	if (s)
++		return -EINVAL;
++
++	kaiser_enabled = 0;
++	pr_info("Kernel/User page tables isolation: disabled\n");
++
++	return 0;
++}
++early_param("nopti", x86_nokaiser_setup);
diff --git a/debian/patches/bugfix/all/kpti/x86-mm-64-fix-reboot-interaction-with-cr4.pcide.patch b/debian/patches/bugfix/all/kpti/x86-mm-64-fix-reboot-interaction-with-cr4.pcide.patch
new file mode 100644
index 0000000..c81f90d
--- /dev/null
+++ b/debian/patches/bugfix/all/kpti/x86-mm-64-fix-reboot-interaction-with-cr4.pcide.patch
@@ -0,0 +1,41 @@
+From: Andy Lutomirski <luto at kernel.org>
+Date: Sun, 8 Oct 2017 21:53:05 -0700
+Subject: x86/mm/64: Fix reboot interaction with CR4.PCIDE
+
+commit 924c6b900cfdf376b07bccfd80e62b21914f8a5a upstream.
+
+Trying to reboot via real mode fails with PCID on: long mode cannot
+be exited while CR4.PCIDE is set.  (No, I have no idea why, but the
+SDM and actual CPUs are in agreement here.)  The result is a GPF and
+a hang instead of a reboot.
+
+I didn't catch this in testing because neither my computer nor my VM
+reboots this way.  I can trigger it with reboot=bios, though.
+
+Fixes: 660da7c9228f ("x86/mm: Enable CR4.PCIDE on supported systems")
+Reported-and-tested-by: Steven Rostedt (VMware) <rostedt at goodmis.org>
+Signed-off-by: Andy Lutomirski <luto at kernel.org>
+Signed-off-by: Thomas Gleixner <tglx at linutronix.de>
+Cc: Borislav Petkov <bp at alien8.de>
+Link: https://lkml.kernel.org/r/f1e7d965998018450a7a70c2823873686a8b21c0.1507524746.git.luto@kernel.org
+Signed-off-by: Hugh Dickins <hughd at google.com>
+Signed-off-by: Ben Hutchings <ben at decadent.org.uk>
+---
+ arch/x86/kernel/reboot.c | 6 ++++++
+ 1 file changed, 6 insertions(+)
+
+--- a/arch/x86/kernel/reboot.c
++++ b/arch/x86/kernel/reboot.c
+@@ -357,6 +357,12 @@ void machine_real_restart(unsigned int t
+ 	lowmem_gdt[1] =
+ 		GDT_ENTRY(0x009b, restart_pa, 0xffff);
+ 
++#ifdef CONFIG_X86_64
++	/* Exiting long mode will fail if CR4.PCIDE is set. */
++	if (static_cpu_has(X86_FEATURE_PCID))
++		clear_in_cr4(X86_CR4_PCIDE);
++#endif
++
+ 	/* Jump to the identity-mapped low memory code */
+ 	restart_lowmem(type);
+ }
diff --git a/debian/patches/bugfix/all/kpti/x86-mm-add-a-noinvpcid-boot-option-to-turn-off-invpcid.patch b/debian/patches/bugfix/all/kpti/x86-mm-add-a-noinvpcid-boot-option-to-turn-off-invpcid.patch
new file mode 100644
index 0000000..395e93f
--- /dev/null
+++ b/debian/patches/bugfix/all/kpti/x86-mm-add-a-noinvpcid-boot-option-to-turn-off-invpcid.patch
@@ -0,0 +1,72 @@
+From: Andy Lutomirski <luto at kernel.org>
+Date: Fri, 29 Jan 2016 11:42:58 -0800
+Subject: x86/mm: Add a 'noinvpcid' boot option to turn off  INVPCID
+
+commit d12a72b844a49d4162f24cefdab30bed3f86730e upstream.
+
+This adds a chicken bit to turn off INVPCID in case something goes
+wrong.  It's an early_param() because we do TLB flushes before we
+parse __setup() parameters.
+
+Signed-off-by: Andy Lutomirski <luto at kernel.org>
+Reviewed-by: Borislav Petkov <bp at suse.de>
+Cc: Andrew Morton <akpm at linux-foundation.org>
+Cc: Andrey Ryabinin <aryabinin at virtuozzo.com>
+Cc: Andy Lutomirski <luto at amacapital.net>
+Cc: Borislav Petkov <bp at alien8.de>
+Cc: Brian Gerst <brgerst at gmail.com>
+Cc: Dave Hansen <dave.hansen at linux.intel.com>
+Cc: Denys Vlasenko <dvlasenk at redhat.com>
+Cc: H. Peter Anvin <hpa at zytor.com>
+Cc: Linus Torvalds <torvalds at linux-foundation.org>
+Cc: Luis R. Rodriguez <mcgrof at suse.com>
+Cc: Oleg Nesterov <oleg at redhat.com>
+Cc: Peter Zijlstra <peterz at infradead.org>
+Cc: Thomas Gleixner <tglx at linutronix.de>
+Cc: Toshi Kani <toshi.kani at hp.com>
+Cc: linux-mm at kvack.org
+Link: http://lkml.kernel.org/r/f586317ed1bc2b87aee652267e515b90051af385.1454096309.git.luto@kernel.org
+Signed-off-by: Ingo Molnar <mingo at kernel.org>
+Cc: Hugh Dickins <hughd at google.com>
+Signed-off-by: Ben Hutchings <ben at decadent.org.uk>
+---
+ Documentation/kernel-parameters.txt |  2 ++
+ arch/x86/kernel/cpu/common.c        | 16 ++++++++++++++++
+ 2 files changed, 18 insertions(+)
+
+--- a/Documentation/kernel-parameters.txt
++++ b/Documentation/kernel-parameters.txt
+@@ -1799,6 +1799,8 @@ bytes respectively. Such letter suffixes
+ 
+ 	nointroute	[IA-64]
+ 
++	noinvpcid	[X86] Disable the INVPCID cpu feature.
++
+ 	nojitter	[IA-64] Disables jitter checking for ITC timers.
+ 
+ 	no-kvmclock	[X86,KVM] Disable paravirtualized KVM clock driver
+--- a/arch/x86/kernel/cpu/common.c
++++ b/arch/x86/kernel/cpu/common.c
+@@ -155,6 +155,22 @@ static int __init x86_xsaveopt_setup(cha
+ }
+ __setup("noxsaveopt", x86_xsaveopt_setup);
+ 
++static int __init x86_noinvpcid_setup(char *s)
++{
++	/* noinvpcid doesn't accept parameters */
++	if (s)
++		return -EINVAL;
++
++	/* do not emit a message if the feature is not present */
++	if (!boot_cpu_has(X86_FEATURE_INVPCID))
++		return 0;
++
++	setup_clear_cpu_cap(X86_FEATURE_INVPCID);
++	pr_info("noinvpcid: INVPCID feature disabled\n");
++	return 0;
++}
++early_param("noinvpcid", x86_noinvpcid_setup);
++
+ #ifdef CONFIG_X86_32
+ static int cachesize_override __cpuinitdata = -1;
+ static int disable_x86_serial_nr __cpuinitdata = 1;
diff --git a/debian/patches/bugfix/all/kpti/x86-mm-add-invpcid-helpers.patch b/debian/patches/bugfix/all/kpti/x86-mm-add-invpcid-helpers.patch
new file mode 100644
index 0000000..0cf17ce
--- /dev/null
+++ b/debian/patches/bugfix/all/kpti/x86-mm-add-invpcid-helpers.patch
@@ -0,0 +1,91 @@
+From: Andy Lutomirski <luto at kernel.org>
+Date: Fri, 29 Jan 2016 11:42:57 -0800
+Subject: x86/mm: Add INVPCID helpers
+
+commit 060a402a1ddb551455ee410de2eadd3349f2801b upstream.
+
+This adds helpers for each of the four currently-specified INVPCID
+modes.
+
+Signed-off-by: Andy Lutomirski <luto at kernel.org>
+Reviewed-by: Borislav Petkov <bp at suse.de>
+Cc: Andrew Morton <akpm at linux-foundation.org>
+Cc: Andrey Ryabinin <aryabinin at virtuozzo.com>
+Cc: Andy Lutomirski <luto at amacapital.net>
+Cc: Borislav Petkov <bp at alien8.de>
+Cc: Brian Gerst <brgerst at gmail.com>
+Cc: Dave Hansen <dave.hansen at linux.intel.com>
+Cc: Denys Vlasenko <dvlasenk at redhat.com>
+Cc: H. Peter Anvin <hpa at zytor.com>
+Cc: Linus Torvalds <torvalds at linux-foundation.org>
+Cc: Luis R. Rodriguez <mcgrof at suse.com>
+Cc: Oleg Nesterov <oleg at redhat.com>
+Cc: Peter Zijlstra <peterz at infradead.org>
+Cc: Thomas Gleixner <tglx at linutronix.de>
+Cc: Toshi Kani <toshi.kani at hp.com>
+Cc: linux-mm at kvack.org
+Link: http://lkml.kernel.org/r/8a62b23ad686888cee01da134c91409e22064db9.1454096309.git.luto@kernel.org
+Signed-off-by: Ingo Molnar <mingo at kernel.org>
+Cc: Hugh Dickins <hughd at google.com>
+Signed-off-by: Ben Hutchings <ben at decadent.org.uk>
+---
+ arch/x86/include/asm/tlbflush.h | 48 +++++++++++++++++++++++++++++++++++++++++
+ 1 file changed, 48 insertions(+)
+
+--- a/arch/x86/include/asm/tlbflush.h
++++ b/arch/x86/include/asm/tlbflush.h
+@@ -7,6 +7,54 @@
+ #include <asm/processor.h>
+ #include <asm/system.h>
+ 
++static inline void __invpcid(unsigned long pcid, unsigned long addr,
++			     unsigned long type)
++{
++	u64 desc[2] = { pcid, addr };
++
++	/*
++	 * The memory clobber is because the whole point is to invalidate
++	 * stale TLB entries and, especially if we're flushing global
++	 * mappings, we don't want the compiler to reorder any subsequent
++	 * memory accesses before the TLB flush.
++	 *
++	 * The hex opcode is invpcid (%ecx), %eax in 32-bit mode and
++	 * invpcid (%rcx), %rax in long mode.
++	 */
++	asm volatile (".byte 0x66, 0x0f, 0x38, 0x82, 0x01"
++		      : : "m" (desc), "a" (type), "c" (desc) : "memory");
++}
++
++#define INVPCID_TYPE_INDIV_ADDR		0
++#define INVPCID_TYPE_SINGLE_CTXT	1
++#define INVPCID_TYPE_ALL_INCL_GLOBAL	2
++#define INVPCID_TYPE_ALL_NON_GLOBAL	3
++
++/* Flush all mappings for a given pcid and addr, not including globals. */
++static inline void invpcid_flush_one(unsigned long pcid,
++				     unsigned long addr)
++{
++	__invpcid(pcid, addr, INVPCID_TYPE_INDIV_ADDR);
++}
++
++/* Flush all mappings for a given PCID, not including globals. */
++static inline void invpcid_flush_single_context(unsigned long pcid)
++{
++	__invpcid(pcid, 0, INVPCID_TYPE_SINGLE_CTXT);
++}
++
++/* Flush all mappings, including globals, for all PCIDs. */
++static inline void invpcid_flush_all(void)
++{
++	__invpcid(0, 0, INVPCID_TYPE_ALL_INCL_GLOBAL);
++}
++
++/* Flush all mappings for all PCIDs except globals. */
++static inline void invpcid_flush_all_nonglobals(void)
++{
++	__invpcid(0, 0, INVPCID_TYPE_ALL_NON_GLOBAL);
++}
++
+ #ifdef CONFIG_PARAVIRT
+ #include <asm/paravirt.h>
+ #else
diff --git a/debian/patches/bugfix/all/kpti/x86-mm-add-the-nopcid-boot-option-to-turn-off-pcid.patch b/debian/patches/bugfix/all/kpti/x86-mm-add-the-nopcid-boot-option-to-turn-off-pcid.patch
new file mode 100644
index 0000000..ce3f800
--- /dev/null
+++ b/debian/patches/bugfix/all/kpti/x86-mm-add-the-nopcid-boot-option-to-turn-off-pcid.patch
@@ -0,0 +1,72 @@
+From: Andy Lutomirski <luto at kernel.org>
+Date: Thu, 29 Jun 2017 08:53:20 -0700
+Subject: x86/mm: Add the 'nopcid' boot option to turn off PCID
+
+commit 0790c9aad84901ca1bdc14746175549c8b5da215 upstream.
+
+The parameter is only present on x86_64 systems to save a few bytes,
+as PCID is always disabled on x86_32.
+
+Signed-off-by: Andy Lutomirski <luto at kernel.org>
+Reviewed-by: Nadav Amit <nadav.amit at gmail.com>
+Reviewed-by: Borislav Petkov <bp at suse.de>
+Reviewed-by: Thomas Gleixner <tglx at linutronix.de>
+Cc: Andrew Morton <akpm at linux-foundation.org>
+Cc: Arjan van de Ven <arjan at linux.intel.com>
+Cc: Borislav Petkov <bp at alien8.de>
+Cc: Dave Hansen <dave.hansen at intel.com>
+Cc: Linus Torvalds <torvalds at linux-foundation.org>
+Cc: Mel Gorman <mgorman at suse.de>
+Cc: Peter Zijlstra <peterz at infradead.org>
+Cc: Rik van Riel <riel at redhat.com>
+Cc: linux-mm at kvack.org
+Link: http://lkml.kernel.org/r/8bbb2e65bcd249a5f18bfb8128b4689f08ac2b60.1498751203.git.luto@kernel.org
+Signed-off-by: Ingo Molnar <mingo at kernel.org>
+[Hugh Dickins: Backported to 3.2:
+ - Documentation/admin-guide/kernel-parameters.txt (not in this tree)
+ - Documentation/kernel-parameters.txt (patched instead of that)]
+Signed-off-by: Hugh Dickins <hughd at google.com>
+Signed-off-by: Ben Hutchings <ben at decadent.org.uk>
+---
+ Documentation/kernel-parameters.txt |  2 ++
+ arch/x86/kernel/cpu/common.c        | 18 ++++++++++++++++++
+ 2 files changed, 20 insertions(+)
+
+--- a/Documentation/kernel-parameters.txt
++++ b/Documentation/kernel-parameters.txt
+@@ -1829,6 +1829,8 @@ bytes respectively. Such letter suffixes
+ 	nopat		[X86] Disable PAT (page attribute table extension of
+ 			pagetables) support.
+ 
++	nopcid		[X86-64] Disable the PCID cpu feature.
++
+ 	norandmaps	Don't use address space randomization.  Equivalent to
+ 			echo 0 > /proc/sys/kernel/randomize_va_space
+ 
+--- a/arch/x86/kernel/cpu/common.c
++++ b/arch/x86/kernel/cpu/common.c
+@@ -155,6 +155,24 @@ static int __init x86_xsaveopt_setup(cha
+ }
+ __setup("noxsaveopt", x86_xsaveopt_setup);
+ 
++#ifdef CONFIG_X86_64
++static int __init x86_pcid_setup(char *s)
++{
++	/* require an exact match without trailing characters */
++	if (strlen(s))
++		return 0;
++
++	/* do not emit a message if the feature is not present */
++	if (!boot_cpu_has(X86_FEATURE_PCID))
++		return 1;
++
++	setup_clear_cpu_cap(X86_FEATURE_PCID);
++	pr_info("nopcid: PCID feature disabled\n");
++	return 1;
++}
++__setup("nopcid", x86_pcid_setup);
++#endif
++
+ static int __init x86_noinvpcid_setup(char *s)
+ {
+ 	/* noinvpcid doesn't accept parameters */
diff --git a/debian/patches/bugfix/all/kpti/x86-mm-build-arch-x86-mm-tlb.c-even-on-smp.patch b/debian/patches/bugfix/all/kpti/x86-mm-build-arch-x86-mm-tlb.c-even-on-smp.patch
new file mode 100644
index 0000000..21fd320
--- /dev/null
+++ b/debian/patches/bugfix/all/kpti/x86-mm-build-arch-x86-mm-tlb.c-even-on-smp.patch
@@ -0,0 +1,63 @@
+From: Andy Lutomirski <luto at kernel.org>
+Date: Tue, 26 Apr 2016 09:39:07 -0700
+Subject: x86/mm: Build arch/x86/mm/tlb.c even on !SMP
+
+commit e1074888c326038340a1ada9129d679e661f2ea6 upstream.
+
+Currently all of the functions that live in tlb.c are inlined on
+!SMP builds.  One can debate whether this is a good idea (in many
+respects the code in tlb.c is better than the inlined UP code).
+
+Regardless, I want to add code that needs to be built on UP and SMP
+kernels and relates to tlb flushing, so arrange for tlb.c to be
+compiled unconditionally.
+
+Signed-off-by: Andy Lutomirski <luto at kernel.org>
+Reviewed-by: Borislav Petkov <bp at suse.de>
+Cc: Borislav Petkov <bp at alien8.de>
+Cc: Linus Torvalds <torvalds at linux-foundation.org>
+Cc: Peter Zijlstra <peterz at infradead.org>
+Cc: Thomas Gleixner <tglx at linutronix.de>
+Link: http://lkml.kernel.org/r/f0d778f0d828fc46e5d1946bca80f0aaf9abf032.1461688545.git.luto@kernel.org
+Signed-off-by: Ingo Molnar <mingo at kernel.org>
+Signed-off-by: Hugh Dickins <hughd at google.com>
+Signed-off-by: Ben Hutchings <ben at decadent.org.uk>
+---
+ arch/x86/mm/Makefile | 3 +--
+ arch/x86/mm/tlb.c    | 4 ++++
+ 2 files changed, 5 insertions(+), 2 deletions(-)
+
+--- a/arch/x86/mm/Makefile
++++ b/arch/x86/mm/Makefile
+@@ -1,5 +1,5 @@
+ obj-y	:=  init.o init_$(BITS).o fault.o ioremap.o extable.o pageattr.o mmap.o \
+-	    pat.o pgtable.o physaddr.o gup.o setup_nx.o
++	    pat.o pgtable.o physaddr.o gup.o setup_nx.o tlb.o
+ 
+ # Make sure __phys_addr has no stackprotector
+ nostackp := $(call cc-option, -fno-stack-protector)
+@@ -7,7 +7,6 @@ CFLAGS_physaddr.o		:= $(nostackp)
+ CFLAGS_setup_nx.o		:= $(nostackp)
+ 
+ obj-$(CONFIG_X86_PAT)		+= pat_rbtree.o
+-obj-$(CONFIG_SMP)		+= tlb.o
+ 
+ obj-$(CONFIG_X86_32)		+= pgtable_32.o iomap_32.o
+ 
+--- a/arch/x86/mm/tlb.c
++++ b/arch/x86/mm/tlb.c
+@@ -38,6 +38,8 @@ DEFINE_PER_CPU_SHARED_ALIGNED(struct tlb
+  *	fixed, at the cost of triggering multiple IPIs in some cases.
+  */
+ 
++#ifdef CONFIG_SMP
++
+ union smp_flush_state {
+ 	struct {
+ 		struct mm_struct *flush_mm;
+@@ -350,3 +352,5 @@ void flush_tlb_all(void)
+ {
+ 	on_each_cpu(do_flush_tlb_all, NULL, 1);
+ }
++
++#endif /* CONFIG_SMP */
diff --git a/debian/patches/bugfix/all/kpti/x86-mm-disable-pcid-on-32-bit-kernels.patch b/debian/patches/bugfix/all/kpti/x86-mm-disable-pcid-on-32-bit-kernels.patch
new file mode 100644
index 0000000..dcd4a62
--- /dev/null
+++ b/debian/patches/bugfix/all/kpti/x86-mm-disable-pcid-on-32-bit-kernels.patch
@@ -0,0 +1,63 @@
+From: Andy Lutomirski <luto at kernel.org>
+Date: Thu, 29 Jun 2017 08:53:19 -0700
+Subject: x86/mm: Disable PCID on 32-bit kernels
+
+commit cba4671af7550e008f7a7835f06df0763825bf3e upstream.
+
+32-bit kernels on new hardware will see PCID in CPUID, but PCID can
+only be used in 64-bit mode.  Rather than making all PCID code
+conditional, just disable the feature on 32-bit builds.
+
+Signed-off-by: Andy Lutomirski <luto at kernel.org>
+Reviewed-by: Nadav Amit <nadav.amit at gmail.com>
+Reviewed-by: Borislav Petkov <bp at suse.de>
+Reviewed-by: Thomas Gleixner <tglx at linutronix.de>
+Cc: Andrew Morton <akpm at linux-foundation.org>
+Cc: Arjan van de Ven <arjan at linux.intel.com>
+Cc: Borislav Petkov <bp at alien8.de>
+Cc: Dave Hansen <dave.hansen at intel.com>
+Cc: Linus Torvalds <torvalds at linux-foundation.org>
+Cc: Mel Gorman <mgorman at suse.de>
+Cc: Peter Zijlstra <peterz at infradead.org>
+Cc: Rik van Riel <riel at redhat.com>
+Cc: linux-mm at kvack.org
+Link: http://lkml.kernel.org/r/2e391769192a4d31b808410c383c6bf0734bc6ea.1498751203.git.luto@kernel.org
+Signed-off-by: Ingo Molnar <mingo at kernel.org>
+Signed-off-by: Hugh Dickins <hughd at google.com>
+Signed-off-by: Ben Hutchings <ben at decadent.org.uk>
+---
+ arch/x86/kernel/cpu/bugs.c   | 8 ++++++++
+ arch/x86/kernel/cpu/common.c | 5 +++++
+ 2 files changed, 13 insertions(+)
+
+--- a/arch/x86/kernel/cpu/bugs.c
++++ b/arch/x86/kernel/cpu/bugs.c
+@@ -159,6 +159,14 @@ static void __init check_config(void)
+ 
+ void __init check_bugs(void)
+ {
++#ifdef CONFIG_X86_32
++	/*
++	 * Regardless of whether PCID is enumerated, the SDM says
++	 * that it can't be enabled in 32-bit mode.
++	 */
++	setup_clear_cpu_cap(X86_FEATURE_PCID);
++#endif
++
+ 	identify_boot_cpu();
+ #ifndef CONFIG_SMP
+ 	printk(KERN_INFO "CPU: ");
+--- a/arch/x86/kernel/cpu/common.c
++++ b/arch/x86/kernel/cpu/common.c
+@@ -934,6 +934,11 @@ void __cpuinit identify_secondary_cpu(st
+ 	BUG_ON(c == &boot_cpu_data);
+ 	identify_cpu(c);
+ #ifdef CONFIG_X86_32
++	/*
++	 * Regardless of whether PCID is enumerated, the SDM says
++	 * that it can't be enabled in 32-bit mode.
++	 */
++	clear_cpu_cap(c, X86_FEATURE_PCID);
+ 	enable_sep_cpu();
+ #endif
+ 	mtrr_ap_init();
diff --git a/debian/patches/bugfix/all/kpti/x86-mm-enable-cr4.pcide-on-supported-systems.patch b/debian/patches/bugfix/all/kpti/x86-mm-enable-cr4.pcide-on-supported-systems.patch
new file mode 100644
index 0000000..40d31f9
--- /dev/null
+++ b/debian/patches/bugfix/all/kpti/x86-mm-enable-cr4.pcide-on-supported-systems.patch
@@ -0,0 +1,135 @@
+From: Andy Lutomirski <luto at kernel.org>
+Date: Thu, 29 Jun 2017 08:53:21 -0700
+Subject: x86/mm: Enable CR4.PCIDE on supported systems
+
+commit 660da7c9228f685b2ebe664f9fd69aaddcc420b5 upstream.
+
+We can use PCID if the CPU has PCID and PGE and we're not on Xen.
+
+By itself, this has no effect. A followup patch will start using PCID.
+
+Signed-off-by: Andy Lutomirski <luto at kernel.org>
+Reviewed-by: Nadav Amit <nadav.amit at gmail.com>
+Reviewed-by: Boris Ostrovsky <boris.ostrovsky at oracle.com>
+Reviewed-by: Thomas Gleixner <tglx at linutronix.de>
+Cc: Andrew Morton <akpm at linux-foundation.org>
+Cc: Arjan van de Ven <arjan at linux.intel.com>
+Cc: Borislav Petkov <bp at alien8.de>
+Cc: Dave Hansen <dave.hansen at intel.com>
+Cc: Juergen Gross <jgross at suse.com>
+Cc: Linus Torvalds <torvalds at linux-foundation.org>
+Cc: Mel Gorman <mgorman at suse.de>
+Cc: Peter Zijlstra <peterz at infradead.org>
+Cc: Rik van Riel <riel at redhat.com>
+Cc: linux-mm at kvack.org
+Link: http://lkml.kernel.org/r/6327ecd907b32f79d5aa0d466f04503bbec5df88.1498751203.git.luto@kernel.org
+Signed-off-by: Ingo Molnar <mingo at kernel.org>
+[Hugh Dickins: Backported to 3.2:
+ - arch/x86/xen/enlighten_pv.c (not in this tree)
+ - arch/x86/xen/enlighten.c (patched instead of that)]
+Signed-off-by: Hugh Dickins <hughd at google.com>
+[Borislav Petkov: Fix bad backport to disable PCID on Xen]
+Signed-off-by: Ben Hutchings <ben at decadent.org.uk>
+---
+ arch/x86/include/asm/processor-flags.h |  1 +
+ arch/x86/include/asm/tlbflush.h        |  8 ++++++++
+ arch/x86/kernel/cpu/common.c           | 31 ++++++++++++++++++++++++++-----
+ arch/x86/xen/enlighten.c               |  6 ++++++
+ 4 files changed, 41 insertions(+), 5 deletions(-)
+
+--- a/arch/x86/include/asm/processor-flags.h
++++ b/arch/x86/include/asm/processor-flags.h
+@@ -60,6 +60,7 @@
+ #define X86_CR4_OSXMMEXCPT 0x00000400 /* enable unmasked SSE exceptions */
+ #define X86_CR4_VMXE	0x00002000 /* enable VMX virtualization */
+ #define X86_CR4_RDWRGSFS 0x00010000 /* enable RDWRGSFS support */
++#define X86_CR4_PCIDE	0x00020000 /* enable PCID support */
+ #define X86_CR4_OSXSAVE 0x00040000 /* enable xsave and xrestore */
+ #define X86_CR4_SMEP	0x00100000 /* enable SMEP support */
+ 
+--- a/arch/x86/include/asm/tlbflush.h
++++ b/arch/x86/include/asm/tlbflush.h
+@@ -117,6 +117,14 @@ static inline void __flush_tlb_all(void)
+ 		__flush_tlb_global();
+ 	else
+ 		__flush_tlb();
++
++	/*
++	 * Note: if we somehow had PCID but not PGE, then this wouldn't work --
++	 * we'd end up flushing kernel translations for the current ASID but
++	 * we might fail to flush kernel translations for other cached ASIDs.
++	 *
++	 * To avoid this issue, we force PCID off if PGE is off.
++	 */
+ }
+ 
+ static inline void __flush_tlb_one(unsigned long addr)
+--- a/arch/x86/kernel/cpu/common.c
++++ b/arch/x86/kernel/cpu/common.c
+@@ -310,6 +310,29 @@ static __cpuinit void setup_smep(struct
+ 	}
+ }
+ 
++static void setup_pcid(struct cpuinfo_x86 *c)
++{
++	if (cpu_has(c, X86_FEATURE_PCID)) {
++		if (cpu_has(c, X86_FEATURE_PGE) && IS_ENABLED(CONFIG_X86_64)) {
++			/*
++			 * Regardless of whether PCID is enumerated, the
++			 * SDM says that it can't be enabled in 32-bit mode.
++			 */
++			set_in_cr4(X86_CR4_PCIDE);
++		} else {
++			/*
++			 * flush_tlb_all(), as currently implemented, won't
++			 * work if PCID is on but PGE is not.  Since that
++			 * combination doesn't exist on real hardware, there's
++			 * no reason to try to fully support it, but it's
++			 * polite to avoid corrupting data if we're on
++			 * an improperly configured VM.
++			 */
++			clear_cpu_cap(c, X86_FEATURE_PCID);
++		}
++	}
++}
++
+ /*
+  * Some CPU features depend on higher CPUID levels, which may not always
+  * be available due to CPUID level capping or broken virtualization
+@@ -867,6 +890,9 @@ static void __cpuinit identify_cpu(struc
+ 	/* Disable the PN if appropriate */
+ 	squash_the_stupid_serial_number(c);
+ 
++	/* Set up PCID */
++	setup_pcid(c);
++
+ 	/*
+ 	 * The vendor-specific functions might have changed features.
+ 	 * Now we do "generic changes."
+@@ -952,11 +978,6 @@ void __cpuinit identify_secondary_cpu(st
+ 	BUG_ON(c == &boot_cpu_data);
+ 	identify_cpu(c);
+ #ifdef CONFIG_X86_32
+-	/*
+-	 * Regardless of whether PCID is enumerated, the SDM says
+-	 * that it can't be enabled in 32-bit mode.
+-	 */
+-	clear_cpu_cap(c, X86_FEATURE_PCID);
+ 	enable_sep_cpu();
+ #endif
+ 	mtrr_ap_init();
+--- a/arch/x86/xen/enlighten.c
++++ b/arch/x86/xen/enlighten.c
+@@ -270,6 +270,12 @@ static void __init xen_init_cpuid_mask(v
+ 		  (1 << X86_FEATURE_MTRR) |  /* disable MTRR */
+ 		  (1 << X86_FEATURE_ACC));   /* thermal monitoring */
+ 
++	/*
++	 * Xen PV would need some work to support PCID: CR3 handling as well
++	 * as xen_flush_tlb_others() would need updating.
++	 */
++	cpuid_leaf1_ecx_mask &= ~(1 << (X86_FEATURE_PCID % 32));  /* disable PCID */
++
+ 	if (!xen_initial_domain())
+ 		cpuid_leaf1_edx_mask &=
+ 			~((1 << X86_FEATURE_APIC) |  /* disable local APIC */
diff --git a/debian/patches/bugfix/all/kpti/x86-mm-fix-invpcid-asm-constraint.patch b/debian/patches/bugfix/all/kpti/x86-mm-fix-invpcid-asm-constraint.patch
new file mode 100644
index 0000000..34c0abc
--- /dev/null
+++ b/debian/patches/bugfix/all/kpti/x86-mm-fix-invpcid-asm-constraint.patch
@@ -0,0 +1,66 @@
+From: Borislav Petkov <bp at suse.de>
+Date: Wed, 10 Feb 2016 15:51:16 +0100
+Subject: x86/mm: Fix INVPCID asm constraint
+MIME-Version: 1.0
+Content-Type: text/plain; charset=UTF-8
+Content-Transfer-Encoding: 8bit
+
+commit e2c7698cd61f11d4077fdb28148b2d31b82ac848 upstream.
+
+So we want to specify the dependency on both @pcid and @addr so that the
+compiler doesn't reorder accesses to them *before* the TLB flush. But
+for that to work, we need to express this properly in the inline asm and
+deref the whole desc array, not the pointer to it. See clwb() for an
+example.
+
+This fixes the build error on 32-bit:
+
+  arch/x86/include/asm/tlbflush.h: In function ‘__invpcid’:
+  arch/x86/include/asm/tlbflush.h:26:18: error: memory input 0 is not directly addressable
+
+which gcc4.7 caught but 5.x didn't. Which is strange. :-\
+
+Signed-off-by: Borislav Petkov <bp at suse.de>
+Cc: Andrew Morton <akpm at linux-foundation.org>
+Cc: Andrey Ryabinin <aryabinin at virtuozzo.com>
+Cc: Andy Lutomirski <luto at amacapital.net>
+Cc: Borislav Petkov <bp at alien8.de>
+Cc: Brian Gerst <brgerst at gmail.com>
+Cc: Dave Hansen <dave.hansen at linux.intel.com>
+Cc: Denys Vlasenko <dvlasenk at redhat.com>
+Cc: H. Peter Anvin <hpa at zytor.com>
+Cc: Linus Torvalds <torvalds at linux-foundation.org>
+Cc: Luis R. Rodriguez <mcgrof at suse.com>
+Cc: Michael Matz <matz at suse.de>
+Cc: Oleg Nesterov <oleg at redhat.com>
+Cc: Peter Zijlstra <peterz at infradead.org>
+Cc: Thomas Gleixner <tglx at linutronix.de>
+Cc: Toshi Kani <toshi.kani at hp.com>
+Cc: linux-mm at kvack.org
+Signed-off-by: Ingo Molnar <mingo at kernel.org>
+Cc: Hugh Dickins <hughd at google.com>
+Signed-off-by: Ben Hutchings <ben at decadent.org.uk>
+---
+ arch/x86/include/asm/tlbflush.h | 4 ++--
+ 1 file changed, 2 insertions(+), 2 deletions(-)
+
+--- a/arch/x86/include/asm/tlbflush.h
++++ b/arch/x86/include/asm/tlbflush.h
+@@ -10,7 +10,7 @@
+ static inline void __invpcid(unsigned long pcid, unsigned long addr,
+ 			     unsigned long type)
+ {
+-	u64 desc[2] = { pcid, addr };
++	struct { u64 d[2]; } desc = { { pcid, addr } };
+ 
+ 	/*
+ 	 * The memory clobber is because the whole point is to invalidate
+@@ -22,7 +22,7 @@ static inline void __invpcid(unsigned lo
+ 	 * invpcid (%rcx), %rax in long mode.
+ 	 */
+ 	asm volatile (".byte 0x66, 0x0f, 0x38, 0x82, 0x01"
+-		      : : "m" (desc), "a" (type), "c" (desc) : "memory");
++		      : : "m" (desc), "a" (type), "c" (&desc) : "memory");
+ }
+ 
+ #define INVPCID_TYPE_INDIV_ADDR		0
diff --git a/debian/patches/bugfix/all/kpti/x86-mm-if-invpcid-is-available-use-it-to-flush-global-mappings.patch b/debian/patches/bugfix/all/kpti/x86-mm-if-invpcid-is-available-use-it-to-flush-global-mappings.patch
new file mode 100644
index 0000000..b43213b
--- /dev/null
+++ b/debian/patches/bugfix/all/kpti/x86-mm-if-invpcid-is-available-use-it-to-flush-global-mappings.patch
@@ -0,0 +1,54 @@
+From: Andy Lutomirski <luto at kernel.org>
+Date: Fri, 29 Jan 2016 11:42:59 -0800
+Subject: x86/mm: If INVPCID is available, use it to flush global  mappings
+
+commit d8bced79af1db6734f66b42064cc773cada2ce99 upstream.
+
+On my Skylake laptop, INVPCID function 2 (flush absolutely
+everything) takes about 376ns, whereas saving flags, twiddling
+CR4.PGE to flush global mappings, and restoring flags takes about
+539ns.
+
+Signed-off-by: Andy Lutomirski <luto at kernel.org>
+Reviewed-by: Borislav Petkov <bp at suse.de>
+Cc: Andrew Morton <akpm at linux-foundation.org>
+Cc: Andrey Ryabinin <aryabinin at virtuozzo.com>
+Cc: Andy Lutomirski <luto at amacapital.net>
+Cc: Borislav Petkov <bp at alien8.de>
+Cc: Brian Gerst <brgerst at gmail.com>
+Cc: Dave Hansen <dave.hansen at linux.intel.com>
+Cc: Denys Vlasenko <dvlasenk at redhat.com>
+Cc: H. Peter Anvin <hpa at zytor.com>
+Cc: Linus Torvalds <torvalds at linux-foundation.org>
+Cc: Luis R. Rodriguez <mcgrof at suse.com>
+Cc: Oleg Nesterov <oleg at redhat.com>
+Cc: Peter Zijlstra <peterz at infradead.org>
+Cc: Thomas Gleixner <tglx at linutronix.de>
+Cc: Toshi Kani <toshi.kani at hp.com>
+Cc: linux-mm at kvack.org
+Link: http://lkml.kernel.org/r/ed0ef62581c0ea9c99b9bf6df726015e96d44743.1454096309.git.luto@kernel.org
+Signed-off-by: Ingo Molnar <mingo at kernel.org>
+Cc: Hugh Dickins <hughd at google.com>
+Signed-off-by: Ben Hutchings <ben at decadent.org.uk>
+---
+ arch/x86/include/asm/tlbflush.h | 9 +++++++++
+ 1 file changed, 9 insertions(+)
+
+--- a/arch/x86/include/asm/tlbflush.h
++++ b/arch/x86/include/asm/tlbflush.h
+@@ -80,6 +80,15 @@ static inline void __native_flush_tlb_gl
+ 	unsigned long flags;
+ 	unsigned long cr4;
+ 
++	if (static_cpu_has(X86_FEATURE_INVPCID)) {
++		/*
++		 * Using INVPCID is considerably faster than a pair of writes
++		 * to CR4 sandwiched inside an IRQ flag save/restore.
++		 */
++		invpcid_flush_all();
++		return;
++	}
++
+ 	/*
+ 	 * Read-modify-write to CR4 - protect it from preemption and
+ 	 * from interrupts. (Use the raw variant because this code can
diff --git a/debian/patches/bugfix/all/kpti/x86-mm-kaiser-re-enable-vsyscalls.patch b/debian/patches/bugfix/all/kpti/x86-mm-kaiser-re-enable-vsyscalls.patch
new file mode 100644
index 0000000..45a09d6
--- /dev/null
+++ b/debian/patches/bugfix/all/kpti/x86-mm-kaiser-re-enable-vsyscalls.patch
@@ -0,0 +1,132 @@
+From: Andrea Arcangeli <aarcange at redhat.com>
+Date: Tue, 5 Dec 2017 21:15:07 +0100
+Subject: x86/mm/kaiser: re-enable vsyscalls
+
+To avoid breaking the kernel ABI.
+
+Signed-off-by: Andrea Arcangeli <aarcange at redhat.com>
+[Hugh Dickins: Backported to 3.2:
+ - Leave out the PVCLOCK_FIXMAP user mapping, which does not apply to
+   this tree
+ - For safety added vsyscall_pgprot, and a BUG_ON if _PAGE_USER
+   outside of FIXMAP.]
+Signed-off-by: Hugh Dickins <hughd at google.com>
+Signed-off-by: Ben Hutchings <ben at decadent.org.uk>
+---
+ arch/x86/include/asm/vsyscall.h |  1 +
+ arch/x86/kernel/hpet.c          |  3 +++
+ arch/x86/kernel/vsyscall_64.c   |  7 ++++---
+ arch/x86/mm/kaiser.c            | 14 +++++++++++---
+ 4 files changed, 19 insertions(+), 6 deletions(-)
+
+--- a/arch/x86/include/asm/vsyscall.h
++++ b/arch/x86/include/asm/vsyscall.h
+@@ -22,6 +22,7 @@ enum vsyscall_num {
+ /* kernel space (writeable) */
+ extern int vgetcpu_mode;
+ extern struct timezone sys_tz;
++extern unsigned long vsyscall_pgprot;
+ 
+ #include <asm/vvar.h>
+ 
+--- a/arch/x86/kernel/hpet.c
++++ b/arch/x86/kernel/hpet.c
+@@ -12,6 +12,7 @@
+ #include <linux/cpu.h>
+ #include <linux/pm.h>
+ #include <linux/io.h>
++#include <linux/kaiser.h>
+ 
+ #include <asm/fixmap.h>
+ #include <asm/hpet.h>
+@@ -74,6 +75,8 @@ static inline void hpet_set_mapping(void
+ 	hpet_virt_address = ioremap_nocache(hpet_address, HPET_MMAP_SIZE);
+ #ifdef CONFIG_X86_64
+ 	__set_fixmap(VSYSCALL_HPET, hpet_address, PAGE_KERNEL_VVAR_NOCACHE);
++	kaiser_add_mapping(__fix_to_virt(VSYSCALL_HPET), PAGE_SIZE,
++			   __PAGE_KERNEL_VVAR_NOCACHE);
+ #endif
+ }
+ 
+--- a/arch/x86/kernel/vsyscall_64.c
++++ b/arch/x86/kernel/vsyscall_64.c
+@@ -58,6 +58,7 @@ DEFINE_VVAR(struct vsyscall_gtod_data, v
+ };
+ 
+ static enum { EMULATE, NATIVE, NONE } vsyscall_mode = NATIVE;
++unsigned long vsyscall_pgprot = __PAGE_KERNEL_VSYSCALL;
+ 
+ static int __init vsyscall_setup(char *str)
+ {
+@@ -274,10 +275,10 @@ void __init map_vsyscall(void)
+ 	extern char __vvar_page;
+ 	unsigned long physaddr_vvar_page = __pa_symbol(&__vvar_page);
+ 
++	if (vsyscall_mode != NATIVE)
++		vsyscall_pgprot = __PAGE_KERNEL_VVAR;
+ 	__set_fixmap(VSYSCALL_FIRST_PAGE, physaddr_vsyscall,
+-		     vsyscall_mode == NATIVE
+-		     ? PAGE_KERNEL_VSYSCALL
+-		     : PAGE_KERNEL_VVAR);
++		     __pgprot(vsyscall_pgprot));
+ 	BUILD_BUG_ON((unsigned long)__fix_to_virt(VSYSCALL_FIRST_PAGE) !=
+ 		     (unsigned long)VSYSCALL_START);
+ 
+--- a/arch/x86/mm/kaiser.c
++++ b/arch/x86/mm/kaiser.c
+@@ -16,6 +16,7 @@ extern struct mm_struct init_mm;
+ 
+ #include <asm/kaiser.h>
+ #include <asm/tlbflush.h>	/* to verify its kaiser declarations */
++#include <asm/vsyscall.h>
+ #include <asm/pgtable.h>
+ #include <asm/pgalloc.h>
+ #include <asm/desc.h>
+@@ -133,7 +134,7 @@ static pte_t *kaiser_pagetable_walk(unsi
+ 			return NULL;
+ 		spin_lock(&shadow_table_allocation_lock);
+ 		if (pud_none(*pud)) {
+-			set_pud(pud, __pud(_KERNPG_TABLE | __pa(new_pmd_page)));
++			set_pud(pud, __pud(_PAGE_TABLE | __pa(new_pmd_page)));
+ 			__inc_zone_page_state(virt_to_page((void *)
+ 						new_pmd_page), NR_KAISERTABLE);
+ 		} else
+@@ -153,7 +154,7 @@ static pte_t *kaiser_pagetable_walk(unsi
+ 			return NULL;
+ 		spin_lock(&shadow_table_allocation_lock);
+ 		if (pmd_none(*pmd)) {
+-			set_pmd(pmd, __pmd(_KERNPG_TABLE | __pa(new_pte_page)));
++			set_pmd(pmd, __pmd(_PAGE_TABLE | __pa(new_pte_page)));
+ 			__inc_zone_page_state(virt_to_page((void *)
+ 						new_pte_page), NR_KAISERTABLE);
+ 		} else
+@@ -174,6 +175,9 @@ int kaiser_add_user_map(const void *__st
+ 	unsigned long end_addr = PAGE_ALIGN(start_addr + size);
+ 	unsigned long target_address;
+ 
++	if (flags & _PAGE_USER)
++		BUG_ON(address < FIXADDR_START || end_addr >= FIXADDR_TOP);
++
+ 	for (; address < end_addr; address += PAGE_SIZE) {
+ 		target_address = get_pa_from_mapping(address);
+ 		if (target_address == -1) {
+@@ -227,7 +231,7 @@ static void __init kaiser_init_all_pgds(
+ 			break;
+ 		}
+ 		inc_zone_page_state(virt_to_page(pud), NR_KAISERTABLE);
+-		new_pgd = __pgd(_KERNPG_TABLE |__pa(pud));
++		new_pgd = __pgd(_PAGE_TABLE |__pa(pud));
+ 		/*
+ 		 * Make sure not to stomp on some other pgd entry.
+ 		 */
+@@ -285,6 +289,10 @@ void __init kaiser_init(void)
+ 	kaiser_add_user_map_early((void *)idt_descr.address,
+ 				  sizeof(gate_desc) * NR_VECTORS,
+ 				  __PAGE_KERNEL_RO);
++	kaiser_add_user_map_early((void *)VVAR_ADDRESS, PAGE_SIZE,
++				  __PAGE_KERNEL_VVAR);
++	kaiser_add_user_map_early((void *)VSYSCALL_START, PAGE_SIZE,
++				  vsyscall_pgprot);
+ 	kaiser_add_user_map_early(&x86_cr3_pcid_noflush,
+ 				  sizeof(x86_cr3_pcid_noflush),
+ 				  __PAGE_KERNEL);
diff --git a/debian/patches/bugfix/all/kpti/x86-mm-remove-the-up-asm-tlbflush.h-code-always-use-the-formerly-smp-code.patch b/debian/patches/bugfix/all/kpti/x86-mm-remove-the-up-asm-tlbflush.h-code-always-use-the-formerly-smp-code.patch
new file mode 100644
index 0000000..5effe33
--- /dev/null
+++ b/debian/patches/bugfix/all/kpti/x86-mm-remove-the-up-asm-tlbflush.h-code-always-use-the-formerly-smp-code.patch
@@ -0,0 +1,232 @@
+From: Andy Lutomirski <luto at kernel.org>
+Date: Sun, 28 May 2017 10:00:14 -0700
+Subject: x86/mm: Remove the UP asm/tlbflush.h code, always use  the (formerly) SMP code
+
+commit ce4a4e565f5264909a18c733b864c3f74467f69e upstream.
+
+The UP asm/tlbflush.h generates somewhat nicer code than the SMP version.
+Aside from that, it's fallen quite a bit behind the SMP code:
+
+ - flush_tlb_mm_range() didn't flush individual pages if the range
+   was small.
+
+ - The lazy TLB code was much weaker.  This usually wouldn't matter,
+   but, if a kernel thread flushed its lazy "active_mm" more than
+   once (due to reclaim or similar), it wouldn't be unlazied and
+   would instead pointlessly flush repeatedly.
+
+ - Tracepoints were missing.
+
+Aside from that, simply having the UP code around was a maintanence
+burden, since it means that any change to the TLB flush code had to
+make sure not to break it.
+
+Simplify everything by deleting the UP code.
+
+Signed-off-by: Andy Lutomirski <luto at kernel.org>
+Cc: Andrew Morton <akpm at linux-foundation.org>
+Cc: Arjan van de Ven <arjan at linux.intel.com>
+Cc: Borislav Petkov <bpetkov at suse.de>
+Cc: Dave Hansen <dave.hansen at intel.com>
+Cc: Linus Torvalds <torvalds at linux-foundation.org>
+Cc: Mel Gorman <mgorman at suse.de>
+Cc: Michal Hocko <mhocko at suse.com>
+Cc: Nadav Amit <nadav.amit at gmail.com>
+Cc: Nadav Amit <namit at vmware.com>
+Cc: Peter Zijlstra <peterz at infradead.org>
+Cc: Rik van Riel <riel at redhat.com>
+Cc: Thomas Gleixner <tglx at linutronix.de>
+Cc: linux-mm at kvack.org
+Signed-off-by: Ingo Molnar <mingo at kernel.org>
+[Hugh Dickins: Backported to 3.2]
+Signed-off-by: Hugh Dickins <hughd at google.com>
+Signed-off-by: Ben Hutchings <ben at decadent.org.uk>
+---
+ arch/x86/include/asm/hardirq.h     |  2 +-
+ arch/x86/include/asm/mmu.h         |  6 -----
+ arch/x86/include/asm/mmu_context.h |  2 --
+ arch/x86/include/asm/tlbflush.h    | 47 +-------------------------------------
+ arch/x86/mm/tlb.c                  | 17 ++------------
+ 5 files changed, 4 insertions(+), 70 deletions(-)
+
+--- a/arch/x86/include/asm/hardirq.h
++++ b/arch/x86/include/asm/hardirq.h
+@@ -18,8 +18,8 @@ typedef struct {
+ #ifdef CONFIG_SMP
+ 	unsigned int irq_resched_count;
+ 	unsigned int irq_call_count;
+-	unsigned int irq_tlb_count;
+ #endif
++	unsigned int irq_tlb_count;
+ #ifdef CONFIG_X86_THERMAL_VECTOR
+ 	unsigned int irq_thermal_count;
+ #endif
+--- a/arch/x86/include/asm/mmu.h
++++ b/arch/x86/include/asm/mmu.h
+@@ -20,12 +20,6 @@ typedef struct {
+ 	void *vdso;
+ } mm_context_t;
+ 
+-#ifdef CONFIG_SMP
+ void leave_mm(int cpu);
+-#else
+-static inline void leave_mm(int cpu)
+-{
+-}
+-#endif
+ 
+ #endif /* _ASM_X86_MMU_H */
+--- a/arch/x86/include/asm/mmu_context.h
++++ b/arch/x86/include/asm/mmu_context.h
+@@ -69,10 +69,8 @@ void destroy_context(struct mm_struct *m
+ 
+ static inline void enter_lazy_tlb(struct mm_struct *mm, struct task_struct *tsk)
+ {
+-#ifdef CONFIG_SMP
+ 	if (percpu_read(cpu_tlbstate.state) == TLBSTATE_OK)
+ 		percpu_write(cpu_tlbstate.state, TLBSTATE_LAZY);
+-#endif
+ }
+ 
+ extern void switch_mm(struct mm_struct *prev, struct mm_struct *next,
+--- a/arch/x86/include/asm/tlbflush.h
++++ b/arch/x86/include/asm/tlbflush.h
+@@ -6,6 +6,7 @@
+ 
+ #include <asm/processor.h>
+ #include <asm/system.h>
++#include <asm/smp.h>
+ 
+ static inline void __invpcid(unsigned long pcid, unsigned long addr,
+ 			     unsigned long type)
+@@ -145,52 +146,8 @@ static inline void __flush_tlb_one(unsig
+  *
+  * ..but the i386 has somewhat limited tlb flushing capabilities,
+  * and page-granular flushes are available only on i486 and up.
+- *
+- * x86-64 can only flush individual pages or full VMs. For a range flush
+- * we always do the full VM. Might be worth trying if for a small
+- * range a few INVLPGs in a row are a win.
+  */
+ 
+-#ifndef CONFIG_SMP
+-
+-#define flush_tlb() __flush_tlb()
+-#define flush_tlb_all() __flush_tlb_all()
+-#define local_flush_tlb() __flush_tlb()
+-
+-static inline void flush_tlb_mm(struct mm_struct *mm)
+-{
+-	if (mm == current->active_mm)
+-		__flush_tlb();
+-}
+-
+-static inline void flush_tlb_page(struct vm_area_struct *vma,
+-				  unsigned long addr)
+-{
+-	if (vma->vm_mm == current->active_mm)
+-		__flush_tlb_one(addr);
+-}
+-
+-static inline void flush_tlb_range(struct vm_area_struct *vma,
+-				   unsigned long start, unsigned long end)
+-{
+-	if (vma->vm_mm == current->active_mm)
+-		__flush_tlb();
+-}
+-
+-static inline void native_flush_tlb_others(const struct cpumask *cpumask,
+-					   struct mm_struct *mm,
+-					   unsigned long va)
+-{
+-}
+-
+-static inline void reset_lazy_tlbstate(void)
+-{
+-}
+-
+-#else  /* SMP */
+-
+-#include <asm/smp.h>
+-
+ #define local_flush_tlb() __flush_tlb()
+ 
+ extern void flush_tlb_all(void);
+@@ -224,8 +181,6 @@ static inline void reset_lazy_tlbstate(v
+ 	percpu_write(cpu_tlbstate.active_mm, &init_mm);
+ }
+ 
+-#endif	/* SMP */
+-
+ #ifndef CONFIG_PARAVIRT
+ #define flush_tlb_others(mask, mm, va)	native_flush_tlb_others(mask, mm, va)
+ #endif
+--- a/arch/x86/mm/tlb.c
++++ b/arch/x86/mm/tlb.c
+@@ -17,7 +17,7 @@ DEFINE_PER_CPU_SHARED_ALIGNED(struct tlb
+ 			= { &init_mm, 0, };
+ 
+ /*
+- *	Smarter SMP flushing macros.
++ *	TLB flushing, formerly SMP-only
+  *		c/o Linus Torvalds.
+  *
+  *	These mean you can really definitely utterly forget about
+@@ -38,8 +38,6 @@ DEFINE_PER_CPU_SHARED_ALIGNED(struct tlb
+  *	fixed, at the cost of triggering multiple IPIs in some cases.
+  */
+ 
+-#ifdef CONFIG_SMP
+-
+ union smp_flush_state {
+ 	struct {
+ 		struct mm_struct *flush_mm;
+@@ -71,8 +69,6 @@ void leave_mm(int cpu)
+ }
+ EXPORT_SYMBOL_GPL(leave_mm);
+ 
+-#endif /* CONFIG_SMP */
+-
+ void switch_mm(struct mm_struct *prev, struct mm_struct *next,
+ 	       struct task_struct *tsk)
+ {
+@@ -89,10 +85,8 @@ void switch_mm_irqs_off(struct mm_struct
+ 	unsigned cpu = smp_processor_id();
+ 
+ 	if (likely(prev != next)) {
+-#ifdef CONFIG_SMP
+ 		percpu_write(cpu_tlbstate.state, TLBSTATE_OK);
+ 		percpu_write(cpu_tlbstate.active_mm, next);
+-#endif
+ 		cpumask_set_cpu(cpu, mm_cpumask(next));
+ 
+ 		/*
+@@ -133,9 +127,7 @@ void switch_mm_irqs_off(struct mm_struct
+ 		 */
+ 		if (unlikely(prev->context.ldt != next->context.ldt))
+ 			load_mm_ldt(next);
+-	}
+-#ifdef CONFIG_SMP
+-	else {
++	} else {
+ 		percpu_write(cpu_tlbstate.state, TLBSTATE_OK);
+ 		BUG_ON(percpu_read(cpu_tlbstate.active_mm) != next);
+ 
+@@ -151,11 +143,8 @@ void switch_mm_irqs_off(struct mm_struct
+ 			load_mm_ldt(next);
+ 		}
+ 	}
+-#endif
+ }
+ 
+-#ifdef CONFIG_SMP
+-
+ /*
+  *
+  * The flush IPI assumes that a thread switch happens in this order:
+@@ -437,5 +426,3 @@ void flush_tlb_all(void)
+ {
+ 	on_each_cpu(do_flush_tlb_all, NULL, 1);
+ }
+-
+-#endif /* CONFIG_SMP */
diff --git a/debian/patches/bugfix/all/kpti/x86-mm-sched-core-turn-off-irqs-in-switch_mm.patch b/debian/patches/bugfix/all/kpti/x86-mm-sched-core-turn-off-irqs-in-switch_mm.patch
new file mode 100644
index 0000000..43af903
--- /dev/null
+++ b/debian/patches/bugfix/all/kpti/x86-mm-sched-core-turn-off-irqs-in-switch_mm.patch
@@ -0,0 +1,64 @@
+From: Andy Lutomirski <luto at kernel.org>
+Date: Tue, 26 Apr 2016 09:39:09 -0700
+Subject: x86/mm, sched/core: Turn off IRQs in switch_mm()
+
+commit 078194f8e9fe3cf54c8fd8bded48a1db5bd8eb8a upstream.
+
+Potential races between switch_mm() and TLB-flush or LDT-flush IPIs
+could be very messy.  AFAICT the code is currently okay, whether by
+accident or by careful design, but enabling PCID will make it
+considerably more complicated and will no longer be obviously safe.
+
+Fix it with a big hammer: run switch_mm() with IRQs off.
+
+To avoid a performance hit in the scheduler, we take advantage of
+our knowledge that the scheduler already has IRQs disabled when it
+calls switch_mm().
+
+Signed-off-by: Andy Lutomirski <luto at kernel.org>
+Reviewed-by: Borislav Petkov <bp at suse.de>
+Cc: Borislav Petkov <bp at alien8.de>
+Cc: Linus Torvalds <torvalds at linux-foundation.org>
+Cc: Peter Zijlstra <peterz at infradead.org>
+Cc: Thomas Gleixner <tglx at linutronix.de>
+Link: http://lkml.kernel.org/r/f19baf759693c9dcae64bbff76189db77cb13398.1461688545.git.luto@kernel.org
+Signed-off-by: Ingo Molnar <mingo at kernel.org>
+Cc: Hugh Dickins <hughd at google.com>
+Signed-off-by: Ben Hutchings <ben at decadent.org.uk>
+---
+ arch/x86/include/asm/mmu_context.h |  4 ++++
+ arch/x86/mm/tlb.c                  | 10 ++++++++++
+ 2 files changed, 14 insertions(+)
+
+--- a/arch/x86/include/asm/mmu_context.h
++++ b/arch/x86/include/asm/mmu_context.h
+@@ -78,6 +78,10 @@ static inline void enter_lazy_tlb(struct
+ extern void switch_mm(struct mm_struct *prev, struct mm_struct *next,
+ 		      struct task_struct *tsk);
+ 
++extern void switch_mm_irqs_off(struct mm_struct *prev, struct mm_struct *next,
++			       struct task_struct *tsk);
++#define switch_mm_irqs_off switch_mm_irqs_off
++
+ #define activate_mm(prev, next)			\
+ do {						\
+ 	paravirt_activate_mm((prev), (next));	\
+--- a/arch/x86/mm/tlb.c
++++ b/arch/x86/mm/tlb.c
+@@ -76,6 +76,16 @@ EXPORT_SYMBOL_GPL(leave_mm);
+ void switch_mm(struct mm_struct *prev, struct mm_struct *next,
+ 	       struct task_struct *tsk)
+ {
++	unsigned long flags;
++
++	local_irq_save(flags);
++	switch_mm_irqs_off(prev, next, tsk);
++	local_irq_restore(flags);
++}
++
++void switch_mm_irqs_off(struct mm_struct *prev, struct mm_struct *next,
++			struct task_struct *tsk)
++{
+ 	unsigned cpu = smp_processor_id();
+ 
+ 	if (likely(prev != next)) {
diff --git a/debian/patches/bugfix/all/kpti/x86-mm-sched-core-uninline-switch_mm.patch b/debian/patches/bugfix/all/kpti/x86-mm-sched-core-uninline-switch_mm.patch
new file mode 100644
index 0000000..ef89943
--- /dev/null
+++ b/debian/patches/bugfix/all/kpti/x86-mm-sched-core-uninline-switch_mm.patch
@@ -0,0 +1,190 @@
+From: Andy Lutomirski <luto at kernel.org>
+Date: Tue, 26 Apr 2016 09:39:08 -0700
+Subject: x86/mm, sched/core: Uninline switch_mm()
+
+commit 69c0319aabba45bcf33178916a2f06967b4adede upstream.
+
+It's fairly large and it has quite a few callers.  This may also
+help untangle some headers down the road.
+
+Signed-off-by: Andy Lutomirski <luto at kernel.org>
+Reviewed-by: Borislav Petkov <bp at suse.de>
+Cc: Borislav Petkov <bp at alien8.de>
+Cc: Linus Torvalds <torvalds at linux-foundation.org>
+Cc: Peter Zijlstra <peterz at infradead.org>
+Cc: Thomas Gleixner <tglx at linutronix.de>
+Link: http://lkml.kernel.org/r/54f3367803e7f80b2be62c8a21879aa74b1a5f57.1461688545.git.luto@kernel.org
+Signed-off-by: Ingo Molnar <mingo at kernel.org>
+[Hugh Dickins: Backported to 3.2]
+Signed-off-by: Hugh Dickins <hughd at google.com>
+Signed-off-by: Ben Hutchings <ben at decadent.org.uk>
+---
+ arch/x86/include/asm/mmu_context.h | 72 +-----------------------------------
+ arch/x86/mm/tlb.c                  | 75 ++++++++++++++++++++++++++++++++++++++
+ 2 files changed, 77 insertions(+), 70 deletions(-)
+
+--- a/arch/x86/include/asm/mmu_context.h
++++ b/arch/x86/include/asm/mmu_context.h
+@@ -75,76 +75,8 @@ static inline void enter_lazy_tlb(struct
+ #endif
+ }
+ 
+-static inline void switch_mm(struct mm_struct *prev, struct mm_struct *next,
+-			     struct task_struct *tsk)
+-{
+-	unsigned cpu = smp_processor_id();
+-
+-	if (likely(prev != next)) {
+-#ifdef CONFIG_SMP
+-		percpu_write(cpu_tlbstate.state, TLBSTATE_OK);
+-		percpu_write(cpu_tlbstate.active_mm, next);
+-#endif
+-		cpumask_set_cpu(cpu, mm_cpumask(next));
+-
+-		/*
+-		 * Re-load page tables.
+-		 *
+-		 * This logic has an ordering constraint:
+-		 *
+-		 *  CPU 0: Write to a PTE for 'next'
+-		 *  CPU 0: load bit 1 in mm_cpumask.  if nonzero, send IPI.
+-		 *  CPU 1: set bit 1 in next's mm_cpumask
+-		 *  CPU 1: load from the PTE that CPU 0 writes (implicit)
+-		 *
+-		 * We need to prevent an outcome in which CPU 1 observes
+-		 * the new PTE value and CPU 0 observes bit 1 clear in
+-		 * mm_cpumask.  (If that occurs, then the IPI will never
+-		 * be sent, and CPU 0's TLB will contain a stale entry.)
+-		 *
+-		 * The bad outcome can occur if either CPU's load is
+-		 * reordered before that CPU's store, so both CPUs must
+-		 * execute full barriers to prevent this from happening.
+-		 *
+-		 * Thus, switch_mm needs a full barrier between the
+-		 * store to mm_cpumask and any operation that could load
+-		 * from next->pgd.  TLB fills are special and can happen
+-		 * due to instruction fetches or for no reason at all,
+-		 * and neither LOCK nor MFENCE orders them.
+-		 * Fortunately, load_cr3() is serializing and gives the
+-		 * ordering guarantee we need.
+-		 *
+-		 */
+-		load_cr3(next->pgd);
+-
+-		/* stop flush ipis for the previous mm */
+-		cpumask_clear_cpu(cpu, mm_cpumask(prev));
+-
+-		/*
+-		 * load the LDT, if the LDT is different:
+-		 */
+-		if (unlikely(prev->context.ldt != next->context.ldt))
+-			load_mm_ldt(next);
+-	}
+-#ifdef CONFIG_SMP
+-	else {
+-		percpu_write(cpu_tlbstate.state, TLBSTATE_OK);
+-		BUG_ON(percpu_read(cpu_tlbstate.active_mm) != next);
+-
+-		if (!cpumask_test_and_set_cpu(cpu, mm_cpumask(next))) {
+-			/* We were in lazy tlb mode and leave_mm disabled
+-			 * tlb flush IPI delivery. We must reload CR3
+-			 * to make sure to use no freed page tables.
+-			 *
+-			 * As above, load_cr3() is serializing and orders TLB
+-			 * fills with respect to the mm_cpumask write.
+-			 */
+-			load_cr3(next->pgd);
+-			load_mm_ldt(next);
+-		}
+-	}
+-#endif
+-}
++extern void switch_mm(struct mm_struct *prev, struct mm_struct *next,
++		      struct task_struct *tsk);
+ 
+ #define activate_mm(prev, next)			\
+ do {						\
+--- a/arch/x86/mm/tlb.c
++++ b/arch/x86/mm/tlb.c
+@@ -71,6 +71,81 @@ void leave_mm(int cpu)
+ }
+ EXPORT_SYMBOL_GPL(leave_mm);
+ 
++#endif /* CONFIG_SMP */
++
++void switch_mm(struct mm_struct *prev, struct mm_struct *next,
++	       struct task_struct *tsk)
++{
++	unsigned cpu = smp_processor_id();
++
++	if (likely(prev != next)) {
++#ifdef CONFIG_SMP
++		percpu_write(cpu_tlbstate.state, TLBSTATE_OK);
++		percpu_write(cpu_tlbstate.active_mm, next);
++#endif
++		cpumask_set_cpu(cpu, mm_cpumask(next));
++
++		/*
++		 * Re-load page tables.
++		 *
++		 * This logic has an ordering constraint:
++		 *
++		 *  CPU 0: Write to a PTE for 'next'
++		 *  CPU 0: load bit 1 in mm_cpumask.  if nonzero, send IPI.
++		 *  CPU 1: set bit 1 in next's mm_cpumask
++		 *  CPU 1: load from the PTE that CPU 0 writes (implicit)
++		 *
++		 * We need to prevent an outcome in which CPU 1 observes
++		 * the new PTE value and CPU 0 observes bit 1 clear in
++		 * mm_cpumask.  (If that occurs, then the IPI will never
++		 * be sent, and CPU 0's TLB will contain a stale entry.)
++		 *
++		 * The bad outcome can occur if either CPU's load is
++		 * reordered before that CPU's store, so both CPUs must
++		 * execute full barriers to prevent this from happening.
++		 *
++		 * Thus, switch_mm needs a full barrier between the
++		 * store to mm_cpumask and any operation that could load
++		 * from next->pgd.  TLB fills are special and can happen
++		 * due to instruction fetches or for no reason at all,
++		 * and neither LOCK nor MFENCE orders them.
++		 * Fortunately, load_cr3() is serializing and gives the
++		 * ordering guarantee we need.
++		 *
++		 */
++		load_cr3(next->pgd);
++
++		/* stop flush ipis for the previous mm */
++		cpumask_clear_cpu(cpu, mm_cpumask(prev));
++
++		/*
++		 * load the LDT, if the LDT is different:
++		 */
++		if (unlikely(prev->context.ldt != next->context.ldt))
++			load_mm_ldt(next);
++	}
++#ifdef CONFIG_SMP
++	else {
++		percpu_write(cpu_tlbstate.state, TLBSTATE_OK);
++		BUG_ON(percpu_read(cpu_tlbstate.active_mm) != next);
++
++		if (!cpumask_test_and_set_cpu(cpu, mm_cpumask(next))) {
++			/* We were in lazy tlb mode and leave_mm disabled
++			 * tlb flush IPI delivery. We must reload CR3
++			 * to make sure to use no freed page tables.
++			 *
++			 * As above, load_cr3() is serializing and orders TLB
++			 * fills with respect to the mm_cpumask write.
++			 */
++			load_cr3(next->pgd);
++			load_mm_ldt(next);
++		}
++	}
++#endif
++}
++
++#ifdef CONFIG_SMP
++
+ /*
+  *
+  * The flush IPI assumes that a thread switch happens in this order:
diff --git a/debian/patches/bugfix/all/kpti/x86-paravirt-dont-patch-flush_tlb_single.patch b/debian/patches/bugfix/all/kpti/x86-paravirt-dont-patch-flush_tlb_single.patch
new file mode 100644
index 0000000..6455d31
--- /dev/null
+++ b/debian/patches/bugfix/all/kpti/x86-paravirt-dont-patch-flush_tlb_single.patch
@@ -0,0 +1,65 @@
+From: Thomas Gleixner <tglx at linutronix.de>
+Date: Mon, 4 Dec 2017 15:07:30 +0100
+Subject: x86/paravirt: Dont patch flush_tlb_single
+
+commit a035795499ca1c2bd1928808d1a156eda1420383 upstream.
+
+native_flush_tlb_single() will be changed with the upcoming
+PAGE_TABLE_ISOLATION feature. This requires to have more code in
+there than INVLPG.
+
+Remove the paravirt patching for it.
+
+Signed-off-by: Thomas Gleixner <tglx at linutronix.de>
+Reviewed-by: Josh Poimboeuf <jpoimboe at redhat.com>
+Reviewed-by: Juergen Gross <jgross at suse.com>
+Acked-by: Peter Zijlstra <peterz at infradead.org>
+Cc: Andy Lutomirski <luto at kernel.org>
+Cc: Boris Ostrovsky <boris.ostrovsky at oracle.com>
+Cc: Borislav Petkov <bp at alien8.de>
+Cc: Borislav Petkov <bpetkov at suse.de>
+Cc: Brian Gerst <brgerst at gmail.com>
+Cc: Dave Hansen <dave.hansen at intel.com>
+Cc: Dave Hansen <dave.hansen at linux.intel.com>
+Cc: David Laight <David.Laight at aculab.com>
+Cc: Denys Vlasenko <dvlasenk at redhat.com>
+Cc: Eduardo Valentin <eduval at amazon.com>
+Cc: Greg KH <gregkh at linuxfoundation.org>
+Cc: H. Peter Anvin <hpa at zytor.com>
+Cc: Linus Torvalds <torvalds at linux-foundation.org>
+Cc: Rik van Riel <riel at redhat.com>
+Cc: Will Deacon <will.deacon at arm.com>
+Cc: aliguori at amazon.com
+Cc: daniel.gruss at iaik.tugraz.at
+Cc: hughd at google.com
+Cc: keescook at google.com
+Cc: linux-mm at kvack.org
+Cc: michael.schwarz at iaik.tugraz.at
+Cc: moritz.lipp at iaik.tugraz.at
+Cc: richard.fellner at student.tugraz.at
+Link: https://lkml.kernel.org/r/20171204150606.828111617@linutronix.de
+Signed-off-by: Ingo Molnar <mingo at kernel.org>
+[bwh: Backported to 3.2: adjust context]
+Signed-off-by: Ben Hutchings <ben at decadent.org.uk>
+---
+ arch/x86/kernel/paravirt_patch_64.c | 2 --
+ 1 file changed, 2 deletions(-)
+
+--- a/arch/x86/kernel/paravirt_patch_64.c
++++ b/arch/x86/kernel/paravirt_patch_64.c
+@@ -9,7 +9,6 @@ DEF_NATIVE(pv_irq_ops, save_fl, "pushfq;
+ DEF_NATIVE(pv_mmu_ops, read_cr2, "movq %cr2, %rax");
+ DEF_NATIVE(pv_mmu_ops, read_cr3, "movq %cr3, %rax");
+ DEF_NATIVE(pv_mmu_ops, write_cr3, "movq %rdi, %cr3");
+-DEF_NATIVE(pv_mmu_ops, flush_tlb_single, "invlpg (%rdi)");
+ DEF_NATIVE(pv_cpu_ops, clts, "clts");
+ DEF_NATIVE(pv_cpu_ops, wbinvd, "wbinvd");
+ 
+@@ -57,7 +56,6 @@ unsigned native_patch(u8 type, u16 clobb
+ 		PATCH_SITE(pv_mmu_ops, read_cr3);
+ 		PATCH_SITE(pv_mmu_ops, write_cr3);
+ 		PATCH_SITE(pv_cpu_ops, clts);
+-		PATCH_SITE(pv_mmu_ops, flush_tlb_single);
+ 		PATCH_SITE(pv_cpu_ops, wbinvd);
+ 
+ 	patch_site:
diff --git a/debian/patches/features/all/rt/0018-x86-vdso-Use-seqcount-instead-of-seqlock.patch b/debian/patches/features/all/rt/0018-x86-vdso-Use-seqcount-instead-of-seqlock.patch
index 10d4207..66eea6b 100644
--- a/debian/patches/features/all/rt/0018-x86-vdso-Use-seqcount-instead-of-seqlock.patch
+++ b/debian/patches/features/all/rt/0018-x86-vdso-Use-seqcount-instead-of-seqlock.patch
@@ -13,8 +13,6 @@ Signed-off-by: Thomas Gleixner <tglx at linutronix.de>
  arch/x86/vdso/vclock_gettime.c | 16 ++++++++--------
  3 files changed, 12 insertions(+), 17 deletions(-)
 
-diff --git a/arch/x86/include/asm/vgtod.h b/arch/x86/include/asm/vgtod.h
-index 815285bcaceb..1f007178c813 100644
 --- a/arch/x86/include/asm/vgtod.h
 +++ b/arch/x86/include/asm/vgtod.h
 @@ -5,7 +5,7 @@
@@ -26,8 +24,6 @@ index 815285bcaceb..1f007178c813 100644
  
  	/* open coded 'struct timespec' */
  	time_t		wall_time_sec;
-diff --git a/arch/x86/kernel/vsyscall_64.c b/arch/x86/kernel/vsyscall_64.c
-index f04adbd6f6f4..50392ee9a626 100644
 --- a/arch/x86/kernel/vsyscall_64.c
 +++ b/arch/x86/kernel/vsyscall_64.c
 @@ -52,10 +52,7 @@
@@ -41,8 +37,8 @@ index f04adbd6f6f4..50392ee9a626 100644
 +DEFINE_VVAR(struct vsyscall_gtod_data, vsyscall_gtod_data);
  
  static enum { EMULATE, NATIVE, NONE } vsyscall_mode = NATIVE;
- 
-@@ -86,9 +83,7 @@ void update_vsyscall_tz(void)
+ unsigned long vsyscall_pgprot = __PAGE_KERNEL_VSYSCALL;
+@@ -87,9 +84,7 @@ void update_vsyscall_tz(void)
  void update_vsyscall(struct timespec *wall_time, struct timespec *wtm,
  			struct clocksource *clock, u32 mult)
  {
@@ -53,7 +49,7 @@ index f04adbd6f6f4..50392ee9a626 100644
  
  	/* copy vsyscall data */
  	vsyscall_gtod_data.clock.vclock_mode	= clock->archdata.vclock_mode;
-@@ -101,7 +96,7 @@ void update_vsyscall(struct timespec *wall_time, struct timespec *wtm,
+@@ -102,7 +97,7 @@ void update_vsyscall(struct timespec *wa
  	vsyscall_gtod_data.wall_to_monotonic	= *wtm;
  	vsyscall_gtod_data.wall_time_coarse	= __current_kernel_time();
  
@@ -62,11 +58,9 @@ index f04adbd6f6f4..50392ee9a626 100644
  }
  
  static void warn_bad_vsyscall(const char *level, struct pt_regs *regs,
-diff --git a/arch/x86/vdso/vclock_gettime.c b/arch/x86/vdso/vclock_gettime.c
-index 6bc0e723b6e8..d8511fb90c64 100644
 --- a/arch/x86/vdso/vclock_gettime.c
 +++ b/arch/x86/vdso/vclock_gettime.c
-@@ -86,11 +86,11 @@ notrace static noinline int do_realtime(struct timespec *ts)
+@@ -86,11 +86,11 @@ notrace static noinline int do_realtime(
  {
  	unsigned long seq, ns;
  	do {
@@ -80,7 +74,7 @@ index 6bc0e723b6e8..d8511fb90c64 100644
  	timespec_add_ns(ts, ns);
  	return 0;
  }
-@@ -99,12 +99,12 @@ notrace static noinline int do_monotonic(struct timespec *ts)
+@@ -99,12 +99,12 @@ notrace static noinline int do_monotonic
  {
  	unsigned long seq, ns, secs;
  	do {
@@ -95,7 +89,7 @@ index 6bc0e723b6e8..d8511fb90c64 100644
  
  	/* wall_time_nsec, vgetns(), and wall_to_monotonic.tv_nsec
  	 * are all guaranteed to be nonnegative.
-@@ -123,10 +123,10 @@ notrace static noinline int do_realtime_coarse(struct timespec *ts)
+@@ -123,10 +123,10 @@ notrace static noinline int do_realtime_
  {
  	unsigned long seq;
  	do {
@@ -108,7 +102,7 @@ index 6bc0e723b6e8..d8511fb90c64 100644
  	return 0;
  }
  
-@@ -134,12 +134,12 @@ notrace static noinline int do_monotonic_coarse(struct timespec *ts)
+@@ -134,12 +134,12 @@ notrace static noinline int do_monotonic
  {
  	unsigned long seq, ns, secs;
  	do {
diff --git a/debian/patches/features/all/rt/0138-posix-timers-thread-posix-cpu-timers-on-rt.patch b/debian/patches/features/all/rt/0138-posix-timers-thread-posix-cpu-timers-on-rt.patch
index 7a93f0b..8dc116b 100644
--- a/debian/patches/features/all/rt/0138-posix-timers-thread-posix-cpu-timers-on-rt.patch
+++ b/debian/patches/features/all/rt/0138-posix-timers-thread-posix-cpu-timers-on-rt.patch
@@ -18,11 +18,9 @@ Signed-off-by: Thomas Gleixner <tglx at linutronix.de>
  kernel/posix-cpu-timers.c | 182 ++++++++++++++++++++++++++++++++++++++++++++--
  5 files changed, 190 insertions(+), 6 deletions(-)
 
-diff --git a/include/linux/init_task.h b/include/linux/init_task.h
-index cdde2b379c8d..3202e80e5796 100644
 --- a/include/linux/init_task.h
 +++ b/include/linux/init_task.h
-@@ -142,6 +142,12 @@ extern struct task_group root_task_group;
+@@ -142,6 +142,12 @@ extern struct task_group root_task_group
  # define INIT_PERF_EVENTS(tsk)
  #endif
  
@@ -35,7 +33,7 @@ index cdde2b379c8d..3202e80e5796 100644
  #define INIT_TASK_COMM "swapper"
  
  /*
-@@ -197,6 +203,7 @@ extern struct task_group root_task_group;
+@@ -197,6 +203,7 @@ extern struct task_group root_task_group
  	.cpu_timers	= INIT_CPU_TIMERS(tsk.cpu_timers),		\
  	.pi_lock	= __RAW_SPIN_LOCK_UNLOCKED(tsk.pi_lock),	\
  	.timer_slack_ns = 50000, /* 50 usec default slack */		\
@@ -43,8 +41,6 @@ index cdde2b379c8d..3202e80e5796 100644
  	.pids = {							\
  		[PIDTYPE_PID]  = INIT_PID_LINK(PIDTYPE_PID),		\
  		[PIDTYPE_PGID] = INIT_PID_LINK(PIDTYPE_PGID),		\
-diff --git a/include/linux/sched.h b/include/linux/sched.h
-index 2106741cb592..e0f4e8910a8f 100644
 --- a/include/linux/sched.h
 +++ b/include/linux/sched.h
 @@ -1368,6 +1368,9 @@ struct task_struct {
@@ -57,23 +53,19 @@ index 2106741cb592..e0f4e8910a8f 100644
  
  /* process credentials */
  	const struct cred __rcu *real_cred; /* objective and real subjective task
-diff --git a/init/main.c b/init/main.c
-index 85b1b3dfa7fe..7bf452a4743b 100644
 --- a/init/main.c
 +++ b/init/main.c
-@@ -69,6 +69,7 @@
- #include <linux/slab.h>
+@@ -70,6 +70,7 @@
  #include <linux/perf_event.h>
  #include <linux/random.h>
+ #include <linux/kaiser.h>
 +#include <linux/posix-timers.h>
  
  #include <asm/io.h>
  #include <asm/bugs.h>
-diff --git a/kernel/fork.c b/kernel/fork.c
-index ed468a34b333..84115e74de17 100644
 --- a/kernel/fork.c
 +++ b/kernel/fork.c
-@@ -1031,6 +1031,9 @@ void mm_init_owner(struct mm_struct *mm, struct task_struct *p)
+@@ -1037,6 +1037,9 @@ void mm_init_owner(struct mm_struct *mm,
   */
  static void posix_cpu_timers_init(struct task_struct *tsk)
  {
@@ -83,11 +75,9 @@ index ed468a34b333..84115e74de17 100644
  	tsk->cputime_expires.prof_exp = cputime_zero;
  	tsk->cputime_expires.virt_exp = cputime_zero;
  	tsk->cputime_expires.sched_exp = 0;
-diff --git a/kernel/posix-cpu-timers.c b/kernel/posix-cpu-timers.c
-index 962c291224d7..cff175794fd9 100644
 --- a/kernel/posix-cpu-timers.c
 +++ b/kernel/posix-cpu-timers.c
-@@ -701,7 +701,7 @@ static int posix_cpu_timer_set(struct k_itimer *timer, int flags,
+@@ -701,7 +701,7 @@ static int posix_cpu_timer_set(struct k_
  	/*
  	 * Disarm any old timer after extracting its expiry time.
  	 */
@@ -96,7 +86,7 @@ index 962c291224d7..cff175794fd9 100644
  
  	ret = 0;
  	old_incr = timer->it.cpu.incr;
-@@ -1223,7 +1223,7 @@ void posix_cpu_timer_schedule(struct k_itimer *timer)
+@@ -1223,7 +1223,7 @@ void posix_cpu_timer_schedule(struct k_i
  	/*
  	 * Now re-arm for the new expiry time.
  	 */
@@ -105,7 +95,7 @@ index 962c291224d7..cff175794fd9 100644
  	arm_timer(timer);
  	spin_unlock(&p->sighand->siglock);
  
-@@ -1290,10 +1290,11 @@ static inline int fastpath_timer_check(struct task_struct *tsk)
+@@ -1290,10 +1290,11 @@ static inline int fastpath_timer_check(s
  	sig = tsk->signal;
  	if (sig->cputimer.running) {
  		struct task_cputime group_sample;
@@ -119,7 +109,7 @@ index 962c291224d7..cff175794fd9 100644
  
  		if (task_cputime_expired(&group_sample, &sig->cputime_expires))
  			return 1;
-@@ -1307,13 +1308,13 @@ static inline int fastpath_timer_check(struct task_struct *tsk)
+@@ -1307,13 +1308,13 @@ static inline int fastpath_timer_check(s
   * already updated our counts.  We need to check if any timers fire now.
   * Interrupts are disabled.
   */
@@ -135,7 +125,7 @@ index 962c291224d7..cff175794fd9 100644
  
  	/*
  	 * The fast path checks that there are no expired thread or thread
-@@ -1371,6 +1372,175 @@ void run_posix_cpu_timers(struct task_struct *tsk)
+@@ -1371,6 +1372,175 @@ void run_posix_cpu_timers(struct task_st
  	}
  }
  
diff --git a/debian/patches/series b/debian/patches/series
index 71ee934..ca070e4 100644
--- a/debian/patches/series
+++ b/debian/patches/series
@@ -1120,6 +1120,48 @@ bugfix/all/bluetooth-bnep-bnep_add_connection-should-verify-tha.patch
 bugfix/all/xfrm-fix-crash-in-xfrm_msg_getsa-netlink-handler.patch
 bugfix/all/ipsec-fix-aborted-xfrm-policy-dump-crash.patch
 bugfix/x86/kvm-vmx-remove-i-o-port-0x80-bypass-on-intel-hosts.patch
+bugfix/all/kpti/x86-cpufeature-add-cpu-features-from-intel-document-319433-012a.patch
+bugfix/all/kpti/x86-mm-add-invpcid-helpers.patch
+bugfix/all/kpti/x86-mm-fix-invpcid-asm-constraint.patch
+bugfix/all/kpti/x86-mm-add-a-noinvpcid-boot-option-to-turn-off-invpcid.patch
+bugfix/all/kpti/x86-mm-if-invpcid-is-available-use-it-to-flush-global-mappings.patch
+bugfix/all/kpti/mm-mmu_context-sched-core-fix-mmu_context.h-assumption.patch
+bugfix/all/kpti/sched-core-add-switch_mm_irqs_off-and-use-it-in-the-scheduler.patch
+bugfix/all/kpti/x86-mm-build-arch-x86-mm-tlb.c-even-on-smp.patch
+bugfix/all/kpti/x86-mm-sched-core-uninline-switch_mm.patch
+bugfix/all/kpti/x86-mm-sched-core-turn-off-irqs-in-switch_mm.patch
+bugfix/all/kpti/sched-core-idle_task_exit-shouldn-t-use-switch_mm_irqs_off.patch
+bugfix/all/kpti/x86-mm-remove-the-up-asm-tlbflush.h-code-always-use-the-formerly-smp-code.patch
+bugfix/all/kpti/x86-mm-disable-pcid-on-32-bit-kernels.patch
+bugfix/all/kpti/x86-mm-add-the-nopcid-boot-option-to-turn-off-pcid.patch
+bugfix/all/kpti/x86-mm-enable-cr4.pcide-on-supported-systems.patch
+bugfix/all/kpti/x86-mm-64-fix-reboot-interaction-with-cr4.pcide.patch
+bugfix/all/kpti/kaiser-kernel-address-isolation.patch
+bugfix/all/kpti/x86-mm-kaiser-re-enable-vsyscalls.patch
+bugfix/all/kpti/kaiser-user_map-__kprobes_text-too.patch
+bugfix/all/kpti/kaiser-alloc_ldt_struct-use-get_zeroed_page.patch
+bugfix/all/kpti/x86-alternatives-cleanup-dprintk-macro.patch
+bugfix/all/kpti/x86-alternatives-add-instruction-padding.patch
+bugfix/all/kpti/x86-alternatives-make-jmps-more-robust.patch
+bugfix/all/kpti/x86-alternatives-use-optimized-nops-for-padding.patch
+bugfix/all/kpti/kaiser-add-nokaiser-boot-option-using-alternative.patch
+bugfix/all/kpti/x86-boot-carve-out-early-cmdline-parsing-function.patch
+bugfix/all/kpti/x86-boot-fix-early-command-line-parsing-when-matching-at-end.patch
+bugfix/all/kpti/x86-boot-fix-early-command-line-parsing-when-partial-word-matches.patch
+bugfix/all/kpti/x86-boot-simplify-early-command-line-parsing.patch
+bugfix/all/kpti/x86-boot-pass-in-size-to-early-cmdline-parsing.patch
+bugfix/all/kpti/x86-boot-add-early-cmdline-parsing-for-options-with-arguments.patch
+bugfix/all/kpti/x86-kaiser-rename-and-simplify-x86_feature_kaiser-handling.patch
+bugfix/all/kpti/x86-kaiser-check-boottime-cmdline-params.patch
+bugfix/all/kpti/kaiser-use-alternative-instead-of-x86_cr3_pcid_noflush.patch
+bugfix/all/kpti/kaiser-asm-tlbflush.h-handle-nopge-at-lower-level.patch
+bugfix/all/kpti/kaiser-kaiser_flush_tlb_on_return_to_user-check-pcid.patch
+bugfix/all/kpti/x86-paravirt-dont-patch-flush_tlb_single.patch
+bugfix/all/kpti/x86-kaiser-reenable-paravirt.patch
+bugfix/all/kpti/kaiser-disabled-on-xen-pv.patch
+bugfix/all/kpti/x86-kaiser-move-feature-detection-up.patch
+bugfix/all/kpti/kpti-rename-to-page_table_isolation.patch
+bugfix/all/kpti/kpti-report-when-enabled.patch
 
 # ABI maintenance
 debian/perf-hide-abi-change-in-3.2.30.patch

-- 
Alioth's /usr/local/bin/git-commit-notice on /srv/git.debian.org/git/kernel/linux.git



More information about the Kernel-svn-changes mailing list