[linux] 01/01: x86/mm: Add barriers and document switch_mm()-vs-flush synchronization (CVE-2016-2069)
debian-kernel at lists.debian.org
debian-kernel at lists.debian.org
Fri Jan 29 03:43:57 UTC 2016
This is an automated email from the git hooks/post-receive script.
benh pushed a commit to branch jessie-security
in repository linux.
commit 66e39a497de3449bdc7a2ffd4ee4416f61afbac3
Author: Ben Hutchings <ben at decadent.org.uk>
Date: Fri Jan 29 03:27:03 2016 +0000
x86/mm: Add barriers and document switch_mm()-vs-flush synchronization (CVE-2016-2069)
Plus a follow-up fix to the comments.
---
debian/changelog | 3 +
...barriers-and-document-switch_mm-vs-flush-.patch | 157 +++++++++++++++++++++
...x86-mm-Improve-switch_mm-barrier-comments.patch | 62 ++++++++
debian/patches/series | 2 +
4 files changed, 224 insertions(+)
diff --git a/debian/changelog b/debian/changelog
index 7d2f45e..1e48edf 100644
--- a/debian/changelog
+++ b/debian/changelog
@@ -5,6 +5,9 @@ linux (3.16.7-ckt20-1+deb8u4) UNRELEASED; urgency=medium
(Closes: #812207)
- tiny, extract a new func xino_fwrite_wkq()
- XINO handles EINTR from the dying process
+ * [x86] mm: Add barriers and document switch_mm()-vs-flush synchronization
+ (CVE-2016-2069)
+ * [x86] mm: Improve switch_mm() barrier comments
-- Ben Hutchings <ben at decadent.org.uk> Sat, 23 Jan 2016 22:53:58 +0000
diff --git a/debian/patches/bugfix/x86/x86-mm-Add-barriers-and-document-switch_mm-vs-flush-.patch b/debian/patches/bugfix/x86/x86-mm-Add-barriers-and-document-switch_mm-vs-flush-.patch
new file mode 100644
index 0000000..5306caa
--- /dev/null
+++ b/debian/patches/bugfix/x86/x86-mm-Add-barriers-and-document-switch_mm-vs-flush-.patch
@@ -0,0 +1,157 @@
+From: Andy Lutomirski <luto at kernel.org>
+Date: Wed, 6 Jan 2016 12:21:01 -0800
+Subject: x86/mm: Add barriers and document switch_mm()-vs-flush
+ synchronization
+Origin: https://git.kernel.org/linus/71b3c126e61177eb693423f2e18a1914205b165e
+
+When switch_mm() activates a new PGD, it also sets a bit that
+tells other CPUs that the PGD is in use so that TLB flush IPIs
+will be sent. In order for that to work correctly, the bit
+needs to be visible prior to loading the PGD and therefore
+starting to fill the local TLB.
+
+Document all the barriers that make this work correctly and add
+a couple that were missing.
+
+Signed-off-by: Andy Lutomirski <luto at kernel.org>
+Cc: Andrew Morton <akpm at linux-foundation.org>
+Cc: Andy Lutomirski <luto at amacapital.net>
+Cc: Borislav Petkov <bp at alien8.de>
+Cc: Brian Gerst <brgerst at gmail.com>
+Cc: Dave Hansen <dave.hansen at linux.intel.com>
+Cc: Denys Vlasenko <dvlasenk at redhat.com>
+Cc: H. Peter Anvin <hpa at zytor.com>
+Cc: Linus Torvalds <torvalds at linux-foundation.org>
+Cc: Peter Zijlstra <peterz at infradead.org>
+Cc: Rik van Riel <riel at redhat.com>
+Cc: Thomas Gleixner <tglx at linutronix.de>
+Cc: linux-mm at kvack.org
+Cc: stable at vger.kernel.org
+Signed-off-by: Ingo Molnar <mingo at kernel.org>
+[bwh: Backported to 3.16: adjust context]
+---
+ arch/x86/include/asm/mmu_context.h | 33 ++++++++++++++++++++++++++++++++-
+ arch/x86/mm/tlb.c | 29 ++++++++++++++++++++++++++---
+ 2 files changed, 58 insertions(+), 4 deletions(-)
+
+--- a/arch/x86/include/asm/mmu_context.h
++++ b/arch/x86/include/asm/mmu_context.h
+@@ -86,9 +86,35 @@ static inline void switch_mm(struct mm_s
+ #endif
+ cpumask_set_cpu(cpu, mm_cpumask(next));
+
+- /* Re-load page tables */
++ /*
++ * Re-load page tables.
++ *
++ * This logic has an ordering constraint:
++ *
++ * CPU 0: Write to a PTE for 'next'
++ * CPU 0: load bit 1 in mm_cpumask. if nonzero, send IPI.
++ * CPU 1: set bit 1 in next's mm_cpumask
++ * CPU 1: load from the PTE that CPU 0 writes (implicit)
++ *
++ * We need to prevent an outcome in which CPU 1 observes
++ * the new PTE value and CPU 0 observes bit 1 clear in
++ * mm_cpumask. (If that occurs, then the IPI will never
++ * be sent, and CPU 0's TLB will contain a stale entry.)
++ *
++ * The bad outcome can occur if either CPU's load is
++ * reordered before that CPU's store, so both CPUs much
++ * execute full barriers to prevent this from happening.
++ *
++ * Thus, switch_mm needs a full barrier between the
++ * store to mm_cpumask and any operation that could load
++ * from next->pgd. This barrier synchronizes with
++ * remote TLB flushers. Fortunately, load_cr3 is
++ * serializing and thus acts as a full barrier.
++ *
++ */
+ load_cr3(next->pgd);
+
++
+ /* Stop flush ipis for the previous mm */
+ cpumask_clear_cpu(cpu, mm_cpumask(prev));
+
+@@ -109,10 +135,15 @@ static inline void switch_mm(struct mm_s
+ * schedule, protecting us from simultaneous changes.
+ */
+ cpumask_set_cpu(cpu, mm_cpumask(next));
++
+ /*
+ * We were in lazy tlb mode and leave_mm disabled
+ * tlb flush IPI delivery. We must reload CR3
+ * to make sure to use no freed page tables.
++ *
++ * As above, this is a barrier that forces
++ * TLB repopulation to be ordered after the
++ * store to mm_cpumask.
+ */
+ load_cr3(next->pgd);
+ load_mm_ldt(next);
+--- a/arch/x86/mm/tlb.c
++++ b/arch/x86/mm/tlb.c
+@@ -152,7 +152,10 @@ void flush_tlb_current_task(void)
+ preempt_disable();
+
+ count_vm_tlb_event(NR_TLB_LOCAL_FLUSH_ALL);
++
++ /* This is an implicit full barrier that synchronizes with switch_mm. */
+ local_flush_tlb();
++
+ if (cpumask_any_but(mm_cpumask(mm), smp_processor_id()) < nr_cpu_ids)
+ flush_tlb_others(mm_cpumask(mm), mm, 0UL, TLB_FLUSH_ALL);
+ preempt_enable();
+@@ -166,11 +169,19 @@ void flush_tlb_mm_range(struct mm_struct
+ unsigned long nr_base_pages;
+
+ preempt_disable();
+- if (current->active_mm != mm)
++ if (current->active_mm != mm) {
++ /* Synchronize with switch_mm. */
++ smp_mb();
++
+ goto flush_all;
++ }
+
+ if (!current->mm) {
+ leave_mm(smp_processor_id());
++
++ /* Synchronize with switch_mm. */
++ smp_mb();
++
+ goto flush_all;
+ }
+
+@@ -191,6 +202,10 @@ void flush_tlb_mm_range(struct mm_struct
+ act_entries = mm->total_vm > act_entries ? act_entries : mm->total_vm;
+ nr_base_pages = (end - start) >> PAGE_SHIFT;
+
++ /*
++ * Both branches below are implicit full barriers (MOV to CR or
++ * INVLPG) that synchronize with switch_mm.
++ */
+ /* tlb_flushall_shift is on balance point, details in commit log */
+ if (nr_base_pages > act_entries) {
+ count_vm_tlb_event(NR_TLB_LOCAL_FLUSH_ALL);
+@@ -222,10 +237,18 @@ void flush_tlb_page(struct vm_area_struc
+ preempt_disable();
+
+ if (current->active_mm == mm) {
+- if (current->mm)
++ if (current->mm) {
++ /*
++ * Implicit full barrier (INVLPG) that synchronizes
++ * with switch_mm.
++ */
+ __flush_tlb_one(start);
+- else
++ } else {
+ leave_mm(smp_processor_id());
++
++ /* Synchronize with switch_mm. */
++ smp_mb();
++ }
+ }
+
+ if (cpumask_any_but(mm_cpumask(mm), smp_processor_id()) < nr_cpu_ids)
diff --git a/debian/patches/bugfix/x86/x86-mm-Improve-switch_mm-barrier-comments.patch b/debian/patches/bugfix/x86/x86-mm-Improve-switch_mm-barrier-comments.patch
new file mode 100644
index 0000000..d43baeb
--- /dev/null
+++ b/debian/patches/bugfix/x86/x86-mm-Improve-switch_mm-barrier-comments.patch
@@ -0,0 +1,62 @@
+From: Andy Lutomirski <luto at kernel.org>
+Date: Tue, 12 Jan 2016 12:47:40 -0800
+Subject: x86/mm: Improve switch_mm() barrier comments
+Origin: https://git.kernel.org/linus/4eaffdd5a5fe6ff9f95e1ab4de1ac904d5e0fa8b
+
+My previous comments were still a bit confusing and there was a
+typo. Fix it up.
+
+Reported-by: Peter Zijlstra <peterz at infradead.org>
+Signed-off-by: Andy Lutomirski <luto at kernel.org>
+Cc: Andy Lutomirski <luto at amacapital.net>
+Cc: Borislav Petkov <bp at alien8.de>
+Cc: Brian Gerst <brgerst at gmail.com>
+Cc: Dave Hansen <dave.hansen at linux.intel.com>
+Cc: Denys Vlasenko <dvlasenk at redhat.com>
+Cc: H. Peter Anvin <hpa at zytor.com>
+Cc: Linus Torvalds <torvalds at linux-foundation.org>
+Cc: Rik van Riel <riel at redhat.com>
+Cc: Thomas Gleixner <tglx at linutronix.de>
+Cc: stable at vger.kernel.org
+Fixes: 71b3c126e611 ("x86/mm: Add barriers and document switch_mm()-vs-flush synchronization")
+Link: http://lkml.kernel.org/r/0a0b43cdcdd241c5faaaecfbcc91a155ddedc9a1.1452631609.git.luto@kernel.org
+Signed-off-by: Ingo Molnar <mingo at kernel.org>
+---
+ arch/x86/include/asm/mmu_context.h | 15 ++++++++-------
+ 1 file changed, 8 insertions(+), 7 deletions(-)
+
+--- a/arch/x86/include/asm/mmu_context.h
++++ b/arch/x86/include/asm/mmu_context.h
+@@ -102,14 +102,16 @@ static inline void switch_mm(struct mm_s
+ * be sent, and CPU 0's TLB will contain a stale entry.)
+ *
+ * The bad outcome can occur if either CPU's load is
+- * reordered before that CPU's store, so both CPUs much
++ * reordered before that CPU's store, so both CPUs must
+ * execute full barriers to prevent this from happening.
+ *
+ * Thus, switch_mm needs a full barrier between the
+ * store to mm_cpumask and any operation that could load
+- * from next->pgd. This barrier synchronizes with
+- * remote TLB flushers. Fortunately, load_cr3 is
+- * serializing and thus acts as a full barrier.
++ * from next->pgd. TLB fills are special and can happen
++ * due to instruction fetches or for no reason at all,
++ * and neither LOCK nor MFENCE orders them.
++ * Fortunately, load_cr3() is serializing and gives the
++ * ordering guarantee we need.
+ *
+ */
+ load_cr3(next->pgd);
+@@ -141,9 +143,8 @@ static inline void switch_mm(struct mm_s
+ * tlb flush IPI delivery. We must reload CR3
+ * to make sure to use no freed page tables.
+ *
+- * As above, this is a barrier that forces
+- * TLB repopulation to be ordered after the
+- * store to mm_cpumask.
++ * As above, load_cr3() is serializing and orders TLB
++ * fills with respect to the mm_cpumask write.
+ */
+ load_cr3(next->pgd);
+ load_mm_ldt(next);
diff --git a/debian/patches/series b/debian/patches/series
index 1a0bc9d..821182a 100644
--- a/debian/patches/series
+++ b/debian/patches/series
@@ -679,3 +679,5 @@ bugfix/all/KEYS-Fix-keyring-ref-leak-in-join_session_keyring.patch
bugfix/all/fuse-break-infinite-loop-in-fuse_fill_write_pages.patch
bugfix/all/aufs-tiny-extract-a-new-func-xino_fwrite_wkq.patch
bugfix/all/aufs-for-4.3-xino-handles-eintr-from-the-dying-proce.patch
+bugfix/x86/x86-mm-Add-barriers-and-document-switch_mm-vs-flush-.patch
+bugfix/x86/x86-mm-Improve-switch_mm-barrier-comments.patch
--
Alioth's /usr/local/bin/git-commit-notice on /srv/git.debian.org/git/kernel/linux.git
More information about the Kernel-svn-changes
mailing list