[kernel] r19082 - in dists/trunk/linux-2.6: . debian debian/config debian/installer debian/installer/amd64/modules/amd64 debian/installer/i386/modules/i386 debian/installer/modules debian/patches/bugfix/all debian/patches/bugfix/x86 debian/patches/features/all/be2net debian/patches/features/all/codel debian/patches/series
Ben Hutchings
benh at alioth.debian.org
Mon Jun 4 20:36:01 UTC 2012
Author: benh
Date: Mon Jun 4 20:35:59 2012
New Revision: 19082
Log:
Merge changes from sid up to 3.2.19-1
Added:
dists/trunk/linux-2.6/debian/installer/amd64/modules/amd64/hyperv-modules
- copied unchanged from r19054, dists/sid/linux-2.6/debian/installer/amd64/modules/amd64/hyperv-modules
dists/trunk/linux-2.6/debian/installer/i386/modules/i386/hyperv-modules
- copied unchanged from r19054, dists/sid/linux-2.6/debian/installer/i386/modules/i386/hyperv-modules
dists/trunk/linux-2.6/debian/installer/modules/hyperv-modules
- copied unchanged from r19054, dists/sid/linux-2.6/debian/installer/modules/hyperv-modules
dists/trunk/linux-2.6/debian/patches/bugfix/all/fix-scsi_wait_scan.patch
- copied unchanged from r19054, dists/sid/linux-2.6/debian/patches/bugfix/all/fix-scsi_wait_scan.patch
dists/trunk/linux-2.6/debian/patches/bugfix/all/hugetlb-fix-resv_map-leak-in-error-path.patch
- copied, changed from r19054, dists/sid/linux-2.6/debian/patches/bugfix/all/hugetlb-fix-resv_map-leak-in-error-path.patch
dists/trunk/linux-2.6/debian/patches/bugfix/all/mm-fix-vma_resv_map-null-pointer.patch
- copied unchanged from r19054, dists/sid/linux-2.6/debian/patches/bugfix/all/mm-fix-vma_resv_map-null-pointer.patch
dists/trunk/linux-2.6/debian/patches/bugfix/x86/mm-pmd_read_atomic-fix-32bit-pae-pmd-walk-vs-pmd_populate-smp-race.patch
- copied unchanged from r19054, dists/sid/linux-2.6/debian/patches/bugfix/x86/mm-pmd_read_atomic-fix-32bit-pae-pmd-walk-vs-pmd_populate-smp-race.patch
dists/trunk/linux-2.6/debian/patches/features/all/be2net/
- copied from r19054, dists/sid/linux-2.6/debian/patches/features/all/be2net/
dists/trunk/linux-2.6/debian/patches/features/all/codel/
- copied from r19054, dists/sid/linux-2.6/debian/patches/features/all/codel/
Modified:
dists/trunk/linux-2.6/ (props changed)
dists/trunk/linux-2.6/debian/changelog
dists/trunk/linux-2.6/debian/config/config
dists/trunk/linux-2.6/debian/installer/modules/input-modules
dists/trunk/linux-2.6/debian/installer/package-list
dists/trunk/linux-2.6/debian/patches/series/base
Modified: dists/trunk/linux-2.6/debian/changelog
==============================================================================
--- dists/trunk/linux-2.6/debian/changelog Mon Jun 4 20:11:45 2012 (r19081)
+++ dists/trunk/linux-2.6/debian/changelog Mon Jun 4 20:35:59 2012 (r19082)
@@ -89,6 +89,43 @@
-- Ben Hutchings <ben at decadent.org.uk> Sun, 04 Mar 2012 20:27:42 +0000
+linux-2.6 (3.2.19-1) unstable; urgency=low
+
+ * New upstream stable update:
+ http://www.kernel.org/pub/linux/kernel/v3.x/ChangeLog-3.2.19
+ - hpsa: Fix problem with MSA2xxx devices (Closes: #661057)
+ - IB/core: Fix mismatch between locked and pinned pages
+ - iommu: Fix off by one in dmar_get_fault_reason()
+ - vfs: make AIO use the proper rw_verify_area() area helpers
+ - HID: logitech: read all 32 bits of report type bitfield (Closes: #671292)
+ - USB: Remove races in devio.c
+ - ext{3,4}: Fix error handling on inode bitmap corruption
+ - uvcvideo: Fix ENUMINPUT handling
+ - dl2k: Clean up rio_ioctl (CVE-2012-2313)
+ - [x86] MCE: Fix vm86 handling for 32bit mce handler
+ - [x86] mce: Fix check for processor context when machine check was taken.
+ - ethtool: Null-terminate filename passed to ethtool_ops::flash_device
+ - NFSv4: Fix buffer overflows in ACL support (CVE-2012-2375)
+ + Avoid reading past buffer when calling GETACL
+ + Avoid beyond bounds copy while caching ACL
+
+ [ Ben Hutchings ]
+ * be2net: Backport most changes up to Linux 3.5-rc1, thanks to
+ Sarveshwar Bandi (Closes: #673391)
+ - Add support for Skyhawk cards
+ * net/sched: Add codel and fq_codel from Linux 3.5-rc1
+ * [x86] udeb: Add hyperv-modules containing Hyper-V paravirtualised drivers
+ * [x86] ata_piix: defer disks to the Hyper-V drivers by default
+ * [x86] drm/i915:: Disable FBC on SandyBridge (Closes: #675022)
+ * AppArmor: compatibility patch for v5 interface (Closes: #661151)
+ * hugepages: fix use after free bug in "quota" handling (CVE-2012-2133)
+ * [x86] mm: pmd_read_atomic: fix 32bit PAE pmd walk vs pmd_populate SMP race
+ condition (CVE-2012-2373)
+ * hugetlb: fix resv_map leak in error path (CVE-2012-2390)
+ * [SCSI] fix scsi_wait_scan (Closes: #647436)
+
+ -- Ben Hutchings <ben at decadent.org.uk> Fri, 01 Jun 2012 13:15:48 +0100
+
linux-2.6 (3.2.18-1) unstable; urgency=low
* New upstream stable update:
Modified: dists/trunk/linux-2.6/debian/config/config
==============================================================================
--- dists/trunk/linux-2.6/debian/config/config Mon Jun 4 20:11:45 2012 (r19081)
+++ dists/trunk/linux-2.6/debian/config/config Mon Jun 4 20:35:59 2012 (r19082)
@@ -4597,6 +4597,8 @@
CONFIG_NET_SCH_MQPRIO=m
CONFIG_NET_SCH_CHOKE=m
CONFIG_NET_SCH_QFQ=m
+CONFIG_NET_SCH_CODEL=m
+CONFIG_NET_SCH_FQ_CODEL=m
CONFIG_NET_SCH_INGRESS=m
CONFIG_NET_SCH_PLUG=m
CONFIG_NET_CLS_BASIC=m
@@ -4722,6 +4724,7 @@
##
CONFIG_SECURITY_APPARMOR=y
CONFIG_SECURITY_APPARMOR_BOOTPARAM_VALUE=1
+CONFIG_SECURITY_APPARMOR_COMPAT_24=y
##
## file: security/integrity/ima/Kconfig
Copied: dists/trunk/linux-2.6/debian/installer/amd64/modules/amd64/hyperv-modules (from r19054, dists/sid/linux-2.6/debian/installer/amd64/modules/amd64/hyperv-modules)
==============================================================================
--- /dev/null 00:00:00 1970 (empty, because file is newly added)
+++ dists/trunk/linux-2.6/debian/installer/amd64/modules/amd64/hyperv-modules Mon Jun 4 20:35:59 2012 (r19082, copy of r19054, dists/sid/linux-2.6/debian/installer/amd64/modules/amd64/hyperv-modules)
@@ -0,0 +1 @@
+#include <hyperv-modules>
Copied: dists/trunk/linux-2.6/debian/installer/i386/modules/i386/hyperv-modules (from r19054, dists/sid/linux-2.6/debian/installer/i386/modules/i386/hyperv-modules)
==============================================================================
--- /dev/null 00:00:00 1970 (empty, because file is newly added)
+++ dists/trunk/linux-2.6/debian/installer/i386/modules/i386/hyperv-modules Mon Jun 4 20:35:59 2012 (r19082, copy of r19054, dists/sid/linux-2.6/debian/installer/i386/modules/i386/hyperv-modules)
@@ -0,0 +1 @@
+#include <hyperv-modules>
Copied: dists/trunk/linux-2.6/debian/installer/modules/hyperv-modules (from r19054, dists/sid/linux-2.6/debian/installer/modules/hyperv-modules)
==============================================================================
--- /dev/null 00:00:00 1970 (empty, because file is newly added)
+++ dists/trunk/linux-2.6/debian/installer/modules/hyperv-modules Mon Jun 4 20:35:59 2012 (r19082, copy of r19054, dists/sid/linux-2.6/debian/installer/modules/hyperv-modules)
@@ -0,0 +1,6 @@
+# All Hyper-V paravirtual drivers
+hid-hyperv
+hv_netvsc
+hv_storvsc
+hv_utils
+hv_vmbus
Modified: dists/trunk/linux-2.6/debian/installer/modules/input-modules
==============================================================================
--- dists/trunk/linux-2.6/debian/installer/modules/input-modules Mon Jun 4 20:11:45 2012 (r19081)
+++ dists/trunk/linux-2.6/debian/installer/modules/input-modules Mon Jun 4 20:35:59 2012 (r19082)
@@ -1,3 +1,4 @@
+hid
usbhid
hid-apple ?
hid-belkin ?
Modified: dists/trunk/linux-2.6/debian/installer/package-list
==============================================================================
--- dists/trunk/linux-2.6/debian/installer/package-list Mon Jun 4 20:11:45 2012 (r19081)
+++ dists/trunk/linux-2.6/debian/installer/package-list Mon Jun 4 20:35:59 2012 (r19082)
@@ -478,3 +478,9 @@
Priority: extra
Description: LED modules
This package contains LED modules.
+
+Package: hyperv-modules
+Depends: kernel-image, input-modules, scsi-core-modules
+Priority: extra
+Description: Hyper-V modules
+ This package contains Hyper-V paravirtualised drivers for the kernel.
Copied: dists/trunk/linux-2.6/debian/patches/bugfix/all/fix-scsi_wait_scan.patch (from r19054, dists/sid/linux-2.6/debian/patches/bugfix/all/fix-scsi_wait_scan.patch)
==============================================================================
--- /dev/null 00:00:00 1970 (empty, because file is newly added)
+++ dists/trunk/linux-2.6/debian/patches/bugfix/all/fix-scsi_wait_scan.patch Mon Jun 4 20:35:59 2012 (r19082, copy of r19054, dists/sid/linux-2.6/debian/patches/bugfix/all/fix-scsi_wait_scan.patch)
@@ -0,0 +1,40 @@
+From: James Bottomley <jbottomley at parallels.com>
+Date: Wed, 30 May 2012 09:45:39 +0000
+Subject: [SCSI] fix scsi_wait_scan
+
+commit 1ff2f40305772b159a91c19590ee159d3a504afc upstream.
+
+Commit c751085943362143f84346d274e0011419c84202
+Author: Rafael J. Wysocki <rjw at sisk.pl>
+Date: Sun Apr 12 20:06:56 2009 +0200
+
+ PM/Hibernate: Wait for SCSI devices scan to complete during resume
+
+Broke the scsi_wait_scan module in 2.6.30. Apparently debian still uses it so
+fix it and backport to stable before removing it in 3.6.
+
+The breakage is caused because the function template in
+include/scsi/scsi_scan.h is defined to be a nop unless SCSI is built in.
+That means that in the modular case (which is every distro), the
+scsi_wait_scan module does a simple async_synchronize_full() instead of
+waiting for scans.
+
+Signed-off-by: James Bottomley <JBottomley at Parallels.com>
+Signed-off-by: Ben Hutchings <ben at decadent.org.uk>
+---
+ drivers/scsi/scsi_wait_scan.c | 2 +-
+ 1 file changed, 1 insertion(+), 1 deletion(-)
+
+diff --git a/drivers/scsi/scsi_wait_scan.c b/drivers/scsi/scsi_wait_scan.c
+index 74708fc..ae78148 100644
+--- a/drivers/scsi/scsi_wait_scan.c
++++ b/drivers/scsi/scsi_wait_scan.c
+@@ -12,7 +12,7 @@
+
+ #include <linux/module.h>
+ #include <linux/device.h>
+-#include <scsi/scsi_scan.h>
++#include "scsi_priv.h"
+
+ static int __init wait_scan_init(void)
+ {
Copied and modified: dists/trunk/linux-2.6/debian/patches/bugfix/all/hugetlb-fix-resv_map-leak-in-error-path.patch (from r19054, dists/sid/linux-2.6/debian/patches/bugfix/all/hugetlb-fix-resv_map-leak-in-error-path.patch)
==============================================================================
--- dists/sid/linux-2.6/debian/patches/bugfix/all/hugetlb-fix-resv_map-leak-in-error-path.patch Fri Jun 1 15:11:07 2012 (r19054, copy source)
+++ dists/trunk/linux-2.6/debian/patches/bugfix/all/hugetlb-fix-resv_map-leak-in-error-path.patch Mon Jun 4 20:35:59 2012 (r19082)
@@ -1,5 +1,5 @@
From: Dave Hansen <dave at linux.vnet.ibm.com>
-Date: Fri, 18 May 2012 11:46:30 -0700
+Date: Tue, 29 May 2012 15:06:46 -0700
Subject: hugetlb: fix resv_map leak in error path
commit c50ac050811d6485616a193eb0f37bfbd191cc89 upstream.
@@ -19,16 +19,26 @@
http://marc.info/?l=linux-mm&m=133728900729735
+This patch applies to 3.4 and later. A version for earlier kernels is at
+https://lkml.org/lkml/2012/5/22/418.
+
Signed-off-by: Dave Hansen <dave at linux.vnet.ibm.com>
-[Christoph Lameter: I have rediffed the patch against 2.6.32 and 3.2.0.]
-Signed-off-by: Ben Hutchings <ben at decadent.org.uk>
+Acked-by: Mel Gorman <mel at csn.ul.ie>
+Acked-by: KOSAKI Motohiro <kosaki.motohiro at jp.fujitsu.com>
+Reported-by: Christoph Lameter <cl at linux.com>
+Tested-by: Christoph Lameter <cl at linux.com>
+Cc: Andrea Arcangeli <aarcange at redhat.com>
+Signed-off-by: Andrew Morton <akpm at linux-foundation.org>
+Signed-off-by: Linus Torvalds <torvalds at linux-foundation.org>
---
mm/hugetlb.c | 28 ++++++++++++++++++++++------
1 file changed, 22 insertions(+), 6 deletions(-)
+diff --git a/mm/hugetlb.c b/mm/hugetlb.c
+index 41a647d..285a81e 100644
--- a/mm/hugetlb.c
+++ b/mm/hugetlb.c
-@@ -2068,6 +2068,15 @@
+@@ -2157,6 +2157,15 @@ static void hugetlb_vm_op_open(struct vm_area_struct *vma)
kref_get(&reservations->refs);
}
@@ -44,7 +54,7 @@
static void hugetlb_vm_op_close(struct vm_area_struct *vma)
{
struct hstate *h = hstate_vma(vma);
-@@ -2083,7 +2092,7 @@
+@@ -2173,7 +2182,7 @@ static void hugetlb_vm_op_close(struct vm_area_struct *vma)
reserve = (end - start) -
region_count(&reservations->regions, start, end);
@@ -53,7 +63,7 @@
if (reserve) {
hugetlb_acct_memory(h, -reserve);
-@@ -2884,12 +2893,16 @@
+@@ -2991,12 +3000,16 @@ int hugetlb_reserve_pages(struct inode *inode,
set_vma_resv_flags(vma, HPAGE_RESV_OWNER);
}
@@ -64,26 +74,26 @@
+ goto out_err;
+ }
- /* There must be enough filesystem quota for the mapping */
-- if (hugetlb_get_quota(inode->i_mapping, chg))
+ /* There must be enough pages in the subpool for the mapping */
+- if (hugepage_subpool_get_pages(spool, chg))
- return -ENOSPC;
-+ if (hugetlb_get_quota(inode->i_mapping, chg)) {
++ if (hugepage_subpool_get_pages(spool, chg)) {
+ ret = -ENOSPC;
+ goto out_err;
+ }
/*
* Check enough hugepages are available for the reservation.
-@@ -2898,7 +2911,7 @@
+@@ -3005,7 +3018,7 @@ int hugetlb_reserve_pages(struct inode *inode,
ret = hugetlb_acct_memory(h, chg);
if (ret < 0) {
- hugetlb_put_quota(inode->i_mapping, chg);
+ hugepage_subpool_put_pages(spool, chg);
- return ret;
+ goto out_err;
}
/*
-@@ -2915,6 +2928,9 @@
+@@ -3022,6 +3035,9 @@ int hugetlb_reserve_pages(struct inode *inode,
if (!vma || vma->vm_flags & VM_MAYSHARE)
region_add(&inode->i_mapping->private_list, from, to);
return 0;
Copied: dists/trunk/linux-2.6/debian/patches/bugfix/all/mm-fix-vma_resv_map-null-pointer.patch (from r19054, dists/sid/linux-2.6/debian/patches/bugfix/all/mm-fix-vma_resv_map-null-pointer.patch)
==============================================================================
--- /dev/null 00:00:00 1970 (empty, because file is newly added)
+++ dists/trunk/linux-2.6/debian/patches/bugfix/all/mm-fix-vma_resv_map-null-pointer.patch Mon Jun 4 20:35:59 2012 (r19082, copy of r19054, dists/sid/linux-2.6/debian/patches/bugfix/all/mm-fix-vma_resv_map-null-pointer.patch)
@@ -0,0 +1,66 @@
+From: Dave Hansen <dave at linux.vnet.ibm.com>
+Date: Wed, 30 May 2012 07:51:07 -0700
+Subject: mm: fix vma_resv_map() NULL pointer
+
+commit 4523e1458566a0e8ecfaff90f380dd23acc44d27 upstream.
+
+hugetlb_reserve_pages() can be used for either normal file-backed
+hugetlbfs mappings, or MAP_HUGETLB. In the MAP_HUGETLB, semi-anonymous
+mode, there is not a VMA around. The new call to resv_map_put() assumed
+that there was, and resulted in a NULL pointer dereference:
+
+ BUG: unable to handle kernel NULL pointer dereference at 0000000000000030
+ IP: vma_resv_map+0x9/0x30
+ PGD 141453067 PUD 1421e1067 PMD 0
+ Oops: 0000 [#1] PREEMPT SMP
+ ...
+ Pid: 14006, comm: trinity-child6 Not tainted 3.4.0+ #36
+ RIP: vma_resv_map+0x9/0x30
+ ...
+ Process trinity-child6 (pid: 14006, threadinfo ffff8801414e0000, task ffff8801414f26b0)
+ Call Trace:
+ resv_map_put+0xe/0x40
+ hugetlb_reserve_pages+0xa6/0x1d0
+ hugetlb_file_setup+0x102/0x2c0
+ newseg+0x115/0x360
+ ipcget+0x1ce/0x310
+ sys_shmget+0x5a/0x60
+ system_call_fastpath+0x16/0x1b
+
+This was reported by Dave Jones, but was reproducible with the
+libhugetlbfs test cases, so shame on me for not running them in the
+first place.
+
+With this, the oops is gone, and the output of libhugetlbfs's
+run_tests.py is identical to plain 3.4 again.
+
+[ Marked for stable, since this was introduced by commit c50ac050811d
+ ("hugetlb: fix resv_map leak in error path") which was also marked for
+ stable ]
+
+Reported-by: Dave Jones <davej at redhat.com>
+Cc: Mel Gorman <mel at csn.ul.ie>
+Cc: KOSAKI Motohiro <kosaki.motohiro at jp.fujitsu.com>
+Cc: Christoph Lameter <cl at linux.com>
+Cc: Andrea Arcangeli <aarcange at redhat.com>
+Cc: Andrew Morton <akpm at linux-foundation.org>
+Signed-off-by: Linus Torvalds <torvalds at linux-foundation.org>
+Signed-off-by: Ben Hutchings <ben at decadent.org.uk>
+---
+ mm/hugetlb.c | 3 ++-
+ 1 file changed, 2 insertions(+), 1 deletion(-)
+
+diff --git a/mm/hugetlb.c b/mm/hugetlb.c
+index 285a81e..e198831 100644
+--- a/mm/hugetlb.c
++++ b/mm/hugetlb.c
+@@ -3036,7 +3036,8 @@ int hugetlb_reserve_pages(struct inode *inode,
+ region_add(&inode->i_mapping->private_list, from, to);
+ return 0;
+ out_err:
+- resv_map_put(vma);
++ if (vma)
++ resv_map_put(vma);
+ return ret;
+ }
+
Copied: dists/trunk/linux-2.6/debian/patches/bugfix/x86/mm-pmd_read_atomic-fix-32bit-pae-pmd-walk-vs-pmd_populate-smp-race.patch (from r19054, dists/sid/linux-2.6/debian/patches/bugfix/x86/mm-pmd_read_atomic-fix-32bit-pae-pmd-walk-vs-pmd_populate-smp-race.patch)
==============================================================================
--- /dev/null 00:00:00 1970 (empty, because file is newly added)
+++ dists/trunk/linux-2.6/debian/patches/bugfix/x86/mm-pmd_read_atomic-fix-32bit-pae-pmd-walk-vs-pmd_populate-smp-race.patch Mon Jun 4 20:35:59 2012 (r19082, copy of r19054, dists/sid/linux-2.6/debian/patches/bugfix/x86/mm-pmd_read_atomic-fix-32bit-pae-pmd-walk-vs-pmd_populate-smp-race.patch)
@@ -0,0 +1,214 @@
+From: Andrea Arcangeli <aarcange at redhat.com>
+Date: Tue, 29 May 2012 15:06:49 -0700
+Subject: mm: pmd_read_atomic: fix 32bit PAE pmd walk vs pmd_populate SMP race
+ condition
+
+commit 26c191788f18129af0eb32a358cdaea0c7479626 upstream.
+
+When holding the mmap_sem for reading, pmd_offset_map_lock should only
+run on a pmd_t that has been read atomically from the pmdp pointer,
+otherwise we may read only half of it leading to this crash.
+
+PID: 11679 TASK: f06e8000 CPU: 3 COMMAND: "do_race_2_panic"
+ #0 [f06a9dd8] crash_kexec at c049b5ec
+ #1 [f06a9e2c] oops_end at c083d1c2
+ #2 [f06a9e40] no_context at c0433ded
+ #3 [f06a9e64] bad_area_nosemaphore at c043401a
+ #4 [f06a9e6c] __do_page_fault at c0434493
+ #5 [f06a9eec] do_page_fault at c083eb45
+ #6 [f06a9f04] error_code (via page_fault) at c083c5d5
+ EAX: 01fb470c EBX: fff35000 ECX: 00000003 EDX: 00000100 EBP:
+ 00000000
+ DS: 007b ESI: 9e201000 ES: 007b EDI: 01fb4700 GS: 00e0
+ CS: 0060 EIP: c083bc14 ERR: ffffffff EFLAGS: 00010246
+ #7 [f06a9f38] _spin_lock at c083bc14
+ #8 [f06a9f44] sys_mincore at c0507b7d
+ #9 [f06a9fb0] system_call at c083becd
+ start len
+ EAX: ffffffda EBX: 9e200000 ECX: 00001000 EDX: 6228537f
+ DS: 007b ESI: 00000000 ES: 007b EDI: 003d0f00
+ SS: 007b ESP: 62285354 EBP: 62285388 GS: 0033
+ CS: 0073 EIP: 00291416 ERR: 000000da EFLAGS: 00000286
+
+This should be a longstanding bug affecting x86 32bit PAE without THP.
+Only archs with 64bit large pmd_t and 32bit unsigned long should be
+affected.
+
+With THP enabled the barrier() in pmd_none_or_trans_huge_or_clear_bad()
+would partly hide the bug when the pmd transition from none to stable,
+by forcing a re-read of the *pmd in pmd_offset_map_lock, but when THP is
+enabled a new set of problem arises by the fact could then transition
+freely in any of the none, pmd_trans_huge or pmd_trans_stable states.
+So making the barrier in pmd_none_or_trans_huge_or_clear_bad()
+unconditional isn't good idea and it would be a flakey solution.
+
+This should be fully fixed by introducing a pmd_read_atomic that reads
+the pmd in order with THP disabled, or by reading the pmd atomically
+with cmpxchg8b with THP enabled.
+
+Luckily this new race condition only triggers in the places that must
+already be covered by pmd_none_or_trans_huge_or_clear_bad() so the fix
+is localized there but this bug is not related to THP.
+
+NOTE: this can trigger on x86 32bit systems with PAE enabled with more
+than 4G of ram, otherwise the high part of the pmd will never risk to be
+truncated because it would be zero at all times, in turn so hiding the
+SMP race.
+
+This bug was discovered and fully debugged by Ulrich, quote:
+
+----
+[..]
+pmd_none_or_trans_huge_or_clear_bad() loads the content of edx and
+eax.
+
+ 496 static inline int pmd_none_or_trans_huge_or_clear_bad(pmd_t
+ *pmd)
+ 497 {
+ 498 /* depend on compiler for an atomic pmd read */
+ 499 pmd_t pmdval = *pmd;
+
+ // edi = pmd pointer
+0xc0507a74 <sys_mincore+548>: mov 0x8(%esp),%edi
+...
+ // edx = PTE page table high address
+0xc0507a84 <sys_mincore+564>: mov 0x4(%edi),%edx
+...
+ // eax = PTE page table low address
+0xc0507a8e <sys_mincore+574>: mov (%edi),%eax
+
+[..]
+
+Please note that the PMD is not read atomically. These are two "mov"
+instructions where the high order bits of the PMD entry are fetched
+first. Hence, the above machine code is prone to the following race.
+
+- The PMD entry {high|low} is 0x0000000000000000.
+ The "mov" at 0xc0507a84 loads 0x00000000 into edx.
+
+- A page fault (on another CPU) sneaks in between the two "mov"
+ instructions and instantiates the PMD.
+
+- The PMD entry {high|low} is now 0x00000003fda38067.
+ The "mov" at 0xc0507a8e loads 0xfda38067 into eax.
+----
+
+Reported-by: Ulrich Obergfell <uobergfe at redhat.com>
+Signed-off-by: Andrea Arcangeli <aarcange at redhat.com>
+Cc: Mel Gorman <mgorman at suse.de>
+Cc: Hugh Dickins <hughd at google.com>
+Cc: Larry Woodman <lwoodman at redhat.com>
+Cc: Petr Matousek <pmatouse at redhat.com>
+Cc: Rik van Riel <riel at redhat.com>
+Signed-off-by: Andrew Morton <akpm at linux-foundation.org>
+Signed-off-by: Linus Torvalds <torvalds at linux-foundation.org>
+Signed-off-by: Ben Hutchings <ben at decadent.org.uk>
+---
+ arch/x86/include/asm/pgtable-3level.h | 50 +++++++++++++++++++++++++++++++++
+ include/asm-generic/pgtable.h | 22 +++++++++++++--
+ 2 files changed, 70 insertions(+), 2 deletions(-)
+
+diff --git a/arch/x86/include/asm/pgtable-3level.h b/arch/x86/include/asm/pgtable-3level.h
+index effff47..43876f1 100644
+--- a/arch/x86/include/asm/pgtable-3level.h
++++ b/arch/x86/include/asm/pgtable-3level.h
+@@ -31,6 +31,56 @@ static inline void native_set_pte(pte_t *ptep, pte_t pte)
+ ptep->pte_low = pte.pte_low;
+ }
+
++#define pmd_read_atomic pmd_read_atomic
++/*
++ * pte_offset_map_lock on 32bit PAE kernels was reading the pmd_t with
++ * a "*pmdp" dereference done by gcc. Problem is, in certain places
++ * where pte_offset_map_lock is called, concurrent page faults are
++ * allowed, if the mmap_sem is hold for reading. An example is mincore
++ * vs page faults vs MADV_DONTNEED. On the page fault side
++ * pmd_populate rightfully does a set_64bit, but if we're reading the
++ * pmd_t with a "*pmdp" on the mincore side, a SMP race can happen
++ * because gcc will not read the 64bit of the pmd atomically. To fix
++ * this all places running pmd_offset_map_lock() while holding the
++ * mmap_sem in read mode, shall read the pmdp pointer using this
++ * function to know if the pmd is null nor not, and in turn to know if
++ * they can run pmd_offset_map_lock or pmd_trans_huge or other pmd
++ * operations.
++ *
++ * Without THP if the mmap_sem is hold for reading, the
++ * pmd can only transition from null to not null while pmd_read_atomic runs.
++ * So there's no need of literally reading it atomically.
++ *
++ * With THP if the mmap_sem is hold for reading, the pmd can become
++ * THP or null or point to a pte (and in turn become "stable") at any
++ * time under pmd_read_atomic, so it's mandatory to read it atomically
++ * with cmpxchg8b.
++ */
++#ifndef CONFIG_TRANSPARENT_HUGEPAGE
++static inline pmd_t pmd_read_atomic(pmd_t *pmdp)
++{
++ pmdval_t ret;
++ u32 *tmp = (u32 *)pmdp;
++
++ ret = (pmdval_t) (*tmp);
++ if (ret) {
++ /*
++ * If the low part is null, we must not read the high part
++ * or we can end up with a partial pmd.
++ */
++ smp_rmb();
++ ret |= ((pmdval_t)*(tmp + 1)) << 32;
++ }
++
++ return (pmd_t) { ret };
++}
++#else /* CONFIG_TRANSPARENT_HUGEPAGE */
++static inline pmd_t pmd_read_atomic(pmd_t *pmdp)
++{
++ return (pmd_t) { atomic64_read((atomic64_t *)pmdp) };
++}
++#endif /* CONFIG_TRANSPARENT_HUGEPAGE */
++
+ static inline void native_set_pte_atomic(pte_t *ptep, pte_t pte)
+ {
+ set_64bit((unsigned long long *)(ptep), native_pte_val(pte));
+diff --git a/include/asm-generic/pgtable.h b/include/asm-generic/pgtable.h
+index e2768f1..6f2b45a 100644
+--- a/include/asm-generic/pgtable.h
++++ b/include/asm-generic/pgtable.h
+@@ -445,6 +445,18 @@ static inline int pmd_write(pmd_t pmd)
+ #endif /* __HAVE_ARCH_PMD_WRITE */
+ #endif /* CONFIG_TRANSPARENT_HUGEPAGE */
+
++#ifndef pmd_read_atomic
++static inline pmd_t pmd_read_atomic(pmd_t *pmdp)
++{
++ /*
++ * Depend on compiler for an atomic pmd read. NOTE: this is
++ * only going to work, if the pmdval_t isn't larger than
++ * an unsigned long.
++ */
++ return *pmdp;
++}
++#endif
++
+ /*
+ * This function is meant to be used by sites walking pagetables with
+ * the mmap_sem hold in read mode to protect against MADV_DONTNEED and
+@@ -458,11 +470,17 @@ static inline int pmd_write(pmd_t pmd)
+ * undefined so behaving like if the pmd was none is safe (because it
+ * can return none anyway). The compiler level barrier() is critically
+ * important to compute the two checks atomically on the same pmdval.
++ *
++ * For 32bit kernels with a 64bit large pmd_t this automatically takes
++ * care of reading the pmd atomically to avoid SMP race conditions
++ * against pmd_populate() when the mmap_sem is hold for reading by the
++ * caller (a special atomic read not done by "gcc" as in the generic
++ * version above, is also needed when THP is disabled because the page
++ * fault can populate the pmd from under us).
+ */
+ static inline int pmd_none_or_trans_huge_or_clear_bad(pmd_t *pmd)
+ {
+- /* depend on compiler for an atomic pmd read */
+- pmd_t pmdval = *pmd;
++ pmd_t pmdval = pmd_read_atomic(pmd);
+ /*
+ * The barrier will stabilize the pmdval in a register or on
+ * the stack so that it will stop changing under the code.
Modified: dists/trunk/linux-2.6/debian/patches/series/base
==============================================================================
--- dists/trunk/linux-2.6/debian/patches/series/base Mon Jun 4 20:11:45 2012 (r19081)
+++ dists/trunk/linux-2.6/debian/patches/series/base Mon Jun 4 20:35:59 2012 (r19082)
@@ -60,3 +60,36 @@
+ features/all/wacom/0026-Input-wacom-return-proper-error-if-usb_get_extra_des.patch
+ bugfix/all/acpi-battery-only-refresh-the-sysfs-files-when-pertinent.patch
+
+# Update be2net driver to 3.5ish
++ features/all/be2net/0043-be2net-fix-ethtool-get-settings.patch
++ features/all/be2net/0044-be2net-Fix-VLAN-multicast-packet-reception.patch
++ features/all/be2net/0045-be2net-Fix-FW-download-in-Lancer.patch
++ features/all/be2net/0046-be2net-Fix-ethtool-self-test-for-Lancer.patch
++ features/all/be2net/0047-be2net-Fix-traffic-stall-INTx-mode.patch
++ features/all/be2net/0048-be2net-Fix-Lancer-statistics.patch
++ features/all/be2net/0049-be2net-Fix-wrong-status-getting-returned-for-MCC-com.patch
++ features/all/be2net/0050-be2net-Fix-FW-download-for-BE.patch
++ features/all/be2net/0051-be2net-Ignore-status-of-some-ioctls-during-driver-lo.patch
++ features/all/be2net/0052-be2net-fix-speed-displayed-by-ethtool-on-certain-SKU.patch
++ features/all/be2net/0053-be2net-update-the-driver-version.patch
++ features/all/be2net/0054-be2net-Fix-to-not-set-link-speed-for-disabled-functi.patch
++ features/all/be2net/0055-be2net-Fix-to-apply-duplex-value-as-unknown-when-lin.patch
++ features/all/be2net/0056-be2net-Record-receive-queue-index-in-skb-to-aid-RPS.patch
++ features/all/be2net/0057-be2net-Fix-EEH-error-reset-before-a-flash-dump-compl.patch
++ features/all/be2net/0058-be2net-avoid-disabling-sriov-while-VFs-are-assigned.patch
+
+# Add CoDel from 3.5, and prerequisites
++ features/all/codel/0001-codel-Controlled-Delay-AQM.patch
++ features/all/codel/0002-codel-use-Newton-method-instead-of-sqrt-and-divides.patch
++ features/all/codel/0003-fq_codel-Fair-Queue-Codel-AQM.patch
++ features/all/codel/0004-net-codel-Add-missing-include-linux-prefetch.h.patch
++ features/all/codel/0005-net-codel-fix-build-errors.patch
++ features/all/codel/0006-codel-use-u16-field-instead-of-31bits-for-rec_inv_sq.patch
++ features/all/codel/0007-fq_codel-should-use-qdisc-backlog-as-threshold.patch
+
++ bugfix/x86/mm-pmd_read_atomic-fix-32bit-pae-pmd-walk-vs-pmd_populate-smp-race.patch
++ bugfix/all/hugetlb-fix-resv_map-leak-in-error-path.patch
++ bugfix/all/mm-fix-vma_resv_map-null-pointer.patch
+
++ bugfix/all/fix-scsi_wait_scan.patch
More information about the Kernel-svn-changes
mailing list