[kernel] r19082 - in dists/trunk/linux-2.6: . debian debian/config debian/installer debian/installer/amd64/modules/amd64 debian/installer/i386/modules/i386 debian/installer/modules debian/patches/bugfix/all debian/patches/bugfix/x86 debian/patches/features/all/be2net debian/patches/features/all/codel debian/patches/series

Ben Hutchings benh at alioth.debian.org
Mon Jun 4 20:36:01 UTC 2012


Author: benh
Date: Mon Jun  4 20:35:59 2012
New Revision: 19082

Log:
Merge changes from sid up to 3.2.19-1

Added:
   dists/trunk/linux-2.6/debian/installer/amd64/modules/amd64/hyperv-modules
      - copied unchanged from r19054, dists/sid/linux-2.6/debian/installer/amd64/modules/amd64/hyperv-modules
   dists/trunk/linux-2.6/debian/installer/i386/modules/i386/hyperv-modules
      - copied unchanged from r19054, dists/sid/linux-2.6/debian/installer/i386/modules/i386/hyperv-modules
   dists/trunk/linux-2.6/debian/installer/modules/hyperv-modules
      - copied unchanged from r19054, dists/sid/linux-2.6/debian/installer/modules/hyperv-modules
   dists/trunk/linux-2.6/debian/patches/bugfix/all/fix-scsi_wait_scan.patch
      - copied unchanged from r19054, dists/sid/linux-2.6/debian/patches/bugfix/all/fix-scsi_wait_scan.patch
   dists/trunk/linux-2.6/debian/patches/bugfix/all/hugetlb-fix-resv_map-leak-in-error-path.patch
      - copied, changed from r19054, dists/sid/linux-2.6/debian/patches/bugfix/all/hugetlb-fix-resv_map-leak-in-error-path.patch
   dists/trunk/linux-2.6/debian/patches/bugfix/all/mm-fix-vma_resv_map-null-pointer.patch
      - copied unchanged from r19054, dists/sid/linux-2.6/debian/patches/bugfix/all/mm-fix-vma_resv_map-null-pointer.patch
   dists/trunk/linux-2.6/debian/patches/bugfix/x86/mm-pmd_read_atomic-fix-32bit-pae-pmd-walk-vs-pmd_populate-smp-race.patch
      - copied unchanged from r19054, dists/sid/linux-2.6/debian/patches/bugfix/x86/mm-pmd_read_atomic-fix-32bit-pae-pmd-walk-vs-pmd_populate-smp-race.patch
   dists/trunk/linux-2.6/debian/patches/features/all/be2net/
      - copied from r19054, dists/sid/linux-2.6/debian/patches/features/all/be2net/
   dists/trunk/linux-2.6/debian/patches/features/all/codel/
      - copied from r19054, dists/sid/linux-2.6/debian/patches/features/all/codel/
Modified:
   dists/trunk/linux-2.6/   (props changed)
   dists/trunk/linux-2.6/debian/changelog
   dists/trunk/linux-2.6/debian/config/config
   dists/trunk/linux-2.6/debian/installer/modules/input-modules
   dists/trunk/linux-2.6/debian/installer/package-list
   dists/trunk/linux-2.6/debian/patches/series/base

Modified: dists/trunk/linux-2.6/debian/changelog
==============================================================================
--- dists/trunk/linux-2.6/debian/changelog	Mon Jun  4 20:11:45 2012	(r19081)
+++ dists/trunk/linux-2.6/debian/changelog	Mon Jun  4 20:35:59 2012	(r19082)
@@ -89,6 +89,43 @@
 
  -- Ben Hutchings <ben at decadent.org.uk>  Sun, 04 Mar 2012 20:27:42 +0000
 
+linux-2.6 (3.2.19-1) unstable; urgency=low
+
+  * New upstream stable update:
+    http://www.kernel.org/pub/linux/kernel/v3.x/ChangeLog-3.2.19
+    - hpsa: Fix problem with MSA2xxx devices (Closes: #661057)
+    - IB/core: Fix mismatch between locked and pinned pages
+    - iommu: Fix off by one in dmar_get_fault_reason()
+    - vfs: make AIO use the proper rw_verify_area() area helpers
+    - HID: logitech: read all 32 bits of report type bitfield (Closes: #671292)
+    - USB: Remove races in devio.c
+    - ext{3,4}: Fix error handling on inode bitmap corruption
+    - uvcvideo: Fix ENUMINPUT handling
+    - dl2k: Clean up rio_ioctl (CVE-2012-2313)
+    - [x86] MCE: Fix vm86 handling for 32bit mce handler
+    - [x86] mce: Fix check for processor context when machine check was taken.
+    - ethtool: Null-terminate filename passed to ethtool_ops::flash_device
+    - NFSv4: Fix buffer overflows in ACL support (CVE-2012-2375)
+      + Avoid reading past buffer when calling GETACL
+      + Avoid beyond bounds copy while caching ACL
+
+  [ Ben Hutchings ]
+  * be2net: Backport most changes up to Linux 3.5-rc1, thanks to
+    Sarveshwar Bandi (Closes: #673391)
+    - Add support for Skyhawk cards
+  * net/sched: Add codel and fq_codel from Linux 3.5-rc1
+  * [x86] udeb: Add hyperv-modules containing Hyper-V paravirtualised drivers
+  * [x86] ata_piix: defer disks to the Hyper-V drivers by default
+  * [x86] drm/i915:: Disable FBC on SandyBridge (Closes: #675022)
+  * AppArmor: compatibility patch for v5 interface (Closes: #661151)
+  * hugepages: fix use after free bug in "quota" handling (CVE-2012-2133)
+  * [x86] mm: pmd_read_atomic: fix 32bit PAE pmd walk vs pmd_populate SMP race
+    condition (CVE-2012-2373)
+  * hugetlb: fix resv_map leak in error path (CVE-2012-2390)
+  * [SCSI] fix scsi_wait_scan (Closes: #647436)
+
+ -- Ben Hutchings <ben at decadent.org.uk>  Fri, 01 Jun 2012 13:15:48 +0100
+
 linux-2.6 (3.2.18-1) unstable; urgency=low
 
   * New upstream stable update:

Modified: dists/trunk/linux-2.6/debian/config/config
==============================================================================
--- dists/trunk/linux-2.6/debian/config/config	Mon Jun  4 20:11:45 2012	(r19081)
+++ dists/trunk/linux-2.6/debian/config/config	Mon Jun  4 20:35:59 2012	(r19082)
@@ -4597,6 +4597,8 @@
 CONFIG_NET_SCH_MQPRIO=m
 CONFIG_NET_SCH_CHOKE=m
 CONFIG_NET_SCH_QFQ=m
+CONFIG_NET_SCH_CODEL=m
+CONFIG_NET_SCH_FQ_CODEL=m
 CONFIG_NET_SCH_INGRESS=m
 CONFIG_NET_SCH_PLUG=m
 CONFIG_NET_CLS_BASIC=m
@@ -4722,6 +4724,7 @@
 ##
 CONFIG_SECURITY_APPARMOR=y
 CONFIG_SECURITY_APPARMOR_BOOTPARAM_VALUE=1
+CONFIG_SECURITY_APPARMOR_COMPAT_24=y
 
 ##
 ## file: security/integrity/ima/Kconfig

Copied: dists/trunk/linux-2.6/debian/installer/amd64/modules/amd64/hyperv-modules (from r19054, dists/sid/linux-2.6/debian/installer/amd64/modules/amd64/hyperv-modules)
==============================================================================
--- /dev/null	00:00:00 1970	(empty, because file is newly added)
+++ dists/trunk/linux-2.6/debian/installer/amd64/modules/amd64/hyperv-modules	Mon Jun  4 20:35:59 2012	(r19082, copy of r19054, dists/sid/linux-2.6/debian/installer/amd64/modules/amd64/hyperv-modules)
@@ -0,0 +1 @@
+#include <hyperv-modules>

Copied: dists/trunk/linux-2.6/debian/installer/i386/modules/i386/hyperv-modules (from r19054, dists/sid/linux-2.6/debian/installer/i386/modules/i386/hyperv-modules)
==============================================================================
--- /dev/null	00:00:00 1970	(empty, because file is newly added)
+++ dists/trunk/linux-2.6/debian/installer/i386/modules/i386/hyperv-modules	Mon Jun  4 20:35:59 2012	(r19082, copy of r19054, dists/sid/linux-2.6/debian/installer/i386/modules/i386/hyperv-modules)
@@ -0,0 +1 @@
+#include <hyperv-modules>

Copied: dists/trunk/linux-2.6/debian/installer/modules/hyperv-modules (from r19054, dists/sid/linux-2.6/debian/installer/modules/hyperv-modules)
==============================================================================
--- /dev/null	00:00:00 1970	(empty, because file is newly added)
+++ dists/trunk/linux-2.6/debian/installer/modules/hyperv-modules	Mon Jun  4 20:35:59 2012	(r19082, copy of r19054, dists/sid/linux-2.6/debian/installer/modules/hyperv-modules)
@@ -0,0 +1,6 @@
+# All Hyper-V paravirtual drivers
+hid-hyperv
+hv_netvsc
+hv_storvsc
+hv_utils
+hv_vmbus

Modified: dists/trunk/linux-2.6/debian/installer/modules/input-modules
==============================================================================
--- dists/trunk/linux-2.6/debian/installer/modules/input-modules	Mon Jun  4 20:11:45 2012	(r19081)
+++ dists/trunk/linux-2.6/debian/installer/modules/input-modules	Mon Jun  4 20:35:59 2012	(r19082)
@@ -1,3 +1,4 @@
+hid
 usbhid
 hid-apple ?
 hid-belkin ?

Modified: dists/trunk/linux-2.6/debian/installer/package-list
==============================================================================
--- dists/trunk/linux-2.6/debian/installer/package-list	Mon Jun  4 20:11:45 2012	(r19081)
+++ dists/trunk/linux-2.6/debian/installer/package-list	Mon Jun  4 20:35:59 2012	(r19082)
@@ -478,3 +478,9 @@
 Priority: extra
 Description: LED modules
  This package contains LED modules.
+
+Package: hyperv-modules
+Depends: kernel-image, input-modules, scsi-core-modules
+Priority: extra
+Description: Hyper-V modules
+ This package contains Hyper-V paravirtualised drivers for the kernel.

Copied: dists/trunk/linux-2.6/debian/patches/bugfix/all/fix-scsi_wait_scan.patch (from r19054, dists/sid/linux-2.6/debian/patches/bugfix/all/fix-scsi_wait_scan.patch)
==============================================================================
--- /dev/null	00:00:00 1970	(empty, because file is newly added)
+++ dists/trunk/linux-2.6/debian/patches/bugfix/all/fix-scsi_wait_scan.patch	Mon Jun  4 20:35:59 2012	(r19082, copy of r19054, dists/sid/linux-2.6/debian/patches/bugfix/all/fix-scsi_wait_scan.patch)
@@ -0,0 +1,40 @@
+From: James Bottomley <jbottomley at parallels.com>
+Date: Wed, 30 May 2012 09:45:39 +0000
+Subject: [SCSI] fix scsi_wait_scan
+
+commit 1ff2f40305772b159a91c19590ee159d3a504afc upstream.
+
+Commit  c751085943362143f84346d274e0011419c84202
+Author: Rafael J. Wysocki <rjw at sisk.pl>
+Date:   Sun Apr 12 20:06:56 2009 +0200
+
+    PM/Hibernate: Wait for SCSI devices scan to complete during resume
+
+Broke the scsi_wait_scan module in 2.6.30.  Apparently debian still uses it so
+fix it and backport to stable before removing it in 3.6.
+
+The breakage is caused because the function template in
+include/scsi/scsi_scan.h is defined to be a nop unless SCSI is built in.
+That means that in the modular case (which is every distro), the
+scsi_wait_scan module does a simple async_synchronize_full() instead of
+waiting for scans.
+
+Signed-off-by: James Bottomley <JBottomley at Parallels.com>
+Signed-off-by: Ben Hutchings <ben at decadent.org.uk>
+---
+ drivers/scsi/scsi_wait_scan.c |    2 +-
+ 1 file changed, 1 insertion(+), 1 deletion(-)
+
+diff --git a/drivers/scsi/scsi_wait_scan.c b/drivers/scsi/scsi_wait_scan.c
+index 74708fc..ae78148 100644
+--- a/drivers/scsi/scsi_wait_scan.c
++++ b/drivers/scsi/scsi_wait_scan.c
+@@ -12,7 +12,7 @@
+ 
+ #include <linux/module.h>
+ #include <linux/device.h>
+-#include <scsi/scsi_scan.h>
++#include "scsi_priv.h"
+ 
+ static int __init wait_scan_init(void)
+ {

Copied and modified: dists/trunk/linux-2.6/debian/patches/bugfix/all/hugetlb-fix-resv_map-leak-in-error-path.patch (from r19054, dists/sid/linux-2.6/debian/patches/bugfix/all/hugetlb-fix-resv_map-leak-in-error-path.patch)
==============================================================================
--- dists/sid/linux-2.6/debian/patches/bugfix/all/hugetlb-fix-resv_map-leak-in-error-path.patch	Fri Jun  1 15:11:07 2012	(r19054, copy source)
+++ dists/trunk/linux-2.6/debian/patches/bugfix/all/hugetlb-fix-resv_map-leak-in-error-path.patch	Mon Jun  4 20:35:59 2012	(r19082)
@@ -1,5 +1,5 @@
 From: Dave Hansen <dave at linux.vnet.ibm.com>
-Date: Fri, 18 May 2012 11:46:30 -0700
+Date: Tue, 29 May 2012 15:06:46 -0700
 Subject: hugetlb: fix resv_map leak in error path
 
 commit c50ac050811d6485616a193eb0f37bfbd191cc89 upstream.
@@ -19,16 +19,26 @@
 
 	http://marc.info/?l=linux-mm&m=133728900729735
 
+This patch applies to 3.4 and later.  A version for earlier kernels is at
+https://lkml.org/lkml/2012/5/22/418.
+
 Signed-off-by: Dave Hansen <dave at linux.vnet.ibm.com>
-[Christoph Lameter: I have rediffed the patch against 2.6.32 and 3.2.0.]
-Signed-off-by: Ben Hutchings <ben at decadent.org.uk>
+Acked-by: Mel Gorman <mel at csn.ul.ie>
+Acked-by: KOSAKI Motohiro <kosaki.motohiro at jp.fujitsu.com>
+Reported-by: Christoph Lameter <cl at linux.com>
+Tested-by: Christoph Lameter <cl at linux.com>
+Cc: Andrea Arcangeli <aarcange at redhat.com>
+Signed-off-by: Andrew Morton <akpm at linux-foundation.org>
+Signed-off-by: Linus Torvalds <torvalds at linux-foundation.org>
 ---
  mm/hugetlb.c |   28 ++++++++++++++++++++++------
  1 file changed, 22 insertions(+), 6 deletions(-)
 
+diff --git a/mm/hugetlb.c b/mm/hugetlb.c
+index 41a647d..285a81e 100644
 --- a/mm/hugetlb.c
 +++ b/mm/hugetlb.c
-@@ -2068,6 +2068,15 @@
+@@ -2157,6 +2157,15 @@ static void hugetlb_vm_op_open(struct vm_area_struct *vma)
  		kref_get(&reservations->refs);
  }
  
@@ -44,7 +54,7 @@
  static void hugetlb_vm_op_close(struct vm_area_struct *vma)
  {
  	struct hstate *h = hstate_vma(vma);
-@@ -2083,7 +2092,7 @@
+@@ -2173,7 +2182,7 @@ static void hugetlb_vm_op_close(struct vm_area_struct *vma)
  		reserve = (end - start) -
  			region_count(&reservations->regions, start, end);
  
@@ -53,7 +63,7 @@
  
  		if (reserve) {
  			hugetlb_acct_memory(h, -reserve);
-@@ -2884,12 +2893,16 @@
+@@ -2991,12 +3000,16 @@ int hugetlb_reserve_pages(struct inode *inode,
  		set_vma_resv_flags(vma, HPAGE_RESV_OWNER);
  	}
  
@@ -64,26 +74,26 @@
 +		goto out_err;
 +	}
  
- 	/* There must be enough filesystem quota for the mapping */
--	if (hugetlb_get_quota(inode->i_mapping, chg))
+ 	/* There must be enough pages in the subpool for the mapping */
+-	if (hugepage_subpool_get_pages(spool, chg))
 -		return -ENOSPC;
-+	if (hugetlb_get_quota(inode->i_mapping, chg)) {
++	if (hugepage_subpool_get_pages(spool, chg)) {
 +		ret = -ENOSPC;
 +		goto out_err;
 +	}
  
  	/*
  	 * Check enough hugepages are available for the reservation.
-@@ -2898,7 +2911,7 @@
+@@ -3005,7 +3018,7 @@ int hugetlb_reserve_pages(struct inode *inode,
  	ret = hugetlb_acct_memory(h, chg);
  	if (ret < 0) {
- 		hugetlb_put_quota(inode->i_mapping, chg);
+ 		hugepage_subpool_put_pages(spool, chg);
 -		return ret;
 +		goto out_err;
  	}
  
  	/*
-@@ -2915,6 +2928,9 @@
+@@ -3022,6 +3035,9 @@ int hugetlb_reserve_pages(struct inode *inode,
  	if (!vma || vma->vm_flags & VM_MAYSHARE)
  		region_add(&inode->i_mapping->private_list, from, to);
  	return 0;

Copied: dists/trunk/linux-2.6/debian/patches/bugfix/all/mm-fix-vma_resv_map-null-pointer.patch (from r19054, dists/sid/linux-2.6/debian/patches/bugfix/all/mm-fix-vma_resv_map-null-pointer.patch)
==============================================================================
--- /dev/null	00:00:00 1970	(empty, because file is newly added)
+++ dists/trunk/linux-2.6/debian/patches/bugfix/all/mm-fix-vma_resv_map-null-pointer.patch	Mon Jun  4 20:35:59 2012	(r19082, copy of r19054, dists/sid/linux-2.6/debian/patches/bugfix/all/mm-fix-vma_resv_map-null-pointer.patch)
@@ -0,0 +1,66 @@
+From: Dave Hansen <dave at linux.vnet.ibm.com>
+Date: Wed, 30 May 2012 07:51:07 -0700
+Subject: mm: fix vma_resv_map() NULL pointer
+
+commit 4523e1458566a0e8ecfaff90f380dd23acc44d27 upstream.
+
+hugetlb_reserve_pages() can be used for either normal file-backed
+hugetlbfs mappings, or MAP_HUGETLB.  In the MAP_HUGETLB, semi-anonymous
+mode, there is not a VMA around.  The new call to resv_map_put() assumed
+that there was, and resulted in a NULL pointer dereference:
+
+  BUG: unable to handle kernel NULL pointer dereference at 0000000000000030
+  IP: vma_resv_map+0x9/0x30
+  PGD 141453067 PUD 1421e1067 PMD 0
+  Oops: 0000 [#1] PREEMPT SMP
+  ...
+  Pid: 14006, comm: trinity-child6 Not tainted 3.4.0+ #36
+  RIP: vma_resv_map+0x9/0x30
+  ...
+  Process trinity-child6 (pid: 14006, threadinfo ffff8801414e0000, task ffff8801414f26b0)
+  Call Trace:
+    resv_map_put+0xe/0x40
+    hugetlb_reserve_pages+0xa6/0x1d0
+    hugetlb_file_setup+0x102/0x2c0
+    newseg+0x115/0x360
+    ipcget+0x1ce/0x310
+    sys_shmget+0x5a/0x60
+    system_call_fastpath+0x16/0x1b
+
+This was reported by Dave Jones, but was reproducible with the
+libhugetlbfs test cases, so shame on me for not running them in the
+first place.
+
+With this, the oops is gone, and the output of libhugetlbfs's
+run_tests.py is identical to plain 3.4 again.
+
+[ Marked for stable, since this was introduced by commit c50ac050811d
+  ("hugetlb: fix resv_map leak in error path") which was also marked for
+  stable ]
+
+Reported-by: Dave Jones <davej at redhat.com>
+Cc: Mel Gorman <mel at csn.ul.ie>
+Cc: KOSAKI Motohiro <kosaki.motohiro at jp.fujitsu.com>
+Cc: Christoph Lameter <cl at linux.com>
+Cc: Andrea Arcangeli <aarcange at redhat.com>
+Cc: Andrew Morton <akpm at linux-foundation.org>
+Signed-off-by: Linus Torvalds <torvalds at linux-foundation.org>
+Signed-off-by: Ben Hutchings <ben at decadent.org.uk>
+---
+ mm/hugetlb.c |    3 ++-
+ 1 file changed, 2 insertions(+), 1 deletion(-)
+
+diff --git a/mm/hugetlb.c b/mm/hugetlb.c
+index 285a81e..e198831 100644
+--- a/mm/hugetlb.c
++++ b/mm/hugetlb.c
+@@ -3036,7 +3036,8 @@ int hugetlb_reserve_pages(struct inode *inode,
+ 		region_add(&inode->i_mapping->private_list, from, to);
+ 	return 0;
+ out_err:
+-	resv_map_put(vma);
++	if (vma)
++		resv_map_put(vma);
+ 	return ret;
+ }
+ 

Copied: dists/trunk/linux-2.6/debian/patches/bugfix/x86/mm-pmd_read_atomic-fix-32bit-pae-pmd-walk-vs-pmd_populate-smp-race.patch (from r19054, dists/sid/linux-2.6/debian/patches/bugfix/x86/mm-pmd_read_atomic-fix-32bit-pae-pmd-walk-vs-pmd_populate-smp-race.patch)
==============================================================================
--- /dev/null	00:00:00 1970	(empty, because file is newly added)
+++ dists/trunk/linux-2.6/debian/patches/bugfix/x86/mm-pmd_read_atomic-fix-32bit-pae-pmd-walk-vs-pmd_populate-smp-race.patch	Mon Jun  4 20:35:59 2012	(r19082, copy of r19054, dists/sid/linux-2.6/debian/patches/bugfix/x86/mm-pmd_read_atomic-fix-32bit-pae-pmd-walk-vs-pmd_populate-smp-race.patch)
@@ -0,0 +1,214 @@
+From: Andrea Arcangeli <aarcange at redhat.com>
+Date: Tue, 29 May 2012 15:06:49 -0700
+Subject: mm: pmd_read_atomic: fix 32bit PAE pmd walk vs pmd_populate SMP race
+ condition
+
+commit 26c191788f18129af0eb32a358cdaea0c7479626 upstream.
+
+When holding the mmap_sem for reading, pmd_offset_map_lock should only
+run on a pmd_t that has been read atomically from the pmdp pointer,
+otherwise we may read only half of it leading to this crash.
+
+PID: 11679  TASK: f06e8000  CPU: 3   COMMAND: "do_race_2_panic"
+ #0 [f06a9dd8] crash_kexec at c049b5ec
+ #1 [f06a9e2c] oops_end at c083d1c2
+ #2 [f06a9e40] no_context at c0433ded
+ #3 [f06a9e64] bad_area_nosemaphore at c043401a
+ #4 [f06a9e6c] __do_page_fault at c0434493
+ #5 [f06a9eec] do_page_fault at c083eb45
+ #6 [f06a9f04] error_code (via page_fault) at c083c5d5
+    EAX: 01fb470c EBX: fff35000 ECX: 00000003 EDX: 00000100 EBP:
+    00000000
+    DS:  007b     ESI: 9e201000 ES:  007b     EDI: 01fb4700 GS:  00e0
+    CS:  0060     EIP: c083bc14 ERR: ffffffff EFLAGS: 00010246
+ #7 [f06a9f38] _spin_lock at c083bc14
+ #8 [f06a9f44] sys_mincore at c0507b7d
+ #9 [f06a9fb0] system_call at c083becd
+                         start           len
+    EAX: ffffffda  EBX: 9e200000  ECX: 00001000  EDX: 6228537f
+    DS:  007b      ESI: 00000000  ES:  007b      EDI: 003d0f00
+    SS:  007b      ESP: 62285354  EBP: 62285388  GS:  0033
+    CS:  0073      EIP: 00291416  ERR: 000000da  EFLAGS: 00000286
+
+This should be a longstanding bug affecting x86 32bit PAE without THP.
+Only archs with 64bit large pmd_t and 32bit unsigned long should be
+affected.
+
+With THP enabled the barrier() in pmd_none_or_trans_huge_or_clear_bad()
+would partly hide the bug when the pmd transition from none to stable,
+by forcing a re-read of the *pmd in pmd_offset_map_lock, but when THP is
+enabled a new set of problem arises by the fact could then transition
+freely in any of the none, pmd_trans_huge or pmd_trans_stable states.
+So making the barrier in pmd_none_or_trans_huge_or_clear_bad()
+unconditional isn't good idea and it would be a flakey solution.
+
+This should be fully fixed by introducing a pmd_read_atomic that reads
+the pmd in order with THP disabled, or by reading the pmd atomically
+with cmpxchg8b with THP enabled.
+
+Luckily this new race condition only triggers in the places that must
+already be covered by pmd_none_or_trans_huge_or_clear_bad() so the fix
+is localized there but this bug is not related to THP.
+
+NOTE: this can trigger on x86 32bit systems with PAE enabled with more
+than 4G of ram, otherwise the high part of the pmd will never risk to be
+truncated because it would be zero at all times, in turn so hiding the
+SMP race.
+
+This bug was discovered and fully debugged by Ulrich, quote:
+
+----
+[..]
+pmd_none_or_trans_huge_or_clear_bad() loads the content of edx and
+eax.
+
+    496 static inline int pmd_none_or_trans_huge_or_clear_bad(pmd_t
+    *pmd)
+    497 {
+    498         /* depend on compiler for an atomic pmd read */
+    499         pmd_t pmdval = *pmd;
+
+                                // edi = pmd pointer
+0xc0507a74 <sys_mincore+548>:   mov    0x8(%esp),%edi
+...
+                                // edx = PTE page table high address
+0xc0507a84 <sys_mincore+564>:   mov    0x4(%edi),%edx
+...
+                                // eax = PTE page table low address
+0xc0507a8e <sys_mincore+574>:   mov    (%edi),%eax
+
+[..]
+
+Please note that the PMD is not read atomically. These are two "mov"
+instructions where the high order bits of the PMD entry are fetched
+first. Hence, the above machine code is prone to the following race.
+
+-  The PMD entry {high|low} is 0x0000000000000000.
+   The "mov" at 0xc0507a84 loads 0x00000000 into edx.
+
+-  A page fault (on another CPU) sneaks in between the two "mov"
+   instructions and instantiates the PMD.
+
+-  The PMD entry {high|low} is now 0x00000003fda38067.
+   The "mov" at 0xc0507a8e loads 0xfda38067 into eax.
+----
+
+Reported-by: Ulrich Obergfell <uobergfe at redhat.com>
+Signed-off-by: Andrea Arcangeli <aarcange at redhat.com>
+Cc: Mel Gorman <mgorman at suse.de>
+Cc: Hugh Dickins <hughd at google.com>
+Cc: Larry Woodman <lwoodman at redhat.com>
+Cc: Petr Matousek <pmatouse at redhat.com>
+Cc: Rik van Riel <riel at redhat.com>
+Signed-off-by: Andrew Morton <akpm at linux-foundation.org>
+Signed-off-by: Linus Torvalds <torvalds at linux-foundation.org>
+Signed-off-by: Ben Hutchings <ben at decadent.org.uk>
+---
+ arch/x86/include/asm/pgtable-3level.h |   50 +++++++++++++++++++++++++++++++++
+ include/asm-generic/pgtable.h         |   22 +++++++++++++--
+ 2 files changed, 70 insertions(+), 2 deletions(-)
+
+diff --git a/arch/x86/include/asm/pgtable-3level.h b/arch/x86/include/asm/pgtable-3level.h
+index effff47..43876f1 100644
+--- a/arch/x86/include/asm/pgtable-3level.h
++++ b/arch/x86/include/asm/pgtable-3level.h
+@@ -31,6 +31,56 @@ static inline void native_set_pte(pte_t *ptep, pte_t pte)
+ 	ptep->pte_low = pte.pte_low;
+ }
+ 
++#define pmd_read_atomic pmd_read_atomic
++/*
++ * pte_offset_map_lock on 32bit PAE kernels was reading the pmd_t with
++ * a "*pmdp" dereference done by gcc. Problem is, in certain places
++ * where pte_offset_map_lock is called, concurrent page faults are
++ * allowed, if the mmap_sem is hold for reading. An example is mincore
++ * vs page faults vs MADV_DONTNEED. On the page fault side
++ * pmd_populate rightfully does a set_64bit, but if we're reading the
++ * pmd_t with a "*pmdp" on the mincore side, a SMP race can happen
++ * because gcc will not read the 64bit of the pmd atomically. To fix
++ * this all places running pmd_offset_map_lock() while holding the
++ * mmap_sem in read mode, shall read the pmdp pointer using this
++ * function to know if the pmd is null nor not, and in turn to know if
++ * they can run pmd_offset_map_lock or pmd_trans_huge or other pmd
++ * operations.
++ *
++ * Without THP if the mmap_sem is hold for reading, the
++ * pmd can only transition from null to not null while pmd_read_atomic runs.
++ * So there's no need of literally reading it atomically.
++ *
++ * With THP if the mmap_sem is hold for reading, the pmd can become
++ * THP or null or point to a pte (and in turn become "stable") at any
++ * time under pmd_read_atomic, so it's mandatory to read it atomically
++ * with cmpxchg8b.
++ */
++#ifndef CONFIG_TRANSPARENT_HUGEPAGE
++static inline pmd_t pmd_read_atomic(pmd_t *pmdp)
++{
++	pmdval_t ret;
++	u32 *tmp = (u32 *)pmdp;
++
++	ret = (pmdval_t) (*tmp);
++	if (ret) {
++		/*
++		 * If the low part is null, we must not read the high part
++		 * or we can end up with a partial pmd.
++		 */
++		smp_rmb();
++		ret |= ((pmdval_t)*(tmp + 1)) << 32;
++	}
++
++	return (pmd_t) { ret };
++}
++#else /* CONFIG_TRANSPARENT_HUGEPAGE */
++static inline pmd_t pmd_read_atomic(pmd_t *pmdp)
++{
++	return (pmd_t) { atomic64_read((atomic64_t *)pmdp) };
++}
++#endif /* CONFIG_TRANSPARENT_HUGEPAGE */
++
+ static inline void native_set_pte_atomic(pte_t *ptep, pte_t pte)
+ {
+ 	set_64bit((unsigned long long *)(ptep), native_pte_val(pte));
+diff --git a/include/asm-generic/pgtable.h b/include/asm-generic/pgtable.h
+index e2768f1..6f2b45a 100644
+--- a/include/asm-generic/pgtable.h
++++ b/include/asm-generic/pgtable.h
+@@ -445,6 +445,18 @@ static inline int pmd_write(pmd_t pmd)
+ #endif /* __HAVE_ARCH_PMD_WRITE */
+ #endif /* CONFIG_TRANSPARENT_HUGEPAGE */
+ 
++#ifndef pmd_read_atomic
++static inline pmd_t pmd_read_atomic(pmd_t *pmdp)
++{
++	/*
++	 * Depend on compiler for an atomic pmd read. NOTE: this is
++	 * only going to work, if the pmdval_t isn't larger than
++	 * an unsigned long.
++	 */
++	return *pmdp;
++}
++#endif
++
+ /*
+  * This function is meant to be used by sites walking pagetables with
+  * the mmap_sem hold in read mode to protect against MADV_DONTNEED and
+@@ -458,11 +470,17 @@ static inline int pmd_write(pmd_t pmd)
+  * undefined so behaving like if the pmd was none is safe (because it
+  * can return none anyway). The compiler level barrier() is critically
+  * important to compute the two checks atomically on the same pmdval.
++ *
++ * For 32bit kernels with a 64bit large pmd_t this automatically takes
++ * care of reading the pmd atomically to avoid SMP race conditions
++ * against pmd_populate() when the mmap_sem is hold for reading by the
++ * caller (a special atomic read not done by "gcc" as in the generic
++ * version above, is also needed when THP is disabled because the page
++ * fault can populate the pmd from under us).
+  */
+ static inline int pmd_none_or_trans_huge_or_clear_bad(pmd_t *pmd)
+ {
+-	/* depend on compiler for an atomic pmd read */
+-	pmd_t pmdval = *pmd;
++	pmd_t pmdval = pmd_read_atomic(pmd);
+ 	/*
+ 	 * The barrier will stabilize the pmdval in a register or on
+ 	 * the stack so that it will stop changing under the code.

Modified: dists/trunk/linux-2.6/debian/patches/series/base
==============================================================================
--- dists/trunk/linux-2.6/debian/patches/series/base	Mon Jun  4 20:11:45 2012	(r19081)
+++ dists/trunk/linux-2.6/debian/patches/series/base	Mon Jun  4 20:35:59 2012	(r19082)
@@ -60,3 +60,36 @@
 + features/all/wacom/0026-Input-wacom-return-proper-error-if-usb_get_extra_des.patch
 
 + bugfix/all/acpi-battery-only-refresh-the-sysfs-files-when-pertinent.patch
+
+# Update be2net driver to 3.5ish
++ features/all/be2net/0043-be2net-fix-ethtool-get-settings.patch
++ features/all/be2net/0044-be2net-Fix-VLAN-multicast-packet-reception.patch
++ features/all/be2net/0045-be2net-Fix-FW-download-in-Lancer.patch
++ features/all/be2net/0046-be2net-Fix-ethtool-self-test-for-Lancer.patch
++ features/all/be2net/0047-be2net-Fix-traffic-stall-INTx-mode.patch
++ features/all/be2net/0048-be2net-Fix-Lancer-statistics.patch
++ features/all/be2net/0049-be2net-Fix-wrong-status-getting-returned-for-MCC-com.patch
++ features/all/be2net/0050-be2net-Fix-FW-download-for-BE.patch
++ features/all/be2net/0051-be2net-Ignore-status-of-some-ioctls-during-driver-lo.patch
++ features/all/be2net/0052-be2net-fix-speed-displayed-by-ethtool-on-certain-SKU.patch
++ features/all/be2net/0053-be2net-update-the-driver-version.patch
++ features/all/be2net/0054-be2net-Fix-to-not-set-link-speed-for-disabled-functi.patch
++ features/all/be2net/0055-be2net-Fix-to-apply-duplex-value-as-unknown-when-lin.patch
++ features/all/be2net/0056-be2net-Record-receive-queue-index-in-skb-to-aid-RPS.patch
++ features/all/be2net/0057-be2net-Fix-EEH-error-reset-before-a-flash-dump-compl.patch
++ features/all/be2net/0058-be2net-avoid-disabling-sriov-while-VFs-are-assigned.patch
+
+# Add CoDel from 3.5, and prerequisites
++ features/all/codel/0001-codel-Controlled-Delay-AQM.patch
++ features/all/codel/0002-codel-use-Newton-method-instead-of-sqrt-and-divides.patch
++ features/all/codel/0003-fq_codel-Fair-Queue-Codel-AQM.patch
++ features/all/codel/0004-net-codel-Add-missing-include-linux-prefetch.h.patch
++ features/all/codel/0005-net-codel-fix-build-errors.patch
++ features/all/codel/0006-codel-use-u16-field-instead-of-31bits-for-rec_inv_sq.patch
++ features/all/codel/0007-fq_codel-should-use-qdisc-backlog-as-threshold.patch
+
++ bugfix/x86/mm-pmd_read_atomic-fix-32bit-pae-pmd-walk-vs-pmd_populate-smp-race.patch
++ bugfix/all/hugetlb-fix-resv_map-leak-in-error-path.patch
++ bugfix/all/mm-fix-vma_resv_map-null-pointer.patch
+
++ bugfix/all/fix-scsi_wait_scan.patch



More information about the Kernel-svn-changes mailing list