[kernel] r14318 - in dists/lenny/linux-2.6/debian: . patches/bugfix/all patches/series

Ben Hutchings benh at alioth.debian.org
Sun Oct 4 04:00:43 UTC 2009


Author: benh
Date: Sun Oct  4 04:00:33 2009
New Revision: 14318

Log:
dm-snap: Fix crash when using both snapshot and origin volumes (Closes: #545999)

Added:
   dists/lenny/linux-2.6/debian/patches/bugfix/all/dm-snapshot-fix-primary_pe-race.patch
Modified:
   dists/lenny/linux-2.6/debian/changelog
   dists/lenny/linux-2.6/debian/patches/series/20

Modified: dists/lenny/linux-2.6/debian/changelog
==============================================================================
--- dists/lenny/linux-2.6/debian/changelog	Sun Oct  4 03:51:06 2009	(r14317)
+++ dists/lenny/linux-2.6/debian/changelog	Sun Oct  4 04:00:33 2009	(r14318)
@@ -9,6 +9,8 @@
     thanks to Helge Deller <deller at gmx.de>
   * virtio_balloon: Fix towards_target when deflating balloon
     (Closes: #544619)
+  * dm-snap: Fix crash when using both snapshot and origin volumes
+    (Closes: #545999)
 
   [ dann frazier ]
   * autofs4: don't make expiring dentry negative, avoiding an oops

Added: dists/lenny/linux-2.6/debian/patches/bugfix/all/dm-snapshot-fix-primary_pe-race.patch
==============================================================================
--- /dev/null	00:00:00 1970	(empty, because file is newly added)
+++ dists/lenny/linux-2.6/debian/patches/bugfix/all/dm-snapshot-fix-primary_pe-race.patch	Sun Oct  4 04:00:33 2009	(r14318)
@@ -0,0 +1,107 @@
+From 7c5f78b9d7f21937e46c26db82976df4b459c95c Mon Sep 17 00:00:00 2001
+From: Mikulas Patocka <mpatocka at redhat.com>
+Date: Tue, 21 Oct 2008 17:44:51 +0100
+Subject: [PATCH] dm snapshot: fix primary_pe race
+
+Fix a race condition with primary_pe ref_count handling.
+
+put_pending_exception runs under dm_snapshot->lock, it does atomic_dec_and_test
+on primary_pe->ref_count, and later does atomic_read primary_pe->ref_count.
+
+__origin_write does atomic_dec_and_test on primary_pe->ref_count without holding
+dm_snapshot->lock.
+
+This opens the following race condition:
+Assume two CPUs, CPU1 is executing put_pending_exception (and holding
+dm_snapshot->lock). CPU2 is executing __origin_write in parallel.
+primary_pe->ref_count == 2.
+
+CPU1:
+if (primary_pe && atomic_dec_and_test(&primary_pe->ref_count))
+	origin_bios = bio_list_get(&primary_pe->origin_bios);
+... decrements primary_pe->ref_count to 1. Doesn't load origin_bios
+
+CPU2:
+if (first && atomic_dec_and_test(&primary_pe->ref_count)) {
+	flush_bios(bio_list_get(&primary_pe->origin_bios));
+	free_pending_exception(primary_pe);
+	/* If we got here, pe_queue is necessarily empty. */
+	return r;
+}
+... decrements primary_pe->ref_count to 0, submits pending bios, frees
+primary_pe.
+
+CPU1:
+if (!primary_pe || primary_pe != pe)
+	free_pending_exception(pe);
+... this has no effect.
+if (primary_pe && !atomic_read(&primary_pe->ref_count))
+	free_pending_exception(primary_pe);
+... sees ref_count == 0 (written by CPU 2), does double free !!
+
+This bug can happen only if someone is simultaneously writing to both the
+origin and the snapshot.
+
+If someone is writing only to the origin, __origin_write will submit kcopyd
+request after it decrements primary_pe->ref_count (so it can't happen that the
+finished copy races with primary_pe->ref_count decrementation).
+
+If someone is writing only to the snapshot, __origin_write isn't invoked at all
+and the race can't happen.
+
+The race happens when someone writes to the snapshot --- this creates
+pending_exception with primary_pe == NULL and starts copying. Then, someone
+writes to the same chunk in the snapshot, and __origin_write races with
+termination of already submitted request in pending_complete (that calls
+put_pending_exception).
+
+This race may be reason for bugs:
+  http://bugzilla.kernel.org/show_bug.cgi?id=11636
+  https://bugzilla.redhat.com/show_bug.cgi?id=465825
+
+The patch fixes the code to make sure that:
+1. If atomic_dec_and_test(&primary_pe->ref_count) returns false, the process
+must no longer dereference primary_pe (because someone else may free it under
+us).
+2. If atomic_dec_and_test(&primary_pe->ref_count) returns true, the process
+is responsible for freeing primary_pe.
+
+Signed-off-by: Mikulas Patocka <mpatocka at redhat.com>
+Signed-off-by: Alasdair G Kergon <agk at redhat.com>
+Cc: stable at kernel.org
+---
+ drivers/md/dm-snap.c |   10 +++-------
+ 1 files changed, 3 insertions(+), 7 deletions(-)
+
+diff --git a/drivers/md/dm-snap.c b/drivers/md/dm-snap.c
+index 6e5528a..4ed9b7a 100644
+--- a/drivers/md/dm-snap.c
++++ b/drivers/md/dm-snap.c
+@@ -824,8 +824,10 @@ static struct bio *put_pending_exception(struct dm_snap_pending_exception *pe)
+ 	 * the bios for the original write to the origin.
+ 	 */
+ 	if (primary_pe &&
+-	    atomic_dec_and_test(&primary_pe->ref_count))
++	    atomic_dec_and_test(&primary_pe->ref_count)) {
+ 		origin_bios = bio_list_get(&primary_pe->origin_bios);
++		free_pending_exception(primary_pe);
++	}
+ 
+ 	/*
+ 	 * Free the pe if it's not linked to an origin write or if
+@@ -834,12 +836,6 @@ static struct bio *put_pending_exception(struct dm_snap_pending_exception *pe)
+ 	if (!primary_pe || primary_pe != pe)
+ 		free_pending_exception(pe);
+ 
+-	/*
+-	 * Free the primary pe if nothing references it.
+-	 */
+-	if (primary_pe && !atomic_read(&primary_pe->ref_count))
+-		free_pending_exception(primary_pe);
+-
+ 	return origin_bios;
+ }
+ 
+-- 
+1.6.4.3
+

Modified: dists/lenny/linux-2.6/debian/patches/series/20
==============================================================================
--- dists/lenny/linux-2.6/debian/patches/series/20	Sun Oct  4 03:51:06 2009	(r14317)
+++ dists/lenny/linux-2.6/debian/patches/series/20	Sun Oct  4 04:00:33 2009	(r14318)
@@ -3,4 +3,5 @@
 + bugfix/mips/ip22-no-early-printk.patch
 + bugfix/parisc/ensure-tlb-purge-runs-single-threaded.patch
 + bugfix/all/virtio_balloon-fix-towards_target-when-deflating.patch
++ bugfix/all/dm-snapshot-fix-primary_pe-race.patch
 + features/all/ftdi_sio-openrd.patch



More information about the Kernel-svn-changes mailing list