[kernel] r12486 - in dists/etch-security/linux-2.6/debian: . patches/bugfix patches/series

Dann Frazier dannf at alioth.debian.org
Sun Dec 7 22:28:49 UTC 2008


Author: dannf
Date: Sun Dec  7 22:28:47 2008
New Revision: 12486

Log:
* Fix race conditions between inotify removal and umount
   - bugfix/inotify-watch-removal-umount-races.patch
  See CVE-2008-5182

Added:
   dists/etch-security/linux-2.6/debian/patches/bugfix/inotify-watch-removal-umount-races.patch
Modified:
   dists/etch-security/linux-2.6/debian/changelog
   dists/etch-security/linux-2.6/debian/patches/series/23etch1

Modified: dists/etch-security/linux-2.6/debian/changelog
==============================================================================
--- dists/etch-security/linux-2.6/debian/changelog	(original)
+++ dists/etch-security/linux-2.6/debian/changelog	Sun Dec  7 22:28:47 2008
@@ -44,8 +44,11 @@
     /proc/net/atm/*vc:
      - bugfix/atm-duplicate-listen-on-socket-corrupts-the-vcc-table.patch
     See CVE-2008-5079
+  * Fix race conditions between inotify removal and umount
+     - bugfix/inotify-watch-removal-umount-races.patch
+    See CVE-2008-5182
 
- -- dann frazier <dannf at debian.org>  Sat, 06 Dec 2008 10:29:25 -0700
+ -- dann frazier <dannf at debian.org>  Sat, 06 Dec 2008 15:40:30 -0700
 
 linux-2.6 (2.6.18.dfsg.1-23) stable; urgency=high
 

Added: dists/etch-security/linux-2.6/debian/patches/bugfix/inotify-watch-removal-umount-races.patch
==============================================================================
--- (empty file)
+++ dists/etch-security/linux-2.6/debian/patches/bugfix/inotify-watch-removal-umount-races.patch	Sun Dec  7 22:28:47 2008
@@ -0,0 +1,346 @@
+commit 8f7b0ba1c853919b85b54774775f567f30006107
+Author: Al Viro <viro at ZenIV.linux.org.uk>
+Date:   Sat Nov 15 01:15:43 2008 +0000
+
+    Fix inotify watch removal/umount races
+    
+    Inotify watch removals suck violently.
+    
+    To kick the watch out we need (in this order) inode->inotify_mutex and
+    ih->mutex.  That's fine if we have a hold on inode; however, for all
+    other cases we need to make damn sure we don't race with umount.  We can
+    *NOT* just grab a reference to a watch - inotify_unmount_inodes() will
+    happily sail past it and we'll end with reference to inode potentially
+    outliving its superblock.
+    
+    Ideally we just want to grab an active reference to superblock if we
+    can; that will make sure we won't go into inotify_umount_inodes() until
+    we are done.  Cleanup is just deactivate_super().
+    
+    However, that leaves a messy case - what if we *are* racing with
+    umount() and active references to superblock can't be acquired anymore?
+    We can bump ->s_count, grab ->s_umount, which will almost certainly wait
+    until the superblock is shut down and the watch in question is pining
+    for fjords.  That's fine, but there is a problem - we might have hit the
+    window between ->s_active getting to 0 / ->s_count - below S_BIAS (i.e.
+    the moment when superblock is past the point of no return and is heading
+    for shutdown) and the moment when deactivate_super() acquires
+    ->s_umount.
+    
+    We could just do drop_super() yield() and retry, but that's rather
+    antisocial and this stuff is luser-triggerable.  OTOH, having grabbed
+    ->s_umount and having found that we'd got there first (i.e.  that
+    ->s_root is non-NULL) we know that we won't race with
+    inotify_umount_inodes().
+    
+    So we could grab a reference to watch and do the rest as above, just
+    with drop_super() instead of deactivate_super(), right? Wrong.  We had
+    to drop ih->mutex before we could grab ->s_umount.  So the watch
+    could've been gone already.
+    
+    That still can be dealt with - we need to save watch->wd, do idr_find()
+    and compare its result with our pointer.  If they match, we either have
+    the damn thing still alive or we'd lost not one but two races at once,
+    the watch had been killed and a new one got created with the same ->wd
+    at the same address.  That couldn't have happened in inotify_destroy(),
+    but inotify_rm_wd() could run into that.  Still, "new one got created"
+    is not a problem - we have every right to kill it or leave it alone,
+    whatever's more convenient.
+    
+    So we can use idr_find(...) == watch && watch->inode->i_sb == sb as
+    "grab it and kill it" check.  If it's been our original watch, we are
+    fine, if it's a newcomer - nevermind, just pretend that we'd won the
+    race and kill the fscker anyway; we are safe since we know that its
+    superblock won't be going away.
+    
+    And yes, this is far beyond mere "not very pretty"; so's the entire
+    concept of inotify to start with.
+    
+    Signed-off-by: Al Viro <viro at zeniv.linux.org.uk>
+    Acked-by: Greg KH <greg at kroah.com>
+    Signed-off-by: Linus Torvalds <torvalds at linux-foundation.org>
+
+Backported to Debian's 2.6.18 by dann frazier <dannf at debian.org>
+
+diff -urpN linux-source-2.6.18.orig/fs/inotify.c linux-source-2.6.18/fs/inotify.c
+--- linux-source-2.6.18.orig/fs/inotify.c	2006-09-19 21:42:06.000000000 -0600
++++ linux-source-2.6.18/fs/inotify.c	2008-12-06 14:32:56.000000000 -0700
+@@ -105,6 +105,20 @@ void get_inotify_watch(struct inotify_wa
+ }
+ EXPORT_SYMBOL_GPL(get_inotify_watch);
+ 
++int pin_inotify_watch(struct inotify_watch *watch)
++{
++	struct super_block *sb = watch->inode->i_sb;
++	spin_lock(&sb_lock);
++	if (sb->s_count >= S_BIAS) {
++		atomic_inc(&sb->s_active);
++		spin_unlock(&sb_lock);
++		atomic_inc(&watch->count);
++		return 1;
++	}
++	spin_unlock(&sb_lock);
++	return 0;
++}
++
+ /**
+  * put_inotify_watch - decrements the ref count on a given watch.  cleans up
+  * watch references if the count reaches zero.  inotify_watch is freed by
+@@ -123,6 +137,13 @@ void put_inotify_watch(struct inotify_wa
+ }
+ EXPORT_SYMBOL_GPL(put_inotify_watch);
+ 
++void unpin_inotify_watch(struct inotify_watch *watch)
++{
++	struct super_block *sb = watch->inode->i_sb;
++	put_inotify_watch(watch);
++	deactivate_super(sb);
++}
++
+ /*
+  * inotify_handle_get_wd - returns the next WD for use by the given handle
+  *
+@@ -485,6 +506,112 @@ void inotify_init_watch(struct inotify_w
+ }
+ EXPORT_SYMBOL_GPL(inotify_init_watch);
+ 
++/*
++ * Watch removals suck violently.  To kick the watch out we need (in this
++ * order) inode->inotify_mutex and ih->mutex.  That's fine if we have
++ * a hold on inode; however, for all other cases we need to make damn sure
++ * we don't race with umount.  We can *NOT* just grab a reference to a
++ * watch - inotify_unmount_inodes() will happily sail past it and we'll end
++ * with reference to inode potentially outliving its superblock.  Ideally
++ * we just want to grab an active reference to superblock if we can; that
++ * will make sure we won't go into inotify_umount_inodes() until we are
++ * done.  Cleanup is just deactivate_super().  However, that leaves a messy
++ * case - what if we *are* racing with umount() and active references to
++ * superblock can't be acquired anymore?  We can bump ->s_count, grab
++ * ->s_umount, which will almost certainly wait until the superblock is shut
++ * down and the watch in question is pining for fjords.  That's fine, but
++ * there is a problem - we might have hit the window between ->s_active
++ * getting to 0 / ->s_count - below S_BIAS (i.e. the moment when superblock
++ * is past the point of no return and is heading for shutdown) and the
++ * moment when deactivate_super() acquires ->s_umount.  We could just do
++ * drop_super() yield() and retry, but that's rather antisocial and this
++ * stuff is luser-triggerable.  OTOH, having grabbed ->s_umount and having
++ * found that we'd got there first (i.e. that ->s_root is non-NULL) we know
++ * that we won't race with inotify_umount_inodes().  So we could grab a
++ * reference to watch and do the rest as above, just with drop_super() instead
++ * of deactivate_super(), right?  Wrong.  We had to drop ih->mutex before we
++ * could grab ->s_umount.  So the watch could've been gone already.
++ *
++ * That still can be dealt with - we need to save watch->wd, do idr_find()
++ * and compare its result with our pointer.  If they match, we either have
++ * the damn thing still alive or we'd lost not one but two races at once,
++ * the watch had been killed and a new one got created with the same ->wd
++ * at the same address.  That couldn't have happened in inotify_destroy(),
++ * but inotify_rm_wd() could run into that.  Still, "new one got created"
++ * is not a problem - we have every right to kill it or leave it alone,
++ * whatever's more convenient.
++ *
++ * So we can use idr_find(...) == watch && watch->inode->i_sb == sb as
++ * "grab it and kill it" check.  If it's been our original watch, we are
++ * fine, if it's a newcomer - nevermind, just pretend that we'd won the
++ * race and kill the fscker anyway; we are safe since we know that its
++ * superblock won't be going away.
++ *
++ * And yes, this is far beyond mere "not very pretty"; so's the entire
++ * concept of inotify to start with.
++ */
++
++/**
++ * pin_to_kill - pin the watch down for removal
++ * @ih: inotify handle
++ * @watch: watch to kill
++ *
++ * Called with ih->mutex held, drops it.  Possible return values:
++ * 0 - nothing to do, it has died
++ * 1 - remove it, drop the reference and deactivate_super()
++ * 2 - remove it, drop the reference and drop_super(); we tried hard to avoid
++ * that variant, since it involved a lot of PITA, but that's the best that
++ * could've been done.
++ */
++static int pin_to_kill(struct inotify_handle *ih, struct inotify_watch *watch)
++{
++	struct super_block *sb = watch->inode->i_sb;
++	s32 wd = watch->wd;
++
++	spin_lock(&sb_lock);
++	if (sb->s_count >= S_BIAS) {
++		atomic_inc(&sb->s_active);
++		spin_unlock(&sb_lock);
++		get_inotify_watch(watch);
++		mutex_unlock(&ih->mutex);
++		return 1;	/* the best outcome */
++	}
++	sb->s_count++;
++	spin_unlock(&sb_lock);
++	mutex_unlock(&ih->mutex); /* can't grab ->s_umount under it */
++	down_read(&sb->s_umount);
++	if (likely(!sb->s_root)) {
++		/* fs is already shut down; the watch is dead */
++		drop_super(sb);
++		return 0;
++	}
++	/* raced with the final deactivate_super() */
++	mutex_lock(&ih->mutex);
++	if (idr_find(&ih->idr, wd) != watch || watch->inode->i_sb != sb) {
++		/* the watch is dead */
++		mutex_unlock(&ih->mutex);
++		drop_super(sb);
++		return 0;
++	}
++	/* still alive or freed and reused with the same sb and wd; kill */
++	get_inotify_watch(watch);
++	mutex_unlock(&ih->mutex);
++	return 2;
++}
++
++static void unpin_and_kill(struct inotify_watch *watch, int how)
++{
++	struct super_block *sb = watch->inode->i_sb;
++	put_inotify_watch(watch);
++	switch (how) {
++	case 1:
++		deactivate_super(sb);
++		break;
++	case 2:
++		drop_super(sb);
++	}
++}
++
+ /**
+  * inotify_destroy - clean up and destroy an inotify instance
+  * @ih: inotify handle
+@@ -496,11 +623,15 @@ void inotify_destroy(struct inotify_hand
+ 	 * pretty.  We cannot do a simple iteration over the list, because we
+ 	 * do not know the inode until we iterate to the watch.  But we need to
+ 	 * hold inode->inotify_mutex before ih->mutex.  The following works.
++	 *
++	 * AV: it had to become even uglier to start working ;-/
+ 	 */
+ 	while (1) {
+ 		struct inotify_watch *watch;
+ 		struct list_head *watches;
++		struct super_block *sb;
+ 		struct inode *inode;
++		int how;
+ 
+ 		mutex_lock(&ih->mutex);
+ 		watches = &ih->watches;
+@@ -509,8 +640,10 @@ void inotify_destroy(struct inotify_hand
+ 			break;
+ 		}
+ 		watch = list_entry(watches->next, struct inotify_watch, h_list);
+-		get_inotify_watch(watch);
+-		mutex_unlock(&ih->mutex);
++		sb = watch->inode->i_sb;
++		how = pin_to_kill(ih, watch);
++		if (!how)
++			continue;
+ 
+ 		inode = watch->inode;
+ 		mutex_lock(&inode->inotify_mutex);
+@@ -524,7 +657,7 @@ void inotify_destroy(struct inotify_hand
+ 
+ 		mutex_unlock(&ih->mutex);
+ 		mutex_unlock(&inode->inotify_mutex);
+-		put_inotify_watch(watch);
++		unpin_and_kill(watch, how);
+ 	}
+ 
+ 	/* free this handle: the put matching the get in inotify_init() */
+@@ -675,7 +808,9 @@ EXPORT_SYMBOL_GPL(inotify_add_watch);
+ int inotify_rm_wd(struct inotify_handle *ih, u32 wd)
+ {
+ 	struct inotify_watch *watch;
++	struct super_block *sb;
+ 	struct inode *inode;
++	int how;
+ 
+ 	mutex_lock(&ih->mutex);
+ 	watch = idr_find(&ih->idr, wd);
+@@ -683,9 +818,12 @@ int inotify_rm_wd(struct inotify_handle 
+ 		mutex_unlock(&ih->mutex);
+ 		return -EINVAL;
+ 	}
+-	get_inotify_watch(watch);
++	sb = watch->inode->i_sb;
++	how = pin_to_kill(ih, watch);
++	if (!how)
++		return 0;
++
+ 	inode = watch->inode;
+-	mutex_unlock(&ih->mutex);
+ 
+ 	mutex_lock(&inode->inotify_mutex);
+ 	mutex_lock(&ih->mutex);
+@@ -696,7 +834,7 @@ int inotify_rm_wd(struct inotify_handle 
+ 
+ 	mutex_unlock(&ih->mutex);
+ 	mutex_unlock(&inode->inotify_mutex);
+-	put_inotify_watch(watch);
++	unpin_and_kill(watch, how);
+ 
+ 	return 0;
+ }
+diff -urpN linux-source-2.6.18.orig/include/linux/inotify.h linux-source-2.6.18/include/linux/inotify.h
+--- linux-source-2.6.18.orig/include/linux/inotify.h	2006-09-19 21:42:06.000000000 -0600
++++ linux-source-2.6.18/include/linux/inotify.h	2008-12-06 14:29:21.000000000 -0700
+@@ -126,6 +126,8 @@ extern void inotify_remove_watch_locked(
+ 					struct inotify_watch *);
+ extern void get_inotify_watch(struct inotify_watch *);
+ extern void put_inotify_watch(struct inotify_watch *);
++extern int pin_inotify_watch(struct inotify_watch *);
++extern void unpin_inotify_watch(struct inotify_watch *);
+ 
+ #else
+ 
+@@ -220,6 +222,15 @@ static inline void put_inotify_watch(str
+ {
+ }
+ 
++extern inline int pin_inotify_watch(struct inotify_watch *watch)
++{
++	return 0;
++}
++
++extern inline void unpin_inotify_watch(struct inotify_watch *watch)
++{
++}
++
+ #endif	/* CONFIG_INOTIFY */
+ 
+ #endif	/* __KERNEL __ */
+diff -urpN linux-source-2.6.18.orig/kernel/auditfilter.c linux-source-2.6.18/kernel/auditfilter.c
+--- linux-source-2.6.18.orig/kernel/auditfilter.c	2006-09-19 21:42:06.000000000 -0600
++++ linux-source-2.6.18/kernel/auditfilter.c	2008-12-06 14:30:47.000000000 -0700
+@@ -991,8 +991,8 @@ static void audit_inotify_unregister(str
+ 	list_for_each_entry_safe(p, n, in_list, ilist) {
+ 		list_del(&p->ilist);
+ 		inotify_rm_watch(audit_ih, &p->wdata);
+-		/* the put matching the get in audit_do_del_rule() */
+-		put_inotify_watch(&p->wdata);
++		/* the unpin matching the pin in audit_do_del_rule() */
++		unpin_inotify_watch(&p->wdata);
+ 	}
+ }
+ 
+@@ -1274,9 +1274,13 @@ static inline int audit_del_rule(struct 
+ 				/* Put parent on the inotify un-registration
+ 				 * list.  Grab a reference before releasing
+ 				 * audit_filter_mutex, to be released in
+-				 * audit_inotify_unregister(). */
+-				list_add(&parent->ilist, &inotify_list);
+-				get_inotify_watch(&parent->wdata);
++				 * audit_inotify_unregister().
++				 * If filesystem is going away, just leave
++				 * the sucker alone, eviction will take
++				 * care of it.
++				 */
++				if (pin_inotify_watch(&parent->wdata))
++					list_add(&parent->ilist, &inotify_list);
+ 			}
+ 		}
+ 	}

Modified: dists/etch-security/linux-2.6/debian/patches/series/23etch1
==============================================================================
--- dists/etch-security/linux-2.6/debian/patches/series/23etch1	(original)
+++ dists/etch-security/linux-2.6/debian/patches/series/23etch1	Sun Dec  7 22:28:47 2008
@@ -14,3 +14,4 @@
 + bugfix/net-fix-recursive-descent-in-__scm_destroy.patch
 + bugfix/net-unix-gc-fix-soft-lockups-oom-issues.patch
 + bugfix/atm-duplicate-listen-on-socket-corrupts-the-vcc-table.patch
++ bugfix/inotify-watch-removal-umount-races.patch



More information about the Kernel-svn-changes mailing list