Bug#587550: mdadm: "--manage --remove faulty" does not remove all faulty disks

Neil Brown neilb at suse.de
Wed Jun 30 07:23:35 UTC 2010


On Tue, 29 Jun 2010 14:24:07 -0400
Jim Paris <jim at jtan.com> wrote:

> My guess is that in Manage.c:Manage_subdevs, the loops like
>     for (; j < array.raid_disks + array.nr_disks ; j++) {
> are missing disks because the disk numbers are changing as they are
> removed, but I didn't have the time to follow the code in detail.
> 
>

Thanks for the report.
Good guess.

This is the patch that I have just checked in to mdadm.

Thanks,
NeilBrown


commit b3b4e8a7a229cccca915421329a5319f996b0842
Author: NeilBrown <neilb at suse.de>
Date:   Wed Jun 30 17:20:38 2010 +1000

    Avoid skipping devices where removing all faulty/detached devices.
    
    When using 0.90 metadata, devices can be renumbered when
    earlier devices are removed.
    So when iterating all devices looking for 'failed' or 'detached'
    devices, we need to re-check the same slot we checked last time
    to see if maybe it has a different device now.
    
    Reported-by: Jim Paris <jim at jtan.com>
    Resolves-Debian-Bug: 587550
    Signed-off-by: NeilBrown <neilb at suse.de>

diff --git a/Manage.c b/Manage.c
index 6bc5d0a..edf41e9 100644
--- a/Manage.c
+++ b/Manage.c
@@ -376,6 +376,7 @@ int Manage_subdevs(char *devname, int fd,
 		return 1;
 	}
 
+	stb.st_rdev = 0;
 	for (dv = devlist, j=0 ; dv; dv = next, j = jnext) {
 		unsigned long long ldsize;
 		char dvname[20];
@@ -394,6 +395,7 @@ int Manage_subdevs(char *devname, int fd,
 				return 1;
 			}
 			for (; j < array.raid_disks + array.nr_disks ; j++) {
+				int dev;
 				disc.number = j;
 				if (ioctl(fd, GET_DISK_INFO, &disc))
 					continue;
@@ -401,9 +403,15 @@ int Manage_subdevs(char *devname, int fd,
 					continue;
 				if ((disc.state & 1) == 0) /* faulty */
 					continue;
-				stb.st_rdev = makedev(disc.major, disc.minor);
+				dev = makedev(disc.major, disc.minor);
+				if (stb.st_rdev == dev)
+					/* already did that one */
+					continue;
+				stb.st_rdev = dev;
 				next = dv;
-				jnext = j+1;
+				/* same slot again next time - things might
+				 * have reshuffled */
+				jnext = j;
 				sprintf(dvname,"%d:%d", disc.major, disc.minor);
 				dnprintable = dvname;
 				break;
@@ -419,6 +427,7 @@ int Manage_subdevs(char *devname, int fd,
 			}
 			for (; j < array.raid_disks + array.nr_disks; j++) {
 				int sfd;
+				int dev;
 				disc.number = j;
 				if (ioctl(fd, GET_DISK_INFO, &disc))
 					continue;
@@ -435,9 +444,15 @@ int Manage_subdevs(char *devname, int fd,
 					continue;
 				if (errno != ENXIO)
 					continue;
-				stb.st_rdev = makedev(disc.major, disc.minor);
+				dev = makedev(disc.major, disc.minor);
+				if (stb.st_rdev == dev)
+					/* already did that one */
+					continue;
+				stb.st_rdev = dev;
 				next = dv;
-				jnext = j+1;
+				/* same slot again next time - things might
+				 * have reshuffled */
+				jnext = j;
 				dnprintable = dvname;
 				break;
 			}





More information about the pkg-mdadm-devel mailing list