[Parted-commits] GNU Parted Official Repository: Changes to 'next'

Jim Meyering meyering at alioth.debian.org
Fri Aug 28 17:02:05 UTC 2009


 libparted/arch/linux.c                |  231 ++--------------------------------
 libparted/labels/pt-tools.c           |    2 
 tests/lvm-utils.sh                    |    4 
 tests/t4100-dvh-partition-limits.sh   |    5 
 tests/t4100-msdos-partition-limits.sh |    5 
 tests/t6000-dm.sh                     |   11 +
 tests/t8000-loop.sh                   |    8 -
 7 files changed, 38 insertions(+), 228 deletions(-)

New commits:
commit 2a6936fab4d4499a4b812dd330d3db50549029e0
Author: Hans de Goede <hdegoede at redhat.com>
Date:   Fri Aug 28 17:05:55 2009 +0200

    linux-commit: do not unnecessarily open partition device nodes
    
    After patching parted with my do-not-use-BLKPG patch, I started
    to get EBUSY errors on commit_to_os. Note this is not caused
    by the do-not-use-BLKPG patch, this was already happening, but
    parted was silently ignoring the errors (and the kernel was
    not notified of the changes, which is bad).  The error now
    actually gets reported.
    
    The problem turns out to be in libparted/arch/linux.c's
    _flush_cache function, which walks all the partitions of the
    disk and does BLKFLSBUF calls on them.  This causes the following:
    
    commit_to_os -> device_open -> fd = open /dev/sda ->
    _flush_cache -> for each /dev/sda# open, ioctl, close
    -> ioctl(fd, BLKRRPART) -> EBUSY
    
    What is happening here is that the:
    for each /dev/sda# open, ioctl, close
    
    Is causing udev change events for all the /dev/sda#
    nodes, which causes udev to call blkid on all these nodes
    (on systems which use DeviceKit), so blkid has /dev/sda# nodes
    open while BLKRRPART gets called on /dev/sda -> EBUSY.
    
    I've checked with two independend storage subsystem kernel
    developers, and /dev/sda and /dev/sda#, guarantee cache coherency
    now-a-days.  So there is no need to do this for 2.6, which also
    eliminates the need to call _flush_cache() on device open at all.
    
    * libparted/arch/linux.c (_have_kern26): New function.
    (_flush_cache): For linux kernels 2.6 and newer, don't flush
    partition devices.
    (linux_open): Skip _flush_cache on newer kernels here, too.

diff --git a/libparted/arch/linux.c b/libparted/arch/linux.c
index 575a659..6d38ab3 100644
--- a/libparted/arch/linux.c
+++ b/libparted/arch/linux.c
@@ -581,6 +581,19 @@ _get_linux_version ()
         return kver = KERNEL_VERSION (major, minor, teeny);
 }
 
+static int
+_have_kern26 ()
+{
+        static int have_kern26 = -1;
+        int kver;
+
+        if (have_kern26 != -1)
+                return have_kern26;
+
+        kver = _get_linux_version();
+        return have_kern26 = kver >= KERNEL_VERSION (2,6,0) ? 1 : 0;
+}
+
 static void
 _device_set_sector_size (PedDevice* dev)
 {
@@ -1356,8 +1369,8 @@ linux_is_busy (PedDevice* dev)
         return 0;
 }
 
-/* we need to flush the master device, and all the partition devices,
- * because there is no coherency between the caches.
+/* we need to flush the master device, and with kernel < 2.6 all the partition
+ * devices, because there is no coherency between the caches with old kernels.
  * We should only flush unmounted partition devices, because:
  *  - there is never a need to flush them (we're not doing IO there)
  *  - flushing a device that is mounted causes unnecessary IO, and can
@@ -1375,6 +1388,10 @@ _flush_cache (PedDevice* dev)
 
         ioctl (arch_specific->fd, BLKFLSBUF);
 
+        /* With linux-2.6.0 and newer, we're done.  */
+        if (_have_kern26())
+                return;
+
         for (i = 1; i < 16; i++) {
                 char*           name;
                 int             fd;
@@ -1439,7 +1456,9 @@ retry:
                 dev->read_only = 0;
         }
 
-        _flush_cache (dev);
+        /* With kernels < 2.6 flush cache for cache coherence issues */
+        if (!_have_kern26())
+                _flush_cache (dev);
 
         return 1;
 }

commit 1d8f9bece138e4d8e58f7b059b4195aff6f39deb
Author: Hans de Goede <hdegoede at redhat.com>
Date:   Fri Aug 28 10:29:00 2009 +0200

    linux-commit: remove the use of the BLKPG ioctl
    
    While testing partitionable mdraid I noticed that the kernels
    view of the partition table never changes even though I was successfully
    making commit_to_os() calls.
    
    This has let to me diving into libparted's commit_to_os() code for Linux
    and there are multiple issues hiding in there:
    
    1) Parted reads /sys/block/foo/range to determine how many partitions
       the device type supports and then makes BLKPG ioctl's to update the
       kernels view of the partition table for partitions which fall into
       this range. However for example /sys/block/sda/range contains 16,
       there are 2 issue with libparted using this number:
       1) scsi major's only support 15 partitions, 1 of the range of 16
          is reserved for the whole device, yet libparted will try
          to notify the kernel about 16 partitions if present
       2) If the major's partition minor's run out, the kernel will switch
          to the mdp major for the other partitions, iow range no longer
          limits the number of partitions.
    
    2) libparted assumes the user knows what he is doing, and will ignore
       -ebusy errors for partitions, assuming that the user is smart enough
       to only change unused partitions. Parted does this without checking
       if the partitions which return ebusy actually are unchanged causing
       REAL errors to get unreported (BAD, really really BAD)
    
    3) because of 1) libparted will only sync 1 partition on /dev/md#
       devices (would be 0 if not for the of by 1 bug as all md#p#
       partitions use the mdp major), and it fails to even do that without
       reporting an error.
    
    ###
    
    1) we can fix by simply not checking /sys/block/foo/range, but instead
       just syncing max partitions.
    
    2) is more troublesome, we could just make -EBUSY n error,
       but that may annoy / bug some users. OTOH in certain cases libparted
       already falls back to BLKRRPART which will return EBUSY so users
       should already be prepared to handle EBUSY
    
    3) Could be fixed by making libparted recognize mdraid as a device type
       and except mdraid from using BLKPG, like it already is doing with
       DASD, but it might be better to just get rid of using BLKPG all
       together.  See below.
    
    An even bigger problem IMHO is the use of the BLKPG ioctl instead of
    BLKRRPART at all. What this does is tell the kernel parted's view
    of the partition table and make it use that, instead of telling
    the kernel to reread the partition table.  According to the parted
    sources this is done for the case where the kernel does not know
    the disklabel type. However as soon as the system is rebooted, the
    system will be using the kernel's view. So IMHO it would be much
    better to always use the kernels view and just always call BLKRRPART
    in commit_to_os(), this would solve all of the above issues, *and*
    make the way the system views the partition table consistent between
    just after running parted and after a reboot.
    
    I've attached a patch which removes the use of the BLKPG ioctl, notice
    that this also removes a lot of special case code and workarounds,
    which existence to me clearly indicates that using the BLKPG ioctl is
    a bad idea.
    
    * libparted/arch/linux.c (linux_disk_commit): Remove the use of the
    BLKPG ioctl.
    (_blkpg_add_partition, _blkpg_part_command, _blkpg_remove_partition):
    (_device_get_partition_range, _disk_sync_part_table, _have_blkpg):
    (_have_devfs): Remove functions thus rendered unused.

diff --git a/libparted/arch/linux.c b/libparted/arch/linux.c
index 06b9327..575a659 100644
--- a/libparted/arch/linux.c
+++ b/libparted/arch/linux.c
@@ -581,22 +581,6 @@ _get_linux_version ()
         return kver = KERNEL_VERSION (major, minor, teeny);
 }
 
-static int
-_have_devfs ()
-{
-        static int have_devfs = -1;
-        struct stat sb;
-
-        if (have_devfs != -1)
-                return have_devfs;
-
-        /* the presence of /dev/.devfsd implies that DevFS is active */
-        if (stat("/dev/.devfsd", &sb) < 0)
-                return have_devfs = 0;
-
-        return have_devfs = S_ISCHR(sb.st_mode) ? 1 : 0;
-}
-
 static void
 _device_set_sector_size (PedDevice* dev)
 {
@@ -2219,182 +2203,6 @@ linux_partition_is_busy (const PedPartition* part)
         return 0;
 }
 
-static int
-_blkpg_part_command (PedDevice* dev, struct blkpg_partition* part, int op)
-{
-        LinuxSpecific*          arch_specific = LINUX_SPECIFIC (dev);
-        struct blkpg_ioctl_arg  ioctl_arg;
-
-        ioctl_arg.op = op;
-        ioctl_arg.flags = 0;
-        ioctl_arg.datalen = sizeof (struct blkpg_partition);
-        ioctl_arg.data = (void*) part;
-
-        return ioctl (arch_specific->fd, BLKPG, &ioctl_arg) == 0;
-}
-
-static int
-_blkpg_add_partition (PedDisk* disk, const PedPartition *part)
-{
-        struct blkpg_partition  linux_part;
-        const char*             vol_name;
-        char*                   dev_name;
-
-        PED_ASSERT(disk != NULL, return 0);
-        PED_ASSERT(disk->dev->sector_size % PED_SECTOR_SIZE_DEFAULT == 0,
-                   return 0);
-
-        if (!_has_partitions (disk))
-                return 0;
-
-        if (ped_disk_type_check_feature (disk->type,
-                                         PED_DISK_TYPE_PARTITION_NAME))
-                vol_name = ped_partition_get_name (part);
-        else
-                vol_name = NULL;
-
-        dev_name = _device_get_part_path (disk->dev, part->num);
-        if (!dev_name)
-                return 0;
-
-        memset (&linux_part, 0, sizeof (linux_part));
-        linux_part.start = part->geom.start * disk->dev->sector_size;
-        /* see fs/partitions/msdos.c:msdos_partition(): "leave room for LILO" */
-        if (part->type & PED_PARTITION_EXTENDED)
-                linux_part.length = part->geom.length == 1 ? 512 : 1024;
-        else
-                linux_part.length = part->geom.length * disk->dev->sector_size;
-        linux_part.pno = part->num;
-        strncpy (linux_part.devname, dev_name, BLKPG_DEVNAMELTH);
-        if (vol_name)
-                strncpy (linux_part.volname, vol_name, BLKPG_VOLNAMELTH);
-
-        free (dev_name);
-
-        if (!_blkpg_part_command (disk->dev, &linux_part,
-                                  BLKPG_ADD_PARTITION)) {
-                return ped_exception_throw (
-                        PED_EXCEPTION_ERROR,
-                        PED_EXCEPTION_IGNORE_CANCEL,
-                        _("Error informing the kernel about modifications to "
-                          "partition %s -- %s.  This means Linux won't know "
-                          "about any changes you made to %s until you reboot "
-                          "-- so you shouldn't mount it or use it in any way "
-                          "before rebooting."),
-                        linux_part.devname,
-                        strerror (errno),
-                        linux_part.devname)
-                                == PED_EXCEPTION_IGNORE;
-        }
-
-        return 1;
-}
-
-static int
-_blkpg_remove_partition (PedDisk* disk, int n)
-{
-        struct blkpg_partition  linux_part;
-
-        if (!_has_partitions (disk))
-                return 0;
-
-        memset (&linux_part, 0, sizeof (linux_part));
-        linux_part.pno = n;
-        return _blkpg_part_command (disk->dev, &linux_part,
-                                    BLKPG_DEL_PARTITION);
-}
-
-/*
- * The number of partitions that a device can have depends on the kernel.
- * If we don't find this value in /sys/block/DEV/range, we will use our own
- * value.
- */
-static unsigned int
-_device_get_partition_range(PedDevice* dev)
-{
-        int         range, r;
-        char        path[128];
-        FILE*       fp;
-        bool        ok;
-
-        r = snprintf(path, sizeof(path), "/sys/block/%s/range",
-                     last_component(dev->path));
-        if(r < 0 || r >= sizeof(path))
-                return MAX_NUM_PARTS;
-
-        fp = fopen(path, "r");
-        if(!fp)
-                return MAX_NUM_PARTS;
-
-        ok = fscanf(fp, "%d", &range) == 1;
-        fclose(fp);
-
-        /* (range <= 0) is none sense.*/
-        return ok && range > 0 ? range : MAX_NUM_PARTS;
-}
-
-/*
- * Sync the partition table in two step process:
- * 1. Remove all of the partitions from the kernel's tables, but do not attempt
- *    removal of any partition for which the corresponding ioctl call fails.
- * 2. Add all the partitions that we hold in disk.
- *
- * To achieve this two step process we must calculate the minimum number of
- * maximum possible partitions between what linux supports and what the label
- * type supports. EX:
- *
- * number=MIN(max_parts_supported_in_linux,max_parts_supported_in_msdos_tables)
- */
-static int
-_disk_sync_part_table (PedDisk* disk)
-{
-        PED_ASSERT(disk != NULL, return 0);
-        PED_ASSERT(disk->dev != NULL, return 0);
-        int lpn;
-
-        /* lpn = largest partition number. */
-        if(ped_disk_get_max_supported_partition_count(disk, &lpn))
-                lpn = PED_MIN(lpn, _device_get_partition_range(disk->dev));
-        else
-                lpn = _device_get_partition_range(disk->dev);
-
-        /* Its not possible to support largest_partnum < 0.
-         * largest_partnum == 0 would mean does not support partitions.
-         * */
-        if(lpn < 0)
-                return 0;
-
-        int *rets = ped_malloc(sizeof(int) * lpn);
-        int *errnums = ped_malloc(sizeof(int) * lpn);
-        int ret = 1;
-        int i;
-
-        for (i = 1; i <= lpn; i++) {
-                rets[i - 1] = _blkpg_remove_partition (disk, i);
-                errnums[i - 1] = errno;
-        }
-
-        for (i = 1; i <= lpn; i++) {
-                const PedPartition *part = ped_disk_get_partition (disk, i);
-                if (part) {
-                        /* busy... so we won't (can't!) disturb ;)  Prolly
-                         * doesn't matter anyway, because users shouldn't be
-                         * changing mounted partitions anyway...
-                         */
-                        if (!rets[i - 1] && errnums[i - 1] == EBUSY)
-                                        continue;
-
-                        /* add the (possibly modified or new) partition */
-                        if (!_blkpg_add_partition (disk, part))
-                                ret = 0;
-                }
-        }
-
-        free (rets);
-        free (errnums);
-        return ret;
-}
-
 #ifdef ENABLE_DEVICE_MAPPER
 static int
 _dm_remove_map_name(char *name)
@@ -2640,19 +2448,6 @@ _kernel_reread_part_table (PedDevice* dev)
 }
 
 static int
-_have_blkpg ()
-{
-        static int have_blkpg = -1;
-        int kver;
-
-        if (have_blkpg != -1)
-                return have_blkpg;
-
-        kver = _get_linux_version();
-        return have_blkpg = kver >= KERNEL_VERSION (2,4,0) ? 1 : 0;
-}
-
-static int
 linux_disk_commit (PedDisk* disk)
 {
        if (!_has_partitions (disk))
@@ -2663,19 +2458,6 @@ linux_disk_commit (PedDisk* disk)
                 return _dm_reread_part_table (disk);
 #endif
         if (disk->dev->type != PED_DEVICE_FILE) {
-                /* The ioctl() command BLKPG_ADD_PARTITION does not notify
-                 * the devfs system; consequently, /proc/partitions will not
-                 * be up to date, and the proper links in /dev are not
-                 * created.  Therefore, if using DevFS, we must get the kernel
-                 * to re-read and grok the partition table.
-                 */
-                /* Work around kernel dasd problem so we really do BLKRRPART */
-                if (disk->dev->type != PED_DEVICE_DASD &&
-                    _have_blkpg () && !_have_devfs ()) {
-                        if (_disk_sync_part_table (disk))
-                                return 1;
-                }
-
                 return _kernel_reread_part_table (disk->dev);
         }
 

commit d16300a88d9200e0f1e08d56e39392e028412611
Author: Jim Meyering <meyering at redhat.com>
Date:   Fri Aug 28 18:53:39 2009 +0200

    tests: make two partition-related tests work for other sector sizes
    
    These two root-only tests would fail with the PARTED_SECTOR_SIZE envvar
    set to anything other than 512.  Now they also work for multiples.
    * tests/t4100-dvh-partition-limits.sh: Make sector-size agnostic.
    * tests/t4100-msdos-partition-limits.sh: Likewise.

diff --git a/tests/t4100-dvh-partition-limits.sh b/tests/t4100-dvh-partition-limits.sh
index 0606a7e..89302f1 100755
--- a/tests/t4100-dvh-partition-limits.sh
+++ b/tests/t4100-dvh-partition-limits.sh
@@ -23,6 +23,7 @@ privileges_required_=1
 : ${srcdir=.}
 . $srcdir/test-lib.sh
 require_xfs_
+ss=$sector_size_
 
 ####################################################
 # Create and mount a file system capable of dealing with >=2TB files.
@@ -58,7 +59,7 @@ do_mkpart()
   start_sector=$1
   end_sector=$2
   # echo '********' $(echo $end_sector - $start_sector + 1 |bc)
-  dd if=/dev/zero of=$dev bs=1b count=2k seek=$end_sector 2> /dev/null &&
+  dd if=/dev/zero of=$dev bs=$ss count=2k seek=$end_sector 2> /dev/null &&
   parted -s $dev mklabel $table_type &&
   parted -s $dev mkpart p xfs ${start_sector}s ${end_sector}s
 }
@@ -136,7 +137,7 @@ test_expect_success \
 cat > exp <<EOF
 Model:  (file)
 Disk: 4294970342s
-Sector size (logical/physical): 512B/512B
+Sector size (logical/physical): ${ss}B/${ss}B
 Partition Table: $table_type
 
 Number  Start        End          Size   Type      File system  Name  Flags
diff --git a/tests/t4100-msdos-partition-limits.sh b/tests/t4100-msdos-partition-limits.sh
index d58c387..554b230 100755
--- a/tests/t4100-msdos-partition-limits.sh
+++ b/tests/t4100-msdos-partition-limits.sh
@@ -23,6 +23,7 @@ privileges_required_=1
 : ${srcdir=.}
 . $srcdir/test-lib.sh
 require_xfs_
+ss=$sector_size_
 
 ####################################################
 # Create and mount a file system capable of dealing with >=2TB files.
@@ -58,7 +59,7 @@ do_mkpart()
   start_sector=$1
   end_sector=$2
   # echo '********' $(echo $end_sector - $start_sector + 1 |bc)
-  dd if=/dev/zero of=$dev bs=1b count=2k seek=$end_sector 2> /dev/null &&
+  dd if=/dev/zero of=$dev bs=$ss count=2k seek=$end_sector 2> /dev/null &&
   parted -s $dev mklabel $table_type &&
   parted -s $dev mkpart p xfs ${start_sector}s ${end_sector}s
 }
@@ -136,7 +137,7 @@ test_expect_success \
 cat > exp <<EOF
 Model:  (file)
 Disk: 4294970342s
-Sector size (logical/physical): 512B/512B
+Sector size (logical/physical): ${ss}B/${ss}B
 Partition Table: $table_type
 
 Number  Start        End          Size   Type     File system  Flags

commit f78cfbbe09e8c91d4e904a53e1f3e386d8be6bf5
Author: Jim Meyering <meyering at redhat.com>
Date:   Fri Aug 28 11:47:40 2009 +0200

    tests: make it easier to diagnose loop_setup_ failure
    
    * tests/lvm-utils.sh: Don't redirect stderr to /dev/null.

diff --git a/tests/lvm-utils.sh b/tests/lvm-utils.sh
index 2aba445..c0c82f9 100644
--- a/tests/lvm-utils.sh
+++ b/tests/lvm-utils.sh
@@ -1,7 +1,7 @@
 # Put lvm-related utilities here.
 # This file is sourced from test-lib.sh.
 
-# Copyright (C) 2007, 2008 Red Hat, Inc. All rights reserved.
+# Copyright (C) 2007-2009 Red Hat, Inc. All rights reserved.
 #
 # This copyrighted material is made available to anyone wishing to use,
 # modify, copy, or redistribute it subject to the terms and conditions
@@ -43,7 +43,7 @@ loop_setup_()
     || { warn "loop_setup_ failed: Unable to create tmp file $file"; return 1; }
 
   # NOTE: this requires a new enough version of losetup
-  dev=$(unsafe_losetup_ "$file" 2>/dev/null) \
+  dev=$(unsafe_losetup_ "$file") \
     || { warn "loop_setup_ failed: Unable to create loopback device"; return 1; }
 
   echo "$dev"

commit bebaf76ce468230fe250c5d367c5467c2614941e
Author: Jim Meyering <meyering at redhat.com>
Date:   Fri Aug 28 11:30:40 2009 +0200

    tests: avoid spurious failure due to extra space in diagnostic
    
    * libparted/labels/pt-tools.c (ptt_partition_max_start_len):
    Remove stray space in diagnostic that was causing the root-only
    regression test, t4100-dvh-partition-limits.sh, to fail.

diff --git a/libparted/labels/pt-tools.c b/libparted/labels/pt-tools.c
index ce087b3..622cedd 100644
--- a/libparted/labels/pt-tools.c
+++ b/libparted/labels/pt-tools.c
@@ -123,7 +123,7 @@ ptt_partition_max_start_len (char const *label_type, const PedPartition *part)
             ped_exception_throw (
                                  PED_EXCEPTION_ERROR, PED_EXCEPTION_CANCEL,
                                  _("starting sector number, %jd exceeds"
-                                   " the  %s-partition-table-imposed maximum"
+                                   " the %s-partition-table-imposed maximum"
                                    " of %jd"),
                                  part->geom.start,
                                  label_type,

commit acd6967293385a1b508a512246e417757f19bcf3
Author: Jim Meyering <meyering at redhat.com>
Date:   Fri Aug 28 11:17:34 2009 +0200

    tests: avoid spurious failure on "nodev" mounted partition
    
    * tests/t8000-loop.sh: Skip this test if loop_setup_ fails.
    * tests/t6000-dm.sh: Likewise.

diff --git a/tests/t6000-dm.sh b/tests/t6000-dm.sh
index bda600b..49905c0 100755
--- a/tests/t6000-dm.sh
+++ b/tests/t6000-dm.sh
@@ -1,6 +1,6 @@
 #!/bin/sh
 
-# Copyright (C) 2008 Free Software Foundation, Inc.
+# Copyright (C) 2008-2009 Free Software Foundation, Inc.
 
 # This program is free software; you can redistribute it and/or modify
 # it under the terms of the GNU General Public License as published by
@@ -44,10 +44,15 @@ cleanup_() {
     rm -f "$f1" "$f2" "$f3";
 }
 
+f1=$(pwd)/1; d1=$(loop_setup_ "$f1") || {
+    say "skipping $0: is this partition mounted with 'nodev'?"
+    test_done
+    exit
+}
+
 test_expect_success \
     "setup: create loop devices" \
-    'f1=$(pwd)/1 && d1=$(loop_setup_ "$f1") && \
-     f2=$(pwd)/2 && d2=$(loop_setup_ "$f2") && \
+    'f2=$(pwd)/2 && d2=$(loop_setup_ "$f2") && \
      f3=$(pwd)/3 && d3=$(loop_setup_ "$f3")'
 
 #
diff --git a/tests/t8000-loop.sh b/tests/t8000-loop.sh
index f599b11..e83c606 100755
--- a/tests/t8000-loop.sh
+++ b/tests/t8000-loop.sh
@@ -40,9 +40,11 @@ emit_expected_diagnostic()
       'Warning: The kernel was unable to re-read the partition table on'
 }
 
-test_expect_success \
-    "setup: create loop devices" \
-    'f1=$(pwd)/1 && d1=$(loop_setup_ "$f1")'
+f1=$(pwd)/1; d1=$(loop_setup_ "$f1") || {
+    say "skipping $0: is this partition mounted with 'nodev'?"
+    test_done
+    exit
+}
 
 test_expect_success \
     'run parted -s "$d1" mklabel msdos' \



More information about the Parted-commits mailing list