Bug#686189: pvmove hungs after moving one of many LVs, all IO on affected LVs hungs
Pokotilenko Kostik
casper at meteor.dp.ua
Wed Aug 29 17:59:46 UTC 2012
Package: lvm2
Version: 2.02.66-5
Severity: critical
Tags: upstream
First of all, links to this problem:
https://bugzilla.redhat.com/show_bug.cgi?id=602516
https://bugzilla.redhat.com/show_bug.cgi?id=706036
As stated in above links this bug is fixed upstream in 2.02.86, squeeze still
have 2.02.66 and as I see without a fix backported.
Also as stated in above links there is a workaround for this bug - move one LV at a time, i.e.:
pvmove -i0 -n $lvname
Now about my story...
I recently upgraded to squeeze and wanted to utilize the ability of new grub-pc to boot from
lvm over raid1 directly. This way I could migrate my system from partitions to lvm over raid
and have failover over 3 drives in raid1.
To do so I had to convert one drive at a time. I have seccessfully converted 2 drives and
started to convert 3rd one in the exactly the same way which resulted in pvmove hang and
hang of all IOs to all affected LVs.
The layout before converting 3rd drive was:
sda[1,2,5,6,7]: old unused system partitions
sda8: LVM PV of VGa
sda9: Raid5 member
sdb1: Raid1 member
sdb3: LVM PV of VGb
sdb4: Raid5 member
sdc1: Raid1 member
sdc3: LVM PV of VGc
sdc4: Raid5 member
md0: Raid5 of sda9, sdb4, sdc4
md1: Raid1 of missing, sdb1, sdc1
md0: PV of VGraid5
md1: PV of VGsystem
As I did not have spare drive to move to I used LV on VGraid5 as PV for VGa:
lvcreate -n pvVGa VGraid5
pvcreate /dev/mapper/VGraid5-pvVGa
vgextend VGa /dev/mapper/VGraid5-pvVGa
And stared the move:
pvmove /dev/sda8 /dev/mapper/VGraid5-pvVGa
The proccess of moving data stared, but stuck at about 26%.
At this point pvmove, any other lvm commands and all processes tried to access
LVs on VGa resulted in D state and I was not been able to kill -9 them.
Any other attempt to access data on VGa resulted in hang, devices IO queue never
decreased and 100% device usage with no read/write activity was shown in atop
(no dequeue was performed).
The system did not responded to reboot/shutdown and to resolve this problem
quickly I have to shutdown whatever I can and hard-reset. I was not able even to
sync as it also hanged.
After reboot VGa did not activated as pvmove process left in inconsistent state.
As I figured out, first LV in VGa has successfully moved, mirror has created to
second one and this mirror left out of sync.
pvmove --abort canceled the move and I was able to activate VGa. All data was
there, safe.
Then I made sure nothing is accessing VGa and started pvmove with the same command
and it successfully finished the move.
During first failed move VGa was actively accessed by one of kvm guests which I
think in addition to multi-leveled dm/lv/pv/lv setup caused suspend/lock race,
which is explained in the links above.
Squeeze should provide solution to this problem as LVM considered production stable
and many people rely on this.
-- System Information:
Debian Release: 6.0.5
APT prefers stable
APT policy: (500, 'stable')
Architecture: amd64 (x86_64)
Kernel: Linux 3.2.0-0.bpo.2-amd64 (SMP w/8 CPU cores)
Locale: LANG=ru_UA.UTF-8, LC_CTYPE=ru_UA.UTF-8 (charmap=UTF-8)
Shell: /bin/sh linked to /bin/bash
Versions of packages lvm2 depends on:
ii dmsetup 2:1.02.48-5 The Linux Kernel Device Mapper use
ii libc6 2.11.3-3 Embedded GNU C Library: Shared lib
ii libdevmapper1.02.1 2:1.02.48-5 The Linux Kernel Device Mapper use
ii libreadline5 5.2-7 GNU readline and history libraries
ii libudev0 164-3 libudev shared library
ii lsb-base 3.2-23.2squeeze1 Linux Standard Base 3.2 init scrip
lvm2 recommends no packages.
lvm2 suggests no packages.
-- no debconf information
More information about the pkg-lvm-maintainers
mailing list