sarge2etch upgrade destroys xfs filesystem on lvm2 over md-filesystem

Thomas Stegbauer thomas at stegbauer.info
Sat Apr 21 12:07:54 UTC 2007


hi all,

sorry for cross-mailing.

i upgraded two machine's from latest sarge 3.1r5 kernel 2.6.8 to debian etch.
the machine's are completly differntly, the one is celeron or sempron with two ide harddisk's the
other is a fsc-server econel 50 with intel chipset and pentium4 whith four sata-drive's.
what was identical?
debian sarge 3.1r5
md raid1 device's
lvm2
xfs on all lv's and root (on /dev/md0)

i found on internet a similar problem:
http://www.debianhelp.org/node/6006
which has an other hardware, but software looks identical.

while upgrading filesystem's on lvm's get shutdown:
the kern.log shows the following:

Apr 10 16:18:05 hornet kernel: XFS internal error XFS_WANT_CORRUPTED_GOTO at line 1583 of file
fs/xfs/xfs_alloc.c.  Caller 0xf8935305
Apr 10 16:18:05 hornet kernel:  [<f8934091>] xfs_free_ag_extent+0x471/0x7a0 [xfs]
Apr 10 16:18:05 hornet kernel:  [<f8935305>] xfs_free_extent+0xe5/0x110 [xfs]
Apr 10 16:18:05 hornet kernel:  [<f8935305>] xfs_free_extent+0xe5/0x110 [xfs]
Apr 10 16:18:05 hornet kernel:  [<f89977fc>] kmem_zone_alloc+0x4c/0xa0 [xfs]
Apr 10 16:18:05 hornet kernel:  [<f8968686>] xfs_efd_init+0x86/0x90 [xfs]
Apr 10 16:18:05 hornet kernel:  [<f898bee8>] xfs_trans_get_efd+0x38/0x50 [xfs]
Apr 10 16:18:05 hornet kernel:  [<f8948b8f>] xfs_bmap_finish+0x13f/0x1e0 [xfs]
Apr 10 16:18:05 hornet kernel:  [<f8992e7e>] xfs_remove+0x2fe/0x500 [xfs]
Apr 10 16:18:05 hornet kernel:  [<f899f0f0>] linvfs_unlink+0x30/0x70 [xfs]
Apr 10 16:18:05 hornet kernel:  [<c017267a>] vfs_unlink+0x10a/0x1e0
Apr 10 16:18:05 hornet kernel:  [<c01727fe>] sys_unlink+0xae/0x130
Apr 10 16:18:05 hornet kernel:  [<c0175b60>] sys_getdents64+0xa0/0xaa
Apr 10 16:18:05 hornet kernel:  [<c01759c0>] filldir64+0x0/0x100
Apr 10 16:18:05 hornet kernel:  [<c01061eb>] syscall_call+0x7/0xb
Apr 10 16:18:05 hornet kernel: xfs_force_shutdown(dm-2,0x8) called from line 4049 of file
fs/xfs/xfs_bmap.c.  Return address = 0xf89a244b
Apr 10 16:18:05 hornet kernel: Filesystem "dm-2": Corruption of in-memory data detected.  Shutting
down filesystem: dm-2
Apr 10 16:18:05 hornet kernel: Please umount the filesystem, and rectify the problem(s)
Apr 10 16:22:06 hornet kernel: xfs_force_shutdown(dm-2,0x1) called from line 353 of file
fs/xfs/xfs_rw.c.  Return address = 0xf89a244b
Apr 10 16:25:04 hornet kernel: xfs_force_shutdown(dm-2,0x1) called from line 353 of file
fs/xfs/xfs_rw.c.  Return address = 0xf89a244b

on the fsc machine this happened on /tmp and /var
on the other server, it destroyed /tmp/ and after rebooting with 2.6.18-4 /usr was gone after a while.

the only solution was: umount the partition (if possible, otherwise start a rescue-system) and run a
xfs_repair. on the "other server" (not fsc ;) cant login currently), xfs_repair failed, cause it
complained about a unreplayed log, mounting/unmounting, didn't replay it. so i had to recover with
xfs_repair -L, where the log get zeroed. happily all data of tmp still exist's, of course, cause it
wasn't important ;)

i checked already bugs for xfsprogs and linux-2.6, there was nothing for xfs-progs and several xfs
bugs in kernel-2.6, the maybe nearest was
http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=410204 but there the problem seems to be in dm-crypt.

any ideas?

thomas



-- 
# Thomas Stegbauer
# https://keyserver1.pgp.com/vkd/DownloadKey.event?keyid=0x9A3F1866FC68E91D
# Key fingerprint = 5A2D FEDC 8A50 F1BB 25FB  967B 9A3F 1866 FC68 E91D


-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 252 bytes
Desc: OpenPGP digital signature
Url : http://lists.alioth.debian.org/pipermail/pkg-lvm-maintainers/attachments/20070421/c8685a9d/signature.pgp


More information about the pkg-lvm-maintainers mailing list