Bug#580312: multipath-tools: multipathd segfaults when restarted

Vincent.McIntyre at csiro.au Vincent.McIntyre at csiro.au
Mon May 10 01:09:41 UTC 2010


I changed the setup slightly, connecting the storage unit and the host
to a FC switch. There are still two LUNs and now there are 4 paths to
each.

# multipath -l
mpath1 (2227300015530e20d) dm-1 Promise ,VTrak E610f
[size=13T][features=1 queue_if_no_path][hwhandler=0]
\_ round-robin 0 [prio=0][active]
 \_ 1:0:0:1 sdf 8:80  [active][undef]
 \_ 1:0:1:1 sdh 8:112 [active][undef]
 \_ 1:0:2:1 sdj 8:144 [active][undef]
 \_ 1:0:3:1 sdl 8:176 [active][undef]
mpath0 (2228f000155e2acda) dm-0 Promise ,VTrak E610f
[size=13T][features=1 queue_if_no_path][hwhandler=0]
\_ round-robin 0 [prio=0][active]
 \_ 1:0:0:0 sde 8:64  [active][undef]
 \_ 1:0:1:0 sdg 8:96  [active][undef]
 \_ 1:0:2:0 sdi 8:128 [active][undef]
 \_ 1:0:3:0 sdk 8:160 [active][undef]

I could go back to the other configuration briefly if you wish.


> Could you run multipathd under valgrind please?

I ran it, then tried to stop with the init script (which didn't seem
to work) and then tried with 'kill -HUP' and then just 'kill'.
Somehow the last command caused my shell to attach to the process,
I then hit ^C.
Details below.

I tried this twice, with and without the filesystems mounted.
Results were similar. Between mounting and running the second time
I briefly exercised the filesystems by copying a bit of data from one to
the other. I didn't try to stop/start under load.

# /etc/init.d/multipath-tools stop
# ps -fade|grep multi
root     12124 12098  0 10:45 pts/2    00:00:00 grep multi

# valgrind multipathd
==12134== Memcheck, a memory error detector.
==12134== Copyright (C) 2002-2007, and GNU GPL'd, by Julian Seward et al.
==12134== Using LibVEX rev 1854, a library for dynamic binary translation.
==12134== Copyright (C) 2004-2007, and GNU GPL'd, by OpenWorks LLP.
==12134== Using valgrind-3.3.1-Debian, a dynamic binary instrumentation framework.
==12134== Copyright (C) 2000-2007, and GNU GPL'd, by Julian Seward et al.
==12134== For more details, rerun with: -v
==12134==
==12135==
==12135== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 8 from 1)
==12135== malloc/free: in use at exit: 1,120 bytes in 23 blocks.
==12135== malloc/free: 48 allocs, 25 frees, 4,716 bytes allocated.
==12135== For counts of detected errors, rerun with: -v
==12135== searching for pointers to 23 not-freed blocks.
==12135== checked 208,864 bytes.
==12135==
==12135== LEAK SUMMARY:
==12135==    definitely lost: 0 bytes in 0 blocks.
==12135==      possibly lost: 0 bytes in 0 blocks.
==12135==    still reachable: 1,120 bytes in 23 blocks.
==12135==         suppressed: 0 bytes in 0 blocks.
==12135== Rerun with --leak-check=full to see details of leaked memory.
==12134==
==12134== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 8 from 1)
==12134== malloc/free: in use at exit: 216 bytes in 1 blocks.
==12134== malloc/free: 48 allocs, 47 frees, 4,716 bytes allocated.
==12134== For counts of detected errors, rerun with: -v
==12134== searching for pointers to 1 not-freed blocks.
==12134== checked 208,048 bytes.
==12134==
==12134== LEAK SUMMARY:
==12134==    definitely lost: 0 bytes in 0 blocks.
==12134==      possibly lost: 0 bytes in 0 blocks.
==12134==    still reachable: 216 bytes in 1 blocks.
==12134==         suppressed: 0 bytes in 0 blocks.
==12134== Rerun with --leak-check=full to see details of leaked memory.

# ps -fade|grep multip
root     12136     1  5 10:45 ?        00:00:00 /usr/bin/valgrind.bin multipathd
root     12166 12098  0 10:45 pts/2    00:00:00 grep multip

# /etc/init.d/multipath-tools stop
Stopping multipath daemon: multipathd.

# ps -fade|grep multip
root     12136     1  2 10:45 ?        00:00:00 /usr/bin/valgrind.bin multipathd
root     12172 12098  0 10:45 pts/2    00:00:00 grep multip

# kill -HUP 12136
# ps -fade|grep multip
root     12136     1  2 10:45 ?        00:00:00 /usr/bin/valgrind.bin multipathd
root     12190 12098  0 10:45 pts/2    00:00:00 grep multip

# kill 12136
==12136== Thread 9:
==12136== Invalid read of size 4
==12136==    at 0x4E2E4FE: pthread_mutex_lock (in /lib/libpthread-2.7.so)
==12136==    by 0x42B596: (within /sbin/multipathd)
==12136==    by 0x42BC83: (within /sbin/multipathd)
==12136==    by 0x4E2CFC6: start_thread (in /lib/libpthread-2.7.so)
==12136==    by 0x59A959C: clone (in /lib/libc-2.7.so)
==12136==  Address 0x6068788 is 8 bytes inside a block of size 40 free'd
==12136==    at 0x4C2130F: free (vg_replace_malloc.c:323)
==12136==    by 0x415770: xfree (in /sbin/multipathd)
==12136==    by 0x4069F8: (within /sbin/multipathd)
==12136==    by 0x406D98: (within /sbin/multipathd)
==12136==    by 0x58F81A5: (below main) (in /lib/libc-2.7.so)
==12136==
==12136== Invalid read of size 4
==12136==    at 0x4E2E509: pthread_mutex_lock (in /lib/libpthread-2.7.so)
==12136==    by 0x42B596: (within /sbin/multipathd)
==12136==    by 0x42BC83: (within /sbin/multipathd)
==12136==    by 0x4E2CFC6: start_thread (in /lib/libpthread-2.7.so)
==12136==    by 0x59A959C: clone (in /lib/libc-2.7.so)
==12136==  Address 0x606878c is 12 bytes inside a block of size 40 free'd
==12136==    at 0x4C2130F: free (vg_replace_malloc.c:323)
==12136==    by 0x415770: xfree (in /sbin/multipathd)
==12136==    by 0x4069F8: (within /sbin/multipathd)
==12136==    by 0x406D98: (within /sbin/multipathd)
==12136==    by 0x58F81A5: (below main) (in /lib/libc-2.7.so)
==12136==
==12136== Invalid write of size 4
==12136==    at 0x4E2E50D: pthread_mutex_lock (in /lib/libpthread-2.7.so)
==12136==    by 0x42B596: (within /sbin/multipathd)
==12136==    by 0x42BC83: (within /sbin/multipathd)
==12136==    by 0x4E2CFC6: start_thread (in /lib/libpthread-2.7.so)
==12136==    by 0x59A959C: clone (in /lib/libc-2.7.so)
==12136==  Address 0x6068788 is 8 bytes inside a block of size 40 free'd
==12136==    at 0x4C2130F: free (vg_replace_malloc.c:323)
==12136==    by 0x415770: xfree (in /sbin/multipathd)
==12136==    by 0x4069F8: (within /sbin/multipathd)
==12136==    by 0x406D98: (within /sbin/multipathd)
==12136==    by 0x58F81A5: (below main) (in /lib/libc-2.7.so)
==12136==
==12136== Invalid read of size 8
==12136==    at 0x42B5E8: (within /sbin/multipathd)
==12136==    by 0x42BC83: (within /sbin/multipathd)
==12136==    by 0x4E2CFC6: start_thread (in /lib/libpthread-2.7.so)
==12136==    by 0x59A959C: clone (in /lib/libc-2.7.so)
==12136==  Address 0x6068738 is 0 bytes inside a block of size 24 free'd
==12136==    at 0x4C2130F: free (vg_replace_malloc.c:323)
==12136==    by 0x415770: xfree (in /sbin/multipathd)
==12136==    by 0x406A0C: (within /sbin/multipathd)
==12136==    by 0x406D98: (within /sbin/multipathd)
==12136==    by 0x58F81A5: (below main) (in /lib/libc-2.7.so)
==12136==
==12136== Invalid read of size 4
==12136==    at 0x4E2FBF5: __pthread_mutex_unlock_usercnt (in /lib/libpthread-2.7.so)
==12136==    by 0x42B5EF: (within /sbin/multipathd)
==12136==    by 0x42BC83: (within /sbin/multipathd)
==12136==    by 0x4E2CFC6: start_thread (in /lib/libpthread-2.7.so)
==12136==    by 0x59A959C: clone (in /lib/libc-2.7.so)
==12136==  Address 0x10 is not stack'd, malloc'd or (recently) free'd
==12136==
==12136==
==12136== Process terminating with default action of signal 11 (SIGSEGV)
==12136==  Access not within mapped region at address 0x10
==12136==    at 0x4E2FBF5: __pthread_mutex_unlock_usercnt (in /lib/libpthread-2.7.so)
==12136==    by 0x42B5EF: (within /sbin/multipathd)
==12136==    by 0x42BC83: (within /sbin/multipathd)
==12136==    by 0x4E2CFC6: start_thread (in /lib/libpthread-2.7.so)
==12136==    by 0x59A959C: clone (in /lib/libc-2.7.so)
==12136==
==12136== ERROR SUMMARY: 6 errors from 5 contexts (suppressed: 9 from 2)
==12136== malloc/free: in use at exit: 13,258 bytes in 44 blocks.
==12136== malloc/free: 3,990 allocs, 3,946 frees, 3,016,169 bytes allocated.
==12136== For counts of detected errors, rerun with: -v
==12136== searching for pointers to 44 not-freed blocks.
==12136== checked 399,480 bytes.
==12136==
==12136== LEAK SUMMARY:
==12136==    definitely lost: 0 bytes in 0 blocks.
==12136==      possibly lost: 1,152 bytes in 4 blocks.
==12136==    still reachable: 12,106 bytes in 40 blocks.
==12136==         suppressed: 0 bytes in 0 blocks.
==12136== Rerun with --leak-check=full to see details of leaked memory.
^C

# !ps
root     12196 12098  0 10:46 pts/2    00:00:00 grep multip

# /etc/init.d/multipath-tools start
Starting multipath daemon: multipathd.
# !ps
root     12207     1  0 10:46 ?        00:00:00 /sbin/multipathd
root     12236 12230  0 10:46 pts/2    00:00:00 grep multi







More information about the pkg-lvm-maintainers mailing list