[pkg-db-devel] Bug#622909: Deadlock in libdb4.5

Fredrik Tolf fredrik at dolda2000.com
Fri Apr 15 18:52:15 UTC 2011


Package: libdb4.5
Version: 4.5.20-13
Severity: normal

I'm using db4.5 via the bsddb module in Python in a multithreaded
program, and am having problems with deadlocking where, every few
days, two or more threads deadlock inside db4.5.

I've been over my code (which really isn't very complex at all) many
times by now, and I can hardly even imagine anymore that I could be
doing anything wrong, so I'm suspecting some kind of bug in libdb4.5
itself. I'm opening the environment the DB_THREAD and DB_INIT_LOCK,
and the databases themselves with DB_THREAD, and from all I know, that
should be enough to ensure libdb doesn't deadlock.

I've been able to make two instances of the program coredump under the
deadlock condition, and in both cases there were three threads
deadlocking with the exact same call stacks in both instances. They
look like this.

Thread 1:
#0  0x00007f027543fd29 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib/libpthread.so.0
#1  0x00007f0271e529be in __db_pthread_mutex_lock () from /usr/lib/libdb-4.5.so
#2  0x00007f0271f01f4a in __lock_get_internal () from /usr/lib/libdb-4.5.so
#3  0x00007f0271f02056 in __lock_get () from /usr/lib/libdb-4.5.so
#4  0x00007f0271eda293 in __db_lget () from /usr/lib/libdb-4.5.so
#5  0x00007f0271e7cf45 in __ham_get_meta () from /usr/lib/libdb-4.5.so
#6  0x00007f0271e760dc in ?? () from /usr/lib/libdb-4.5.so
#7  0x00007f0271ecc84c in __db_c_get () from /usr/lib/libdb-4.5.so
#8  0x00007f0271ed8002 in __db_get () from /usr/lib/libdb-4.5.so
#9  0x00007f0271ed82de in __db_get_pp () from /usr/lib/libdb-4.5.so
#10 0x00007f027216d5ba in DB_subscript (self=0x10f62b0, keyobj=0x144fc48) at /build/buildd-python2.5_2.5.2-15+lenny1-amd64-gBDyED/python2.5-2.5.2/Modules/_bsddb.c:2792
(More frames follow inside the Python interpreter)

Thread 2:
#0  0x00007f027543fd29 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib/libpthread.so.0
#1  0x00007f0271e529be in __db_pthread_mutex_lock () from /usr/lib/libdb-4.5.so
#2  0x00007f0271f01f4a in __lock_get_internal () from /usr/lib/libdb-4.5.so
#3  0x00007f0271f02056 in __lock_get () from /usr/lib/libdb-4.5.so
#4  0x00007f0271eda293 in __db_lget () from /usr/lib/libdb-4.5.so
#5  0x00007f0271edb80c in __db_new () from /usr/lib/libdb-4.5.so
#6  0x00007f0271e7e78c in __ham_add_ovflpage () from /usr/lib/libdb-4.5.so
#7  0x00007f0271e7f103 in __ham_add_el () from /usr/lib/libdb-4.5.so
#8  0x00007f0271e75905 in ?? () from /usr/lib/libdb-4.5.so
#9  0x00007f0271eceb8a in __db_c_put () from /usr/lib/libdb-4.5.so
#10 0x00007f0271ec6b10 in __db_put () from /usr/lib/libdb-4.5.so
#11 0x00007f0271ed6480 in __db_put_pp () from /usr/lib/libdb-4.5.so
#12 0x00007f027216db81 in DB_ass_sub (self=0x10f62b0, keyobj=<value optimized out>, dataobj=0x11ebd00) at /build/buildd-python2.5_2.5.2-15+lenny1-amd64-gBDyED/python2.5-2.5.2/Modules/_bsddb.c:678

Thread 3:
#0  0x00007f027543fd29 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib/libpthread.so.0
#1  0x00007f0271e529be in __db_pthread_mutex_lock () from /usr/lib/libdb-4.5.so
#2  0x00007f0271f01f4a in __lock_get_internal () from /usr/lib/libdb-4.5.so
#3  0x00007f0271f02056 in __lock_get () from /usr/lib/libdb-4.5.so
#4  0x00007f0271eda293 in __db_lget () from /usr/lib/libdb-4.5.so
#5  0x00007f0271edb80c in __db_new () from /usr/lib/libdb-4.5.so
#6  0x00007f0271e7e78c in __ham_add_ovflpage () from /usr/lib/libdb-4.5.so
#7  0x00007f0271e7f103 in __ham_add_el () from /usr/lib/libdb-4.5.so
#8  0x00007f0271e75905 in ?? () from /usr/lib/libdb-4.5.so
#9  0x00007f0271eceb8a in __db_c_put () from /usr/lib/libdb-4.5.so
#10 0x00007f0271ec6b10 in __db_put () from /usr/lib/libdb-4.5.so
#11 0x00007f0271ed6480 in __db_put_pp () from /usr/lib/libdb-4.5.so
#12 0x00007f027216db81 in DB_ass_sub (self=0x10f62b0, keyobj=<value optimized out>, dataobj=0x194d780) at /build/buildd-python2.5_2.5.2-15+lenny1-amd64-gBDyED/python2.5-2.5.2/Modules/_bsddb.c:678

I suspect that the first thread (in db->get) is just collateral
damage, and that it is threads 2 and 3 that cause the deadlock
via the __ham_add_ovflpage function. Because it sounds like an
important function with global side effects on the database. :-)

Of course, the troubleshooting is made harder by the fact that
there are no debugging symbols to be had for libdb4.5.

-- System Information:
Debian Release: 5.0.8
  APT prefers oldstable
  APT policy: (500, 'oldstable')
Architecture: amd64 (x86_64)

Kernel: Linux 2.6.26-2-amd64 (SMP w/8 CPU cores)
Locale: LANG=en_US.UTF-8, LC_CTYPE=en_US.UTF-8 (charmap=UTF-8)
Shell: /bin/sh linked to /bin/bash

Versions of packages libdb4.5 depends on:
ii  libc6                       2.7-18lenny7 GNU C Library: Shared libraries

libdb4.5 recommends no packages.

libdb4.5 suggests no packages.

-- no debconf information





More information about the pkg-db-devel mailing list