Bug#842796: libc recently more aggressive about pthread locks in stable ?

Ian Jackson ijackson at chiark.greenend.org.uk
Sun Nov 6 20:02:56 UTC 2016


Henrique de Moraes Holschuh writes ("Re: libc recently more aggressive about pthread locks in stable ?"):
> Per logs from message #15 on bug #842796:
> https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=842796#15
> 
> SIGSEGV on __lll_unlock_elision is a signature (IME with very high
> confidence) of an attempt to unlock an already unlocked lock while
> running under hardware lock elision.

I don't know anything about hardware lock elision...

> Well, unlocking an already unlocked lock is a pthreads API rule
> violation, and it is going to crash the process on something that
> implements hardware lock elision.

... but you are of course correct about this.  I debugged the problem
with ghostscript, and it was indeed violating the pthreads rules.  I
have filed #843324 with a patch for Debian to backport the
corresponding upstream fix.  I don't understand the wider logic in
ghostscript; the bug was in the colour space management code and
occurred when a function was called with two pointer arguments which
were actually aliases of the same colourspace-related data structure.
Converting ghostscript to use recursive mutexes was IMO clearly
correct and fixed the bug.

> If the problem is too widespread and too hard to fix on a large number
> of packages, I suppose we could ask the glibc maintainers to consider
> disabling hardware lock elision support in stable through a stable
> update.

I think this would be a good idea.

ogg123 and ghostscript are hardly obscure programs.  It's difficult to
know how bad this problem is, but we would like stable to be useful
even on recent hardware.

> And what should we do about Debian stretch, then?

Perhaps we could add the assert you suggest, on non-lock-elision
hardware.  Whether to do that would depend on its performance impact.

TBH I wonder whether we really want to be giving an evidently shonky
codebase boobytrapped mutexes by default.  We could change the default
mutex type to recursive and make all of these bugs go away.

Ian.

-- 
Ian Jackson <ijackson at chiark.greenend.org.uk>   These opinions are my own.

If I emailed you from an address @fyvzl.net or @evade.org.uk, that is
a private address which bypasses my fierce spamfilter.



More information about the pkg-xiph-maint mailing list