[Logcheck-devel] Bug#657641: /usr/sbin/logcheck: line 100: kill: (31667) - No such process
matthijs
matthijs at stdin.nl
Thu Sep 28 07:56:42 UTC 2017
Package: logcheck
Version: 1.3.18
Followup-For: Bug #657641
Hi,
this problem still exists in current versions, I just ran into it. It
seems to occur when there are a large number of log lines to process, so
that the previous run of logcheck is not complete when the next one is
started. I just caught a case of three logchecks running in parallel,
which is really what this locking is supposed to prevent.
Turns out the locking is actually ineffective (see below), allowing
multiple logcheck runs to run at the same time. Once the first one
finishes, it kills its own lockfile-touch and removes the lockfile.
I couldn't find any docs about how lockfile-touch behaves
when its lockfile is deleted, but the source confirms that it exits with
an error code. Due to this, when the second logcheck run completes, its
lockfile-touch process will have quit, leading to the kill error
message.
Doing some tests shows that the cause of this problem are these lines [1]:
lockfile-create --retry 1 "$LOCKFILE" > /dev/null 2>&1
if [ $? -eq 1 ]; then
# Locked, error and quit
This only treats an exit code of 1 as an error, while lockfile-create
actually returns "4" in this case [2]. Changing this to `$? -ne 0`, or
even more compact:
if ! lockfile-create --retry 1 "$LOCKFILE" > /dev/null 2>&1; then
will fix this problem.
Gr.
Matthijs
[1]: https://anonscm.debian.org/cgit/logcheck/logcheck.git/tree/src/logcheck#n633
[2]: https://github.com/miquels/liblockfile/blob/master/lockfile.h#L31-L38
More information about the Logcheck-devel
mailing list