[Pkg-postgresql-public] Bug#828769: pgpool2: stale socket files after systemd kills pgpool processes during

Debian BTS debbugs at buxtehude.debian.org
Mon Jun 27 16:36:05 UTC 2016


stop
Reply-To: Andreas Unterkircher <unki at netshadow.at>, 828769 at bugs.debian.org
Resent-From: Andreas Unterkircher <unki at netshadow.at>
Resent-To: debian-bugs-dist at lists.debian.org
Resent-CC: Debian PostgreSQL Maintainers <pkg-postgresql-public at lists.alioth.debian.org>
X-Loop: owner at bugs.debian.org
Resent-Date: Mon, 27 Jun 2016 16:36:01 +0000
Resent-Message-ID: <handler.828769.B.146704518329670 at bugs.debian.org>
Resent-Sender: owner at bugs.debian.org
X-Debian-PR-Message: report 828769
X-Debian-PR-Package: pgpool2
X-Debian-PR-Keywords: 
X-Debian-PR-Source: pgpool2
Received: via spool by submit at bugs.debian.org id=B.146704518329670
          (code B); Mon, 27 Jun 2016 16:36:01 +0000
Received: (at submit) by bugs.debian.org; 27 Jun 2016 16:33:03 +0000
X-Spam-Checker-Version: SpamAssassin 3.4.0-bugs.debian.org_2005_01_02
	(2014-02-07) on buxtehude.debian.org
X-Spam-Level: 
X-Spam-Status: No, score=-10.8 required=4.0 tests=BAYES_00,FOURLA,HAS_PACKAGE,
	RCVD_IN_DNSWL_MED,WORD_WITHOUT_VOWELS,XMAILER_REPORTBUG autolearn=ham
	autolearn_force=no version=3.4.0-bugs.debian.org_2005_01_02
X-Spam-Bayes: score:0.0000 Tokens: new, 27; hammy, 150; neutral, 179; spammy,
	0. spammytokens: hammytokens:0.000-+--systemd, 0.000-+--H*x:6.6.3,
	0.000-+--H*UA:6.6.3, 0.000-+--31604amd64, 0.000-+--3.16.0-4-amd64
Received: from iris.mm-karton.com ([2a04:85c0::250:56ff:feb6:6a37])
	by buxtehude.debian.org with esmtps (TLS1.2:DHE_RSA_AES_128_CBC_SHA1:128)
	(Exim 4.84_2)
	(envelope-from <unki at netshadow.at>)
	id 1bHZT5-0007dz-4T
	for submit at bugs.debian.org; Mon, 27 Jun 2016 16:33:03 +0000
Received: from srv-ha-exc20.mmk.mmdom.net ([10.128.28.35])
	by iris.mm-karton.com with esmtps (TLS1.2:RSA_AES_256_CBC_SHA1:256)
	(Exim 4.80)
	(envelope-from <unki at netshadow.at>)
	id 1bHZ8b-0003vo-TZ; Mon, 27 Jun 2016 18:11:53 +0200
Received: from srv-ha-exc20.mmk.mmdom.net (10.128.28.35) by
 srv-ha-exc20.mmk.mmdom.net (10.128.28.35) with Microsoft SMTP Server (TLS) id
 15.0.1130.7; Mon, 27 Jun 2016 18:11:53 +0200
Received: from nyx.vie.mm-karton.com (10.128.30.24) by
 srv-ha-exc20.mmk.mmdom.net (10.128.28.35) with Microsoft SMTP Server id
 15.0.1130.7 via Frontend Transport; Mon, 27 Jun 2016 18:11:52 +0200
Received: from [10.1.128.230] (helo=lnx-vie-430.vie.mm-karton.com)
	by nyx.vie.mm-karton.com with esmtps (TLS1.2:RSA_AES_128_CBC_SHA1:128)
	(Exim 4.80)
	(envelope-from <unki at netshadow.at>)
	id 1bHZ8a-0006Bj-Vh; Mon, 27 Jun 2016 18:11:52 +0200
Received: from unki by lnx-vie-430.vie.mm-karton.com with local (Exim 4.84_2)
	(envelope-from <unki at netshadow.at>)
	id 1bHZ8a-0007DN-Mw; Mon, 27 Jun 2016 18:11:52 +0200
Content-Type: text/plain; charset="us-ascii"
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit
From: Andreas Unterkircher <unki at netshadow.at>
To: Debian Bug Tracking System <submit at bugs.debian.org>
Message-ID: <20160627161152.25874.58895.reportbug at lnx-vie-430.vie.mm-karton.com>
X-Mailer: reportbug 6.6.3
Date: Mon, 27 Jun 2016 18:11:52 +0200
X-Greylist: delayed 1259 seconds by postgrey-1.35 at buxtehude; Mon, 27 Jun 2016 16:33:02 UTC
Delivered-To: submit at bugs.debian.org

Package: pgpool2
Version: 3.4.3-1
Severity: minor

Dear Maintainer,

I'm having an issue with stale pgpool2 socket files if systemd had killed the
pgpool processes while in progress of stopping pgpool2.

First of all, there is a systemd.service file for pgpool2 in
/lib/systemd/system/pgpool2.service. As it does not specify an explicit
ExecStop, default systemd.kill is used - first all pgpool processes will receive
a SIGTERM signal and, after a timeout of 90s, all remaining processes get a
final SIGKILL.

In my setups, sometimes stopping pgpool2 takes longer than the above mentioned
timeout (maybe long running SQL queries or so) - here SIGKILL kicks in. In
(rare) cases where pgpool somehow has locked up it may also refuse to kindly
exit cleanly - also here SIGKILL will used when trying to use systemd to stop
pgpool2. When the processes get killed hard, they will leave residues in
$socket_dir.

srwxrwxrwx 1 postgres postgres 0 Jun 23 07:02 /var/run/postgresql/.s.PGSQL.5432
srwxrwxrwx 1 postgres postgres 0 Jun 23 07:02 /var/run/postgresql/.s.PGSQL.9898

(On the next start|during a restart), pgpool2 will finally fail to start
claiming that the socket file(s) already exist(s). Like:

pgpool[7643]: [5-1] 2016-06-27 16:58:00: pid 7643: FATAL:  failed to bind a
   socket: "/var/run/postgresql/.s.PGSQL.9898"
pgpool[7643]: [5-2] 2016-06-27 16:58:00: pid 7643: DETAIL:  bind socket failed
   with error: "Address already in use"

As a first-aid I have added two ExecStartPre's to pgpool2.service that will
cleanup the sockets - if there is no more running pgpool process - before trying
to start pgpool2.

ExecStartPre=/bin/bash -c "{ /usr/bin/test -s /var/run/postgresql/pgpool.pid && ! /usr/bin/pgrep --pidfile /var/run/postgresql/pgpool.pid; } || { /usr/bin/test ! -s /var/run/postgresql/pgpool.pid && ! /usr/bin/pgrep pgpool; } && { /usr/bin/test -S /var/run/postgresql/.s.PGSQL.9898 && /bin/rm /var/run/postgresql/.s.PGSQL.9898; } || /usr/bin/test ! -S /var/run/postgresql/.s.PGSQL.9898"
ExecStartPre=/bin/bash -c "{ /usr/bin/test -s /var/run/postgresql/pgpool.pid && ! /usr/bin/pgrep --pidfile /var/run/postgresql/pgpool.pid; } || { /usr/bin/test ! -s /var/run/postgresql/pgpool.pid && ! /usr/bin/pgrep pgpool; } && { /usr/bin/test -S /var/run/postgresql/.s.PGSQL.5432 && /bin/rm /var/run/postgresql/.s.PGSQL.5432; } || /usr/bin/test ! -S /var/run/postgresql/.s.PGSQL.5432"

Rather hacky with the hardcoded socket filenames.  Otherwise IMHO 'port' and
'pcp_port' parameters from pgpool.conf needs to be looked up first to guess the
socket names.

My question - do you see a cleaner way to handle that situation?

-- System Information:
Debian Release: 8.5
  APT prefers stable
  APT policy: (500, 'stable')
Architecture: amd64 (x86_64)
Foreign Architectures: i386

Kernel: Linux 3.16.0-4-amd64 (SMP w/3 CPU cores)
Locale: LANG=en_US.UTF-8, LC_CTYPE=en_US.UTF-8 (charmap=UTF-8)
Shell: /bin/sh linked to /bin/dash
Init: systemd (via /run/systemd/system)



More information about the Pkg-postgresql-public mailing list