[Pkg-postgresql-public] Bug#839954: PostgreSQL packages in Debian, problems on startup

Thu Oct 6 18:11:35 UTC 2016

Package: postgresql-common

Re: deavid 2016-10-05 <CAFR-75vif=hSH5E3f2SQSXV39vkNjcZVk+KQ3ya2UN6P25RziQ at mail.gmail.com>
> Hello Cristoph,
> 
> First of all, thank you a lot for your hard work on your tools for
> postgresql clusters. It is very helpful indeed. PostgreSQL is a lot easier
> to work with in Debian & Ubuntu thanks to those tools.
> 
> I want to inform you about an issue, maybe because several bugs or misuse
> for my part. I don't know.
> 
> The issue itself can be explained as: init.d scripts manage a list of
> servers/clusters different than the one in pg_lscluster.
> 
> The effect is always the same, I end with some special cluster which
> doesn't start or stop when called from /etc/init.d/postgresql start/stop.
> When I try to manually start it, it does without problems.
> 
> I had several talks in IRC on #postgres channel of Freenode, also i asked
> on #debian and #debian-next and no one had a clue.
> 
> Finally I traced the problem. And it seems to be a lack of documentation,
> and maybe some bugs. I could trace it thanks to your documentation in
> README files of postgresql-common package, but I think it could be better
> explained.
> 
> The problem lies in that init.d files rely on systemctl, which I don't know
> how it works. Seems the second list is in
> /var/run/systemd/generator/postgresql.service.wants/
> 
> I would recommend to add a tool to check this folder and do some repair,
> and output a log of possible problems.
> 
> I've found 3 ways to get a broken list in systemd:
> 
> 1) Failed pg_upgradecluster due to failure on postgresql.conf migration:
> From 9.3 to 9.5 several options have changed and older options are no
> longer supported. If you have modified those options then they are
> uncommented. Most notable is checkpoint_segments, which the default is too
> low, and disappears in newer versions. When the script tries to finally
> "start" the cluster, it fails, and forgots to update systemd services; so
> when you manually fix postgresql.conf there's no way to "resume" the
> upgrade.

> On the pg_upgradecluster failure, is an old problem now. I believe it was
> wheezy, so it should be version 134wheezy4 or similar.
> For the other two, they ocurred this week on an updated Jessie system, so
> they should be version 165+deb8u1

Hi deavid,

both the unknown options and the systemd integration with
pg_upgradecluster have been fixed in the meantime; unfortunately not
in the jessie version of postgresql-common. Though if you say
9.5, is that from apt.postgresql.org? If so, make sure to upgrade to
the postgresql-common version from there as well.

> 2) Placing a dash (-) in the cluster name. Seems its a separator for your
> scripts. When you do this, pg_createcluster doesn't notice and continues,
> leaving you with a working cluster that doesn't start on boot.

This has also been, well, addressed. It's not really fixable, see
/lib/systemd/system/postgresql\@.service because of the way how
instanced systemd service units work. pg_createcluster will now warn
if you create a cluster with a dash in the name.

> 3) Using pg_renamecluster. It leaves old names in systemd.

That item is still open, thanks for spotting.

> I would recommend:
> * Add --skip-systemctl-redirect to the man page of pg_ctlcluster. It is
> useful for debug problems. I had to drill down through the perl scripts to
> find the option. (I do not know perl)

Aye.

> * Document in pg_ctlcluster how to fix if one cluster doesn't start on boot.

Possibly, will think about it.

> What I did manually to fix this:
> * Modified the names of symlinks in
> /var/run/systemd/generator/postgresql.service.wants/

"systemctl daemon-reload" should have done that (that's what was
missing in the old pg_upgradecluster version).

> * systemctl daemon-reload
> * systemctl stop postgresql
> * pg_lsclusters to see if anyone is still runing, and stop it with
> "pg_ctlcluster --skip-systemctl-redirect"

That was the other half of the breakage; the cluster would not be
started via systemd during the upgrade.

> * systemctl start postgresql
> * pg_lsclusters to see if every cluster is running again.
> 
> I've checked a dozen times /etc/postgresql/9.5/*/start.conf ; but it has
> its original contents. I tried "manual" and it doesn't help, so i put
> "auto" again on it.

If you change start.conf, you need to "systemctl daemon-reload" to get
the generator symlinks updated.

Christoph