[Pkg-postgresql-public] pg_upgrade support in pg_upgradecluster

Martin Pitt mpitt at debian.org
Wed Jul 25 05:35:36 UTC 2012


Hello Peter,

just answering to your questions, as I'm about to leave for GUADEC and
then vacation. I haven't looked closely at the patch yet.

Can we please keep this out of bzr at least until after the next
upload, and getting that into testing? This doesn't look like
something we want to try and get past the release team at this point.

Peter Eisentraut [2012-07-24 15:25 +0300]:
> I have added pg_upgrade support to pg_upgradecluster.  I have it
> basically running, and I would like to discuss some details.

\o/ Rocking! I thought there was something rather fundamental still
missing from pg_upgrade that prevented us from using it so far? Not
being able to handle the config vs. data dir separation into /etc/ and
/var/? Did that get fixed upstream?

> First, invocation interface: I have used -m|--method={dump,copy,link},
> where dump is the existing method, and also the default, copy is
> pg_upgrade in copy mode, and link is pg_upgrade in link mode.  I'm not
> too happy with that, because it obscures that pg_upgrade is being
> used, and hides that copy and link are more closely related than dump.
> Maybe someone has a better idea.

Let me ask that the other way round: If we are in a situation where we
can use pg_upgrade (e. g. in Wheezy we'll only support 9.1, so the
Wheezy -> Wheezy+1 upgrade will be from 9.1), is there any reason not
to use that as a method instead of the pg_dump/pg_restore method? It
does seem to retain the old data dir etc., so that the basic
properties of p-common (running several clusters in parallel, always
able to go back to the original cluster if your upgrade doesn't
succeed) are retained.

If pg_upgrade dominates dump/restore in terms of performance and
safety, then I propose that pg_upgrade in copy mode becomes the
default for versions that it supports, and for older versions it uses
the dump/restore approach. --link seems rather dangerous in that
regard as it prevents you from using the original cluster if the
upgraded doesn't work (sufficiently well) for you, or this was just a
test upgrade to a beta version, etc. Adding an option for --link mode
seems ok, but it should get a proper warning in the
--help/man page.

> Scripts in /etc/postgresql-common/pg_upgradecluster.d: I have left it
> so that they are used in any of the methods, but I have my doubts
> about whether that is appropriate.  I couldn't find any examples of
> packages that are actually using this, so it's hard to tell.  Maybe
> it's obsolescent?

This was originally implemented for extensions like PostGIS which need
custom steps to set up the new cluster before/after dumping, but it
seems the PostGIS maintainers (or others) never acutally implement
this. If the "init" phase doesn't work any more with pg_upgrade,
perhaps we can at least leave the "finish" phase?

This has been around for 6 years, so at this point I'm hesitant about
dropping it. Some admins might have put their own scripts there.

> pg_upgradecluster is also used in a number of other tests.  I haven't
> taught those about the different upgrade methods, but that could be
> done once the interface is settled.

For tests like 070_non_postgres_clusters.t or
120_pg_upgradecluster_scripts.t which do not test the particular
details of the upgrade and just need it working at all, I think it is
sufficient to just run it in its default mode (which might very well
be pg_upgrade, see above).

If pg_upgradecluster does not even have an option which method to use
(it just selects whichever method is available), then the tests should
be fine as they are.

> -print "Restarting old cluster with restricted connections...\n";
> - at argv = ('pg_ctlcluster', $version, $cluster, 'start');
> -error "Could not restart old cluster" if system @argv;
> +if ($method eq 'dump') {
> +    print "Restarting old cluster with restricted connections...\n";
> +    @argv = ('pg_ctlcluster', $version, $cluster, 'start');
> +    error "Could not restart old cluster" if system @argv;
> +}

Is there no danger of accessing the cluster while it's being upgraded
with pg_upgrade?

> +	# Make a temporary directory for pg_upgrade to store its
> +	# reports and log files.  Will be removed on success, but
> +	# retained on error.
> +	my $tmpdir = tempdir("pg_upgradecluster-pg_upgrade-$cluster-$version-$newversion.XXXXXX", TMPDIR => 1);

How about putting that into /var/log/postgresql/ instead, and
unlinking it on success? That would be more permanent than /tmp, which
gets removed after a reboot, and it could also get a predictable file
name.


> +	# pg_upgrade doesn't support separate configuration and data
> +	# directories, so we must fix this up temporarily.
> +	foreach my $file ('postgresql.conf', 'pg_hba.conf', 'pg_ident.conf') {
> +	    symlink($info{'configdir'}."/$file", $info{'pgdata'}."/$file") or error 'could not symlink configuration file';
> +	    symlink($newinfo{'configdir'}."/$file", $newinfo{'pgdata'}."/$file") or error 'could not symlink configuration files';
> +	}

Oh, I see. That addresses my question from above :-) Could that cause
any trouble with relative paths in any of those files, which expect
ident maps etc. in /etc/postgresql/, and with this pg_upgrade suddenly
would look for them in the data dir?

How hard would it be to teach pg_upgrade about data_directory and
friends?

> +	# Run pg_upgrade.
> +	(system @argv) == 0 or error 'pg_upgrade run failed';
> +	# And remove the temporary configuration files again.
> +	foreach my $file ('postgresql.conf', 'pg_hba.conf', 'pg_ident.conf') {
> +	    unlink($newinfo{'pgdata'}."/$file");
> +	}

That should also be done if pg_upgrade fails. I. e. save the return
value of system, clean up, and then call error.

Thanks!

Martin
-- 
Martin Pitt                        | http://www.piware.de
Ubuntu Developer (www.ubuntu.com)  | Debian Developer  (www.debian.org)
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 836 bytes
Desc: Digital signature
URL: <http://lists.alioth.debian.org/pipermail/pkg-postgresql-public/attachments/20120725/c9e5ee05/attachment.pgp>


More information about the Pkg-postgresql-public mailing list