[Buildd-tools-devel] RFC: Creation of a PostgreSQL database schema for wanna-build data

Adeodato Simó dato at net.com.org.es
Mon Jan 26 18:35:08 UTC 2009


* Marc Brockschmidt [Mon, 26 Jan 2009 19:25:08 +0100]:

> > We need to figure out how to store extra access. In principle, a simple
> > table "extra_access" with (login, arch) rows should suffice. However, I
> > would like there's a way to say "all arches", and neither of the two
> > ways I thought of doing that pleased me much.

> > (Way #1: have a boolean in "extra_access"; then the "arch" column
> > becomes meaningless for that row. Way #2: allow an "all" value for arch
> > in extra_access; then, you can't reference architectures and have to
> > reference package_architectures instead.)

> I would like to join the information in extra_access (or however it is
> named) with the buildd_admins table anyway, creating an "access"
> view. In that process, translating "all" to an entry for each arch
> shouldn't be too hard. It would allow to do actual access control
> against a single view, instead of checking two separate tables.

Yeah, that's exactly what I had in mind. In fact, I have the view
written.

The question was what to point the "arch" column in extra_access to: the
"architectures" table does not have "all", and pointing it to
package_architectures just to gain "all" doesn't appeal me much.

> > Ok, I added those as well, but without a NOT NULL restriction, otherwise
> > it'll be impossible to import old logs. If somebody cares enough, once
> > we have this DB up and the old logs imported we can run a daily job to
> > slowly fill in that information, if it's easily retrievable from the log
> > itself (which I think it is).

> tail -n 1 on any build log will yield
> | Build needed $time, $sizek disk space
> This could (and should) be imported when the old build logs are
> imported.

Ok. I'll suggest a two-phase import, first the paths/pkg/ver/date/status,
and then the bit that needs bzcat, so as not to make the import eternal,
and have the second-stage easily interrupt/resume-eable.

> > Also, please check that build_time = interval, and used_space => integer
> > are appropriate names and types. (In particular, I'm not sure if "space
> > statistics" is just a number, or more.)

> It's the size of the build dir, in KB. This doesn't count installed
> dependencies, which makes for interesting problems when trying to
> guess how much space you need to build packages (for example, OO.org's
> build-deps add more than a GB to the space requirements). We might want
> to change the computation of that at some point.

Ok, so an integer should be ok.

> I have to admit that I'm not a big fan of using natural joins instead of
> the self-documenting "join ... on" construct. It requires people to keep
> the table definitions always in mind to understand what is used to join
> them.

Yes, I can understand that point of view. But still, with this
improvement you can use JOIN ... USING (column), which is more concise.

Thanks,

-- 
Adeodato Simó                                     dato at net.com.org.es
Debian Developer                                  adeodato at debian.org
 
The pure and simple truth is rarely pure and never simple.
                -- Oscar Wilde




More information about the Buildd-tools-devel mailing list