[debpool] Refactoring the main loop

Magnus Holmgren holmgren at lysator.liu.se
Sat Apr 28 07:45:44 UTC 2007


On Wednesday 11 April 2007 19:49, Magnus Holmgren wrote:
> 1. Check_Files(): Parse the .changes file; check that all the files it
> lists are there and have the correct MD5 checksum; check that the list of
> files is consistent with the Source, Binary, and Architecture fields, and
> that there are no unrecognized files. Parse the .dsc file if source is
> uploaded; check that all files listed have the same properties as in the
> .changes file and/or (for a .orig.tar.gz) as the existing file in the
> archive, that the file names are consistent with the Version field and
> Source fields, and that there are no unrecognized files. While we're at it,
> why not perform similar checks on the debs too? Hmm, if we're going to
> perform such extensive checking I guess it could be broken down further.

Here's my revised plan:

for each file in the .changes list:
  Check MD5 checksum and other basic things,
  Look at the filename and figure out the file type,
  if the file is a binary package:
    Extract control file using Dpkg_Info(),
    Check that Package, Version, Architecture fields match file name,
    Check that there isn't a package of the same name, version and arch [2],
     (looking out for arch-indep packages)
    Check that the architecture is valid,
    Check that there isn't already a binary package of the same name (and
     arch?) in the upload,
    Add package information structure to list of binaries [1].
  if the file is a DSC:
    Parse it[3],
    Check that Source and Version fields match file name and .changes fields,
    Check that there isn't a source package of the same name and version [2],
    Save DSC information structure in a variable [1].
  if the file is some other source file:
    Check if there already is a file of the same name [2][3].
    Add file info to a list.

for each package in the list of binaries:
  Optionally, check that the source package mentioned in the Source field
   exists [and is the one uploaded now, if any], and that it lists the
   binary in its Binary field.

if a DSC was uploaded:
  Check that all files listed in it exist (in the upload or in the archive)
if a DSC was not uploaded:
  Check that there were no "other source files".

[1] Here we add the other information that goes into the Packages and Sources 
files as well, i.e. everything Generate_Package() and Generate_Source() do, 
but without writing anything yet.

[2] In general, duplicates can be ignored if they match the existing one, 
otherwise the upload is rejected. This allows packages to be added to new 
suites by re-uploading. There are two ways to find existing files: Looking up 
the pertinent package record or having a list of all files (two files with 
the same name shouldn't exist (except in the convoluted case of two different 
Debian versions, but same upstream version, of a package existing in two 
different components).

[3] I think it's an advantage if we don't have to care too much about
what files make up a source package. Just make sure that all files exist, and 
that two file with the same name but different content don't exist.

The following data will have to be used to keep track of packages, suites, 
arches and components (the rest will just sit in the .package and .source 
files to be concatenated together when DebPool has figured out which packages 
are in a certain suite, component and arch.

Binaries: package_name, version, arch, source_package, source_version, 
component, pool_dir[4], file_name[5]. Key: (package_name, version, arch)

Sources: source_name, version, component, pool_dir[4], file_name[5], 
other_files. Key: (source_name, version)

Files[6]: file_name, size, md5sum. Key: file_name

Bin_assoc/Src_assoc: package, version, suite

[4] By storing pool_dir, a pool layout change can be effected gradually, but 
this is probably no important feature.

[5] The file name can be deduced from the other fields, except for the type 
(deb or udeb). As of right now, dpkg-genchanges can't handle deb and udeb 
variants of the same package (the udeb variant has to have a different name), 
and dak's database scheme doesn't allow it either, but we might want to 
support it anyway, in which case we'd either have to add `type' to the key, 
or store a list of types in the same record.

[6] And finally, the properties of debs and DSCs can be stored in the 
respective tables (Binaries and Sources), but there would still need to be a 
table for "other source files". Or I can scrap that and simply look on the 
disk (in all possible places).

-- 
Magnus Holmgren        holmgren at lysator.liu.se
                       (No Cc of list mail needed, thanks)
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: not available
Url : http://lists.alioth.debian.org/pipermail/debpool-devel/attachments/20070428/779c6ac6/attachment.pgp


More information about the Debpool-devel mailing list