Thoughts about the future of PET

Fri Sep 5 01:51:27 UTC 2008

To put them on writing before my mind forgets something. :)

Regarding buildstat:

As I said before, I think that now the post interesting and easy thing
to do is to somehow access the build status and lintian reports, so
those can be shown along the rest of the current information in pet.
Other thing to think about is the git backend, but I foresee that this
won't be easy.

If we want to merge the two tools, we need to discuss what to keep
from each. I know Gonéri is interested in keeping the database schema,
but if we're heading to use udd or something similar in the future,
converting all the fuzzy and amorphous data structures of pet into
buildstat schemas will be too much effort that will be discarded in
the near future.

Regarding udd, ddpo and pet itself:

pet is completely vcs-centric, that's why it can run everywhere. It's
not tied to alioth, or any other debian machine. At the same time that
impedes the merging with the other two.

In moving to something that's repository-wide, you have the problem of
reaching personal repositories which aren't hosted in alioth, and may
never be. A way of triggering repository checks from outside should e
taken into account, or keeping a way of using it outside alioth,
without udd dependencies.

There's also the problem of mining data for packages that have no
known repository. Maybe it could use the source package from unstable
as another pseudo-VCS backend. In my mind, the move from svn-only
would be to have a VCS-independent api that worked more or less like a
VFS; that could integrate source package reading transparently.

Other problems: you cannot reasonably process 10000 source packages in
one run, different repositores can overlap while a package is being
adopted (in pkg-perl there's about 100 packages which are adopted but
not uploaded with the new maintainer set)

In any case, the flow would have to be completely rethought. I guess
that breaking the code in many small "providers" would be the best
option, the big question mark is in how to reach the many repositories
out there.

A problem that I see with udd is that it could result too rigid; a
good thing of the horrible data structures that pet uses is that
adding information is trivial. How do you think that UDD will evolve?
Also,. the data there still needs a lot of processing. Some other
intermediate storage for pre-computed data would be needed. Remember
that PET shows data in real time, it's very different to most other
services in debian that prepare reports in batch.

Input welcomed :)

-- 
Martín Ferrari