[newmaint-site] contributors.d.o source for git.debian.org

Martín Ferrari tincho at debian.org
Fri Jan 9 12:44:33 UTC 2015


Hi Zack!

On 09/01/15 08:42, Stefano Zacchiroli wrote:
> Hey Martín, thanks *a lot* for this work. It is incredibly important to
> properly list Git commits as contributions on contributors.d.o, and I've
> constantly felt bad for not having found the time to work on this
> myself. So thanks a bunch for relieving my pain :)

Glad it is useful :-)

> I wonder if other repositories, in addition to those explicitly
> mentioned below, are excluded from data collection by your new source,
> or if there's maybe a bug here.
> 
> Case in point: the Git repo we use for Debsources,
> i.e. http://anonscm.debian.org/cgit/qa/debsources.git . Based on the
> following observations, it doesn't seem to be considered:
> 
> - my most recent Git contribution seems to date back to October 2014
>   https://contributors.debian.org/contributor/zack%40debian whereas I
>   regularly commit to the above repo
> 
> - neither Matthieu Caneill (matthieuc-guest on alioth) nor my OPW intern
>   Jingjie Jiang (sophiejjj-guest) seems to be listed at
>   https://contributors.debian.org/source/git.debian.org
> 
> So, am I right in assuming that your data source only consider package
> repositories on git.d.o? That is consistent with name "pkg-commit" which
> I see at the above URL, but it wasn't clear to me from your email.

This is exactly the problem. I am sorry I forgot to mention this. Let me
explain what's going on:

The naive (and very fast) way to get commit info from git is to check
for the owner of the object files. This has a big problem: if somebody
runs a git gc, or uploads on behalf of somebody else, those
contributions are lost.

The alternative is to use git log (my preferred way). It is pretty slow
(hence the many hours to process git.d.o), but honours whatever the
committer put as their id.

This brings two further problems:

* Many of the ids in the commit logs are unusable (svn-convertions,
people with misconfigured git clients). This is not a big problem, though.
* With many people importing upstream history into their repositories,
all upstream committers become suddenly Debian contributors!

So, a quick-and-dirty way to solve this conundrum (suggested by Gregoa,
IIRC), is to filter commits to those that modify the debian/ directory.
This worked pretty well for the pkg-perl team, and it is what I used here.

The obvious drawback, is that non-packaging repos get ignored. I have
just checked: no contributions are taken from the qa/debsources git repo.

For the pkg-perl team, what I've done is to create two contribution
types (which are handled separately): one for packaging, and the other
for the website and tools.

> If so, why is that the case? Have you thought about generalizing it to
> *all* git.d.o repositories? My take is that everyone who has committed
> anything to one such repo is contributing something to Debian, and as
> such deserves to see her work acknowledged publicly. YMMV.

I fully agree with your view, but as per the previous explanation, I
don't think I can do it on a global scale, only case-by-case. After all,
the vast majority of the repos in git.d.o are packaging repos, so it
makes sense to special-case the others, IMHO.


> QA repos are not listed here, but does not seem to be currently
> considered by your data source either. Do you think they should be
> tracked separately?
> 
> Personally, I think it would be much better to have a single git.d.o
> data source, so that we can deploy/update it once, without having
> several instances of it to separately maintain.

I did not know about the content of the QA repos, but now that you
pointed it out to me, I am sure that the way to go is to handle them
separately. If you give me some guidelines, I can implement this easily
as a new data source.

About the single data source... I agree it has appeal, but at this point
it seems it would be a bit messy to implement. Maybe when we have a good
coverage of most contributions we can try to tidy up and unify?

Thanks for the bug report!

-- 
Martín Ferrari (Tincho)



More information about the newmaint-site mailing list