[Teammetrics-discuss] Updating commitstats

Sukhbir Singh sukhbir.in at gmail.com
Tue Apr 23 04:45:55 UTC 2013


Hi Andreas,

Andreas Tille wrote:
> status on several places is (obviously) the wrong approach.  If we build
> a database on two different hosts (say for testing purpose or as we do
> now an initial import) this will break import data.  Simply assume I
> would create a test database instance at home and will run a daily
> import.  The consequence would be that the status on vasks.d.o will be
> updated daily and if our production import runs at the beginning of a
> month it will see no new commits.  That's broken design.

I think I couldn't get my message across clearly. Maybe I did, but let
me reiterate it:

The SVN data is stored in the database *only* (on blends) and the state
is saved on vasks. The reason we did this is because we were taking into
account the number of lines changed for SVN commits, which is an
intensive task for repositories that have thousands of commits and
thousands of lines changed per commit.

So when we run svnstat, instead of fetching the state from blends, we
just get it locally from vasks, and if a commit is found, we skip it and
don't parse it (and therefore don't send it back).

> The consequence needs to be that we do the housekeeping *inside* each
> database because the housekeeping and the data belong together and it
> actually needs to be done in a *transaction* to make sure that the
> housekeeping will fit the data status exactly.

I am not sure what you mean, but there is just one database that we
populate and that is on vasks. Also, if a commit has already been parsed
on vasks, it is not sent back, so the consistency remains.

So technically, yes, I can just decide to save the state entirely on
blends, but I am very sure that there was some reason we didn't do this.
In fact, I had to put in extra work just to handle this special case of
saving the state on vasks for SVN commits.

> So the very quick hack to cure the situation above would be to also
> store the svn data in /var/cache/teammetrics.

Ok. I can do that.

> The "real" solution would probably as I mentioned briefly in my past
> mail that we need to store also these data inside the database rather
> than in /var/cache/teammetrics.  This would enable us to do clean
> backups of the database.  OK, with some proper backup method we could
> also keep the dir /var/cache/teammetrics - but hmmm, I'm somehow lacking
> the motivation to keep one part of the data in files and the other part
> inside the database.

So do you mean to save the state in the database too? Other than the
backup thing, is there any reason why you would want to do that?

> BTW, in the debian-l10n team there need to be some names adjusted.
> There is some Nicolas_F and Nicolas_F? as well as a user fzt.  In
> debian-science there is sebastien-guest and sebastien and barbier-guest.

I will update it.

-- 
Sukhbir



More information about the Teammetrics-discuss mailing list