[Pkg-bazaar-commits] ./bzr-stats/unstable r19: Merge trunk.
Jelmer Vernooij
jelmer at samba.org
Wed Jul 16 17:31:20 UTC 2008
------------------------------------------------------------
revno: 19
committer: Jelmer Vernooij <jelmer at samba.org>
branch nick: debian
timestamp: Wed 2008-07-16 19:31:20 +0200
message:
Merge trunk.
removed:
test_stats.py
added:
classify.py
test_classify.py
modified:
__init__.py
debian/changelog
setup.py
------------------------------------------------------------
revno: 10.1.2
committer: Jelmer Vernooij <jelmer at samba.org>
branch nick: trunk
timestamp: Mon 2007-11-05 05:26:47 +0100
message:
Split out functionality that sorts revids by commmitter.
modified:
__init__.py
------------------------------------------------------------
revno: 10.2.1
committer: John Arbash Meinel <john at arbash-meinel.com>
branch nick: stats
timestamp: Mon 2007-11-05 20:31:49 -0600
message:
merge in Jelmer's setup.py and split out sorting functionality.
added:
setup.py
modified:
__init__.py
------------------------------------------------------------
revno: 10.2.2
committer: Jelmer Vernooij <jelmer at samba.org>
branch nick: trunk
timestamp: Tue 2007-11-20 19:18:21 +0100
message:
Change name to committer-stats, to allow for other sorts of stats too.
modified:
__init__.py
------------------------------------------------------------
revno: 10.2.3
committer: Jelmer Vernooij <jelmer at samba.org>
branch nick: trunk
timestamp: Sun 2007-12-09 05:21:27 +0100
message:
Provide enough information for setup.py register to work.
modified:
setup.py
------------------------------------------------------------
revno: 10.2.4
committer: Jelmer Vernooij <jelmer at samba.org>
branch nick: trunk
timestamp: Sun 2008-03-09 16:51:12 +0100
message:
Merge upstream.
modified:
__init__.py
------------------------------------------------------------
revno: 10.3.1
committer: John Arbash Meinel <john at arbash-meinel.com>
branch nick: stats
timestamp: Thu 2008-03-06 10:21:59 +0000
message:
Make a lot of imports lazy since they may not actually be used.
modified:
__init__.py
------------------------------------------------------------
revno: 10.3.2
committer: John Arbash Meinel <john at arbash-meinel.com>
branch nick: stats
timestamp: Fri 2008-03-07 16:55:43 +0000
message:
(Wesley J. Landaker) properly import ui before using it.
modified:
__init__.py
------------------------------------------------------------
revno: 10.4.1
committer: Wesley J. Landaker <wjlanda at sandia.gov>
branch nick: stats
timestamp: Fri 2008-03-07 09:16:58 -0700
message:
Added ui to bzrlib lazy imports.
modified:
__init__.py
------------------------------------------------------------
revno: 10.2.5
committer: Jelmer Vernooij <jelmer at samba.org>
branch nick: trunk
timestamp: Sat 2008-06-28 20:54:37 +0200
message:
Add code for classifying commits.
added:
classify.py
test_classify.py
------------------------------------------------------------
revno: 10.2.6
committer: Jelmer Vernooij <jelmer at samba.org>
branch nick: trunk
timestamp: Sat 2008-06-28 21:06:38 +0200
message:
Rename collapse_by_author -> collapse_by_person since author has an unambigous meaning
modified:
__init__.py
------------------------------------------------------------
revno: 10.2.7
committer: Jelmer Vernooij <jelmer at samba.org>
branch nick: trunk
timestamp: Sat 2008-06-28 21:08:44 +0200
message:
Use get_apparent_author rather than committer.
modified:
__init__.py
------------------------------------------------------------
revno: 10.2.8
committer: Jelmer Vernooij <jelmer at samba.org>
branch nick: trunk
timestamp: Sat 2008-06-28 22:07:45 +0200
message:
Add credits command, test classify code by default, add comments to classify code.
modified:
__init__.py
classify.py
------------------------------------------------------------
revno: 10.2.9
committer: Jelmer Vernooij <jelmer at samba.org>
branch nick: trunk
timestamp: Sat 2008-06-28 22:31:01 +0200
message:
List contributors with more contributions first.
modified:
__init__.py
------------------------------------------------------------
revno: 10.2.10
committer: Jelmer Vernooij <jelmer at samba.org>
branch nick: trunk
timestamp: Sat 2008-06-28 22:34:11 +0200
message:
Add --show-class argument to stats command.
modified:
__init__.py
------------------------------------------------------------
revno: 10.2.11
committer: Lukáš Lalinský <lalinsky at gmail.com>
branch nick: stats
timestamp: Fri 2008-07-04 14:10:24 +0200
message:
Some stats fixes:
- Don't use full name as email when there is no email
- Use name/email parsing function from bzrlib.config
- Always use rev.get_apparent_author()
modified:
__init__.py
------------------------------------------------------------
revno: 10.2.12
committer: Jelmer Vernooij <jelmer at samba.org>
branch nick: trunk
timestamp: Fri 2008-07-04 14:24:39 +0200
message:
Remove now-obsolete tests.
removed:
test_stats.py
modified:
__init__.py
------------------------------------------------------------
revno: 10.2.13
committer: Jelmer Vernooij <jelmer at samba.org>
branch nick: trunk
timestamp: Fri 2008-07-04 14:43:31 +0200
message:
Add another progress bar.
modified:
__init__.py
-------------- next part --------------
=== modified file '__init__.py'
--- a/__init__.py 2007-07-17 16:17:40 +0000
+++ b/__init__.py 2008-07-04 12:43:31 +0000
@@ -2,28 +2,21 @@
import re
-from bzrlib import errors, tsort
-from bzrlib.branch import Branch
-import bzrlib.commands
-from bzrlib.config import extract_email_address
-from bzrlib.workingtree import WorkingTree
-
-
-_fullname_re = re.compile(r'(?P<fullname>.*?)\s*<')
-
-def extract_fullname(committer):
- """Try to get the user's name from their committer info."""
- m = _fullname_re.match(committer)
- if m:
- return m.group('fullname')
- try:
- email = extract_email_address(committer)
- except errors.BzrError:
- return committer
- else:
- # We found an email address, but not a fullname
- # so there is no fullname
- return ''
+from bzrlib.lazy_import import lazy_import
+lazy_import(globals(), """
+from bzrlib import (
+ branch,
+ commands,
+ config,
+ errors,
+ option,
+ tsort,
+ ui,
+ workingtree,
+ )
+from bzrlib.plugins.stats.classify import classify_delta
+from itertools import izip
+""")
def find_fullnames(lst):
@@ -31,14 +24,14 @@
counts = {}
for committer in lst:
- fullname = extract_fullname(committer)
+ fullname = config.parse_username(committer)[0]
counts.setdefault(fullname, 0)
counts[fullname] += 1
return sorted(((count, name) for name,count in counts.iteritems()), reverse=True)
-def collapse_by_author(committers):
- """The committers list is sorted by email, fix it up by author.
+def collapse_by_person(committers):
+ """The committers list is sorted by email, fix it up by person.
Some people commit with a similar username, but different email
address. Which makes it hard to sort out when they have multiple
@@ -56,7 +49,7 @@
counter_to_info = {}
counter = 0
for email, revs in committers.iteritems():
- fullnames = find_fullnames(rev.committer for rev in revs)
+ fullnames = find_fullnames(rev.get_apparent_author() for rev in revs)
match = None
for count, fullname in fullnames:
if fullname and fullname in name_to_counter:
@@ -85,30 +78,36 @@
for revs, email, fname in counter_to_info.values()), reverse=True)
+def sort_by_committer(a_repo, revids):
+ committers = {}
+ pb = ui.ui_factory.nested_progress_bar()
+ try:
+ pb.note('getting revisions')
+ revisions = a_repo.get_revisions(revids)
+ for count, rev in enumerate(revisions):
+ pb.update('checking', count, len(revids))
+ email = config.parse_username(rev.get_apparent_author())[1]
+ committers.setdefault(email, []).append(rev)
+ finally:
+ pb.finished()
+
+ return committers
+
+
def get_info(a_repo, revision):
"""Get all of the information for a particular revision"""
- pb = bzrlib.ui.ui_factory.nested_progress_bar()
- committers = {}
+ pb = ui.ui_factory.nested_progress_bar()
a_repo.lock_read()
try:
pb.note('getting ancestry')
ancestry = a_repo.get_ancestry(revision)[1:]
- pb.note('getting revisions')
- revisions = a_repo.get_revisions(ancestry)
- for count, rev in enumerate(revisions):
- pb.update('checking', count, len(ancestry))
- try:
- email = extract_email_address(rev.committer)
- except errors.BzrError:
- email = rev.committer
- committers.setdefault(email, []).append(rev)
+ committers = sort_by_committer(a_repo, ancestry)
finally:
a_repo.unlock()
pb.finished()
- info = collapse_by_author(committers)
- return info
+ return collapse_by_person(committers)
def get_diff_info(a_repo, start_rev, end_rev):
@@ -116,7 +115,7 @@
This lets us figure out what has actually changed between 2 revisions.
"""
- pb = bzrlib.ui.ui_factory.nested_progress_bar()
+ pb = ui.ui_factory.nested_progress_bar()
committers = {}
a_repo.lock_read()
try:
@@ -131,18 +130,19 @@
for count, rev in enumerate(revisions):
pb.update('checking', count, len(ancestry))
try:
- email = extract_email_address(rev.committer)
+ email = config.extract_email_address(rev.get_apparent_author())
except errors.BzrError:
- email = rev.committer
+ email = rev.get_apparent_author()
committers.setdefault(email, []).append(rev)
finally:
a_repo.unlock()
pb.finished()
- info = collapse_by_author(committers)
+ info = collapse_by_person(committers)
return info
-def display_info(info, to_file):
+
+def display_info(info, to_file, gather_class_stats=None):
"""Write out the information"""
for count, revs, emails, fullnames in info:
@@ -172,23 +172,29 @@
to_file.write("''\n")
else:
to_file.write("%s\n" % (email,))
-
-
-class cmd_statistics(bzrlib.commands.Command):
+ if gather_class_stats is not None:
+ print ' Contributions:'
+ classes, total = gather_class_stats(revs)
+ for name,count in sorted(classes.items(), lambda x,y: cmp((x[1], x[0]), (y[1], y[0]))):
+ to_file.write(" %4.0f%% %s\n" % ((float(count) / total) * 100.0, "Unknown" if name is None else name))
+
+
+class cmd_committer_statistics(commands.Command):
"""Generate statistics for LOCATION."""
- aliases = ['stats']
+ aliases = ['stats', 'committer-stats']
takes_args = ['location?']
- takes_options = ['revision']
+ takes_options = ['revision',
+ option.Option('show-class', help="Show the class of contributions")]
encoding_type = 'replace'
- def run(self, location='.', revision=None):
+ def run(self, location='.', revision=None, show_class=False):
alternate_rev = None
try:
- wt = WorkingTree.open_containing(location)[0]
+ wt = workingtree.WorkingTree.open_containing(location)[0]
except errors.NoWorkingTree:
- a_branch = Branch.open(location)
+ a_branch = branch.Branch.open(location)
last_rev = a_branch.last_revision()
else:
a_branch = wt.branch
@@ -208,13 +214,15 @@
info = get_info(a_branch.repository, last_rev)
finally:
a_branch.unlock()
- display_info(info, self.outf)
-
-
-bzrlib.commands.register_command(cmd_statistics)
-
-
-class cmd_ancestor_growth(bzrlib.commands.Command):
+ def fetch_class_stats(revs):
+ return gather_class_stats(a_branch.repository, revs)
+ display_info(info, self.outf, fetch_class_stats if show_class else None)
+
+
+commands.register_command(cmd_committer_statistics)
+
+
+class cmd_ancestor_growth(commands.Command):
"""Figure out the ancestor graph for LOCATION"""
takes_args = ['location?']
@@ -223,9 +231,9 @@
def run(self, location='.'):
try:
- wt = WorkingTree.open_containing(location)[0]
+ wt = workingtree.WorkingTree.open_containing(location)[0]
except errors.NoWorkingTree:
- a_branch = Branch.open(location)
+ a_branch = branch.Branch.open(location)
last_rev = a_branch.last_revision()
else:
a_branch = wt.branch
@@ -247,16 +255,122 @@
self.outf.write('%4d, %4d\n' % (revno, cur_parents))
-bzrlib.commands.register_command(cmd_ancestor_growth)
+commands.register_command(cmd_ancestor_growth)
+
+
+def gather_class_stats(repository, revs):
+ ret = {}
+ total = 0
+ pb = ui.ui_factory.nested_progress_bar()
+ try:
+ repository.lock_read()
+ try:
+ i = 0
+ for delta in repository.get_deltas_for_revisions(revs):
+ pb.update("classifying commits", i, len(revs))
+ for c in classify_delta(delta):
+ if not c in ret:
+ ret[c] = 0
+ ret[c] += 1
+ total += 1
+ i += 1
+ finally:
+ repository.unlock()
+ finally:
+ pb.finished()
+ return ret, total
+
+
+def display_credits(credits):
+ (coders, documenters, artists, translators) = credits
+ def print_section(name, lst):
+ if len(lst) == 0:
+ return
+ print "%s:" % name
+ for name in lst:
+ print "%s" % name
+ print ""
+ print_section("Code", coders)
+ print_section("Documentation", documenters)
+ print_section("Art", artists)
+ print_section("Translations", translators)
+
+
+def find_credits(repository, revid):
+ """Find the credits of the contributors to a revision.
+
+ :return: tuple with (authors, documenters, artists, translators)
+ """
+ ret = {"documentation": {},
+ "code": {},
+ "art": {},
+ "translation": {},
+ None: {}
+ }
+ repository.lock_read()
+ try:
+ ancestry = filter(lambda x: x is not None, repository.get_ancestry(revid))
+ revs = repository.get_revisions(ancestry)
+ pb = ui.ui_factory.nested_progress_bar()
+ try:
+ for i, (rev,delta) in enumerate(izip(revs, repository.get_deltas_for_revisions(revs))):
+ pb.update("analysing revisions", i, len(revs))
+ # Don't count merges
+ if len(rev.parent_ids) > 1:
+ continue
+ for c in set(classify_delta(delta)):
+ author = rev.get_apparent_author()
+ if not author in ret[c]:
+ ret[c][author] = 0
+ ret[c][author] += 1
+ finally:
+ pb.finished()
+ finally:
+ repository.unlock()
+ def sort_class(name):
+ return map(lambda (x,y): x,
+ sorted(ret[name].items(), lambda x,y: cmp((x[1], x[0]), (y[1], y[0])), reverse=True))
+ return (sort_class("code"), sort_class("documentation"), sort_class("art"), sort_class("translation"))
+
+
+class cmd_credits(commands.Command):
+ """Determine credits for LOCATION."""
+
+ takes_args = ['location?']
+ takes_options = ['revision']
+
+ encoding_type = 'replace'
+
+ def run(self, location='.', revision=None):
+ try:
+ wt = workingtree.WorkingTree.open_containing(location)[0]
+ except errors.NoWorkingTree:
+ a_branch = branch.Branch.open(location)
+ last_rev = a_branch.last_revision()
+ else:
+ a_branch = wt.branch
+ last_rev = wt.last_revision()
+
+ if revision is not None:
+ last_rev = revision[0].in_history(a_branch).rev_id
+
+ a_branch.lock_read()
+ try:
+ credits = find_credits(a_branch.repository, last_rev)
+ display_credits(credits)
+ finally:
+ a_branch.unlock()
+
+
+commands.register_command(cmd_credits)
def test_suite():
from unittest import TestSuite
from bzrlib.tests import TestLoader
- import test_stats
suite = TestSuite()
loader = TestLoader()
- testmod_names = ['test_stats']
+ testmod_names = [ 'test_classify']
suite.addTest(loader.loadTestsFromModuleNames(['%s.%s' % (__name__, i) for i in testmod_names]))
return suite
=== added file 'classify.py'
--- a/classify.py 1970-01-01 00:00:00 +0000
+++ b/classify.py 2008-06-28 20:07:45 +0000
@@ -0,0 +1,48 @@
+"""Classify a commit based on the types of files it changed."""
+
+from bzrlib import urlutils
+from bzrlib.trace import mutter
+
+
+def classify_filename(name):
+ """Classify a file based on its name.
+
+ :param name: File path.
+ :return: One of code, documentation, translation or art.
+ None if determining the file type failed.
+ """
+ # FIXME: Use mime types? Ohcount?
+ basename = urlutils.basename(name)
+ try:
+ extension = basename.split(".")[1]
+ if extension in ("c", "h", "py", "cpp", "rb", "ac"):
+ return "code"
+ if extension in ("html", "xml", "txt", "rst", "TODO"):
+ return "documentation"
+ if extension in ("po"):
+ return "translation"
+ if extension in ("svg", "png", "jpg"):
+ return "art"
+ except IndexError:
+ if basename in ("README", "NEWS", "TODO",
+ "AUTHORS", "COPYING"):
+ return "documentation"
+ if basename in ("Makefile"):
+ return "code"
+
+ mutter("don't know how to classify %s", name)
+ return None
+
+
+def classify_delta(delta):
+ """Determine what sort of changes a delta contains.
+
+ :param delta: A TreeDelta to inspect
+ :return: List with classes found (see classify_filename)
+ """
+ # TODO: This is inaccurate, since it doesn't look at the
+ # number of lines changed in a file.
+ types = []
+ for d in delta.added + delta.modified:
+ types.append(classify_filename(d[0]))
+ return types
=== modified file 'debian/changelog'
--- a/debian/changelog 2008-07-03 14:42:20 +0000
+++ b/debian/changelog 2008-07-16 17:31:20 +0000
@@ -1,4 +1,4 @@
-bzr-stats (0.0.1~bzr20-1) unstable; urgency=low
+bzr-stats (0.0.1~bzr23-1) unstable; urgency=low
* Initial release. (Closes: #XXXXXX)
=== modified file 'setup.py'
--- a/setup.py 2007-10-26 02:33:18 +0000
+++ b/setup.py 2007-12-09 04:21:27 +0000
@@ -8,6 +8,8 @@
version='0.0.1',
license='GPL',
author='John Arbash Meinel',
+ author_email="john at arbash-meinel.com",
+ url="http://launchpad.net/bzr-stats",
long_description="""
Simple statistics plugin for Bazaar.
""",
=== added file 'test_classify.py'
--- a/test_classify.py 1970-01-01 00:00:00 +0000
+++ b/test_classify.py 2008-06-28 18:54:37 +0000
@@ -0,0 +1,22 @@
+from bzrlib.tests import TestCase
+from bzrlib.plugins.stats.classify import classify_filename, classify_delta
+
+
+class TestClassify(TestCase):
+ def test_classify_code(self):
+ self.assertEquals("code", classify_filename("foo/bar.c"))
+
+ def test_classify_documentation(self):
+ self.assertEquals("documentation", classify_filename("bla.html"))
+
+ def test_classify_translation(self):
+ self.assertEquals("translation", classify_filename("nl.po"))
+
+ def test_classify_art(self):
+ self.assertEquals("art", classify_filename("icon.png"))
+
+ def test_classify_unknown(self):
+ self.assertEquals(None, classify_filename("something.bar"))
+
+ def test_classify_doc_hardcoded(self):
+ self.assertEquals("documentation", classify_filename("README"))
=== removed file 'test_stats.py'
--- a/test_stats.py 2007-07-17 16:17:40 +0000
+++ b/test_stats.py 1970-01-01 00:00:00 +0000
@@ -1,17 +0,0 @@
-from bzrlib.tests import TestCase
-from bzrlib.plugins.stats import extract_fullname
-
-
-class TestFullnameExtractor(TestCase):
- def test_standard(self):
- self.assertEquals("John Doe",
- extract_fullname("John Doe <joe at example.com>"))
-
- def test_only_email(self):
- self.assertEquals("",
- extract_fullname("joe at example.com"))
-
- def test_only_fullname(self):
- self.assertEquals("John Doe",
- extract_fullname("John Doe"))
-
More information about the Pkg-bazaar-commits
mailing list