Should licensecheck list skipped files ? (was: Re: Bug#806424: scan-copyrights: Failed to notice copyrights in lisp files)

Dominique Dumont dod at debian.org
Mon Nov 30 19:50:39 UTC 2015


On Friday 27 November 2015 10:52:47 Wookey wrote:
> This correctly catalogued the copyrights in the python file, but none of the
> list files

Indeed. The .lisp extension is missing from the regexp used by licensecheck to 
decide whether to scan a file or not. That's easy to fix.

> Now if I use licensecheck manually with a changed -c regex:
> licensecheck -r -c=* --copyright *
> It notices the copyrights in the list files:
> s-xml-rpc/test/test-base64.lisp: UNKNOWN
>   [Copyright: 2002, 2004 Sven Van Caekenberghe, Beta Nine BVBA]
> 
> But it still fails to grok the licence.

I can tweak licensecheck to scan the LLGPL. 

> Can scan-copyrights call licencecheck in such a way that it looks look
> in more (all?) files by default? 

Scanning all files by default is a can of worms: some files are binary (png, 
jpg...) and will lead to a lot of garbage issued by licensecheck. This can't 
be the default. And I'm reluctant to change scan-copyright to use -c '.*' 
because of the extra processing required to weed garbage out.

> Or perhaps this bugreport should be
> directed to licensecheck to make the default more comprehensive.

Yes. Making licensecheck more comprehensive will benefit more tools (and 
people). I'm thinking of license-reconcile and some other packages like 
ghostscript that have a fairly advanced processing of the output of 
licensecheck.

I will reassign this bug to devscripts

> Really
> I want a tool like this to look in everything it can (not just code
> and docs: graphics and test files too). Missing things entirely is
> much worse than false positives I can check and weed out. 
> If it can't
> reliably find nearly all the licence and copyright notices in the tree
> then it doesn't really help much as I still have to look in every damn
> file myself, by hand.

Good point.

May be licensecheck should list files that are not scanned (instead of 
returning garbage) ?

Thoughts ?

All the best






More information about the devscripts-devel mailing list