licensechack: new BSD detection algorithm

Dmitry Smirnov onlyjob at member.fsf.org
Fri Oct 12 04:36:20 UTC 2012


Hello team,

As you may know licensecheck have problems with detection of BSD licenses.

Basically it uses regex to search for known clauses. 
This is wrong -- imagine situation when someone adds the following 

	"Software must not be sold without prior written permission."

to the classic BSD-2-clause license. 
Licensecheck then will find 2 unmodified clauses and incorrectly report 
license as BSD-2-clause (using "licensecheck" notation).

I think I have an elegant solution to this: capture everything between first 
paragraph and the disclaimer to the variable and remove known clauses. If 
something left, report as "BSD-N-clause (modified)".

The major improvement of this will be that detected "BSD-N-clause" will be 
guaranteed to be an unmodified license.

The proposed implementation may be introduced by the attached patch 
that meant to be applied to "jessie" over my previous patches + attached 
0001[..].patch adding another case to GPL detection.
(I didn't try applying new BSD patch to the current state of "jessie").

Also you can try attached licensecheck where this is already implemented.
(worth trying with "--tests" argument. There will be some noise from unrelated 
failed test cases)

I put new BSD detection algorithm before old ones so if new code recognises 
the license it removes its text from further processing, otherwise falls back 
gracefully.

Also new detection introduces unified loop for license detection as I don't 
quite like sequence of if-elses that old code uses for detection.

Feedback is very welcome. (Benjamin? Adam? James?)

Thank you.

-- 
Regards,
Dmitry.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: licensecheck.pl
Type: application/x-perl
Size: 55997 bytes
Desc: not available
URL: <http://lists.alioth.debian.org/pipermail/devscripts-devel/attachments/20121012/0277629e/attachment-0003.bin>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: 0001-new-case-either-followed-by-one-version-and-full-sto.patch
Type: text/x-patch
Size: 2247 bytes
Desc: not available
URL: <http://lists.alioth.debian.org/pipermail/devscripts-devel/attachments/20121012/0277629e/attachment-0004.bin>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: 0002-new-BSD-detection.patch
Type: text/x-patch
Size: 8232 bytes
Desc: not available
URL: <http://lists.alioth.debian.org/pipermail/devscripts-devel/attachments/20121012/0277629e/attachment-0005.bin>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 836 bytes
Desc: This is a digitally signed message part.
URL: <http://lists.alioth.debian.org/pipermail/devscripts-devel/attachments/20121012/0277629e/attachment-0001.pgp>


More information about the devscripts-devel mailing list