[buildd-tools-devel] Bug#589889: Bug#589889: schroot: session names being inconsistently restricted

Zach Carter z.carter at f5.com
Thu Jul 22 16:36:13 UTC 2010


On Thursday 22 July 2010 02:04:52 Roger Leigh wrote:
> Agreed on all counts and the patch looks great.  I'll review it in
> more detail when I have time at the weekend and make a new release
> then.

Cool!  thanks.

Just some additional background info.  When I was troubleshooting this issue I 
noticed some inconsistent behavior in the boost regex logic.  Some of my 
session names were allowed, and some were not, and I was banging my head 
against the wall trying to figure out what was different.   A friend of my 
suggested it may have to do with how the ranges are handled, such as "a-z".   
Testing confirmed that hypothesis, at least in my environment.  Apparently, 
those ranges are not very reliable with regard to your locale setting.   

From http://www.cs.brown.edu/~jwicks/boost/libs/regex/doc/faq.html:

"Q. Why don't character ranges work properly (POSIX mode only)?
A. The POSIX standard specifies that character range expressions are locale 
sensitive - so for example the expression [A-Z] will match any collating 
element that collates between 'A' and 'Z'. That means that for most locales 
other than "C" or "POSIX", [A-Z] would match the single character 't' for 
example, which is not what most people expect - or at least not what most 
people have come to expect from regular expression engines. For this reason, 
the default behaviour of boost.regex (perl mode) is to turn locale sensitive 
collation off by not setting the regex_constants::collate compile time flag. 
However if you set a non-default compile time flag - for example 
regex_constants::extended or regex_constants::basic, then locale dependent 
collation will be enabled, this also applies to the POSIX API functions which 
use either regex_constants::extended or regex_constants::basic internally. 
[Note - when regex_constants::nocollate in effect, the library behaves "as if" 
the LC_COLLATE locale category were always "C", regardless of what its 
actually set to - end note]."

So, it might be advisable to change the regexes used in sbuild-util.cc to use 
the more reliable character classes, such as [:lower:] and [:digit:], 
documented here:

http://www.boost.org/doc/libs/1_43_0/libs/regex/doc/html/boost_regex/syntax/character_classes/std_char_clases.html

Or, set some compile-time flags to force the locale sensitivity off.

-Zach





More information about the Buildd-tools-devel mailing list