[buildd-tools-devel] Bug#589889: Bug#589889: schroot: session names being inconsistently restricted
Zach Carter
z.carter at f5.com
Thu Jul 22 16:36:13 UTC 2010
On Thursday 22 July 2010 02:04:52 Roger Leigh wrote:
> Agreed on all counts and the patch looks great. I'll review it in
> more detail when I have time at the weekend and make a new release
> then.
Cool! thanks.
Just some additional background info. When I was troubleshooting this issue I
noticed some inconsistent behavior in the boost regex logic. Some of my
session names were allowed, and some were not, and I was banging my head
against the wall trying to figure out what was different. A friend of my
suggested it may have to do with how the ranges are handled, such as "a-z".
Testing confirmed that hypothesis, at least in my environment. Apparently,
those ranges are not very reliable with regard to your locale setting.
From http://www.cs.brown.edu/~jwicks/boost/libs/regex/doc/faq.html:
"Q. Why don't character ranges work properly (POSIX mode only)?
A. The POSIX standard specifies that character range expressions are locale
sensitive - so for example the expression [A-Z] will match any collating
element that collates between 'A' and 'Z'. That means that for most locales
other than "C" or "POSIX", [A-Z] would match the single character 't' for
example, which is not what most people expect - or at least not what most
people have come to expect from regular expression engines. For this reason,
the default behaviour of boost.regex (perl mode) is to turn locale sensitive
collation off by not setting the regex_constants::collate compile time flag.
However if you set a non-default compile time flag - for example
regex_constants::extended or regex_constants::basic, then locale dependent
collation will be enabled, this also applies to the POSIX API functions which
use either regex_constants::extended or regex_constants::basic internally.
[Note - when regex_constants::nocollate in effect, the library behaves "as if"
the LC_COLLATE locale category were always "C", regardless of what its
actually set to - end note]."
So, it might be advisable to change the regexes used in sbuild-util.cc to use
the more reliable character classes, such as [:lower:] and [:digit:],
documented here:
http://www.boost.org/doc/libs/1_43_0/libs/regex/doc/html/boost_regex/syntax/character_classes/std_char_clases.html
Or, set some compile-time flags to force the locale sensitivity off.
-Zach
More information about the Buildd-tools-devel
mailing list