[Logcheck-devel] Bug#401259: logcheck: logcheck needs to override locale for grep

Chris Hanson cph at debian.org
Sat Dec 2 06:17:28 UTC 2006


Package: logcheck
Version: 1.2.51
Severity: normal

Logcheck has an implicit assumption that the default locale should be
used by grep when processing log files.  However, that's not always
the case.  For example, I use the locale "en_US.UTF-8", and
consequently grep assumes that its inputs are encoded as UTF-8.  But
the log files appear to be encoded as ISO 8859-1, which means that
sometimes my rules don't match.

Specifically, I have a rule that reads 

^\w{3} [ :0-9]{11} [._[:alnum:]-]+ kernel: input: .* as /class/input/input[0-9]+$

which is supposed to ignore messages from the kernel announcing input
devices.  But the following log line isn't handled right:

Dec  2 00:05:01 ravna kernel: input: Microsoft Microsoft Wireless Optical Mouse® 1.0A as /class/input/input3

The reason it doesn't match is that the "R in a circle" character is
encoded in the log file as using the ISO 8859-1 code 0xae, but this
isn't a valid first byte of a UTF-8 code.  Consequently, the "."
pattern doesn't match it.  In fact, I don't think there's _any_ way to
match this byte sequence in a UTF-8 locale.

Unfortunately I'm not sure what to do about this, because it's not
obvious how the log-file messages relate to the locale.  This message
comes from the kernel, which presumably doesn't know what the locale
is.  Furthermore, this particular text is coming directly from the
device, and just being passed along by the kernel -- I have no idea if
USB specifies the character coding that is used in these strings, or
if it's just an uninterpreted sequence of bytes that are encoded any
way the manufacturer pleases.

One thing that works in this case is to set "LC_ALL=C" prior to
calling grep.  But if the log files sometimes contain UTF-8 coding,
this will mess that up  Perhaps the kernel log lines need to be
handled differently?

I hope you have a better idea about how to handle this.

-- System Information:
Debian Release: 4.0
  APT prefers unstable
  APT policy: (500, 'unstable')
Architecture: i386 (i686)
Shell:  /bin/sh linked to /bin/bash
Kernel: Linux 2.6.19-cph1
Locale: LANG=en_US.UTF-8, LC_CTYPE=en_US.UTF-8 (charmap=UTF-8)

Versions of packages logcheck depends on:
ii  adduser          3.100                   Add and remove users and groups
ii  cron             3.0pl1-99               management of regular background p
ii  debconf          1.5.9                   Debian configuration management sy
ii  exim4            4.63-10                 metapackage to ease exim MTA (v4) 
ii  exim4-daemon-lig 4.63-10                 lightweight exim MTA (v4) daemon
ii  grep             2.5.1.ds2-6             GNU grep, egrep and fgrep
ii  lockfile-progs   0.1.10                  Programs for locking and unlocking
ii  logtail          1.2.51                  Print log file lines that have not
ii  mailx            1:8.1.2-0.20050715cvs-1 A simple mail user agent
ii  sysklogd [system 1.4.1-20                System Logging Daemon

Versions of packages logcheck recommends:
ii  logcheck-database             1.2.51     database of system log rules for t

-- debconf information:
  logcheck/changes:
  logcheck/install-note:





More information about the Logcheck-devel mailing list