[Evolution] Bug#506373: complement on the subject line & body
Cyrille Chépélov
cyrille at chepelov.org
Thu Nov 20 23:31:53 UTC 2008
retitle 506373 Evolution recklessy ignores the charset on text/html
email fragments and causes glib's death by ana-utf8-phylactic shock
thanks
Although the subject line is (correclty) encoded in windows-1252 and
appears to contain the offending string, it does not appear to be the
cause of trouble.
The offending string can be found in the scrap of html sent by Google as
the first MIME part of the message body; quoting the bit:
<div style="width:370px; background:#D2E6D2; border-style:solid;
border-color:#ccc; border-width:1px 1px 0 1px; padding:15px 15px
5px 15px; margin:0 auto"><p
style="margin:0;color:#0">cyrille at chepelov.org,
vous êtes invité(e) à participer à</p>
<h2 style="margin:5px 0; font-size:18px;
line-height:1.4;color:#0">Concert Paris-Novembre (Réxx
Vyyyyé)</h2>
(here, gedit did automatically convert that from ISO-8859-15 to UTF-8,
hence none of the diacritics appear mutilated. hexdumping the MIME bit
does confirm the ISO-8859-15 encoding:
000001c0 20 73 74 79 6c 65 3d 22 6d 61 72 67 69 6e 3a 30 |
style="margin:0|
000001d0 3b 63 6f 6c 6f 72 3a 23 30 22 3e 63 79 72 69 6c
|;color:#0">cyril|
000001e0 6c 65 40 63 68 65 70 65 6c 6f 76 2e 6f 72 67 2c |
le at chepelov.org,|
000001f0 0a 76 6f 75 73 20 ea 74 65 73 20 69 6e 76 69 74 |.vous .tes
invit|
00000200 e9 28 65 29 20 e0 20 70 61 72 74 69 63 69 70 65 |.(e) .
participe|
00000210 72 20 e0 3c 2f 70 3e 0a 3c 68 32 20 73 74 79 6c |r .</p>.<h2
styl|
00000220 65 3d 22 6d 61 72 67 69 6e 3a 35 70 78 20 30 3b |
e="margin:5px 0;|
00000230 20 66 6f 6e 74 2d 73 69 7a 65 3a 31 38 70 78 3b |
font-size:18px;|
00000240 20 6c 69 6e 65 2d 68 65 69 67 68 74 3a 31 2e 34 |
line-height:1.4|
00000250 3b 63 6f 6c 6f 72 3a 23 30 22 3e 43 6f 6e 63 65
|;color:#0">Conce|
00000260 72 74 20 50 61 72 69 73 2d 4e 6f 76 65 6d 62 72 |rt
Paris-Novembr|
00000270 65 20 28 52 e9 78 78 20 56 79 79 79 79 e9 29 3c |e (R.xx
Vyyyy.)<|
Inspecting the raw RFC-2822 message, it appears that the bit of HTML
does have content-type Content-Type: text/html; charset=windows-1252.
While I regret that Google did not include redundant metadata within the
text/html bit, there not only there was proper warning that utf-8 this
was not, but also the default encoding was set to be 8859-15. Therefore,
what happened is that Evolution failed to properly convert this fragment
into proper UTF-8 before handing it over to glib (and in any case, it
definitely should have bleached it to not provide an invalid UTF-8
fragment down the HTML renderer). Assigning the blame on Evolution for
sure.
I will gladly provide the raw RFC-2822 offending message, but on a
non-disclosure basis.
Thanks in advance.
-- Cyrille
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.alioth.debian.org/pipermail/pkg-evolution-maintainers/attachments/20081121/6a669803/attachment.htm
More information about the Pkg-evolution-maintainers
mailing list