[gopher] RFC submission?

Nuno Silva nunojsilva at ist.utl.pt
Sat Jan 3 11:39:07 UTC 2015


On 2015-01-03 21:25, James Mills wrote:
> On Sat, Jan 3, 2015 at 8:53 PM, Nuno Silva <nunojsilva at ist.utl.pt> wrote:
> 
> > Aren't you mixing ISO-8859-1 with ASCII? AFAIK, the first 128 bytes in
> > utf8 represent the same as they represent in ASCII. While Wikipedia says
> > that "ISO-8859-1 was incorporated as the first 256 code points of
> > ISO/IEC 10646 and Unicode.", this is, if I'm not mistaken, the
> > *character set*, not the encoding for these characters. In the specific
> > case of utf8, the lowest 128 codepoints are represented in a way that is
> > compatible with ASCII (and thus with ISO-8859-*[1]), but other codepoints
> > above 128, even those from ISO-8859-1, aren't compatible with ISO-8859-1.
> >
> > See, for example,
> > https://en.wikipedia.org/wiki/%C3%81#Character_mappings
> >
> 
> You are of course quite right :) Thank you for clearing tha tup.
> 
> I think the point is clear though?
> Older clients *can* handle UTF-8 encoding even now
> (even if not rendered properly).

Improperly rendered UTF-8 will easily become unreadable[1], which is my
main problem when mixing encodings. By "unreadable" I mean that you
can't get the meaning of the text.

It won't make clients explode, but it will ruin the content.

[1] Unless you're writing in a language like english, which uses
ASCII-compatible codepoints for the words and where utf8 or iso8859 will
probably be used for non-word parts of the text, like writing
'™'. Several languages require characters that are not part of ASCII,
including Finnish, Spanish, French and Portuguese.

But preferring ISO-8859-1 won't help for languages which use other ISO
8859 encodings.

Are there any gopher clients that try to autodetect whether the text is
utf8 or ISO-8859? (IF that's even possible without false positives - I
guess it's easier with ISO-8859-1...)

-- 
Nuno Silva (aka njsg)
Helsinki, Finland



More information about the Gopher-Project mailing list