[gopher] GopherMole - a gopher media crawler

Matjaž Mešnjak matjaz85 at gmail.com
Sat Jan 3 11:29:30 UTC 2015


Well .. theoretically you could use UTF-8 BOM to mark UTF-8 files on
per-document basis. But implementing detection of this into existing
clients is of course not always an option. In my opinion - using UTF-8 as
default encoding is good compromise. RFC compliant clients will still work
and english only text will be displayed as expected. For other languages
having an option to display accented characters in some clients (UTF-8
enabled ones) is still better than not being able to display them at all.

cheers
Matjaz

2015-01-03 12:13 GMT+01:00 Mateusz Viste <mateusz at viste.fr>:

> On 01/03/2015 11:46 AM, James Mills wrote:
>
>> Or does a Client query a Gopherd for CAPS
>> and if it sees "Encoding: utf-8" assumes *all*
>> content it receives from *that* Gopherd is
>> encoded in UTF-8?
>>
>
> That's what I was suggesting, yes.
>
> One could argue that a single server might contain a plethora of
> documents, each of which would be encoded in a specific charset, and that's
> certainly a possibility. But in practice, I have always seen servers saying
> (in human language, mainly on their root page) "this server is serving
> content in utf-8", and rarely or never "this specific document is encoded
> in xyz".
>
> But still, the CAPS capability I was suggesting was about a "default"
> encoding, that is, "if not specified otherwise, assume everything on this
> server is encoded in this encoding". That way, if one day there is a
> mechanism that allows to specify the charset on a per-document basis, both
> won't collide (although I doubt such specific mechanism will appear, but of
> course one can never be sure of the future).
>
> Currently, gopher clients are supposed to assume ISO Latin 1, as per RFC
> 1436. The ServerDefaultCharset CAPS setting I was suggesting in my message
> from 31st of December, 2014, was simply a way to overload that RFC charset.
>
> Mateusz
>
>
>
>
>
>  On Sat, Jan 3, 2015 at 8:38 PM, Mateusz Viste <mateusz at viste.fr
>> <mailto:mateusz at viste.fr>> wrote:
>>
>>     On 01/03/2015 11:27 AM, James Mills wrote:
>>
>>         Mis-rendered correct (which is what I meant)
>>         but the client "won't break".
>>
>>
>>     That's correct.
>>
>>         What's what I meant by "degrade".
>>
>>
>>     Sure, but that's hardly 'graceful'. And doesn't have anything to do
>>     with ISO-8859-1. Which doesn't mean I am opposed to UTF-8 usage in
>>     the gopherspace, on the contrary, I'm 100% for it. But it's
>>     important to keep in mind the exact impact it will have on legacy
>>     clients.
>>
>>         *I think* a Gopher server that splits out UTF_8 encoded data to
>>         a Client
>>         that doesn't support UTF-8 encoding will still display the
>>         content (just
>>         not any codepoint higher than 255)?
>>
>>
>>     Only low-ascii will be rendered correctly, that is anything above
>>     code point 127 will be scrambled.
>>
>>     Here's an example:
>>
>>     gopher://gopher.viste.fr/0/__docs/other/Little%2520Big%__
>> 2520Adventure%2520-%__2520Soluce%2520du%2520jeu%__
>> 2520%2528french%2529.txt
>>     <http://gopher.viste.fr/0/docs/other/Little%2520Big%
>> 2520Adventure%2520-%2520Soluce%2520du%2520jeu%2520%2528french%2529.txt>
>>
>>     Same thing here (but on a polish document):
>>
>>     gopher://gopher.viste.fr/0/__docs/opowiadania%2520%__
>> 2528polish%2529/sendbajt.txt
>>     <http://gopher.viste.fr/0/docs/opowiadania%2520%
>> 2528polish%2529/sendbajt.txt>
>>
>>     When I open these documents with Overbite, all french or polish
>>     diacritics are broken (until I set my browser manually to UTF-8).
>>
>>     Of course there are thousands of such examples across the gopherspace.
>>
>>     Mateusz
>>
>>
>>     _________________________________________________
>>     Gopher-Project mailing list
>>     Gopher-Project at lists.alioth.__debian.org
>>     <mailto:Gopher-Project at lists.alioth.debian.org>
>>     http://lists.alioth.debian.__org/cgi-bin/mailman/listinfo/_
>> _gopher-project
>>     <http://lists.alioth.debian.org/cgi-bin/mailman/listinfo/
>> gopher-project>
>>
>>
>>
>>
>> _______________________________________________
>> Gopher-Project mailing list
>> Gopher-Project at lists.alioth.debian.org
>> http://lists.alioth.debian.org/cgi-bin/mailman/listinfo/gopher-project
>>
>>
> _______________________________________________
> Gopher-Project mailing list
> Gopher-Project at lists.alioth.debian.org
> http://lists.alioth.debian.org/cgi-bin/mailman/listinfo/gopher-project
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.alioth.debian.org/pipermail/gopher-project/attachments/20150103/84d32a98/attachment.html>


More information about the Gopher-Project mailing list