[gopher] GopherMole - a gopher media crawler
Mateusz Viste
mateusz at viste.fr
Sat Jan 3 11:13:10 UTC 2015
On 01/03/2015 11:46 AM, James Mills wrote:
> Or does a Client query a Gopherd for CAPS
> and if it sees "Encoding: utf-8" assumes *all*
> content it receives from *that* Gopherd is
> encoded in UTF-8?
That's what I was suggesting, yes.
One could argue that a single server might contain a plethora of
documents, each of which would be encoded in a specific charset, and
that's certainly a possibility. But in practice, I have always seen
servers saying (in human language, mainly on their root page) "this
server is serving content in utf-8", and rarely or never "this specific
document is encoded in xyz".
But still, the CAPS capability I was suggesting was about a "default"
encoding, that is, "if not specified otherwise, assume everything on
this server is encoded in this encoding". That way, if one day there is
a mechanism that allows to specify the charset on a per-document basis,
both won't collide (although I doubt such specific mechanism will
appear, but of course one can never be sure of the future).
Currently, gopher clients are supposed to assume ISO Latin 1, as per RFC
1436. The ServerDefaultCharset CAPS setting I was suggesting in my
message from 31st of December, 2014, was simply a way to overload that
RFC charset.
Mateusz
> On Sat, Jan 3, 2015 at 8:38 PM, Mateusz Viste <mateusz at viste.fr
> <mailto:mateusz at viste.fr>> wrote:
>
> On 01/03/2015 11:27 AM, James Mills wrote:
>
> Mis-rendered correct (which is what I meant)
> but the client "won't break".
>
>
> That's correct.
>
> What's what I meant by "degrade".
>
>
> Sure, but that's hardly 'graceful'. And doesn't have anything to do
> with ISO-8859-1. Which doesn't mean I am opposed to UTF-8 usage in
> the gopherspace, on the contrary, I'm 100% for it. But it's
> important to keep in mind the exact impact it will have on legacy
> clients.
>
> *I think* a Gopher server that splits out UTF_8 encoded data to
> a Client
> that doesn't support UTF-8 encoding will still display the
> content (just
> not any codepoint higher than 255)?
>
>
> Only low-ascii will be rendered correctly, that is anything above
> code point 127 will be scrambled.
>
> Here's an example:
>
> gopher://gopher.viste.fr/0/__docs/other/Little%2520Big%__2520Adventure%2520-%__2520Soluce%2520du%2520jeu%__2520%2528french%2529.txt
> <http://gopher.viste.fr/0/docs/other/Little%2520Big%2520Adventure%2520-%2520Soluce%2520du%2520jeu%2520%2528french%2529.txt>
>
> Same thing here (but on a polish document):
>
> gopher://gopher.viste.fr/0/__docs/opowiadania%2520%__2528polish%2529/sendbajt.txt
> <http://gopher.viste.fr/0/docs/opowiadania%2520%2528polish%2529/sendbajt.txt>
>
> When I open these documents with Overbite, all french or polish
> diacritics are broken (until I set my browser manually to UTF-8).
>
> Of course there are thousands of such examples across the gopherspace.
>
> Mateusz
>
>
> _________________________________________________
> Gopher-Project mailing list
> Gopher-Project at lists.alioth.__debian.org
> <mailto:Gopher-Project at lists.alioth.debian.org>
> http://lists.alioth.debian.__org/cgi-bin/mailman/listinfo/__gopher-project
> <http://lists.alioth.debian.org/cgi-bin/mailman/listinfo/gopher-project>
>
>
>
>
> _______________________________________________
> Gopher-Project mailing list
> Gopher-Project at lists.alioth.debian.org
> http://lists.alioth.debian.org/cgi-bin/mailman/listinfo/gopher-project
>
More information about the Gopher-Project
mailing list