[gopher] CAPS capability: ServerDefaultCharset

Mateusz Viste mateusz at viste.fr
Sat Jan 3 17:23:35 UTC 2015


Hello,

For those interested in unicode and utf-8 handling, I developed not so 
long ago a converter that decodes UTF-8 content and encodes it into 
several 8-bit codepages (can also work the other way). The source code 
is pretty readable, and it's available here:

http://sourceforge.net/p/utf8tocp/code/HEAD/tree/utf8tocp.c

It comes with several lookup tables already. I developed it primarily 
for the FreeDOS localization project.

The utf8tocp project's main page is this:

http://sourceforge.net/projects/utf8tocp/

Mateusz




On 01/03/2015 06:18 PM, Nuno Silva wrote:
> On 2015-01-03 17:55, Kim Holviala wrote:
>>> On 03 Jan 2015, at 17:49, Nuno Silva <nunojsilva at ist.utl.pt> wrote:
>>>
>>> You mean Gophernicus can even handle both ISO-8859-1 and UTF-8 if
>>> they're mixed inside the *same* document? That's neat! (And it also
>>> degrades in a nice way!)
>>
>> Yep, it works even if they are used within a single line of text. I first tried to use the GNU iconv() but that function was just incredibly stupid so I wrote my own. While writing it I realized I can just autodetect all input on char-by-char basis, skip most of the “offical” conversion tables and just focus on US-ASCII/Latin-1/first plane of UTF-8. My strniconv() is purely a 80/20 implementation, and that’s good enough for me.
>>
>
> Out of curiosity, have you made a standalone (iconv-like) tool using the
> code you wrote? Even if it is just 80/20, that is something I could use
> in some situations.
>



More information about the Gopher-Project mailing list