[Debburn-devel] Character sets in UDF

Florent Rougon f.rougon at free.fr
Sun Jan 21 21:49:44 CET 2007


Hi,

Eduard Bloch <edi at gmx.de> wrote:

> Uhm, that is the point of Unicode. If you know the encoding and it
> covers the whole Unicode and this is the only allowed mode, what is the
> point in additional metadata? 

I'm not sure what you mean with "If you know the encoding". If you're
talking about my files: yes, sure, I know the encoding. But if you're
talking about a random UDF filesystem burnt to DVD, I don't know if it's
possible, and that's a big part of my question.

IOW: yes, I can probably use UTF-8 in my UDF filesystems, but if I give
the resulting DVDs to someone else, can his software know without
resorting to ugly guesswork that file names are UTF-8-encoded? (Unicode
is only the charset, not the encoding...)

The only clean solutions to all these charset problems are either to
have only one charset _and_ encoding allowed (which is not very
flexible), or to record as metata the charset and encoding used. I
looked for this information regarding UDF, but the answer is not clear
to me... (page 11 of the specs for UDF 2.60 downloaded from
http://www.osta.org/specs/, no idea what a d-character is, for a start).

Would a UDF fs with filenames encoded in UTF-8 be readable on Windows
and MacOS X?

> There is a problem with large files. 2..4GB are possible with UDF, but
> then something goes wrong, looks like a subtle bug or design limitation
> in genisoimage's code. You can disable the largefile prevention hook but
> the resulting FS has broken files. I did not have time to investigate it
> yet, somebody needs to step out and fix it. Competing tools like
> commercial NeroLinux seem to do it well.

OK, thanks for the info.

-- 
Florent



More information about the Debburn-devel mailing list