Pantek Library
Hosting Provided By
CybrHost
High Speed Hosting

Re: character encoding

From: Vincent Lefevre <vincent(at)vinc17.org>
Date: Mon Dec 31 2007 - 23:52:07 EST


On 2007-12-31 15:08:24 -0800, Kelly Clowers wrote:
> On Dec 31, 2007 1:41 PM, ChadDavis <chadmichaeldavis@gmail.com> wrote:
> > 3) What is the encoding of the file name? Is this a feature of the
> > filesystem?
>
> This is also based on your locale.

And this is nasty: This means that if the user changes his locales (or use different locales depending on the context), he will get buggy filenames; this is also the case with system scripts that run under the C locale. Also, different users using different locales won't easily be able to share files.

Workaround 1: don't use non-ASCII characters in filenames. This may not be very user-friendly, but this is 100% compatible with everything.

Workaround 2 (if ASCII isn't sufficient): always use UTF-8. But be careful about the normalization problems (NFC/NFD...). Linux can't handle that, so that you may get several files with the same name (but encoded differently) in the same directory.

-- 
Vincent Lefèvre <
vincent(at)vinc17.org> - Web: <
http://www.vinc17.org/>
100% accessible validated (X)HTML - Blog: <
http://www.vinc17.org/blog/>
Work: CR INRIA - computer arithmetic / Arenaire project (LIP, ENS-Lyon)


-- 
To UNSUBSCRIBE, email to debian-user-REQUEST@lists.debian.org 
with a subject of "unsubscribe". Trouble? Contact listmaster@lists.debian.org
Received on Tue Jan 1 00:09:36 2008

This archive was generated by hypermail 2.1.8 : Fri Feb 29 2008 - 16:13:29 EST


Contact Us  Legal Notices  Order Services Online 
Pantek Home  Privacy Policy  IT news  Site Map  Pantek Library