|
|||||||||||
|
Re: character encoding
From: Kelly Clowers <kelly.clowers(at)gmail.com>
Date: Mon Dec 31 2007 - 18:08:24 EST
On new Etch installs, UTF-8 is the default. On older systems, it depends on you locale (I'm not sure if a system upgraded to Etch would be UTF-8 or not). In the US it would be ISO-8859-1 or ISO-8859-15, I think. Use the command "locale" and see what it says. Mine says en_US.UTF-8 > 2) It seems that a file itself doesn't have any encoding as it is sitting on All files have encoding. Text files do, of course, but so binary files like .jpg or .mp3. Even binary executables and libraries have a specific format (binary executables are in ELF format on non-ancient Linux systems). When a text file is opened, I believe most simple apps try to interpret it based on your systems locale. Some smarter programs may apply fairly complicated heuristics to determine the encoding. Some plain-text-based file types, such as xml, declare the encoding near the beginning of the file. > 3) What is the encoding of the file name? Is this a feature of the This is also based on your locale. Note that if you download a text file that is in, say, Shift-JIS (a common Japanese encoding), the file and perhaps the filename will still be in Shift-JIS. Even if your system is UTF-8 and has Japanese fonts installed, it will not display the file correctly if it simply interprets it based on your locale. There are programs that can convert between encodings, including the "convmv" package, which converts only filenames, the package "utf8-migration-tool" and the "recode" package. > I realize these questions may not be that "smart"; please tell me what I'm For general info start with these wiki pages and some of the other pages they link to: http://en.wikipedia.org/wiki/Locale If you want more in-depth programmer-oriented info on unicode, check out Joel's article: http://www.joelonsoftware.com/articles/Unicode.html There is more Debian-specific info about charsets, locales, etc. in the Debian Reference section on L10n (Localization) [take out 10 letters]: http://www.debian.org/doc/manuals/debian-reference/ch-tune.en.html#s-l10n and in the Debian i18n (internationalization) [take out 18 letters] Guide: http://www.debian.org/doc/manuals/intro-i18n/index.en.html
Cheers,
-- To UNSUBSCRIBE, email to debian-user-REQUEST@lists.debian.org with a subject of "unsubscribe". Trouble? Contact listmaster@lists.debian.orgReceived on Mon Dec 31 18:09:00 2007 This archive was generated by hypermail 2.1.8 : Fri Feb 29 2008 - 15:56:43 EST |
||||||||||
|
|||||||||||