Re: LOAD DATA INTO doesn't work correctly with utf8
Edward Kay napsal(a):
>> I would like to import data from a utf8-coded comma seperated file. I >> created my database with "DEFAULT CHARACTER SET utf8 COLLATE >> utf8_general_ci" and I started my mysql-client with the >> --default-character-set=utf8 option. Nevertheless, when I input primary >> key fields, which differ only in one umlaut character (e.g. "achten" and >> "ächten") I get the following error message: >> >> ERROR 1062 (23000): Duplicate entry 'ächten' for key 1 >> >> (Same thing happens when I try to manually INSERT the row.) >> >> When I display my variable settings with "SHOW variables LIKE >> 'c%';" I receive the following result: >> >> >> +--------------------------+----------------------------+ >> | Variable_name | Value | >> +--------------------------+----------------------------+ >> | character_set_client | utf8 | >> | character_set_connection | utf8 | >> | character_set_database | utf8 | >> | character_set_filesystem | binary | >> | character_set_results | utf8 | >> | character_set_server | latin1 | >> | character_set_system | utf8 | >> | character_sets_dir | /usr/share/mysql/charsets/ | >> | collation_connection | utf8_general_ci | >> | collation_database | utf8_general_ci | >> | collation_server | latin1_swedish_ci | >> | completion_type | 0 | >> | concurrent_insert | 1 | >> | connect_timeout | 5 | >> +--------------------------+----------------------------+ >> 14 rows in set (0.02 sec) >> >> From this I conclude it is the server setting, which causes the trouble >> here. When I manipulate the settings manually from the client (with "SET >> character_set_server=utf8; SET collation_server=utf8_general_ci;") the >> values do change, but not the behaviour. But this can be expected, since >> the server is already up and running with the wrong settings. >> >> Does anybody know how I restart my mysql-server with the correct >> character and collation settings, if this is the cause for my problem, >> or if there might be any other reason for it. My mysql version is >> 5.0.26-12, running on a Suse Linux 10.2. >> >> Best regards, >> H. >>
>
> Try using the SET NAMES 'utf8' statement [1] to tell MySQL that your client
> is sending data in UTF-8. I believe that as your server is latin1, it will
> assume this is the character set used by the command line client.
>
> [1]
http://dev.mysql.com/doc/refman/5.0/en/charset-connection.html
>
> Edward
>
>
>
From my experience SET NAMES doesn't work, but character set of the
database must be same as file's character set and this condition is OK.
For sure I used script:
USE database_with_correct_charset;
LOAD DATA ...;
And this worked fine for files with cp1250 and also with keybcs2 (I had
two databases, of course)
HTH,
Dusan
--
MySQL General Mailing List
For list archives:
http://lists.mysql.com/mysql
To unsubscribe:
http://lists.mysql.com/mysql?unsub=lists@pantek.com
Received on Thu Aug 30 06:28:57 2007
This archive was generated by hypermail 2.1.8
: Sun Oct 07 2007 - 10:08:05 EDT
|