Change connection settings to utf8mb4 (not utf8) MySQL.

[expired user #7958]'s profile image [expired user #7958] posted 11 years ago in General Permalink
My MySQL database is set to use utf8mb4 with a utf8mb4_unicode_ci collation, but HeidiSQL defaults to utf8 & utf8_general_ci when it connects. I ran the following in the Query tab to get the client settings,

"SHOW VARIABLES WHERE Variable_name LIKE 'character\_set\_%' OR Variable_name LIKE 'collation%';".

It seems to work fine for the 3 byte utf8 characters, but as soon as I try inserting an Emoji (
[expired user #7958]'s profile image [expired user #7958] posted 11 years ago Permalink
oops looks like Emoji didn't like the forums and cut my post in half :(

It seems to work find for the 3 byte utf8 characters, but as soon as I try inserting an Emoji (hex 0xF09F9884 as an example), I get the following error,

"/* SQL Error (1267): Illegal mix of collations (utf8mb4_unicode_ci,IMPLICIT) and (utf8_general_ci,COERCIBLE) for operation '=' */"

The insert will work fine if I execute "SET CHARACTER SET utf8mb4" in the Query tab at the start of the session (it changes all the connection settings to utf8mb4 etc).

Is there a setting somewhere that I am missing to get this happening automatically when I connect?

Sorry if this post has ended up sounding confusung :(
ansgar's profile image ansgar posted 11 years ago Permalink
Server communication in Heidi is done in utf8, and an INSERT or whatever is also sent as utf8, not utf8mb4. The server converts your query data then from utf8 to the charset of your table column. The database and table charset are irrelevant by the way, as long as the column has a different charset. However, if the column has utf8mb4 collation, the server should convert your query data from utf8 to utf8mb4.

The only exception I can think of is when you send characters which are not contained in utf8, which is probably the case here with utf8mb4. I'm not certain.

Probably I should use utf16 in Heidi.

Can you please test if the INSERT works when you fire "SET CHARACTER SET utf16" ?
[expired user #7958]'s profile image [expired user #7958] posted 11 years ago Permalink
Thanks for the quick reply Ansgar :)

"SET CHARACTER SET utf16" gives me an error: "/* SQL Error (1231): Variable 'character_set_client' can't be set to the value of 'utf16' */"

I made a small table to test this on, will show the create table and output I was referring to,

The MySQL table,
CREATE TABLE `table` (
`column` VARCHAR(50) NULL DEFAULT NULL COLLATE 'utf8mb4_unicode_ci'
)
COLLATE='utf8mb4_unicode_ci'
ENGINE=InnoDB;

When I log into MySQL on the server,
SHOW VARIABLES WHERE Variable_name LIKE 'character\_set\_%' OR Variable_name LIKE 'collation%';
+--------------------------+--------------------+
| Variable_name | Value |
+--------------------------+--------------------+
| character_set_client | utf8mb4 |
| character_set_connection | utf8mb4 |
| character_set_database | utf8mb4 |
| character_set_filesystem | binary |
| character_set_results | utf8mb4 |
| character_set_server | utf8mb4 |
| character_set_system | utf8 |
| collation_connection | utf8mb4_unicode_ci |
| collation_database | utf8mb4_unicode_ci |
| collation_server | utf8mb4_unicode_ci |
+--------------------------+--------------------+

When I log into HeidiSQL,
SHOW VARIABLES WHERE Variable_name LIKE 'character\_set\_%' OR Variable_name LIKE 'collation%';
+--------------------------+--------------------+
| Variable_name | Value |
+--------------------------+--------------------+
| character_set_client | utf8 |
| character_set_connection | utf8 |
| character_set_database | utf8mb4 |
| character_set_filesystem | binary |
| character_set_results | utf8 |
| character_set_server | utf8mb4 |
| character_set_system | utf8 |
| collation_connection | utf8_general_ci |
| collation_database | utf8mb4_unicode_ci |
| collation_server | utf8mb4_unicode_ci |
+--------------------------+--------------------+

If I run "SET CHARACTER SET utf8mb4" in HeidiSQL it will change to the output of that command to what the server output showed me.

It seems that in MySQL/MariaDB that utf8 can only store encoded symbols up to 3 bytes long, but official UTF-8 should be able to store encoded symbols up to 4 bytes long (so utf8mb4 is the "correct" UTF-8 to use if you want all those 4 bytes of encoding in MySQL).

The 4 byte encoded Emoji characters (for example) exist in UTF-8 but not in MySQL "utf8", however they do exist in "utf8mb4".

I noticed this website goes into more detail and gives a better explanation that I can give,

http://mathiasbynens.be/notes/mysql-utf8mb4
ansgar's profile image ansgar posted 10 years ago Permalink
See here
pvcon13's profile image pvcon13 posted 9 years ago Permalink
I am currently having this same or related issue, and I guess after reading this older sequence I do not understand the correct procedure to prevent this.

Because of user entered messaging (emoji and others) I need to maintain all not enumerated columns at utf8mb4_unicode_ci (or so it seems). error is select statements in mysql 5.6x getting coerce errors on equals.

EVENT FIND --> ERRORError: ER_CANT_AGGREGATE_2COLLATIONS: Illegal mix of collations (utf8mb4_general_ci,IMPLICIT) and (utf8_general_ci,COERCIBLE) for operation '='

UPDATE --> ERRORError: ER_TRUNCATED_WRONG_VALUE_FOR_FIELD: Incorrect string value: '\xF0\x9F\x8D\xA3",...' for column 'Preferences' at row 1

When I use heidi in Maintenance Option for an entire database all tables do indeed change from utf8mb4_general_ci to what I asked for utf8_unicode_ci.

THEN...
:)

I go into each table and check convert data (all the general_ci columns highlight and table by table I convert the data to utf8mb4_unicode_ci and each conversion appears to work, BUT, refreshing the database page I notice that one by one as I convert each tables data the collation list on that table switches back to utf8mb4_unicode.

If I repeat these steps in another order I always get back to the place that Heidi indicates is utf8mb4_general_ci.

The reason I am doing these conversion is the exact same coerce equal failures in select statements when emoji are entered in chat messages by users.

I must be doing something very wrong. My apologizes if this is something obvious

Do you have a procedure, not only for the insert statements listed above, but how I have actually convert the standing collation at database and table and column level and the existing data as well?
pvcon13's profile image pvcon13 posted 9 years ago Permalink
Correcting typo above

When I use heidi in Maintenance Option for an entire database all tables do indeed change from utf8mb4_general_ci to what I asked for utf8mb4_unicode_ci.

Please login to leave a reply, or register at first.