Viewing Unicode characters

Danya's profile image Danya posted 10 years ago in General Permalink
Hi all!

Does anyone know how to set the character encoding for viewing the database in HeidiSQL?
The chars work fine on the website and are stored in a utf8mb4 database via the admin, which works great.
But the moments I do use the editor, I would like to see what's in there, instead of a square or, in this case a 4 byte character, a question mark.



Thanks!
kalvaro's profile image kalvaro posted 10 years ago Permalink
I'd say you just need to set an appropriate fond at Tools-> Preferences-> Data appearance. I use "Arial Unicode MS" in my PC.
Danya's profile image Danya posted 10 years ago Permalink
Thanks for your help, Kavaro, but that would not change anything to special characters being displayed or not. That's *how* they are displayed, not if.
Anyone else, maybe? :)
jfalch's profile image jfalch posted 10 years ago Permalink
care to explain how a program will display an extended utf-8 character if the font selected for display does not contain a glyph for that character ? what do you think display of square / '?' means ?
Danya's profile image Danya posted 10 years ago Permalink
Well, I'm using Tahoma (a standard true type font) for displaying them on my website and also in HeidiSQL, and the website. That does the job perfect.
The problem is that HeidiSQL probably can't handle 4 byte characters? 2 byte chars, as you see in the example image are displayed well.
Oh well, not really a problem though, since I use a CMS I build myself which displays them perfectly.
Was just wondering if there is just some way to display them correctly in HeidiSQL.
ansgar's profile image ansgar posted 10 years ago Permalink
HeidiSQL does all client-server communication in UTF-8, after calling mysql_set_character_set(FHandle, 'utf8').

Is there a slight chance that utf8mb4 contains characters which are not present in utf8?
Danya's profile image Danya posted 10 years ago Permalink
That's correct, ansgar, you know what I'm talking about! :D

UTF8mb4 contains 4 byte characters, which UTF8 doesn't support. That's why the other UTF8 characters show as they should, but these ones don't.
I use mysql_query('SET NAMES utf8mb4') to get them properly from the database. And as you said, if HeidiSQL calls it with mysql_set_character_set(FHandle, 'utf8'), they will not appear.
Is that something I can change or is that deeply embedded in HeidiSQL software code?
ansgar's profile image ansgar posted 10 years ago Permalink
That utf8 is hardcoded in HeidiSQL. No way to change that from the users point. A workaround could be to create a startup script with "SET NAMES utf8mb4" in it, and apply it to the sessions advanced options. (See tab "Advanced" in the session manager). But I'm afraid that's only one part of the story. Values written to and read from the server are still expected to be encoded in utf8, and so various hardcoded places in HeidiSQL will apply the Delphi method "Utf8ToString" to strings. That's something I cannot give you some workaround for. HeidiSQL should use utf16 or even utf32 in the future, not utf8, I guess.
Danya's profile image Danya posted 10 years ago Permalink
Thanks, ansgar! Your answer is very appreciated!happy
Well, I hope with a next version or so it's going to be working.
Like said, I'm using a CMS, so it's not really a problem. :)
ansgar's profile image ansgar posted 10 years ago Permalink
I'm curious. Could you provide me an .sql dump of your utf8mb4 table with some values in it? So I could experiment with utf16-based communication in HeidiSQL.
Danya's profile image Danya posted 10 years ago Permalink
I could, but the output file is utf-8, so I guess it's not helpful.

But, try this:

CREATE TABLE IF NOT EXISTS `testing_table` (
`index_id` int(11) NOT NULL DEFAULT '0',
`charlist_id` int(11) NOT NULL,
`cOriginal` varchar(20) CHARACTER SET utf8 COLLATE utf8_unicode_ci NOT NULL,
`cReplace` varchar(20) COLLATE utf8mb4_unicode_ci NOT NULL
) ENGINE=MyISAM DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_unicode_ci ROW_FORMAT=DYNAMIC;

And insert some of the characters in the cReplace column from the following URL: http://www.fileformat.info/info/unicode/block/mathematical_alphanumeric_symbols/list.htm

And in this case pick some of the chars from MATHEMATICAL FRAKTUR or MATHEMATICAL SCRIPT.
ansgar's profile image ansgar posted 10 years ago Permalink
I just inserted the "A" character via HeidiSQL's query tab:
insert into utable (cReplace) values (_ucs2'\xD835\xDC00')

But I get chinese chars in the data grid, no matter if I ran "SET NAMES utf8" or "SET NAMES utf8mb4" before.
Danya's profile image Danya posted 10 years ago Permalink
Yeah, don't know about that, when it comes to inserting them directly via the Query tab.
I'm using it via my CMS with a normal query. That's how I got them into the database. The output is fine too.
It's just only the viewing from within HeidiSQL, the character itself is inserted properly in the database.
ansgar's profile image ansgar posted 10 years ago Permalink
You could post the result of "SELECT HEX(cReplace) from "yourtable" LIMIT 10" here. That should preserve the character integrity.
Danya's profile image Danya posted 10 years ago Permalink
HEX(cReplace)
E28284
E28283
E28282
E28281
E28280
E292B5
E292B4
E292B3
E292B2
E292B1
Danya's profile image Danya posted 10 years ago Permalink
Sorry, these are the MATHEMATICAL FRAKTUR characters:
HEX(cReplace)
F09D9484
F09D9485
F09D9487
F09D9488
F09D9489
F09D948A
F09D948D
F09D948E
F09D948F
F09D9490
ansgar's profile image ansgar posted 10 years ago Permalink
Got it working here. Now that I read about utf8 issues here, I would like to switch from utf8 to utf8mb4 in HeidiSQL. However, that requires an update of the libmysql.dll to a newer one which knows the utf8mb4 charset. Seems that utf8mb4 was introduced not so long ago as a fix to the normal utf8 charset?

See my variables and values:
2 attachment(s):
  • utf8mb4-variables
  • utf8mb4-data
Danya's profile image Danya posted 10 years ago Permalink
Good to see you got the characters working!
Hopefully the HeidiSQL team will introduce their system to a utf8mb4 character encoding soon! :D
ansgar's profile image ansgar posted 10 years ago Permalink
I'm on the way. I did not know about that bug in MySQL's utf8 charset up to yesterday. So, give me some hours for a downwards-compatible fix.
ansgar's profile image ansgar posted 10 years ago Permalink
The bad thing is that the libmysql.dll shipped with the HeidiSQL installer does not yet contain that utf8mb4 charset. In order not to break older fixes again I will need a newer MariaDB client library and test other things with that as well. See r4532
ansgar's profile image ansgar posted 10 years ago Permalink
Eh, I meant http://code.google.com/p/heidisql/source/detail?r=4532
Danya's profile image Danya posted 10 years ago Permalink
Sounds like you're in the develop team, ansgar?
Anyway, great job of trying to fix it!
ansgar's profile image ansgar posted 10 years ago Permalink
I am the develop teamsmileYes, HeidiSQL coding is a one-man-show. There were other developers for short times in the past, but I'd say 99.5% of the code is from me. And of course there are other contributors, for example the myriad of translators, then the Transifex manager, the Facebook writer etc.
Danya's profile image Danya posted 10 years ago Permalink
Ok, great. Well, you're doing an amazing job, mate!
I used MySQL Front before HeidiSQL, and wasn't satisfied with the utf8 support, it had almost none. Copying tables meant loss of all utf8 characters, even though everything was set to utf8_unicode_ci. That was why I switched to HeidiSQL (and because MySQL Front wasn't updated anymore).
Using HeidiSQL now for 6 months or so and very satisfied, which I expressed with a little donation, since you're doing a great job!
I run about 30 MySQL driven websites, so I'm using HeidiSQL a lot. It's stable, the design is nice, the functions work superb and the utf8 support is great. :D
Code modification/commit 6f6c594 from ansgar.becker, 10 years ago, revision 8.3.0.4822
Switch character set from utf8 to the fixed one called utf8mb4 in MySQL. Attempt to still use utf8 when that newer charset does not exist in the client library. At the same time, update libmysql to v2.0 from https://downloads.mariadb.org/client-native/2.0.0/ . See http://www.heidisql.com/forum.php?t=16425
ansgar's profile image ansgar posted 10 years ago Permalink
Done in r4822. HeidiSQL now tries to use utf8mb4 by default, and if that charset does not exist in libmysql, it uses utf8 as fallback. At the same time I committed a newer libmysql from https://downloads.mariadb.org/client-native/2.0.0/ so the installer contains the utf8mb4 charset. If you just update your HeidiSQL build it will spit out a warning in the sql log about the non existent utf8mb4 charset.
Danya's profile image Danya posted 10 years ago Permalink
Nice! I updated HeidiSQL to r8422, but still question marks. Do I need to install the mariadb-native-client-2.0.0-win32.msi from MariaDB.org too to get it working?
lemon_juice's profile image lemon_juice posted 10 years ago Permalink
When connecting I see the following in the log:

/* Attempt to create plink.exe process, waiting 4s for response ... */
/* Connected. Thread-ID: 310502 */
/* Can't initialize character set utf8mb4 (path: C:\Program Files (x86)\MySql\share\charsets\) */
/* Characterset: utf8 */


This doesn't look right because there's no folder "C:\Program Files (x86)" on my system, is it hard coded?
jfalch's profile image jfalch posted 10 years ago Permalink
probably should use the value of server var @character_sets_dir
ansgar's profile image ansgar posted 10 years ago Permalink
The message with that faulty path comes from libmysql.dll and you can ignore the fact that it does not use a valid directory.

You should use the nightly build installer in order to get a newer libmysql which has that utf8mb4 charset. When you have the libmysql from the last official 8.3 installer you will always get that error message. Well, HeidiSQL still uses utf8 in that case, so it's not a real problem.
ansgar's profile image ansgar posted 10 years ago Permalink
Danya: No, you don't need to install anything from MariaDB. Just use the nightly installer: http://www.heidisql.com/installers/HeidiSQL_8.3.0.4822_Setup.exe
lemon_juice's profile image lemon_juice posted 10 years ago Permalink
Yes, the nightly installer did it - thanks! Nice to have this feature.
Danya's profile image Danya posted 10 years ago Permalink
Hm, the question marks are now two squares, which is a tad better, but still not like in your sqreenshot.
Might now be one of my own settings.
They display correctly in the webbrowser, so my system supports it. The database standard encoding is utf8 though, the table too. Just the column has utf8mb4, since I just started experimenting with it. But I'm not eager to change the complete database/table encoding (yet) on this domain.
It might work on a clean database, with everything set to mb4.
I'll update you on that tomorrow!
ansgar's profile image ansgar posted 10 years ago Permalink
And you installed the nightly build installer r4822? If yes, what do you get in the very first lines of your SQL log at the bottom. I have an old .dll here and I get:
/* Can't initialize character set utf8mb4 (path: compiled_in) */
/* Characterset: utf8 */
ansgar's profile image ansgar posted 10 years ago Permalink
A problem might be that not only HeidiSQL + libmysql.dll needs to support that utf8mb4 charset, but also the server needs to have support for it. On an MySQL 5.1.66 I get that above error message, but on a MySQL 5.6.19 not so.
Danya's profile image Danya posted 10 years ago Permalink
The first lines are:
/* Delimiter changed to ; */
/* Connecting to *******.com via MySQL (TCP/IP), username *******, using password: Yes ... */
/* Connected. Thread-ID: 8794864 */
/* Characterset: utf8mb4 */

I'm using MySQL 5.5.37, and it supports the charset since everything is working fine on the frontend.

I did not had the time to test on a fresh domain though and will try ASAP.

By the way, since I updated to r4822 connecting to the server is taking about 10-15 seconds instead of 1 ish second.
Danya's profile image Danya posted 10 years ago Permalink
Ok, so, I created a fresh database on a clean domain, with default collation set to utf8mb4 and created a table with one column, varchar utf8mb4. Still 2 squares.

ALTER DATABASE `in4_db` COLLATE 'utf8mb4_unicode_ci';
CREATE TABLE `test` (
`test` VARCHAR(50) NULL COLLATE 'utf8mb4_unicode_ci'
)
COLLATE='utf8mb4_unicode_ci'
ENGINE=MyISAM;
INSERT INTO `in4_db`.`test` (`test`) VALUES ('
Danya's profile image Danya posted 10 years ago Permalink
Whoops, forum breaks on that character...
ansgar's profile image ansgar posted 10 years ago Permalink
Shit. What's wrong here. Worked for me in my tests out of the box.
Danya's profile image Danya posted 10 years ago Permalink
Don't know. Like said, might just be my system. So many variables.
But as said too when I started this topic, it's not a big problem! I don't use HeidiSQL to edit those characters, that's all done by the CMS.
ansgar's profile image ansgar posted 10 years ago Permalink
Well I thought I could bump HeidiSQL's feature set by introducing that fixed utf8 charset, so I find that thing quite interesting and important. And I too have this latency on *some* of my connection attempts, mostly not, I'd say in 1 out of 5 sessions I get a 3 second waiter. Weird.
Danya's profile image Danya posted 10 years ago Permalink
Hm, well, I never experienced the wait time but with the r4822 I do, for every domain and database. Once connected it's all good, but the first connection takes 15 seconds instead of 1.
Hopefully that'll get fixed soon.

If I can do or test anything or supply you with data, let me know.
ansgar's profile image ansgar posted 10 years ago Permalink
The delay is directly in the mysql_real_connect method, as I see that when writing to session log files:
/* 2014-10-03 14:06:47 [localhost] */ /* Connecting to 127.0.0.1 via MariaDB (TCP/IP), username root, using password: No ... */
/* 2014-10-03 14:06:59 [localhost] */ /* Connected. Thread-ID: 103 */

Will try a newer libmysql from the current MariaDB GA release.
Danya's profile image Danya posted 10 years ago Permalink
Ok, and by the way, holy crap, have you changed the font family on HeidiSQL.com? o.O Gee wiz.
Code modification/commit e6d0eb2 from ansgar.becker, 10 years ago, revision 8.3.0.4827
Update libmysql again to those from the MariaDB 10.0.14 package. Probably fixes a connection latency, described in http://www.heidisql.com/forum.php?t=16425
ansgar's profile image ansgar posted 10 years ago Permalink
r4827 updates libmysql.dll again to those from the MariaDB 10.0.14 package (current GA release)

And please try again with my exported characters:

CREATE TABLE `utable` (
`id` INT(11) NOT NULL AUTO_INCREMENT,
`val` VARCHAR(20) NOT NULL COLLATE 'utf8mb4_unicode_ci',
PRIMARY KEY (`id`)
)
COLLATE='utf8mb4_unicode_ci';
INSERT INTO utable (id, val) VALUES (1, UNHEX('F09D9484'));
INSERT INTO utable (id, val) VALUES (2, UNHEX('F09D9485'));
INSERT INTO utable (id, val) VALUES (3, UNHEX('F09D9487'));
INSERT INTO utable (id, val) VALUES (4, UNHEX('F09D9488'));
INSERT INTO utable (id, val) VALUES (5, UNHEX('F09D9489'));
INSERT INTO utable (id, val) VALUES (6, UNHEX('F09D948A'));
INSERT INTO utable (id, val) VALUES (7, UNHEX('F09D948D'));
INSERT INTO utable (id, val) VALUES (8, UNHEX('F09D948E'));
INSERT INTO utable (id, val) VALUES (9, UNHEX('F09D948F'));
INSERT INTO utable (id, val) VALUES (10, UNHEX('F09D9490'));
Danya's profile image Danya posted 10 years ago Permalink
I'm sorry.. bad news.
1) The connecting latency of about 15 seconds is still there on r4827
2) Still seeing squares after running the query:
ansgar's profile image ansgar posted 10 years ago Permalink
Be sure to use the installer from r4827. Another similar delay issue was fixed by the newer libmysql: http://www.heidisql.com/forum.php?t=16547 . And also the connection delay is gone for me here.

And try another font in Preferences > Data appearance, like kalvaro said. Try "Calibri" or the one kalvaro meant.

These hex values should work here, they should work for you too.
Danya's profile image Danya posted 10 years ago Permalink
Thanks, the installer did the job. It's connecting within a second. Great!!happy

Did not do anything for the squares though.
Like I said before, changing the font is not gooing to be helping. As long as you use true type fonts, they just work. I'm using Tahoma, which is a truetype font.
But, to prove that, here's a screenshot with Calibri:
[expired user #8438]'s profile image [expired user #8438] posted 10 years ago Permalink
After upgrading to 8.3.0.4839, my Chinese characters become corrupted.

/* Delimiter changed to ; */
/* Connecting to 10.55.99.29 via MySQL (TCP/IP), username root, using password: Yes ... */
/* Connected. Thread-ID: 102 */
/* Can't initialize character set utf8mb4 (path: C:\Program Files (x86)\MySql\share\charsets\) */
/* Characterset: latin1 */


ansgar's profile image ansgar posted 10 years ago Permalink
You should download a nighly installer, as you're probably missing some dll.
[expired user #8438]'s profile image [expired user #8438] posted 10 years ago Permalink
I have tried to install the nightly installer but the problem still exists.

Now I have rolled back to use the stable version 8.3.0.4694.
/* Delimiter changed to ; */
/* Connecting to 10.55.99.29 via MySQL (TCP/IP), username root, using password: Yes ... */
/* Connected. Thread-ID: 165 */
SHOW STATUS;


Thanks.
ansgar's profile image ansgar posted 10 years ago Permalink
Hm, well it's clear that latin1 breaks your characters, but why does the installer not use utf8mb4 and not even utf8 as the fallback? I have no clue.
[expired user #8438]'s profile image [expired user #8438] posted 10 years ago Permalink
It should use big5 as the character_set and latin1 big5 czech euc_kr gb2312 gbk latin1_de sjis tis620 ujis as the character_sets.

My OS is 32-bit.
The path C:\Program Files (x86)\MySql\share\charsets\ doesn't exist at all.
ansgar's profile image ansgar posted 10 years ago Permalink
I just used the 32 bit installer to try to reproduce your situation, but I get utf8mb4:
/* Connecting to 127.0.0.1 via MariaDB (TCP/IP), username root, using password: No ... */
/* Connected. Thread-ID: 284 */
/* Characterset: utf8mb4 */
SHOW STATUS;

What OS is it on your side?
ansgar's profile image ansgar posted 10 years ago Permalink
Even on my Windows XP box I get utf8mb4.
[expired user #8438]'s profile image [expired user #8438] posted 10 years ago Permalink
My PC is Windows 7 32-bit.
The database is MySQL 4.0.23 which is not installed in my PC.

Do I need to install any MySQL stuff in my PC to support that?
If so, why would the HeidiSQL stable version do not need to install this?
ansgar's profile image ansgar posted 10 years ago Permalink
MySQL 4.0.23? That's really, *really* old. And utf8mb4 is supported from MySQL 5.5.3 upwards: http://dev.mysql.com/doc/refman/5.5/en/charset-unicode-utf8mb4.html
[expired user #8438]'s profile image [expired user #8438] posted 10 years ago Permalink
Yea... That's a legacy system in my workplace.
Could you please make it backward compatible? smile
[expired user #8438]'s profile image [expired user #8438] posted 10 years ago Permalink

MySQL 4.0.23? That's really, *really* old. And utf8mb4 is supported from MySQL 5.5.3 upwards: http://dev.mysql.com/doc/refman/5.5/en/charset-unicode-utf8mb4.html


Would you mind adding an option to let the user choose the default character_set to fallback?
ansgar's profile image ansgar posted 10 years ago Permalink
You can set the charset at session startup. Just create a .sql script with "SET NAMES latin1" or whatever and apply that file to your session's option "Startup script" (in the "Advanced" tab)

Please login to leave a reply, or register at first.