Unicode character set

[expired user #9567]'s profile image [expired user #9567] posted 9 years ago in Feature discussion Permalink

Hey all. I've played with different character sets and tried utf8 and utf8mb4 collation order when creating a MySQL database/table on the server. Trying to manage a multilingual database, which has 3 languages: English, Russian and Latvian. Server default encoding is utf-8. Whenever I enter row data via Webmin MySQL server all characters are fine.

But whenever I add characters from HeidiSQL, they get displayed as question marks on the website. Or when they are grabbed by Heidi they get converted into jibberish.

I believe something happens during the MySQL connection. Would appreciate any help.

Example pic is attached.

1 attachment(s):
  • Heidi_SQL_character_set
BubikolRamios's profile image BubikolRamios posted 9 years ago Permalink

So, did you noticed that collation can be set to table and to each column separtarely. Check columns collation .....

[expired user #9567]'s profile image [expired user #9567] posted 9 years ago Permalink

Thanks, I've just noticed that. However, setting even to utf8, utfmb4 and utf-32_unicode_ci doesn't resolve the issue. Now the HeidiSQL does see normal text, but when the query is made on the front end I get ??? marks. And that seems to work in reverse, i.e. whenever website shows ???, HeidiSQL is ok. Whenever the website is ok, HeidiSQL shows jibberish. It has no problems with English, but Russian and Latvian are scrambled.

Pic of the front end and pic of variables.

2 attachment(s):
  • Website_output
  • Character_set
BubikolRamios's profile image BubikolRamios posted 9 years ago Permalink

sample, see 3. column:

1.I use utf8_slovenian_ci on MySql SIDE but I guess utf8_general_ci is more or less the same. 2.do you have this on your page source: <meta http-equiv="Content-Type" content="text/html; charset=utf-8" /> <meta http-equiv="Content-Style-Type" content="text/css; charset=utf-8" /> <meta http-equiv="Content-Script-Type" content="text/javascript; charset=utf-8" /> 3. do you have this in case of submit form: accept-charset="UTF-8" > 3.do you have (in case of java) on that same page: <%@ page language="java" contentType="text/html; charset=utf-8" pageEncoding="utf-8"%>

BubikolRamios's profile image BubikolRamios posted 9 years ago Permalink

continued, coz forum eate something:

<%@ page language="java" contentType="text/html; charset=utf-8" pageEncoding="utf-8"%>
BubikolRamios's profile image BubikolRamios posted 9 years ago Permalink

OK Had to delete soem uninportant code: page language="java" contentType="text/html; charset=utf-8" pageEncoding="utf-8"

BubikolRamios's profile image BubikolRamios posted 9 years ago Permalink

one post up is server side code, for rest look for utf8 on given link source.

[expired user #9567]'s profile image [expired user #9567] posted 9 years ago Permalink

sample, see 3. column:

1.I use utf8_slovenian_ci on MySql SIDE but I guess utf8_general_ci is more or less the same. 2.do you have this on your page source: &lt;meta http-equiv="Content-Type" content="text/html; charset=utf-8" /> &lt;meta http-equiv="Content-Style-Type" content="text/css; charset=utf-8" /> &lt;meta http-equiv="Content-Script-Type" content="text/javascript; charset=utf-8" /> 3. do you have this in case of submit form: accept-charset="UTF-8" > 3.do you have (in case of java) on that same page: &lt;%@ page language="java" contentType="text/html; charset=utf-8" pageEncoding="utf-8"%>

The page is an html containing &lt;meta http-equiv="Content-Type" content="text/html; charset=utf-8" /> in the head section with an embedded PHP code for user search. In return, the user search results are tied to a MySQL database for querying.

When I create MySQL data rows via Webmin on the server, it does allow all kinds of characters to be written and the search is able to pick up all the characters upon user query. However, when a data row is created/edited in HeidiSQL, then it's a trouble.

BubikolRamios's profile image BubikolRamios posted 9 years ago Permalink

try chcp in win command prompt mine shows 852

[expired user #9567]'s profile image [expired user #9567] posted 9 years ago Permalink

It was 775, and I changed it now to chcp 65001, which is UTF-8. No luck either.

[expired user #9567]'s profile image [expired user #9567] posted 9 years ago Permalink

Just tried another piece of software - MySQL workbench. The same thing happens upon row upload.

BubikolRamios's profile image BubikolRamios posted 9 years ago Permalink

Try to set it to 852...

[expired user #9567]'s profile image [expired user #9567] posted 9 years ago Permalink

Nothing. As far as I can see and read on the Internet chcp is only a temporary change. Making the change within registry entries may cause whole Windows system to fail and make it unbootable.

BubikolRamios's profile image BubikolRamios posted 9 years ago Permalink

Maybe: https://technet.microsoft.com/en-us/library/bb490874.aspx

Programs that you start after you assign a new code page use the new code page, however, 
programs (except Cmd.exe) that you started before assigning the new code page use 
the original code page. 
....
850 (Multilingual), type
[expired user #9567]'s profile image [expired user #9567] posted 9 years ago Permalink

To no avail.

[expired user #9567]'s profile image [expired user #9567] posted 9 years ago Permalink

I believe there is something wrong in the parsing of the character encoding in the process of connecting to the database and sending data rows to it. HeidiSQL and MySQL Workbench are now correctly displaying the symbols on a local PC. However, once a query is made on the website anything other than English is ?????

Again, this doesn't happen when adding/editing data rows on the server itself within the MySQL database.

ansgar's profile image ansgar posted 9 years ago Permalink

Your website application needs to set the utf8 for the database connection character set, just as HeidiSQL does by calling

mysql_set_character_set(FHandle, 'utf8mb4')

Also, your application needs to be UTF-8 aware. Most PHP functions are, but there are some which are not, like substr(). Replacement functions like mb_substr() exist in the Multibyte String PHP extension, which need to be activated in your php.ini (assuming you run PHP, which I don't know)

Also, your frontend should send a UTF-8 HTTP header. There are at least two ways of doing that. First one is from PHP:

header("Content-Type: text/html; charset=utf-8");

... second one is an alternative way of doing that per HTML meta tag:

< meta http-equiv="Content-Type" content="text/html; charset=utf-8"/>

If that all is done, your Website and HeidiSQL and all other MySQL clients show and store correct characters in your database. You will probably see broken characters first, as your current database is broken. But there are ways to convert that.

ansgar's profile image ansgar posted 9 years ago Permalink
mysql_set_character_set(FHandle, 'utf8mb4')

The PHP replacement for that is calling

mysqli_set_charset($connection, "utf8mb4");
// mysqli_set_charset($connection,"utf8"); // alternative for older libmysql.dll versions
// mysqli_query($connection, "SET NAMES utf8"); // baddest alternative, but better than nothing
[expired user #9567]'s profile image [expired user #9567] posted 9 years ago Permalink

Thanks Ansgar. Adding the mysqli_set_charset($connection, "utf8mb4"); to PHP page did the trick. Weird, all of other linked pages contained &lt; meta http-equiv="Content-Type" content="text/html; charset=utf-8"/> But that did nothing. Somewhere on stackoverflow.com, it was mentioned that the frontend, backend and transport all need to be the same. Seems like my transport was not set.

Spent 2 days on it. I wish there was universal utf32 for every character possible, so that people do not get stuck over this. Was about to leave it to the highly trained monkeys.

Please login to leave a reply, or register at first.