Article From:

The essence of the Chinese data problem is the character set problem

The computer only recognizes binary, human is more recognition symbol, and needs a corresponding relationship between binary and character (character set).

use mydatabase;
-- Insert data (Chinese)Insert into my_student values (5,'itcast0005',' Zhang Yue ',' male ');


 \xIt refers to the sixteen decimal system

 False report: the server does not recognize the corresponding four bytes, the server thinks that the data is UTF8, one Chinese character has three bytes: read three bytes to Chinese character (failure), the remaining reread three bytes (not enough), and finally fail.

All of the database servers think that some of the features are saved by the server – side variables, and the system reads their own variables first to see how it should behave.

-- See what character sets are identified by the serverShow character set;




Basically, servers are omnipotent, and what character sets are supported.

Since the server recognizes so much, there is always a character set that is the default character set that the server is dealing with the client.

-- View the default external processing character set of the serverShow variables like'character_set%';



The root of the problem: the client data can only be GBK, and the server thinks it is UTF8, and the contradiction arises.

Solution: change the server, the default receive character set is GBK.

-- Modify the character set of the client data that the server considers is GBKSet character_set_client = GBK;



 Insert Chinese effect:


Look at the data effect: it’s still a mess


Reason: the data source is the server, the parser is the client (the client only identifies GBK: only two bytes and one man), but the fact server gives the data UTF8, three bytes one man: chaos code

Solution: modify the data character set provided by the server to the client for GBK

-- The character set that modifies the server's given data is GBKSet character_set_results = GBK;



Look at the data effect:


 set Variable = value, modification is only session level (current client, current link is valid, shutdown failure).


Set up the server’s understanding of the character set of the client, and use shortcut: set names character set.

set names gbk;    It is equivalent to character_set_client, character_set_results, character_set_connection.

-- Fast setting character setSet names GBK;


connectionThe connection layer is the middle of the character set transformation. If the unification is more efficient, it will be fine if it is not unified.

Link of this Article: Database _9_ Chinese data problem

Leave a Reply

Your email address will not be published. Required fields are marked *