The dreaded MySQL import encoding issue - revisited The dreaded MySQL import encoding issue - revisited wordpress wordpress

The dreaded MySQL import encoding issue - revisited


This is how I ended up solving my problem:

First mysqldump -uusername -ppassword --default-character-set=latin1 database -r dump.sql

Then run this script:

$search = array('/latin1/');$replace = array('utf8');foreach (range(128, 255) as $dec) {    $search[] = "/\x".dechex($dec)."/";    $replace[] = "&#$dec;";}$input = fopen('dump.sql', 'r');$output = fopen('result.sql', 'w');while (!feof($input)) {    $line = fgets($input);    $line = preg_replace($search, $replace, $line);    fwrite($output, $line);}fclose($input);fclose($output);

The script finds all the hex characters above 127 and encoded them into their HTML entities.

Then mysql -uusername -ppassword database < result.sql


A common problem with older WordPress databases and even newer ones is that the database tables get set as latin-1 but the contents are actually encoded as UTF-8. If you try to export as UTF-8 MySQL will attempt to convert the (supposedly) Latin-1 data to UTF-8 resulting in double encoded characters since the data was already UTF-8.

The solution is to export the tables as latin-1. Since MySQL thinks they are already latin-1 it will do a straight export.

Change the character set from ‘latin1′ to ‘utf8′.Since the dumped data was not converted during the export process, it’s actually UTF-8 encoded data.

Create your new table as UTF-8 If your CREATE TABLE command is in your SQL dump file, change the character set from ‘latin1′ to ‘utf8′.

Import your data normally. Since you’ve got UTF-8 encoded data in your dump file, the declared character set in the dump file is now UTF-8, and the table you’re importing into is UTF-8, everything will go smoothly


I was able to resolve this issue by modifying my wp-config.php as follows:

/** Database Charset to use in creating database tables. */define('DB_CHARSET', 'utf8');/** The Database Collate type. Don't change this if in doubt. */define( 'DB_COLLATE', 'utf8_general_ci' );