How to properly handle international character in PHP / MySQL / Apache How to properly handle international character in PHP / MySQL / Apache apache apache

How to properly handle international character in PHP / MySQL / Apache


Apache

The server encoding must be either not set, or set to UTF-8. This is done via the apache AddDefaultCharset directive. This can go to the virtualhost or the general file (see documentation).

AddDefaultCharset utf-8

MySql

  • Set the collation of the database to be UTF-8
  • Set the connection encoding. It can be done as someone said with mysqli_set_charset, or by sending this just after connecting:
    SET NAMES 'utf8' COLLATE 'utf8_unicode_ci'

PHP

1- You should set the HTML charset of the page to be UTF-8, via a meta tag on the page, or via a PHP header:

    <meta http-equiv="Content-Type" content="text/html; charset=utf-8" />-or-    header('Content-type: text/html; charset=utf-8');

2- You should always use the mb* version of string-related functions, for example, mbstrlen instead of strlen to get the string length of a string.

This should allow you to have UTF-8 everywhere, from the pages to the data. A test you can do: right-click anywhere on the page using firefox, and select Show page information. The effective encoding is listed in that page.


Important: You should also ensure that you use UTF-8 as connection charset when connecting to Mysql from PHP!

For mysqli this is done by

mysqli_set_charset($dblink, 'utf-8')

http://de3.php.net/manual/en/mysqli.set-charset.php


Some things you will need to look into:-

PHP

Make sure your content is marked as utf-8 :

default_charset = "utf-8"

Install mbstring. You can find it here

Ensure that you are talking utf-8 between PHP and MySQL.
Call mysql_set_charset("utf8"); (or use the SQL query SET NAMES utf8)

Apache

You also set the Content-Type: of your pages in here with something like this

AddDefaultCharset utf-8

MySQL

Make sure all your tables use utf8 Collation utf8_general_ci; eg

ALTER DATABASE mydb CHARACTER SET utf8;

Finally

Finally, test stuff with fun unicode samples, like these ones

٩(͡๏̯͡๏)۶

More helpful information from when I tried this...