Problem: Using specific characters from European languages like Romanian, Bulgarian, Czech and so on (usually the ones without support in ISO 8859-1) rises errors when displaying the content in browsers turning special characters in unrecognizable ones.
My fix for this problem is using UTF-8 character set encoding for every page of the website and the MySQL tables that contain the fields you are using. Also all the html encodings from PHP use the UTF-8 character set encoding (this is not mandatory).
If you already have the database, but with the default character set (latin1) and collation (latin1_swedish_ci) for the tables with text fields (of type CHAR, VARCHAR, TEXT etc) in which you need to have special characters, you should change the character set of each of those tables like this:
ALTER TABLE my_table CONVERT TO CHARACTER SET utf8;
If you don’t have the database then you should create it and when you create a table that you need to use with specific language characters, you should specify the character set for that table:
CREATE TABLE `my_table` ( `idmy_table` tinyint(3) unsigned NOT NULL auto_increment, `my_field` varchar(255) NOT NULL default '', PRIMARY KEY (`idmy_table`) ) CHARSET=utf8;
The most important thing is that in PHP, after opening a database connection, before executing any query to the database, you should ensure that this code is executed
mysql_query("SET NAMES utf8", $my_conn);
This tells the server what character set the client is using for sending SQL statements and the character set the server should use to return the results to the client.
A simple example:
<?php $my_conn = @mysql_connect("localhost", "user", "pass") or die("There was a problem connecting to MySQL. Please try again later."); if(!@mysql_select_db("test", $my_conn)) { die ("There was a problem connecting to the database. Please try again later."); } mysql_query("SET NAMES utf8", $my_conn); if(!empty($_GET['mystr'])) { // insert the string into the database $str = htmlspecialchars($_GET['mystr'], ENT_QUOTES, "UTF-8"); $query = "INSERT INTO my_table_t (my_field) VALUES('".$str."')"; $result = mysql_query($query, $my_conn); if($result) { // save the id of the table row inserted $last_insert_id = mysql_insert_id($my_conn); // get the last inserted value $query = "SELECT my_field FROM my_table_t WHERE idmy_table = '".$last_insert_id."'"; $result = mysql_query($query, $my_conn); if($result && $row = mysql_fetch_array($result, MYSQL_ASSOC)) { $db_string = $row['my_field'] ; } } } elseif(!empty($_GET['searchstr'])) { $str = htmlspecialchars($_GET['searchstr'], ENT_QUOTES, "UTF-8"); $query = "SELECT * FROM my_table_t WHERE my_field LIKE '%".$str."%'"; $result = mysql_query($query, $my_conn); if($result) { while($row = mysql_fetch_array($result, MYSQL_ASSOC)) { $search_results[$row['idmy_table']] = $row['my_field']; } } } mysql_close($my_conn); ?> <html> <head> <title>Page title</title> <meta http-equiv="Content-Type" content="text/html; charset=utf-8"></head> <body> <?php if(!empty($db_string)) { echo "<strong>Inserted string</strong>: $db_string<br />"; } ?> <form method="get" action=""> String to insert into the database <input type="text" name="mystr"/> <input type="submit" value="GO"/> </form> <?php if(!empty($search_results)) { echo "<strong>Search results</strong>:<br />"; foreach($search_results as $id => $value) { echo $value."<br />"; } } ?> <form method="get" action=""> Search query <input type="text" name="searchstr"/> <input type="submit" value="GO"/> </form> </body> </html>
Tip: The search in Romanian language over the database (tested with MySQL LIKE operator) works like a charm when searching words that have special characters or not.
For example: In Romanian language the word “peasant” is written as “ţăran” and someone who searches it gets the same result for the search terms “taran” or “ţăran” or “ţaran” or “tărân” and so on – so this is the real magic.
UPDATE: You may also need to add a header to the php script if you use ob_start or similar php functions like this:
header("Content-type: text/html; charset=UTF-8");
this usually fixes the encoding selection in Internet Explorer for this case.