Forums / Setup & design / iso-8859-1 to UTF-8 conversion.

iso-8859-1 to UTF-8 conversion.

Author Message

laurent le cadet

Tuesday 23 August 2005 2:05:09 am

Hi,

Here's one more message about db charset conversion but what I read previously poses me many questions ( http://ez.no/products/ez_publish_cms/documentation/configuration/configuration/language_and_charset/unicode_with_ez_publish ).

I'm using eZp 3.5.2 revision 10972 with PHP 4.3.8 / MySQL 3.23.58, DB internal charset iso-8859-1.

Actually there is a lot of content on the site (fre-FR + en-GB) and I want to add Chinese and Japanese.

What I understand is to change the Charset to UTF-8.

The previous threads about thid refered to mysql version problem or many different things to do...

Actually, is there any step by step doc to perform this ?

Regards.

Laurent

Georg Franz

Tuesday 23 August 2005 6:27:51 am

Hi Laurent,

1st of all, you need a newer mysql, (the version should be greater or equal than 4.1.11).

Then you need to change the charset of the ez tables. I've rewritten a small script for that purpous which I found at the forums of mysql:

<?php
// put in your username, password
$conn = mysql_connect("localhost", "root", "mypassword");

//change this to false to alter on the fly
$printonly=true; 

$charset="utf8";
$collate="utf8_general_ci";

$altertablecharset=true;
$alterdatabasecharser=true;

// put here your databases ...
$currentDBArray = array();
$currentDBArray[] = "mydb";


function PMA_getDbCollation($db)
{
	$sq='SHOW CREATE DATABASE `'.$db.'`;';
	$res = mysql_query($sq);
	if(!$res) echo "\n\n".$sq."\n".mysql_error()."\n\n"; else
	if($row = mysql_fetch_assoc($res))
	{
		$tokenized = explode(' ', $row[1]);
		unset($row, $res, $sql_query);
		for ($i = 1; $i + 3 < count($tokenized); $i++)
		{
			if ($tokenized[$i] == 'DEFAULT' && $tokenized[$i + 1] == 'CHARACTER' && $tokenized[$i + 2] == 'SET')
			{
				if (isset($tokenized[$i + 5]) && $tokenized[$i + 4] == 'COLLATE')
				{
					 return array($tokenized [$i + 3],$tokenized[$i + 5]); // We found the collation!
				}
				else
				{
					return array($tokenized [$i + 3]);
				}
			}
		} 
	}
	return '';
}

$rs2 = mysql_query("SHOW DATABASES"); 
if(!$rs2)
	echo "\n\n".$sq."\n".mysql_error()."\n\n";
else
	while ($data2 = mysql_fetch_row($rs2))
	{
		$db=$data2[0];
		$db_cha=PMA_getDbCollation($db);
		if ( in_array ( $db, $currentDBArray ) )
			if ( substr($db_cha[0],0,4)!='utf8' ) // limit to charset
			{
				mysql_select_db($db);
				$rs = mysql_query("SHOW TABLES"); 
				if(!$rs)
					echo "\n\n".$sq."\n".mysql_error()."\n\n";
				else
					while ($data = mysql_fetch_row($rs))
					{
						if ( substr ( $data[0], 0,2 ) == "ez" )
						{
							$rs1 = mysql_query("show FULL columns from $data[0]");
							
							if(!$rs1)
								echo "\n\n".$sq."\n".mysql_error()."\n\n";
							else
								while ($data1 = mysql_fetch_assoc($rs1))
								{
									if(in_array(array_shift(split("\\(",$data1['Type'],2)),array(
																				//'national char',
																				//'nchar',
																				//'national varchar',
																				//'nvarchar',
																				'char',
																				'varchar',
																				'tinytext',
																				'text',
																				'mediumtext',
																				'longtext',
																				'enum',
																				'set'
																				  ))) 
									 {
										if(substr($data1['Collation'],0,4)!='utf8') // limit to charset
										{
											$sq="ALTER TABLE `$data[0]` CHANGE `".$data1['Field'].'` `'.$data1['Field'].'` '.$data1['Type'].' CHARACTER SET binary '.($data1['Default']==''?'':($data1['Default']=='NULL'?' DEFAULT NULL':' DEFAULT \''.mysql_escape_string($data1['Default']).'\'')).($data1['Null']=='YES'?' NULL ':' NOT NULL').';';
											if(!$printonly&&!mysql_query($sq)) echo "\n\n".$sq."\n".mysql_error()."\n\n"; 
											else
											{
												echo ($sq."\n") ; 
												$sq="ALTER TABLE `$data[0]` CHANGE `".$data1['Field'].'` `'.$data1['Field'].'` '.$data1['Type']." CHARACTER SET $charset ".($collate==''?'':"COLLATE $collate").($data1['Default']==''?'':($data1['Default']=='NULL'?' DEFAULT NULL':' DEFAULT \''.mysql_escape_string($data1['Default']).'\'')).($data1['Null']=='YES'?' NULL ':' NOT NULL').($data1['Comment']==''?'':' COMMENT \''.mysql_escape_string($data1['Comment']).'\'').';';
												if(!$printonly&&!mysql_query($sq)) echo "\n\n".$sq."\n".mysql_error()."\n\n"; 
												else echo ($sq."\n") ; 
											}
										}
									}
								}
								if($altertablecharset)
								{
									/*
									  $sq='ALTER TABLE `'.$data[0]."` DEFAULT CHARACTER SET binary";
									  echo ($sq."\n") ; 
									  if(!mysql_query($sq)) echo "\n\n".$sq."\n".mysql_error()."\n\n";
									*/
									$sq='ALTER TABLE `'.$data[0]."` DEFAULT CHARACTER SET $charset ".($collate==''?'':"COLLATE $collate");
									echo ($sq."\n") ; 
									if(!$printonly)
										if(!mysql_query($sq)) echo "\n\n".$sq."\n".mysql_error()."\n\n";
								}
						}
						else
							echo $data[0] . " nicht geƤndert.\n";
						if( $alterdatabasecharser )
						{
						  /*
						  $sq='ALTER DATABASE `'.$data2[0]."` DEFAULT CHARACTER SET binary";
						  echo ($sq."\n") ; 
						  if(!mysql_query($sq)) echo "\n\n".$sq."\n".mysql_error()."\n\n";
						  */ 
						  $sq='ALTER DATABASE `'.$data2[0]."` DEFAULT CHARACTER SET $charset ".($collate==''?'':"COLLATE $collate");
						  echo ($sq."\n") ; 
							if(!$printonly)
								if(!mysql_query($sq)) echo "\n\n".$sq."\n".mysql_error()."\n\n";
						}
					}
				}
			}
?>

Then you need to change the ini-settings of ezpublish.

-> site.ini.append: charset at db entry
-> i18n.ini.append: charset-setting

After that, don't forget to clear the ezpublish cache completly.

HTH.

Best wishes,
Georg.

--
http://www.schicksal.com Horoskop website which uses eZ Publish since 2004

laurent le cadet

Tuesday 23 August 2005 8:42:30 am

Hi Georg,

Thanks for your repply.
First step : upgrade mysql...

After that, no risks for contents ?
Is the content of each table is re-encode ? (no need ?)

About the script, I presume I just have to launch it one time from the root for the site (for example) ?

Regards

Laurent.

Georg Franz

Tuesday 23 August 2005 9:56:46 am

Hi Laurent,

backup - backup - backup ... of course :-))

The script converts the tables first to a binary format and then to utf8, so no data should be lost.

The script simply produces sql strings for the conversion. If you run it the first time and the var $printonly is set to true (at the begin of the script), only the sql strings are written to the screen, nothing else happen.

If you really want to do the conversion, set $printonly to false.

Best wishes,
Georg.

--
http://www.schicksal.com Horoskop website which uses eZ Publish since 2004

eZ debug

Timing: Jan 18 2025 22:19:11
Script start
Timing: Jan 18 2025 22:19:11
Module start 'content'
Timing: Jan 18 2025 22:19:11
Module end 'content'
Timing: Jan 18 2025 22:19:11
Script end

Main resources:

Total runtime0.2043 sec
Peak memory usage2,048.0000 KB
Database Queries141

Timing points:

CheckpointStart (sec)Duration (sec)Memory at start (KB)Memory used (KB)
Script start 0.00000.0059 588.8594180.8359
Module start 'content' 0.00590.0072 769.6953101.9922
Module end 'content' 0.01310.1911 871.6875530.5547
Script end 0.2042  1,402.2422 

Time accumulators:

 Accumulator Duration (sec) Duration (%) Count Average (sec)
Ini load
Load cache0.00381.8372200.0002
Check MTime0.00150.7505200.0001
Mysql Total
Database connection0.00080.410710.0008
Mysqli_queries0.149773.30791410.0011
Looping result0.00140.70671390.0000
Template Total0.190793.310.1907
Template load0.00120.595910.0012
Template processing0.189492.737010.1894
Override
Cache load0.00060.283410.0006
Sytem overhead
Fetch class attribute can translate value0.00211.013410.0021
XML
Image XML parsing0.00030.145410.0003
General
dbfile0.00522.5395200.0003
String conversion0.00000.002830.0000
Note: percentages do not add up to 100% because some accumulators overlap

CSS/JS files loaded with "ezjscPacker" during request:

CacheTypePacklevelSourceFiles
CSS0extension/community/design/community/stylesheets/ext/jquery.autocomplete.css
extension/community_design/design/suncana/stylesheets/scrollbars.css
extension/community_design/design/suncana/stylesheets/tabs.css
extension/community_design/design/suncana/stylesheets/roadmap.css
extension/community_design/design/suncana/stylesheets/content.css
extension/community_design/design/suncana/stylesheets/star-rating.css
extension/community_design/design/suncana/stylesheets/syntax_and_custom_tags.css
extension/community_design/design/suncana/stylesheets/buttons.css
extension/community_design/design/suncana/stylesheets/tweetbox.css
extension/community_design/design/suncana/stylesheets/jquery.fancybox-1.3.4.css
extension/bcsmoothgallery/design/standard/stylesheets/magnific-popup.css
extension/sevenx/design/simple/stylesheets/star_rating.css
extension/sevenx/design/simple/stylesheets/libs/fontawesome/css/all.min.css
extension/sevenx/design/simple/stylesheets/main.v02.css
extension/sevenx/design/simple/stylesheets/main.v02.res.css
JS0extension/ezjscore/design/standard/lib/yui/3.17.2/build/yui/yui-min.js
extension/ezjscore/design/standard/javascript/jquery-3.7.0.min.js
extension/community_design/design/suncana/javascript/jquery.ui.core.min.js
extension/community_design/design/suncana/javascript/jquery.ui.widget.min.js
extension/community_design/design/suncana/javascript/jquery.easing.1.3.js
extension/community_design/design/suncana/javascript/jquery.ui.tabs.js
extension/community_design/design/suncana/javascript/jquery.hoverIntent.min.js
extension/community_design/design/suncana/javascript/jquery.popmenu.js
extension/community_design/design/suncana/javascript/jScrollPane.js
extension/community_design/design/suncana/javascript/jquery.mousewheel.js
extension/community_design/design/suncana/javascript/jquery.cycle.all.js
extension/sevenx/design/simple/javascript/jquery.scrollTo.js
extension/community_design/design/suncana/javascript/jquery.cookie.js
extension/community_design/design/suncana/javascript/ezstarrating_jquery.js
extension/community_design/design/suncana/javascript/jquery.initboxes.js
extension/community_design/design/suncana/javascript/app.js
extension/community_design/design/suncana/javascript/twitterwidget.js
extension/community_design/design/suncana/javascript/community.js
extension/community_design/design/suncana/javascript/roadmap.js
extension/community_design/design/suncana/javascript/ez.js
extension/community_design/design/suncana/javascript/ezshareevents.js
extension/sevenx/design/simple/javascript/main.js

Templates used to render the page:

UsageRequested templateTemplateTemplate loadedEditOverride
1pagelayout.tpl<No override>extension/sevenx/design/simple/templates/pagelayout.tplEdit templateOverride template
 Number of times templates used: 1
 Number of unique templates used: 1

Time used to render debug report: 0.0001 secs