multi-language (chinese, english, german) installation with mysql: how?

Author Message

Kai Duebbert

Thursday 27 March 2003 9:53:56 pm

Hi,

How can I do a multi-language installation with MySQL? Which codepage should I use?

eZ publish 3.0 is unusable with PostgreSQL 7.3.x so I can't go the unicode way. ;-(

Any suggestions?

Kai

Jan Borsodi

Friday 28 March 2003 1:57:52 am

Generally you can't do multi-language installations on MySQL unless the languages use the same characters and are a western language. There's supposed to come a new version with Unicode support but I don't if it's here yet.

Regarding codepages you could have used cp932 for this.

The other problem is that multi-byte charsets are not 100% supported yet, the reason for this is that PHP and all it's string handling functions are built around single-byte characters. For instance regular expressions will fail. This is something that we will continue to improve with future releases.

--
Amos

Documentation: http://ez.no/ez_publish/documentation
FAQ: http://ez.no/ez_publish/documentation/faq

Tony Wood

Friday 28 March 2003 4:21:08 am

How stable is eZ3 under PostgreSQL? it would appear that this is a better database platform for eZ if you require multiple languages, do you agree?

Tony Wood : twitter.com/tonywood
Vision with Technology
Experts in eZ Publish consulting & development

Power to the Editor!

Free eZ Training : http://www.VisionWT.com/training
eZ Future Podcast : http://www.VisionWT.com/eZ-Future

Sergiy Pushchin

Friday 28 March 2003 4:47:22 am

eZ publish 3 is stable on postgrsql 7.2.x but not on 7.3.x.
But there are still some problems with unicode and eZ publish.
We will make support for 7.3.x asap and improve multi-language (unicode ) support.

Kai Duebbert

Saturday 29 March 2003 6:15:53 pm

Ok, I will probably wait for postgresql 7.3.x support. Any (rough!) guess when the fix will be released or in svn?

Kai

Fady Chakik

Monday 31 March 2003 11:57:48 pm

Hello,

Read This about Internationalization under MySQL...

http://www.mysql.com/doc/en/Nutshell_4.1_features.html

-Fadi

Tony Wood

Wednesday 02 April 2003 1:20:41 am

I was reading this Fadi,

Have you used 4.1? i know 4.0 has achieved production status, but is 4.1 in any fit state?

Tony Wood : twitter.com/tonywood
Vision with Technology
Experts in eZ Publish consulting & development

Power to the Editor!

Free eZ Training : http://www.VisionWT.com/training
eZ Future Podcast : http://www.VisionWT.com/eZ-Future

David Heath

Thursday 10 April 2003 4:22:29 am

HI Sergey,

can you reveal any more about the current status of utf8 support in ezpublish 3. It is a major issue for us at OneWorld. I presume ezpublish 3 supports postgres 7.2.x OK, surely this would support utf8 with no problem?

Best regards

David Heath

David Heath

Thursday 10 April 2003 4:25:28 am

Sorry, have just read back the thread, so I see you have confirmed support for PG 7.2.x.

I would be interested to know exactly what you mean when you say there are "still some problems" with unicode support? Are these show-stoppers, or can they be lived with/worked around?

Thanks

Dave

Jan Borsodi

Thursday 10 April 2003 8:30:27 am

How well eZ publish will work with Unicode really depends on what parts of eZ publish is used.

The major problem with Unicode (or other multibyte encodings) is that PHP was designed for singlebyte encodings (default is latin1) and thus many of the string handling functions has problems with such strings.
With the mb_string extension compiled in it's possible to get more support for Unicode however there are still lots of string functions with support for it.

Currently some of the datatypes (which uses regular expressions for validation) and the search engine (due to upper/lower case handling) will have problem with Unicode.

We will continue to improve the support for Unicode with the next 3.x releases, for instance the 3.1 release will solve some of the issues with cases and searching.

--
Amos

Documentation: http://ez.no/ez_publish/documentation
FAQ: http://ez.no/ez_publish/documentation/faq

David Heath

Thursday 10 April 2003 9:00:40 am

Hi Jan,

thanks for your response. I understand the issues with PHP support for multibyte character encodings and have been researching them myself.

Re: regexp support for UTF8, there are two areas:

1. ereg

2. preg_match (the PCRE libraries)

For 1, the solution is simple, which is to use mb_ereg, or even enable the mbstring override mode which replaces standard ereg with mb_ereg etc. I would hope that this is being done throught eZ3 code.

For 2, I have made further enquiries. Please see the following post which I made to the php.i18n group:

http://news.php.net/article.php?group=php.i18n&article=530

And the response from Wez Furlong, the PCRE maintainer:

http://news.php.net/article.php?group=php.i18n&article=531

In other words, it sounds like PCRE should support UTF8 fine from php 4.3 onwards.

Are there other specific problems with unicode support which you are aware of?

Obviously case handling is a difficult one as it requires locale awareness which I'm not sure if there is any support for in any libraries (possibly ICU? http://oss.software.ibm.com/icu/).

We are currently investigating the possibility of supporting UTF8 in ez 2.x.

Thanks for your help.

Dave

Jan Borsodi

Friday 11 April 2003 4:01:29 am

I'll look into the PCRE in PHP 4.3 and the ICU project when I start on the i18n improvements. Case and priority tables will probably solve most of the problems.

The only other problem with Unicode as I see it is the database support (sorting and string operations), using PostgreSQL, Oracle and the latest MySQL version will probably fix this problem.
Also using Unicode will slow down the site, especially without the mb_string extension.

--
Amos

Documentation: http://ez.no/ez_publish/documentation
FAQ: http://ez.no/ez_publish/documentation/faq

Powered by eZ Publish™ CMS Open Source Web Content Management. Copyright © 1999-2014 eZ Systems AS (except where otherwise noted). All rights reserved.