Monday 31 October 2005 7:19:27 pm
Recently, we imported over 3,000 text articles to the eZpublish database using the provided import routine on release 3.70. It seemed that this conversion function normalized the XML to eZ publish compliant code and improperly stripped some special characters, converting them to question marks, and did not convert others. For example, 1) The special character are encoded properly as UTF-8 in the import XML (ex. the euro sign is 'E2-82-AC'; the em-dash is 'E2-80-94'). 2) In the samples, these characters are incorrectly coded in the MySQL database simply as '3F' (the question mark). We had thought they might be in Latin1, Unicode, or encoded as HTML character entities, but they are not. 3) The eZ Online Editor allows you to use the special character entry button to enter special characters. You can also enter special character using the Windows "Character Map." These characters are then stored in the database as proper UTF-8, just like in the XML (ex. the euro sign is 'E2-82-AC'; the em-dash is 'E2-80-94'). Has anyone run into this problem and if they have, do you have a work around solution that you can share for updating these special characters in the mysql database? Thank you.
|