Problem with Import of Text to UTF-Database

Author Message

Rose Futchko

Monday 31 October 2005 7:19:27 pm

Recently, we imported over 3,000 text articles to the eZpublish database using the provided import routine on release 3.70. It seemed that this conversion function normalized the XML to eZ publish compliant code and improperly stripped some special characters, converting them to question marks, and did not convert others.

For example,

1) The special character are encoded properly as UTF-8 in the import XML (ex. the euro sign is 'E2-82-AC'; the em-dash is 'E2-80-94').

2) In the samples, these characters are incorrectly coded in the MySQL database simply as '3F' (the question mark). We had thought they might be in Latin1, Unicode, or encoded as HTML character entities, but they are not.

3) The eZ Online Editor allows you to use the special character entry button to enter special characters. You can also enter special character using the Windows "Character Map." These characters are then stored in the database as proper UTF-8, just like in the XML (ex. the euro sign is 'E2-82-AC'; the em-dash is 'E2-80-94').

Has anyone run into this problem and if they have, do you have a work around solution that you can share for updating these special characters in the mysql database?

Thank you.

Xavier Dutoit

Monday 31 October 2005 11:59:59 pm

Hi,

What is the routine you've used ? Don't get it.
Could you detail that steps ?

As for the chars, I always have problem with win-1251 chars when I switch to utf-8.

I wrote a wee dirty script to convert the win chars (the quotes, dash and the al), to normal chars. I dump the database, apply it and reimport the modified dump. However, if the chars are already screwed in the mysql database, that's probably not going to work.

X+

http://www.sydesy.com

Powered by eZ Publish™ CMS Open Source Web Content Management. Copyright © 1999-2014 eZ Systems AS (except where otherwise noted). All rights reserved.

eZ debug

Timing: Jan 18 2025 15:59:56
Script start
Timing: Jan 18 2025 15:59:56
Module start 'layout'
Timing: Jan 18 2025 15:59:56
Module start 'content'
Timing: Jan 18 2025 15:59:57
Module end 'content'
Timing: Jan 18 2025 15:59:57
Script end

Main resources:

Total runtime0.8087 sec
Peak memory usage4,096.0000 KB
Database Queries54

Timing points:

CheckpointStart (sec)Duration (sec)Memory at start (KB)Memory used (KB)
Script start 0.00000.0063 589.1641152.6406
Module start 'layout' 0.00630.0045 741.804739.4766
Module start 'content' 0.01080.7966 781.2813521.7656
Module end 'content' 0.80740.0012 1,303.04698.2813
Script end 0.8086  1,311.3281 

Time accumulators:

 Accumulator Duration (sec) Duration (%) Count Average (sec)
Ini load
Load cache0.00340.4198160.0002
Check MTime0.00120.1519160.0001
Mysql Total
Database connection0.00100.120110.0010
Mysqli_queries0.759693.9264540.0141
Looping result0.00040.0545520.0000
Template Total0.773695.720.3868
Template load0.00220.275020.0011
Template processing0.771395.382620.3857
Template load and register function0.00020.018810.0002
states
state_id_array0.00080.095210.0008
state_identifier_array0.00070.081620.0003
Override
Cache load0.00190.2317180.0001
Sytem overhead
Fetch class attribute can translate value0.00060.071620.0003
Fetch class attribute name0.00140.175430.0005
XML
Image XML parsing0.00090.106020.0004
class_abstraction
Instantiating content class attribute0.00000.001030.0000
General
dbfile0.00480.5925160.0003
String conversion0.00000.001340.0000
Note: percentages do not add up to 100% because some accumulators overlap

Templates used to render the page:

UsageRequested templateTemplateTemplate loadedEditOverride
1node/view/full.tplfull/forum_topic.tplextension/sevenx/design/simple/override/templates/full/forum_topic.tplEdit templateOverride template
2content/datatype/view/ezxmltext.tpl<No override>extension/community_design/design/suncana/templates/content/datatype/view/ezxmltext.tplEdit templateOverride template
3content/datatype/view/ezxmltags/paragraph.tpl<No override>extension/ezwebin/design/ezwebin/templates/content/datatype/view/ezxmltags/paragraph.tplEdit templateOverride template
1content/datatype/view/ezimage.tpl<No override>extension/sevenx/design/simple/templates/content/datatype/view/ezimage.tplEdit templateOverride template
1content/datatype/view/ezxmltags/line.tpl<No override>design/standard/templates/content/datatype/view/ezxmltags/line.tplEdit templateOverride template
1print_pagelayout.tpl<No override>extension/community/design/community/templates/print_pagelayout.tplEdit templateOverride template
 Number of times templates used: 9
 Number of unique templates used: 6

Time used to render debug report: 0.0001 secs