UTF-8, collate, fetch & sort with native characters

Author Message

Piotrek Karaś

Wednesday 03 October 2007 11:55:26 pm

Hello everyone after a bit of a break :)

Can anyone give any tips as to how to deal with sorting of the fetched elements, <b>when native characters come into play</b>? Polish language, for instance, uses nine native characters (ąśżźćńłęó). How do I make eZ Publish to sort elements like this:
(...)
o-word
ó-word [o acute]
p-word
(...)
MySQL provides <b>utf8_polish_ci collation</b> - will that do the trick?
What about multi-language installations? Is there a way to force collation based on a siteaccess for example?

Of course, we're talking about UTF-8-based DB.

Any help/suggestions appreciated.
Thanks.

--
Company: mediaSELF Sp. z o.o., http://www.mediaself.pl
eZ references: http://ez.no/partners/worldwide_partners/mediaself
eZ certified developer: http://ez.no/certification/verify/272585
eZ blog: http://ez.ryba.eu

Kristof Coomans

Thursday 04 October 2007 7:32:33 am

Hi Piotrek

As far as I know changing the collation will indeed affect your sorting results. But it's not possible to force collation based on site access at this moment. Maybe executing some SQL on initialization of the database can make it possible, but I didn't investigate this yet.

independent eZ Publish developer and service provider | http://blog.coomanskristof.be | http://ezpedia.org

Piotrek Karaś

Thursday 04 October 2007 9:11:02 am

Yup, you're right. Preinstallation collation setting does the trick - I've just finished several tests on utf8_general_ci against utf8_<i>native</i>_ci for the entire database.

Too bad there's no native eZ Publish ability to relate collations to siteaccesses and/or language settings for MySQL db. Yes, a global setting for the moment of connecting to the database would be very useful and easy to implement - otherwise it seems like most of the queries where collation setting is required would have to be extended.

Thanks.

--
Company: mediaSELF Sp. z o.o., http://www.mediaself.pl
eZ references: http://ez.no/partners/worldwide_partners/mediaself
eZ certified developer: http://ez.no/certification/verify/272585
eZ blog: http://ez.ryba.eu

Piotrek Karaś

Friday 05 October 2007 11:17:28 am

I dug a bit, did some experimenting, unfortunately no results. MySQL seems to unfailingly adhere to the priorities of server->database->table->column default collation override system and I haven't found a way to make it change its behavior by any SQL request upon establishing DB connection. Query rewriting seems inevitable for my purpose and I really hope I am wrong about it.

Anyone with greater MySQL experience?
Thanks

--
Company: mediaSELF Sp. z o.o., http://www.mediaself.pl
eZ references: http://ez.no/partners/worldwide_partners/mediaself
eZ certified developer: http://ez.no/certification/verify/272585
eZ blog: http://ez.ryba.eu

Powered by eZ Publish™ CMS Open Source Web Content Management. Copyright © 1999-2014 eZ Systems AS (except where otherwise noted). All rights reserved.

eZ debug

Timing: Jan 18 2025 19:09:21
Script start
Timing: Jan 18 2025 19:09:21
Module start 'layout'
Timing: Jan 18 2025 19:09:21
Module start 'content'
Timing: Jan 18 2025 19:09:22
Module end 'content'
Timing: Jan 18 2025 19:09:22
Script end

Main resources:

Total runtime0.8608 sec
Peak memory usage4,096.0000 KB
Database Queries60

Timing points:

CheckpointStart (sec)Duration (sec)Memory at start (KB)Memory used (KB)
Script start 0.00000.0170 589.2969152.6563
Module start 'layout' 0.01700.0024 741.953139.5078
Module start 'content' 0.01940.8396 781.4609559.3906
Module end 'content' 0.85900.0018 1,340.851616.1094
Script end 0.8608  1,356.9609 

Time accumulators:

 Accumulator Duration (sec) Duration (%) Count Average (sec)
Ini load
Load cache0.00590.6833160.0004
Check MTime0.00250.2893160.0002
Mysql Total
Database connection0.00420.486010.0042
Mysqli_queries0.784991.1756600.0131
Looping result0.00070.0779580.0000
Template Total0.800893.020.4004
Template load0.00220.255620.0011
Template processing0.798692.772920.3993
Template load and register function0.00010.012210.0001
states
state_id_array0.00280.326010.0028
state_identifier_array0.00190.218220.0009
Override
Cache load0.00180.2111300.0001
Sytem overhead
Fetch class attribute can translate value0.00070.078120.0003
Fetch class attribute name0.00150.178160.0003
XML
Image XML parsing0.00120.137320.0006
class_abstraction
Instantiating content class attribute0.00000.002180.0000
General
dbfile0.00110.1313230.0000
String conversion0.00000.001140.0000
Note: percentages do not add up to 100% because some accumulators overlap

Templates used to render the page:

UsageRequested templateTemplateTemplate loadedEditOverride
1node/view/full.tplfull/forum_topic.tplextension/sevenx/design/simple/override/templates/full/forum_topic.tplEdit templateOverride template
4content/datatype/view/ezimage.tpl<No override>extension/sevenx/design/simple/templates/content/datatype/view/ezimage.tplEdit templateOverride template
4content/datatype/view/ezxmltext.tpl<No override>extension/community_design/design/suncana/templates/content/datatype/view/ezxmltext.tplEdit templateOverride template
7content/datatype/view/ezxmltags/paragraph.tpl<No override>extension/ezwebin/design/ezwebin/templates/content/datatype/view/ezxmltags/paragraph.tplEdit templateOverride template
3content/datatype/view/ezxmltags/line.tpl<No override>design/standard/templates/content/datatype/view/ezxmltags/line.tplEdit templateOverride template
1print_pagelayout.tpl<No override>extension/community/design/community/templates/print_pagelayout.tplEdit templateOverride template
 Number of times templates used: 20
 Number of unique templates used: 6

Time used to render debug report: 0.0001 secs