Forums / Extensions / eZ Find / charset problem

charset problem

Author Message

laurent le cadet

Wednesday 12 December 2007 3:02:32 am

Hi,

I'm using ezfind 1.0.2 with ezP 3.9.3 - iso-8859-1 and text is not correctly indexed.

ie :
V�rin hydraulique
V�rin hydraulique ... pompes, chaleur, hydraulique, v�rin

This should be "Vérin hydraulique"

Any additionnal settings are needed?

Regards.

Laurent

laurent le cadet

Thursday 13 December 2007 6:30:19 am

It sounds like the encoding is not correct.
Must we have a utf-8 db?

Kåre Køhler Høvik

Thursday 13 December 2007 7:36:40 am

Hi

UTF8 should not be required for eZ Find and eZP3. If you have a test environment available, please try to comment out these two lines in <i>extension/ezfind/java/solr/conf/schema.xml</i>

....
<!--        <filter class="ISOLatin1AccentFilterFactory"/> -->
...
<!--        <filter class="ISOLatin1AccentFilterFactory"/> -->
...

restart Solr, and reindex the data.

Kåre Høvik

laurent le cadet

Friday 14 December 2007 3:08:13 am

Hi Kåre,

We add comment for the lines :

<!-- <filter class="ISOLatin1AccentFilterFactory"/> -->

restart solr and reindex but the results are still corrupted :

This text :

Le DMP est con�u pour r�aliser pour le microdosage de tr�s haute pr�cision de tous les produits

Should be :

Le DMP est conçu pour réaliser pour le microdosage de très haute précision de tous les produits

The charcaters : ç,é,è (and I presume all the special characters) are not well encode.

Stuck at this point.

Any hint ?

regards.

Laurent

laurent le cadet

Monday 17 December 2007 4:36:37 am

Hi,

I read that on http://lucene.apache.org/solr/tutorial.html#Requirements :

"SimplePostTool: WARNING: Make sure your XML documents are encoded in UTF-8, other encodings are not currently supported"

Is that related to our problem or can we override that?

I tryed almost everythings without any results actually.

Best regards

Laurent

Kåre Køhler Høvik

Monday 17 December 2007 4:59:55 am

Hi

Thank you for looking into this.

It looks you found the problem. The resolution for this is to use eZ Find to convert the data to UTF-8 before it's indexed. Please add a bug report about this in the issue tracker, and I'll fix it as soon as I have time.

Best regards
Kåre

Kåre Høvik

laurent le cadet

Monday 17 December 2007 5:07:56 am

Kåre,

I'm going to report the bug.
As you can see, there is additionnal info for encoding/decoding (java.net) or another alternative with additionnal code :

String encoding = request.getCharacterEncoding();
if (null == encoding) {
  // Set your default encoding here 
  request.setCharacterEncoding("UTF-8");
} else {
  request.setCharacterEncoding(encoding);
}
...
String value = request.getParameter("q");

I'm digging in the "java.net" solution. For the other one, I don't know if it can serves us and where to apply the "patch".

Any idea?

Laurent

laurent le cadet

Wednesday 19 December 2007 2:50:18 am

Finally, I convert the DB to UTF-8.
Everything works fine.

(http://ez.no/developer/forum/general/convert_from_iso_8859_1_encoding_to_utf_8/)

Hope this help.

laurent

John Smith

Tuesday 19 August 2008 10:26:10 am

hi laurent,

I used the script by Kristof Coomans while upgrading 3.6.1 to 3.8.0 to do the uft-8 conversion, which is posted on

http://ez.no/developer/forum/install_configuration/update_to_3_8_and_codepage_problems

I am getting the notice of

SET NAMES 'utf8' on adminstration and public website.

Are you getting the same....

Please help...

eZ debug

Timing: Jan 30 2025 21:40:38
Script start
Timing: Jan 30 2025 21:40:38
Module start 'content'
Timing: Jan 30 2025 21:40:38
Module end 'content'
Timing: Jan 30 2025 21:40:38
Script end

Main resources:

Total runtime0.2271 sec
Peak memory usage8,192.0000 KB
Database Queries141

Timing points:

CheckpointStart (sec)Duration (sec)Memory at start (KB)Memory used (KB)
Script start 0.00000.0091 587.9453370.2578
Module start 'content' 0.00910.0111 958.20311,014.6875
Module end 'content' 0.02020.2067 1,972.89063,903.6875
Script end 0.2270  5,876.5781 

Time accumulators:

 Accumulator Duration (sec) Duration (%) Count Average (sec)
Ini load
Load cache0.00462.0277200.0002
Check MTime0.00140.6364200.0001
Mysql Total
Database connection0.00070.292810.0007
Mysqli_queries0.131157.74411410.0009
Looping result0.00150.64361390.0000
Template Total0.206390.810.2063
Template load0.00080.374310.0008
Template processing0.205490.472210.2054
Override
Cache load0.00060.255810.0006
Sytem overhead
Fetch class attribute can translate value0.00241.046310.0024
XML
Image XML parsing0.00030.150110.0003
General
dbfile0.00482.1312200.0002
String conversion0.00000.004030.0000
Note: percentages do not add up to 100% because some accumulators overlap

CSS/JS files loaded with "ezjscPacker" during request:

CacheTypePacklevelSourceFiles
CSS0extension/community/design/community/stylesheets/ext/jquery.autocomplete.css
extension/community_design/design/suncana/stylesheets/scrollbars.css
extension/community_design/design/suncana/stylesheets/tabs.css
extension/community_design/design/suncana/stylesheets/roadmap.css
extension/community_design/design/suncana/stylesheets/content.css
extension/community_design/design/suncana/stylesheets/star-rating.css
extension/community_design/design/suncana/stylesheets/syntax_and_custom_tags.css
extension/community_design/design/suncana/stylesheets/buttons.css
extension/community_design/design/suncana/stylesheets/tweetbox.css
extension/community_design/design/suncana/stylesheets/jquery.fancybox-1.3.4.css
extension/bcsmoothgallery/design/standard/stylesheets/magnific-popup.css
extension/sevenx/design/simple/stylesheets/star_rating.css
extension/sevenx/design/simple/stylesheets/libs/fontawesome/css/all.min.css
extension/sevenx/design/simple/stylesheets/main.v02.css
extension/sevenx/design/simple/stylesheets/main.v02.res.css
JS0extension/ezjscore/design/standard/lib/yui/3.17.2/build/yui/yui-min.js
extension/ezjscore/design/standard/javascript/jquery-3.7.0.min.js
extension/community_design/design/suncana/javascript/jquery.ui.core.min.js
extension/community_design/design/suncana/javascript/jquery.ui.widget.min.js
extension/community_design/design/suncana/javascript/jquery.easing.1.3.js
extension/community_design/design/suncana/javascript/jquery.ui.tabs.js
extension/community_design/design/suncana/javascript/jquery.hoverIntent.min.js
extension/community_design/design/suncana/javascript/jquery.popmenu.js
extension/community_design/design/suncana/javascript/jScrollPane.js
extension/community_design/design/suncana/javascript/jquery.mousewheel.js
extension/community_design/design/suncana/javascript/jquery.cycle.all.js
extension/sevenx/design/simple/javascript/jquery.scrollTo.js
extension/community_design/design/suncana/javascript/jquery.cookie.js
extension/community_design/design/suncana/javascript/ezstarrating_jquery.js
extension/community_design/design/suncana/javascript/jquery.initboxes.js
extension/community_design/design/suncana/javascript/app.js
extension/community_design/design/suncana/javascript/twitterwidget.js
extension/community_design/design/suncana/javascript/community.js
extension/community_design/design/suncana/javascript/roadmap.js
extension/community_design/design/suncana/javascript/ez.js
extension/community_design/design/suncana/javascript/ezshareevents.js
extension/sevenx/design/simple/javascript/main.js

Templates used to render the page:

UsageRequested templateTemplateTemplate loadedEditOverride
1pagelayout.tpl<No override>extension/sevenx/design/simple/templates/pagelayout.tplEdit templateOverride template
 Number of times templates used: 1
 Number of unique templates used: 1

Time used to render debug report: 0.0001 secs