Author
|
Message
|
laurent le cadet
|
Wednesday 12 December 2007 3:02:32 am
Hi, I'm using ezfind 1.0.2 with ezP 3.9.3 - iso-8859-1 and text is not correctly indexed.
ie :
V�rin hydraulique V�rin hydraulique ... pompes, chaleur, hydraulique, v�rin This should be "Vérin hydraulique" Any additionnal settings are needed? Regards. Laurent
|
laurent le cadet
|
Thursday 13 December 2007 6:30:19 am
It sounds like the encoding is not correct. Must we have a utf-8 db?
|
Kåre Køhler Høvik
|
Thursday 13 December 2007 7:36:40 am
Hi UTF8 should not be required for eZ Find and eZP3. If you have a test environment available, please try to comment out these two lines in <i>extension/ezfind/java/solr/conf/schema.xml</i>
....
<!-- <filter class="ISOLatin1AccentFilterFactory"/> -->
...
<!-- <filter class="ISOLatin1AccentFilterFactory"/> -->
...
restart Solr, and reindex the data.
Kåre Høvik
|
laurent le cadet
|
Friday 14 December 2007 3:08:13 am
Hi Kåre, We add comment for the lines : <!-- <filter class="ISOLatin1AccentFilterFactory"/> --> restart solr and reindex but the results are still corrupted : This text : Le DMP est con�u pour r�aliser pour le microdosage de tr�s haute pr�cision de tous les produits Should be : Le DMP est conçu pour réaliser pour le microdosage de très haute précision de tous les produits The charcaters : ç,é,è (and I presume all the special characters) are not well encode. Stuck at this point. Any hint ? regards. Laurent
|
laurent le cadet
|
Monday 17 December 2007 4:36:37 am
Hi, I read that on http://lucene.apache.org/solr/tutorial.html#Requirements : "SimplePostTool: WARNING: Make sure your XML documents are encoded in UTF-8, other encodings are not currently supported" Is that related to our problem or can we override that? I tryed almost everythings without any results actually. Best regards Laurent
|
Kåre Køhler Høvik
|
Monday 17 December 2007 4:59:55 am
Hi Thank you for looking into this. It looks you found the problem. The resolution for this is to use eZ Find to convert the data to UTF-8 before it's indexed. Please add a bug report about this in the issue tracker, and I'll fix it as soon as I have time.
Best regards Kåre
Kåre Høvik
|
laurent le cadet
|
Monday 17 December 2007 5:07:56 am
Kåre,
I'm going to report the bug. As you can see, there is additionnal info for encoding/decoding (java.net) or another alternative with additionnal code :
String encoding = request.getCharacterEncoding();
if (null == encoding) {
// Set your default encoding here
request.setCharacterEncoding("UTF-8");
} else {
request.setCharacterEncoding(encoding);
}
...
String value = request.getParameter("q");
I'm digging in the "java.net" solution. For the other one, I don't know if it can serves us and where to apply the "patch". Any idea? Laurent
|
laurent le cadet
|
Wednesday 19 December 2007 2:50:18 am
Finally, I convert the DB to UTF-8. Everything works fine. (http://ez.no/developer/forum/general/convert_from_iso_8859_1_encoding_to_utf_8/) Hope this help. laurent
|
John Smith
|
Tuesday 19 August 2008 10:26:10 am
hi laurent, I used the script by Kristof Coomans while upgrading 3.6.1 to 3.8.0 to do the uft-8 conversion, which is posted on http://ez.no/developer/forum/install_configuration/update_to_3_8_and_codepage_problems I am getting the notice of SET NAMES 'utf8' on adminstration and public website. Are you getting the same.... Please help...
|