Wednesday 26 November 2008 10:59:20 am
pstotext is not the best solution for converting pdf's to raw text, I guess that it fails to onvert the pdf file in question (try on the command line to se what happens) Better is to use pdftotext from the xpdf project, then configure a new script, for example called ezpdftotext with the following content (change the path tp pdftotext with your installation):
#!/bin/sh <path to >/pdftotext -enc "UTF-8" $1 - And configure this script in binaryfile.ini Note that the default installation will "normalize" Latin1 characters, so eZ Find/Solr will transform "reçu" to "recu" and more ... so searching either form will produce the hit Best regards
eZ Publish, eZ Find, Solr expert consulting and training
http://twitter.com/paulborgermans
|