eZFind and pdf indexing

Author Message

Gwenal Le Bihan

Friday 19 September 2008 2:30:33 am

Hello,

I've been trying to index pdf files using eZ4.0.1, eZfind and this binaryfile parser http://ez.no/developer/articles/indexing_multiple_binary_file_types/installation and it's working ok except for too large pdf files.

I tried it on a 2,7Mo pdf file (approx 140000 car) and I have no error messages in any log and my files is indexed until a certain point and then nothing (approx 66% of the words are taken care of).

Is there a limit (size, number of caracters, ...) in some conf file like solrconfig.xml but i quite not understand everything in it and don't want to mess with it ...

Is there anybody out there with the same problem?

Thanks a lot

gwenal

Gwenal Le Bihan

Friday 19 September 2008 3:54:56 am

PS : In the ezbinaryparser.php, i've removed this line :

//               $sData = substr($sData, 0, $iCharacterLimit);

so that i don't have a partial index for my file during the original upload.

Powered by eZ Publish™ CMS Open Source Web Content Management. Copyright © 1999-2014 eZ Systems AS (except where otherwise noted). All rights reserved.

eZ debug

Timing: Jan 30 2025 16:37:25
Script start
Timing: Jan 30 2025 16:37:25
Module start 'layout'
Timing: Jan 30 2025 16:37:25
Module start 'content'
Timing: Jan 30 2025 16:37:25
Module end 'content'
Timing: Jan 30 2025 16:37:25
Script end

Main resources:

Total runtime0.0263 sec
Peak memory usage4,096.0000 KB
Database Queries3

Timing points:

CheckpointStart (sec)Duration (sec)Memory at start (KB)Memory used (KB)
Script start 0.00000.0044 588.1563151.2109
Module start 'layout' 0.00440.0049 739.3672220.6875
Module start 'content' 0.00930.0152 960.0547998.7734
Module end 'content' 0.02450.0017 1,958.828129.9922
Script end 0.0263  1,988.8203 

Time accumulators:

 Accumulator Duration (sec) Duration (%) Count Average (sec)
Ini load
Load cache0.00249.3042140.0002
Check MTime0.00103.9522140.0001
Mysql Total
Database connection0.00083.101410.0008
Mysqli_queries0.003613.600730.0012
Looping result0.00000.037110.0000
Template Total0.00124.710.0012
Template load0.00103.761910.0010
Template processing0.00020.926910.0002
Override
Cache load0.00062.261510.0006
General
dbfile0.00269.906880.0003
String conversion0.00000.021740.0000
Note: percentages do not add up to 100% because some accumulators overlap

Templates used to render the page:

UsageRequested templateTemplateTemplate loadedEditOverride
1print_pagelayout.tpl<No override>extension/community/design/community/templates/print_pagelayout.tplEdit templateOverride template
 Number of times templates used: 1
 Number of unique templates used: 1

Time used to render debug report: 0.0001 secs