Problems with indexing binary/pdf data in eZ Find

Author Message

Jens Görisch

Friday 14 December 2007 10:02:50 am

Hi there,

i was grinding the whole day, trying to get some pdf data into the solr index of eZ Find ... inconclusively. The pdf document is transfered correcty to plaintext and finds its way into the eZPDFParser class. But if i want to search for content or even just some defined tags, i get no results.
Any hint somebody can give me? What did i forgotten and is there a way to get the whole indexdata in plaintext to check, what's in there, that should be?

best regards and thanks in advance

Jens Görisch

Kåre Køhler Høvik

Friday 21 December 2007 1:17:40 am

Hi

Can you provide an example PDF which you where not able to retrieve any search results from ? What PDF to text tool did you use ?

Kåre Høvik

Jordan Hirsch

Wednesday 26 March 2008 10:13:20 am

How are you handling binary file parsing with eZ Find? I've implemented various binary file parsers with the regular eZ publish search, but I haven't used eZ Find before. Is there a documented process for indexing binary files with it?

Me: http://jordan.teamhirsch.com
My blog: http://wiredformusic.blogspot.com
My other company: http://thinkimprov.com
eZ Certification: http://auth.ez.no/certification/verify/402488
eZ Award: http://ez.no/company/news/ez_awards_2007_prize_winners

Kåre Køhler Høvik

Wednesday 26 March 2008 1:45:21 pm

Hi

eZ Find is using the binary file handlers in eZ Publish. Set it up like you normally do, and it should work. Please report any misbehaviour in the issue tracker.

Best regards
Kåre

Kåre Høvik

Jordan Hirsch

Wednesday 26 March 2008 2:02:08 pm

Kåre,

Thank you for your response. I'm used to using the methodology from this article: http://ez.no/developer/articles/indexing_multiple_binary_file_types which involves creating a custom indexing script.

If I don't want to use that custom script and just want to use eZ Find, I just edit the binaryfile.ini override file and tell it the custom parsers I want to use, right?

Thanks again for the help.

Me: http://jordan.teamhirsch.com
My blog: http://wiredformusic.blogspot.com
My other company: http://thinkimprov.com
eZ Certification: http://auth.ez.no/certification/verify/402488
eZ Award: http://ez.no/company/news/ez_awards_2007_prize_winners

Kåre Køhler Høvik

Thursday 27 March 2008 2:55:55 am

Hi

Correct, overriding the binaryfile.ini file and run the update search index script provided by eZ Find.

Kåre Høvik

Jordan Hirsch

Thursday 27 March 2008 7:10:52 am

Great, thank you very much for your replies!

Me: http://jordan.teamhirsch.com
My blog: http://wiredformusic.blogspot.com
My other company: http://thinkimprov.com
eZ Certification: http://auth.ez.no/certification/verify/402488
eZ Award: http://ez.no/company/news/ez_awards_2007_prize_winners

Powered by eZ Publish™ CMS Open Source Web Content Management. Copyright © 1999-2014 eZ Systems AS (except where otherwise noted). All rights reserved.

eZ debug

Timing: Jan 30 2025 21:25:42
Script start
Timing: Jan 30 2025 21:25:42
Module start 'layout'
Timing: Jan 30 2025 21:25:42
Module start 'content'
Timing: Jan 30 2025 21:25:42
Module end 'content'
Timing: Jan 30 2025 21:25:42
Script end

Main resources:

Total runtime0.0300 sec
Peak memory usage8,192.0000 KB
Database Queries3

Timing points:

CheckpointStart (sec)Duration (sec)Memory at start (KB)Memory used (KB)
Script start 0.00000.0060 588.4063151.2422
Module start 'layout' 0.00600.0068 739.6484220.7500
Module start 'content' 0.01280.0155 960.39841,006.9141
Module end 'content' 0.02830.0016 1,967.312541.9922
Script end 0.0300  2,009.3047 

Time accumulators:

 Accumulator Duration (sec) Duration (%) Count Average (sec)
Ini load
Load cache0.00299.5959140.0002
Check MTime0.00113.7592140.0001
Mysql Total
Database connection0.00093.059810.0009
Mysqli_queries0.005317.583230.0018
Looping result0.00000.136710.0000
Template Total0.00123.910.0012
Template load0.00092.943810.0009
Template processing0.00030.973610.0003
Override
Cache load0.00062.040110.0006
General
dbfile0.00186.086280.0002
String conversion0.00000.035840.0000
Note: percentages do not add up to 100% because some accumulators overlap

Templates used to render the page:

UsageRequested templateTemplateTemplate loadedEditOverride
1print_pagelayout.tpl<No override>extension/community/design/community/templates/print_pagelayout.tplEdit templateOverride template
 Number of times templates used: 1
 Number of unique templates used: 1

Time used to render debug report: 0.0001 secs