Forums / Extensions / eZ Find / Problems with indexing binary/pdf data in eZ Find

Problems with indexing binary/pdf data in eZ Find

Author Message

Jens Görisch

Friday 14 December 2007 10:02:50 am

Hi there,

i was grinding the whole day, trying to get some pdf data into the solr index of eZ Find ... inconclusively. The pdf document is transfered correcty to plaintext and finds its way into the eZPDFParser class. But if i want to search for content or even just some defined tags, i get no results.
Any hint somebody can give me? What did i forgotten and is there a way to get the whole indexdata in plaintext to check, what's in there, that should be?

best regards and thanks in advance

Jens Görisch

Kåre Køhler Høvik

Friday 21 December 2007 1:17:40 am

Hi

Can you provide an example PDF which you where not able to retrieve any search results from ? What PDF to text tool did you use ?

Kåre Høvik

Jordan Hirsch

Wednesday 26 March 2008 10:13:20 am

How are you handling binary file parsing with eZ Find? I've implemented various binary file parsers with the regular eZ publish search, but I haven't used eZ Find before. Is there a documented process for indexing binary files with it?

Me: http://jordan.teamhirsch.com
My blog: http://wiredformusic.blogspot.com
My other company: http://thinkimprov.com
eZ Certification: http://auth.ez.no/certification/verify/402488
eZ Award: http://ez.no/company/news/ez_awards_2007_prize_winners

Kåre Køhler Høvik

Wednesday 26 March 2008 1:45:21 pm

Hi

eZ Find is using the binary file handlers in eZ Publish. Set it up like you normally do, and it should work. Please report any misbehaviour in the issue tracker.

Best regards
Kåre

Kåre Høvik

Jordan Hirsch

Wednesday 26 March 2008 2:02:08 pm

Kåre,

Thank you for your response. I'm used to using the methodology from this article: http://ez.no/developer/articles/indexing_multiple_binary_file_types which involves creating a custom indexing script.

If I don't want to use that custom script and just want to use eZ Find, I just edit the binaryfile.ini override file and tell it the custom parsers I want to use, right?

Thanks again for the help.

Me: http://jordan.teamhirsch.com
My blog: http://wiredformusic.blogspot.com
My other company: http://thinkimprov.com
eZ Certification: http://auth.ez.no/certification/verify/402488
eZ Award: http://ez.no/company/news/ez_awards_2007_prize_winners

Kåre Køhler Høvik

Thursday 27 March 2008 2:55:55 am

Hi

Correct, overriding the binaryfile.ini file and run the update search index script provided by eZ Find.

Kåre Høvik

Jordan Hirsch

Thursday 27 March 2008 7:10:52 am

Great, thank you very much for your replies!

Me: http://jordan.teamhirsch.com
My blog: http://wiredformusic.blogspot.com
My other company: http://thinkimprov.com
eZ Certification: http://auth.ez.no/certification/verify/402488
eZ Award: http://ez.no/company/news/ez_awards_2007_prize_winners

eZ debug

Timing: Jan 18 2025 04:58:05
Script start
Timing: Jan 18 2025 04:58:05
Module start 'content'
Timing: Jan 18 2025 04:58:05
Module end 'content'
Timing: Jan 18 2025 04:58:06
Script end

Main resources:

Total runtime0.7252 sec
Peak memory usage4,096.0000 KB
Database Queries210

Timing points:

CheckpointStart (sec)Duration (sec)Memory at start (KB)Memory used (KB)
Script start 0.00000.0083 589.1172180.7578
Module start 'content' 0.00830.5630 769.8750630.5391
Module end 'content' 0.57130.1539 1,400.4141341.0313
Script end 0.7251  1,741.4453 

Time accumulators:

 Accumulator Duration (sec) Duration (%) Count Average (sec)
Ini load
Load cache0.00460.6379210.0002
Check MTime0.00200.2761210.0001
Mysql Total
Database connection0.00100.136710.0010
Mysqli_queries0.631587.08082100.0030
Looping result0.00260.35402080.0000
Template Total0.683394.220.3417
Template load0.00210.284420.0010
Template processing0.681293.933420.3406
Template load and register function0.00020.025010.0002
states
state_id_array0.00130.184710.0013
state_identifier_array0.00090.117920.0004
Override
Cache load0.00170.2413340.0001
Sytem overhead
Fetch class attribute can translate value0.00160.220940.0004
Fetch class attribute name0.00150.211390.0002
XML
Image XML parsing0.00300.410440.0007
class_abstraction
Instantiating content class attribute0.00000.0037130.0000
General
dbfile0.01051.4506320.0003
String conversion0.00000.001030.0000
Note: percentages do not add up to 100% because some accumulators overlap

CSS/JS files loaded with "ezjscPacker" during request:

CacheTypePacklevelSourceFiles
CSS0extension/community/design/community/stylesheets/ext/jquery.autocomplete.css
extension/community_design/design/suncana/stylesheets/scrollbars.css
extension/community_design/design/suncana/stylesheets/tabs.css
extension/community_design/design/suncana/stylesheets/roadmap.css
extension/community_design/design/suncana/stylesheets/content.css
extension/community_design/design/suncana/stylesheets/star-rating.css
extension/community_design/design/suncana/stylesheets/syntax_and_custom_tags.css
extension/community_design/design/suncana/stylesheets/buttons.css
extension/community_design/design/suncana/stylesheets/tweetbox.css
extension/community_design/design/suncana/stylesheets/jquery.fancybox-1.3.4.css
extension/bcsmoothgallery/design/standard/stylesheets/magnific-popup.css
extension/sevenx/design/simple/stylesheets/star_rating.css
extension/sevenx/design/simple/stylesheets/libs/fontawesome/css/all.min.css
extension/sevenx/design/simple/stylesheets/main.v02.css
extension/sevenx/design/simple/stylesheets/main.v02.res.css
JS0extension/ezjscore/design/standard/lib/yui/3.17.2/build/yui/yui-min.js
extension/ezjscore/design/standard/javascript/jquery-3.7.0.min.js
extension/community_design/design/suncana/javascript/jquery.ui.core.min.js
extension/community_design/design/suncana/javascript/jquery.ui.widget.min.js
extension/community_design/design/suncana/javascript/jquery.easing.1.3.js
extension/community_design/design/suncana/javascript/jquery.ui.tabs.js
extension/community_design/design/suncana/javascript/jquery.hoverIntent.min.js
extension/community_design/design/suncana/javascript/jquery.popmenu.js
extension/community_design/design/suncana/javascript/jScrollPane.js
extension/community_design/design/suncana/javascript/jquery.mousewheel.js
extension/community_design/design/suncana/javascript/jquery.cycle.all.js
extension/sevenx/design/simple/javascript/jquery.scrollTo.js
extension/community_design/design/suncana/javascript/jquery.cookie.js
extension/community_design/design/suncana/javascript/ezstarrating_jquery.js
extension/community_design/design/suncana/javascript/jquery.initboxes.js
extension/community_design/design/suncana/javascript/app.js
extension/community_design/design/suncana/javascript/twitterwidget.js
extension/community_design/design/suncana/javascript/community.js
extension/community_design/design/suncana/javascript/roadmap.js
extension/community_design/design/suncana/javascript/ez.js
extension/community_design/design/suncana/javascript/ezshareevents.js
extension/sevenx/design/simple/javascript/main.js

Templates used to render the page:

UsageRequested templateTemplateTemplate loadedEditOverride
1node/view/full.tplfull/forum_topic.tplextension/sevenx/design/simple/override/templates/full/forum_topic.tplEdit templateOverride template
7content/datatype/view/ezxmltext.tpl<No override>extension/community_design/design/suncana/templates/content/datatype/view/ezxmltext.tplEdit templateOverride template
9content/datatype/view/ezxmltags/paragraph.tpl<No override>extension/ezwebin/design/ezwebin/templates/content/datatype/view/ezxmltags/paragraph.tplEdit templateOverride template
2content/datatype/view/ezxmltags/line.tpl<No override>design/standard/templates/content/datatype/view/ezxmltags/line.tplEdit templateOverride template
6content/datatype/view/ezimage.tpl<No override>extension/sevenx/design/simple/templates/content/datatype/view/ezimage.tplEdit templateOverride template
1pagelayout.tpl<No override>extension/sevenx/design/simple/templates/pagelayout.tplEdit templateOverride template
 Number of times templates used: 26
 Number of unique templates used: 6

Time used to render debug report: 0.0002 secs