Wednesday 10 August 2011 4:30:22 am
By : Paul Borgermans
I am happy to announce a new updated version of the eZ Tika extension. This is actually a "helper" extension for indexing a large variety of binary file types, including pdf, MsWord, Powerpoint, iWork , ....
Check out the project page for downloads, source code and docs: http://projects.ez.no/eztika
eZ Tika is besides a binary file plugin, also a wrapper for the Apache Tika project which aims to extract plain text and meta-data from a large variety of files.
For more information on Apache Tika, visit http://tika.apache.org/