Forums / Extensions / eZ Find / Indexing arbitrary error (XMLStreamException Message: null)

Indexing arbitrary error (XMLStreamException Message: null)

Author Message

Jens Görisch

Monday 31 August 2009 7:11:30 am

Hello,

as the title implies, I have problems with indexing into the eZ Find Solr index.

First I want to make clear that I don't have problems with the eZ Find index script. Or at least I don't checked, if this error occurs with this script, too.

Explanation:
We are using a data model, that is ezContentObject-compliant, but more lightweight. To index this data model, we are using the schema file of eZ Find, since the core fields are the same.

This model holds ~8300 objects, which are indexed twice to switch between the "searchable index" and the "indexing index". eZ Publish has 97800 object indexed, which results in more than 100k objects in the index. I don't noticed this error with lower-count-indexes.

Now to the error itself:
When indexing, sometimes the update process causes an error (sometimes means a few XML packets, not a few index processes). The result from Solr is empty and the log-file contains the following entry:

SEVERE: javax.xml.stream.XMLStreamException: ParseError at [row,col]:[4004,3038]
Message: null
	at com.sun.org.apache.xerces.internal.impl.XMLStreamReaderImpl.next(XMLStreamReaderImpl.java:586)
	at org.apache.solr.handler.XmlUpdateRequestHandler.readDoc(XmlUpdateRequestHandler.java:321)
	at org.apache.solr.handler.XmlUpdateRequestHandler.processUpdate(XmlUpdateRequestHandler.java:195)
	at org.apache.solr.handler.XmlUpdateRequestHandler.handleRequestBody(XmlUpdateRequestHandler.java:123)
	at org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:131)
	at org.apache.solr.core.SolrCore.execute(SolrCore.java:1204)
	at org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:303)
	at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:232)
	at org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1089)
	at org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:365)
	at org.mortbay.jetty.security.SecurityHandler.handle(SecurityHandler.java:216)
	at org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:181)
	at org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:712)
	at org.mortbay.jetty.webapp.WebAppContext.handle(WebAppContext.java:405)
	at org.mortbay.jetty.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:211)
	at org.mortbay.jetty.handler.HandlerCollection.handle(HandlerCollection.java:114)
	at org.mortbay.jetty.handler.HandlerWrapper.handle(HandlerWrapper.java:139)
	at org.mortbay.jetty.Server.handle(Server.java:285)
	at org.mortbay.jetty.HttpConnection.handleRequest(HttpConnection.java:502)
	at org.mortbay.jetty.HttpConnection$RequestHandler.content(HttpConnection.java:835)
	at org.mortbay.jetty.HttpParser.parseNext(HttpParser.java:641)
	at org.mortbay.jetty.HttpParser.parseAvailable(HttpParser.java:202)
	at org.mortbay.jetty.HttpConnection.handle(HttpConnection.java:378)
	at org.mortbay.jetty.bio.SocketConnector$Connection.run(SocketConnector.java:226)
	at org.mortbay.thread.BoundedThreadPool$PoolThread.run(BoundedThreadPool.java:442)

I dumped the XML data and checked the respective position. It always was an ordinary (but different) character. Validating the XML data with xmllint also results in valid XML. Occasionally even no error occurs and indexing succeeds.

I've found a workaround to bypass this temporarily, by just retrying the particular packages until <i>eZSolrBase::addDocs()</i> returns <i>true</i> (up to a count of 3). Strangely the <b>same</b> XML works the second or third time.

Does anybody can report about similar problems? And perhaps already have found a (real) solution and the reason for this?

Thanks in advance,

Jens Görisch

eZ debug

Timing: Jan 18 2025 02:43:10
Script start
Timing: Jan 18 2025 02:43:10
Module start 'content'
Timing: Jan 18 2025 02:43:11
Module end 'content'
Timing: Jan 18 2025 02:43:11
Script end

Main resources:

Total runtime0.4592 sec
Peak memory usage4,096.0000 KB
Database Queries49

Timing points:

CheckpointStart (sec)Duration (sec)Memory at start (KB)Memory used (KB)
Script start 0.00000.0072 589.1172180.7422
Module start 'content' 0.00720.4498 769.8594424.3047
Module end 'content' 0.45700.0022 1,194.164142.0313
Script end 0.4592  1,236.1953 

Time accumulators:

 Accumulator Duration (sec) Duration (%) Count Average (sec)
Ini load
Load cache0.00280.6086140.0002
Check MTime0.00120.2639140.0001
Mysql Total
Database connection0.00080.178810.0008
Mysqli_queries0.429393.4827490.0088
Looping result0.00030.0706470.0000
Template Total0.433194.320.2166
Template load0.00200.430520.0010
Template processing0.431193.881020.2156
Template load and register function0.00010.024910.0001
states
state_id_array0.00080.166810.0008
state_identifier_array0.00060.127120.0003
Override
Cache load0.00160.3426170.0001
Sytem overhead
Fetch class attribute can translate value0.00050.101210.0005
Fetch class attribute name0.00060.121110.0006
XML
Image XML parsing0.00020.043110.0002
class_abstraction
Instantiating content class attribute0.00000.000910.0000
General
dbfile0.00070.1489120.0001
String conversion0.00000.002030.0000
Note: percentages do not add up to 100% because some accumulators overlap

Templates used to render the page:

UsageRequested templateTemplateTemplate loadedEditOverride
1node/view/full.tplfull/forum_topic.tplextension/sevenx/design/simple/override/templates/full/forum_topic.tplEdit templateOverride template
1content/datatype/view/ezxmltext.tpl<No override>extension/community_design/design/suncana/templates/content/datatype/view/ezxmltext.tplEdit templateOverride template
4content/datatype/view/ezxmltags/paragraph.tpl<No override>extension/ezwebin/design/ezwebin/templates/content/datatype/view/ezxmltags/paragraph.tplEdit templateOverride template
2content/datatype/view/ezxmltags/line.tpl<No override>design/standard/templates/content/datatype/view/ezxmltags/line.tplEdit templateOverride template
1content/datatype/view/ezxmltags/literal.tpl<No override>extension/community/design/standard/templates/content/datatype/view/ezxmltags/literal.tplEdit templateOverride template
1pagelayout.tpl<No override>extension/sevenx/design/simple/templates/pagelayout.tplEdit templateOverride template
 Number of times templates used: 10
 Number of unique templates used: 6

Time used to render debug report: 0.0001 secs