memory leak on content/publish ?

Author Message

*- pike

Friday 16 February 2007 8:02:31 pm

Hi

I'm publishing objects from php in ezp 3.7.6:


	print "pre publish\t: ".memory_get_usage()."\n";

	$operationResult = eZOperationHandler::execute( 
		'content','publish', 
		array(
			'object_id' => $id,
			'version' => $version 
		) 
	);

	print "post publish\t: ".memory_get_usage()."\n";
		
	eZContentObject::clearCache();

	print "post clearing cache\t: ".memory_get_usage()."\n";

node#1
pre publish : 17616080
post publish : 20010640
post clear cache : 19966224

node#80
pre publish : 80979984
post publish : 81725344
post clear cache : 81678512

node#130
pre publish : 133028024
post publish : 133954768
post clear cache : 133907912

so every node I publish eats ~800kB of memory, and growing, and it doesnt get cleared by clearing the in-memory object cache. importing 140 objects gives an out-of-memory on 128Mb memory limit. not good enough.

I've unset all variables that might have relations to the new object (in fact, this whole routine is inside a function itself, so php will do the garbage collection by itself every iteration of the loop).

so it seems when publishing nodes, something - probably the node and lots attached to it - get stuck somewhere behind content/publish, beneath some global somewhere.

any ideas ?
how should I debug this ?
what globals should I look for ?

thanks,
*pike

---------------
The class eZContentObjectTreeNode does.

Xavier Dutoit

Monday 19 February 2007 2:17:38 am

Hi,

That's not a memory leak, that's a cache for the objects. I've used a ez method/function to empty it (no need to keep the nodes in memory if you're sure you won't use them again). I unfortunately don't have access to that code yet, but it exists and it isn't a problem.

X+

http://www.sydesy.com

*- pike

Monday 19 February 2007 2:47:53 am

Thanks Xavier

I thought that was what the eZContentObject::clearCache(); was for.
It does clear something, as shown above, but not all.

Effectively, what it does in 3.7.4 is
unset( $GLOBALS['eZContentObjectContentObjectCache'] );
unset( $GLOBALS['eZContentObjectDataMapCache'] );
unset( $GLOBALS['eZContentObjectVersionCache'] );

If there are any other caches, I'd love to know. Anyone ?

thanks,
*pike

---------------
The class eZContentObjectTreeNode does.

Kristof Coomans

Monday 19 February 2007 4:07:13 am

It's indeed the in-memory object cache: http://ezpedia.org/wiki/en/ez/in_memory_object_cache

There are other in-memory caches (like the one with policy limitations: http://issues.ez.no/8388), but as far as I know there aren't others which keep growing like the object cache.

independent eZ Publish developer and service provider | http://blog.coomanskristof.be | http://ezpedia.org

*- pike

Monday 19 February 2007 4:31:22 am

Thanks Kristof,

but I am clearing the in-memory-object cache, aren't I ? I checked the $GLOBALS that should have been cleared, and yes, they are "undefined" after the clearCache() call.

Nevertheless, there is *still* something that grows ~800k per newly added object. So the question is: what is it ? And, or, how do I get rid of it ? :-)

still puzzled,
*pike

---------------
The class eZContentObjectTreeNode does.

Xavier Dutoit

Monday 19 February 2007 9:58:17 am

Hi,

Sorry, yes, that's the clearcache method you use already I was referring to.

About the unset, I remember having had better results with = NULL instead. Not sure what was php version/context. Could you try to see if it changes something (borderline voodoo, I know).

X+

PS. If my next post suggest sacrifices of chicken, that's a sure sign I need some vacation.

http://www.sydesy.com

Kristof Coomans

Monday 05 March 2007 12:32:22 am

Hi Pike

Any luck finding the resource hog?

Maybe it's the in-memory data map cache, used in eZContentObject::fetchDataMap

global $eZContentObjectDataMapCache;

independent eZ Publish developer and service provider | http://blog.coomanskristof.be | http://ezpedia.org

Xavier Dutoit

Monday 05 March 2007 3:09:51 am

Can it be safely cleared/unset ?

X+

http://www.sydesy.com

*- pike

Saturday 31 March 2007 2:37:39 am

Hi Cristoff and Xavier

The eZContentObjectDataMapCache is already cleared by ezContentObject::clearCache();

Effectively, what it does in 3.7.6 is

  unset( $GLOBALS['eZContentObjectContentObjectCache'] );
  unset( $GLOBALS['eZContentObjectDataMapCache'] );
  unset( $GLOBALS['eZContentObjectVersionCache'] );

Anyone know any other caches ?

thanks,
*-pike

---------------
The class eZContentObjectTreeNode does.

*- pike

Wednesday 15 August 2007 5:11:56 pm

Hi

still no luck. I'm not going to publish the import extension if it can not import > 200 objects, due to memory leaks in content publish.

I tracked down the eZOperationHandler::execute( 'content','publish' ...
by putting debug in moduleOperationInfo::executeBody. Publishing a node is ofcourse a whole series of methods called. Each of them seems to add a bit to the memory used, ending up in 400k (after clearing caches):

pre publish          : 18078528
-----------
pre  18087992 - set-version-pending
post 18192576 - set-version-pending
pre  18192440 - pre_publish
post 18192200 - pre_publish
pre  18192120 - begin-publish
post 18192048 - begin-publish
pre  18191912 - set-version-archived
post 18197304 - set-version-archived
pre  18197176 - update-section-id
post 18209400 - update-section-id
pre  18209264 - loop-nodes
post 18367200 - loop-nodes
pre  18367056 - set-version-published
post 18368816 - set-version-published
pre  18368688 - set-object-published
post 18369872 - set-object-published
pre  18369744 - remove-old-nodes
post 18371496 - remove-old-nodes
pre  18371360 - attribute-publish-action
post 18373616 - attribute-publish-action
pre  18373616 - clear-object-view-cache
post 18373584 - clear-object-view-cache
pre  18373456 - generate-object-view-cache
post 18425592 - generate-object-view-cache
pre  18425488 - register-search-object
post 18426432 - register-search-object
pre  18426304 - create-notification
post 18493968 - create-notification
pre  18493832 - end-publish
post 18493752 - end-publish
pre  18493608 - post_publish
post 18493728 - post_publish
-----------
post publish          : 18493080
post clear cache    : 18492528

So the biggest killers are

set-version-pending              100k
loop-nodes                          160k
generate-object-view-cache   50k
create-notification                 70k

They are all in ezContentOperationCollection. But none of them seems to do much with global variables. set-version-pending (setVersionStatus) creates a new version, but this should get cleared by my unset( $GLOBALS['eZContentObjectVersionCache'] ). Loop-nodes publishes the main node in the DB (and adds a lot of attributes to the object, but it should get cleared by ObjectCache and DatamapCache (I assume). generateObjectViewCache works on disk and createNotificationEvent adds a trigger to the db. Nothing harmfull.

So .. where do all the bits go ?

*-pike

---------------
The class eZContentObjectTreeNode does.

Xavier Dutoit

Thursday 16 August 2007 2:43:26 am

Hi,

Bad news.

To try improving the import, two suggestions:

1) Disable the generation of the cache (by default for anonymous and the user if I'm right)

2) disable the indextion (you will run a cron at the end)

Does it improve things ?

X+

http://www.sydesy.com

André R.

Thursday 16 August 2007 4:30:23 am

List of caches:

eZContentClassAttribute:
$eZContentClassAttributeCacheListFull
$eZContentClassAttributeCacheList[$this->attribute( 'contentclass_id' )]
$eZContentClassAttributeCache[$this->ID]

eZContentObject:
$eZContentObjectContentObjectCache[$this->ID]
$eZContentObjectDataMapCache[$this->ID]
$eZContentObjectVersionCache[$this->ID]

eZuser:
$GLOBALS['eZUserObject_' . $userID]

From 3.8.9 / 3.9.3 / 3.10beta1 and up:

eZSectionObject:
$eZContentSectionObjectCache[$sectionID]

eZContentClass:
$eZContentClassObjectCache[$this->ID]

eZ Online Editor 5: http://projects.ez.no/ezoe || eZJSCore (Ajax): http://projects.ez.no/ezjscore || eZ Publish EE http://ez.no/eZPublish/eZ-Publish-Enterprise-Subscription
@: http://twitter.com/andrerom

*- pike

Thursday 16 August 2007 4:41:29 am

Thanks Xavier, Andre

Andre: great - thats what i was looking for !
I'm going to inspect those globals closely!

This extension is meant to be distributed, so I can't really require users to rebuild their configuration before using it. Like

>1) Disable the generation of the cache
even if it would help, I wouldn't want that to be a requirement...

>2) disable the indextion (you will run a cron at the end)
that may be possible. how could I do that ?

thanks!
*-pike

---------------
The class eZContentObjectTreeNode does.

Xavier Dutoit

Thursday 16 August 2007 7:43:23 am

Hi,

I meant disable for the time of the import, not permanently.

DelayedIndexing=enabled (in the site.ini)

http://www.sydesy.com

*- pike

Thursday 16 August 2007 7:45:08 am

Hi

DelayedIndexing=enabled (in the site.ini) .. yes .. i already have that by default.

eZContentClassAttribute does fill up a bit, it caches each contentCLASSattribute (as you might expect). But that doesn't account for n*100kB per imported object (in my test, they were all of the same class). the ezContentObject caches should all be cleared by the eZContentObject::clearCache() call.

I've been staring at $GLOBALS, and there seems to be nothing very suspicious on the surface. A lot of globals though. What could be, is a circular reference deeper down that Zend refuses to clean up in any of the unset() calls. The code gets very dense when deep in the trunk.

I'm going to publish the contrib anyway, and get it of my plate finally. I've built in a check that should stop importing once you reach memory limits - because if you hit a out-of-mem in the middle the process, you might be left with inconsistent tables (:-O

Thanks for your input!
*-pike

---------------
The class eZContentObjectTreeNode does.

Xavier Dutoit

Thursday 16 August 2007 7:45:44 am

Btw, for the import, I always use php-cli, otherwise it times out/eat all the memory. I've been able to import a few 1000's nodes. Takes time....

Felipe did something bigger (by several order of magnitude based more or less on the same code and it went fine). Took more time ;)

X+

http://www.sydesy.com

*- pike

Thursday 16 August 2007 7:52:57 am

Hi

>Btw, for the import, I always use php-cli, otherwise it times out/eat
>all the memory. I've been able to import a few 1000's nodes.
>Takes time....

Time is fine. But memory is limited.

What import thing have you been using there ?

thanks,
*-pike

---------------
The class eZContentObjectTreeNode does.

Lazaro Ferreira

Thursday 16 August 2007 11:44:33 am

Hi ,

We have managed to import ( create ez content objects from CSV ) more than 15000 objects using a commandline import script ( ezp 3.8.8 ) , no memory problems, we use to run php in the background, and it runs smoothly, but of course it takes some time

if the imports breaks for any reason, no database corruption happens, because db transactions are commited for every object, if you try to import too many objects within one DB transaction it can eat all the RAM memory

On the other hand we have also used ez csvimport/export scripts , and it is fine too ( ez csvimport scripts are available from 3.9.0 )

The reason we didn't use ezcsvimport first , was that we run a custom import, the information comes from external sources ( no ezpublish site ) and it involves more than just importing, it also validates the information, and in some cases creates objects relation on fly

Todo next , running import scripts from ezpublish website administration page, so users don't need shell access to execute the import of a large number of objects

By the way, if anybody knows a good way to do it, please post the info here :-)

Lazaro
http://www.mzbusiness.com

*- pike

Thursday 16 August 2007 12:13:15 pm

>Todo next , running import scripts from ezpublish website administration page, so users
>don't need shell access to execute the import of a large number of objects

>By the way, if anybody knows a good way to do it, please post the info here :-)

Well, take your chance: http://ez.no/community/contribs/import_export/xmlimport
Maybe you're more lucky than I am. Or maybe you have some setting different than I do.

Let me know!
*-pike

---------------
The class eZContentObjectTreeNode does.

Gunnstein Lye

Thursday 27 December 2007 12:34:31 am

Hi!
Do you have any good news regarding this problem?

Powered by eZ Publish™ CMS Open Source Web Content Management. Copyright © 1999-2014 eZ Systems AS (except where otherwise noted). All rights reserved.