Memory problem will running cronjob

Author Message

Damien MARTIN

Monday 16 August 2010 6:50:00 am

Hi there,

I have a problem with a simple script :

I have to fetch all the nodes in a folder and there is a lot of nodes (18000+).
For each node I have to add it in an indexed array for comparison with a CSV file.

So this is the code :

$nodes = eZContentObjectTreeNode::subTreeByNodeID(null, 2283);
$existing = array();
foreach($nodes as $node){
        
    if($node->ClassIdentifier == "myclass"){

        $dm = $node->DataMap();      
        $existing[$dm["n_siren"]->DataText.$dm["siret"]->DataText] = $node->NodeID;
    
    }
    
    unset($dm);
    unset($site);
    
}

As you can see I try to unset variables (because they are not used anymore) but each time I can't finish this part of code without an error : not enough memory.

I'm under PHP 5.2.6 under GNU/Linux Lenny4 and I can't use PHP 5.3 and it's garbage collector.

Can someone help me ?

Yannick Komotir

Monday 16 August 2010 7:45:58 am

Hi,

You can use offset/limit to achieve this.

<|- Software Engineer @ eZ Publish developpers -|>
@ http://twitter.com/yannixk

Damien MARTIN

Monday 16 August 2010 8:04:12 am

Thanks Yannick,

But I really need to fetch all the node in one time.

This is how the script works in it's complete form :

  1. Load elements in the given directory (the part where the problem is)
  2. Load a CSV file with the meaningly same datas as in the stored elements
  3. Add new elements to eZ
  4. Modify existing elements in eZ
  5. Delete non existing elements (an element is removed it exists in eZ but not in the CSV file)

So, the first part has to load all the node to make the comparison with the CSV file.
If all the elements are not loaded, I will not be able to see if an element from the CSV has to be added or to be modified.

The element used for comparison is composed of two attributes (not the name, because it would be to easy...), it is why I have to load the DataMap of each node...

I hope it is more understandable like this.

Yannick Komotir

Monday 16 August 2010 9:28:58 am

Loading all nodes at same time it's not the best way. You can do it sequentially.

Just an example :

$limit = 50;
$offset = 0;
while( $continueRun )
{
$continueRun = dothis($offset, $limit);
$offset += 50;
 }

function dothis( $offset, $limit )
{
 $nodes = eZContentObjectTreeNode::subTreeByNodeID(array( 'Limit' => $limit, 'Offset' => offset),2283);
 if( $nodes == false )
  return false; 
 foreach( $nodes as $node )
 {
  //your task here
 }
  
 unset($nodes); 
}

<|- Software Engineer @ eZ Publish developpers -|>
@ http://twitter.com/yannixk

Jérôme Vieilledent

Monday 16 August 2010 2:50:30 pm

Hello

eZ Publish uses an in-memory cache for optimizations. If you want to iterate a long list of nodes/objects, you need to clear this cache :

$nodes = eZContentObjectTreeNode::subTreeByNodeID(null, 2283);
$existing = array();
foreach($nodes as $node){
 
    if($node->ClassIdentifier == "myclass"){
 
        $dm = $node->DataMap();      
        $existing[$dm["n_siren"]->DataText.$dm["siret"]->DataText] = $node->NodeID;
 
    }
 
    $node->object()->resetDataMap();
    eZContentObject::clearCache( array( $node->attribute( 'contentobject_id' ) ) );
 
}

Damien MARTIN

Tuesday 17 August 2010 12:20:43 am

Hi Jerome,

Your solution is very interesting because I can use it in my other importation scripts. And it's wonderfull not to have to edit CSV file to re-run the cronjob from where it crashed !

I doesn't thought eZ was storing so much datas in its caches when you are using PHP directly.

Thanks you Yannick and Jerome.

André R.

Tuesday 17 August 2010 1:08:55 am

> I doesn't thought eZ was storing so much datas in its caches when you are using PHP directly.

It is, we have wanted to add a cache handler that manages in memory cache and provides an general cache api with handler support for years, so we can move parts of cache to for instance memecached and so, fix these memory issues, simplify cache code and possibly optimize stuff while at it.

eZ Online Editor 5: http://projects.ez.no/ezoe || eZJSCore (Ajax): http://projects.ez.no/ezjscore || eZ Publish EE http://ez.no/eZPublish/eZ-Publish-Enterprise-Subscription
@: http://twitter.com/andrerom

Jérôme Vieilledent

Tuesday 17 August 2010 1:10:21 am

My pleasure ;-).

About import, I plan to release a new import extension very soon, SQLIImport. You'll be able to handle any data source (XML, CSV...) with only one PHP class to create and with a really simplified API to create and retrieve content objects.

Stay tuned ! :)

Powered by eZ Publish™ CMS Open Source Web Content Management. Copyright © 1999-2014 eZ Systems AS (except where otherwise noted). All rights reserved.

eZ debug

Timing: Jan 18 2025 01:59:06
Script start
Timing: Jan 18 2025 01:59:06
Module start 'layout'
Timing: Jan 18 2025 01:59:06
Module start 'content'
Timing: Jan 18 2025 01:59:08
Module end 'content'
Timing: Jan 18 2025 01:59:08
Script end

Main resources:

Total runtime1.2881 sec
Peak memory usage4,096.0000 KB
Database Queries76

Timing points:

CheckpointStart (sec)Duration (sec)Memory at start (KB)Memory used (KB)
Script start 0.00000.0046 589.0703152.6250
Module start 'layout' 0.00460.0028 741.695339.4453
Module start 'content' 0.00751.2790 781.1406701.2188
Module end 'content' 1.28650.0016 1,482.359420.1563
Script end 1.2881  1,502.5156 

Time accumulators:

 Accumulator Duration (sec) Duration (%) Count Average (sec)
Ini load
Load cache0.00310.2411160.0002
Check MTime0.00130.1004160.0001
Mysql Total
Database connection0.00080.062010.0008
Mysqli_queries1.207693.7515760.0159
Looping result0.00080.0616740.0000
Template Total1.255097.420.6275
Template load0.00200.156720.0010
Template processing1.253097.273920.6265
Template load and register function0.00010.009310.0001
states
state_id_array0.00110.085410.0011
state_identifier_array0.00140.106020.0007
Override
Cache load0.00180.1380570.0000
Sytem overhead
Fetch class attribute can translate value0.00080.058840.0002
Fetch class attribute name0.00130.1028120.0001
XML
Image XML parsing0.00250.193240.0006
class_abstraction
Instantiating content class attribute0.00000.0026160.0000
General
dbfile0.00250.1921360.0001
String conversion0.00000.000740.0000
Note: percentages do not add up to 100% because some accumulators overlap

Templates used to render the page:

UsageRequested templateTemplateTemplate loadedEditOverride
1node/view/full.tplfull/forum_topic.tplextension/sevenx/design/simple/override/templates/full/forum_topic.tplEdit templateOverride template
8content/datatype/view/ezimage.tpl<No override>extension/sevenx/design/simple/templates/content/datatype/view/ezimage.tplEdit templateOverride template
8content/datatype/view/ezxmltext.tpl<No override>extension/community_design/design/suncana/templates/content/datatype/view/ezxmltext.tplEdit templateOverride template
11content/datatype/view/ezxmltags/paragraph.tpl<No override>extension/ezwebin/design/ezwebin/templates/content/datatype/view/ezxmltags/paragraph.tplEdit templateOverride template
2content/datatype/view/ezxmltags/line.tpl<No override>design/standard/templates/content/datatype/view/ezxmltags/line.tplEdit templateOverride template
3content/datatype/view/ezxmltags/literal.tpl<No override>extension/community/design/standard/templates/content/datatype/view/ezxmltags/literal.tplEdit templateOverride template
1content/datatype/view/ezxmltags/li.tpl<No override>design/standard/templates/content/datatype/view/ezxmltags/li.tplEdit templateOverride template
1content/datatype/view/ezxmltags/ol.tpl<No override>design/standard/templates/content/datatype/view/ezxmltags/ol.tplEdit templateOverride template
1print_pagelayout.tpl<No override>extension/community/design/community/templates/print_pagelayout.tplEdit templateOverride template
 Number of times templates used: 36
 Number of unique templates used: 9

Time used to render debug report: 0.0002 secs