Importing and Exporting Content Objects in eZ

Author Message

Russell Michell

Sunday 18 April 2010 3:12:43 pm

No worries Nicolas - the 4.0x days for me were over long ago, but the DB schema/data issue hailed from that time and I have been upgrading and patching ever since 4.0.1-rc2.

FYI - I have created an XSL stylesheet which seems to work OK transforming ezxmlexport's own XML format into data_import's own XML format. I have posted it over on the ezxmlexport forum: http://projects.ez.no/ezxmlexport/forum/general/suggestions.

Hope it may help someone else :-)

Cheers
Russ

PS - anyone know how to get ezxmlexport to include the parent folder in an export? All I get right now is the contents of a folder, but have to manually re-create its parent and copy its children. I will keep playing and try some stuff out though.

Russell Michell, Wellington, New Zealand.
We're building! http://www.theruss.com/blog/
I'm on Twitter: http://twitter.com/therussdotcom

Believe nothing, consider everything.

Nicolas Pastorino

Monday 19 April 2010 12:58:08 am

Thanks Russell, this XSL stylesheet may help a few others! Thanks for sharing.

About including the top node in the export, i'll try to investigate the question with the author of the extension :)
If anyone figures out before please raise a hand here !

Cheers,

--
Nicolas Pastorino
Director Community - eZ
Member of the Community Project Board

eZ Publish Community on twitter: http://twitter.com/ezcommunity

t : http://twitter.com/jeanvoye
G+ : http://plus.tl/jeanvoye

Jérôme Renard

Monday 19 April 2010 1:13:43 am

Hello,

eZXMLExport never export the parent node, it has a top -> bottom approach and not a bottom -> up one.

For example with the following directory structure :

 eZPublish
└── folder1
    └── folder2
        └── folder3

If you want to export folder2 and folder3, you have to export from folder1, and if you want to export folder1, 2 and 3 you have to choose to export from "eZPublish".

Cheers :)

Russell Michell

Monday 19 April 2010 1:18:43 pm

Hi Jérôme and thanks for clearing that up :-)

I have to say though that doing it this way doesn't seem to be intuitive to me. In the ezxmlexport Admin GUI, when you select the "Choose Contents" button, I would expect the contents I selected to be exported, not the children of that contents.

Perhaps it might be an improvement if the docs were more specific about this (possibly they already are, although I have read them several times, I don't recall - sorry!) and also an additional option in the export interface labeled: 'Export Behaviour' 1). default 2). complete - where 'default' is the behavior as it is now, and 'complete' includes the parent directory the user selects when choosing his content. Even an .ini setting?

I will post this as another suggestion over on the ezxmlexport forum though.

Thanks again though Jérôme for an extremely useful and versatile extension - I do not wish to take this fact away from you :-)

Regards
Russell

Russell Michell, Wellington, New Zealand.
We're building! http://www.theruss.com/blog/
I'm on Twitter: http://twitter.com/therussdotcom

Believe nothing, consider everything.

Russell Michell

Monday 19 April 2010 4:36:43 pm

"

Well there is an even simpler solution.

eZXMLExport allows you to use an XSLT file which is used during the export process, so with each XML file created you can apply a smal and specific XSLT that will be used to generate the final result, if you already have a ready to go XML handler in data_import that would be an interesting solution :)

"

I forgot to mention, I created a generic XMLHandler for data_import. It works at the moment with folders and articles. Hopefully you can see how to hack/modify it to do what you want it to do. The only problem is implementing a write-logger->write() method (To Do!)

<?php
/*
*       @decription:    Generic exported eZ content-class import class
*       @author:                R.Michell 2010 r DOT michell AT gns DOT cri DOT nz
*       @package:               data_import
*       @To Do:                 Implement logger-write() method using ezcomponents
*/

class XMLHandlerGeneric extends XmlHandlerPHP
{
        var $handlerTitle = 'Generic Handler';                                                                          // Default handler name.
        var $current_loc_info = array();                                                                                        // Not sure if used.
        var $logfile = 'data_import.log';                                                                                       // Log file-name.
        var $remoteID = '';                                                                                                                     // Not sure if used.
        const REMOTE_IDENTIFIER = 'xmlimport_';                                                                         // Default. Is appended-to later..

        var $root_node = 'all';                                                                                                 // Source XML root node element.
        var $xml_source_path = 'extension/data_import/dataSource/exports';              // Path to parent dir od source XML file(s) for import.
        var $xml_source_file;                                                                                                   // ezxmlexport uses an export name for the export's parent dir and XML filename.
        var $parent_id_fallback = 2;                                                                                    // Fallback to root node ('Main') of a parent_id cannot be found for an imported object

        /*
        * Constructor
        */
        public function XMLHandlerGeneric()
        {
        }

        function logger($message,$logfile)
        {
                if(is_writable())
                {
                }
        }

        function writeLog( $message, $newlogfile = '')
        {
                if($newlogfile)
                {
                        $logfile = $newlogfile;
                }
                else
                {
                        $logfile = $this->logfile;
                }
                $this->logger->write(self::REMOTE_IDENTIFIER.$this->current_row->getAttribute('id').': '.$message,$logfile);
        }

// Mapping for source XML field name to an eZ attribute name:
        function geteZAttributeIdentifierFromField()
        {
                $field_name = $this->current_field->getAttribute('name');
                if($this->getTargetContentClass() == 'article')
                {
                        switch ($field_name)
                        {
                                case 'name':
                                        return 'title';
                                break;
                                case 'shortname':
                                        return 'short_title';
                                break;
                                case 'description':
                                        return 'body';
                                break;
                                case 'publishdate':
                                        return 'publish_date';
                                default:
                                        return $field_name;
                        }
                }
                if($this->getTargetContentClass() == 'folder')
                {
                        switch ($field_name)
                        {
                                case 'shortname':
                                        return 'short_name';
                                case 'showsubitems':
                                        return 'show_children';
                                case 'publishdate':
                                        return 'publish_date';
                                default:
                                        return $field_name;
                        }
                }
                else
                {
                        switch ($field_name)
                        {
                                case 'shortname':
                                        return 'short_name';
                                case 'showsubitems':
                                        return 'show_children';
                                case 'publishdate':
                                        return 'publish_date';
                                default:
                                        return $field_name;
                        }
                }
        }


        // Handles xml fields before storing them in ez publish
        function getValueFromField()
        {
                switch( $this->current_field->getAttribute('name') )
                {
                        case 'publishdate':
                        {
                                $return_unix_ts = time();
                                $us_formated_date = $this->current_field->nodeValue;
                                $parts = explode('/', $us_formated_date );
                                if( count( $parts ) == 3)
                                {
                                        $return_unix_ts = mktime( 0,0,0, $parts[0], $parts[1] , $parts[2] );
                                }
                                return $return_unix_ts;
                        }
                        break;
                        case 'short_description':
                        case 'description':
                        case 'intro':
                        case 'body_right':
                        {
                                $xml_text_parser = new XmlTextParser();
                                $xmltext = $xml_text_parser->Html2XmlText( $this->current_field->nodeValue );
                                if($xmltext !== false)
                                {
                                        return $xmltext;
                                }
                                else
                                {
                                        $message = 'Failed to parse XML for attribute: '.$this->current_field->getAttribute('name');
                                        //$this->logger->write(self::REMOTE_IDENTIFIER.$this->current_row->getAttribute('id').': '.$message,$logfile);
                                        return false;
                                }
                        }
                        break;
                        default:
                        {
                                return $this->current_field->nodeValue;
                        }
                }
        }

// Logic where to place the current content node into the content tree
        function getParentNodeId()
        {
                $parent_id = $this->parent_id_fallback;
                $parent_remote_id = $this->current_row->getAttribute('parent_id');
                if( $parent_remote_id )
                {
                        $eZ_object = eZContentObject::fetchByRemoteID(self::REMOTE_IDENTIFIER.$parent_remote_id );
                        if( $eZ_object )
                        {
                                $parent_id = $eZ_object->attribute('main_node_id');
                        }
                }
                return $parent_id;
        }

        function getDataRowId()
        {
                return self::REMOTE_IDENTIFIER.$this->current_row->getAttribute('id');
        }

        /*
        * - Allow the flexibility to extract data from multiple content-classes in one source XML file.
        * - See comments by Joachim Karl at: http://ez.no/developer/contribs/import_export/data_import
        */
        function getTargetContentClass()
        {
                if($this->current_row->getAttribute('type'))
                {
                        return $this->current_row->getAttribute('type');
                }
                else
                {
                        $message = 'eZ content-class not found. Given class name was: '.$this->current_row->getAttribute('type');
                        $this->logger->write(self::REMOTE_IDENTIFIER.$this->current_row->getAttribute('id').': '.$message,$logfile);
                        return false;
                }
        }

function readData()
        {
                $filename = $this->xml_source_path.'/'.$this->xml_source_file.'/'.$this->xml_source_file.'.transformed.xml';
                //return $this->parse_xml_document($filename,$this->root_node);
                if(isset($this->xml_source_path) && isset($this->xml_source_file))
                {
                        $filename = $this->xml_source_path.'/'.$this->xml_source_file.'/'.$this->xml_source_file.'.transformed.xml';
                        if(!is_file($filename))
                        {
                                $message = 'Cannot open '.$filename.' for reading. Please check files/dirs exist and permissions are set correctly'."\n";
                                $this->logger->write(self::REMOTE_IDENTIFIER.$this->current_row->getAttribute('id').': '.$message,$logfile);
                        }
                        else
                        {
                                return $this->parse_xml_document($filename,$this->root_node);
                        }
                }
                else
                {
                                $message = 'Source export file cannot be found or is not set. Please check files/dirs exist and permissions are set correctly'."\n";
                                $this->logger->write(self::REMOTE_IDENTIFIER.$class_identifier.'_'.$this->current_row->getAttribute('id').': '.$message,$logfile);
                }
        }

        function post_publish_handling( $eZ_object, $force_exit )
        {
                $force_exit = false;
                return true;
        }
}
?>

Russell Michell, Wellington, New Zealand.
We're building! http://www.theruss.com/blog/
I'm on Twitter: http://twitter.com/therussdotcom

Believe nothing, consider everything.

Russell Michell

Tuesday 20 April 2010 3:00:57 pm

"

Hello,

eZXMLExport never export the parent node, it has a top -> bottom approach and not a bottom -> up one.

For example with the following directory structure :

 eZPublish
└── folder1
    └── folder2
        └── folder3

If you want to export folder2 and folder3, you have to export from folder1, and if you want to export folder1, 2 and 3 you have to choose to export from "eZPublish".

Cheers :)

"

What if my tree looks like this and I only want to export folder 3, folder 4 and all their contents?

 

 eZPublish
   └── folder1
       └── folder2
            └── folder3
                 └── article 1
                 └── article 2
            └── folder4
            └── folder5
    └── folder6

I'm also having trouble with my exported items maintaining their relations. If I re-import the items, the default parent node (2) is used and if there are items within items (articles in folders) the articles aren't kept within their parent folders and are all placed at the same level in the folder hierarchy.

Granted, this is probably to do with the data_import extension and not yours Jérôme but any tips anyone might have are gratefully received to fix this.

I have tried including the parent_node_id in the exported content by hacking ezxmlexportexporter.php, adding the parent_node_id into the XSL stylesheet and pulling it out in data_import's getParentNodeId() function, but the same thing still happens.

Help! - I'm fully stuck :-(
Thanks everyone

Russell Michell, Wellington, New Zealand.
We're building! http://www.theruss.com/blog/
I'm on Twitter: http://twitter.com/therussdotcom

Believe nothing, consider everything.

Russell Michell

Sunday 25 April 2010 4:33:12 pm

Hi all, this will pretty much be the final post in this lengthy topic I reckon.

For those encountering this thread in the future (*wave* - what does the post carbon world look like?) I have solved my problems.

It appears data_import's own getParentNodeID() is lacking so I've had to further hack ezxmlexport's eZXMLExportExporter XML creation class in ezxmlexportexporter.php to ensure a parent_id is included in the source (exported) XML. From there, getParentNodeID() in the import will work.

I've also modified the XSL stylesheet to ensure container classes are created first by using <xsl:sort>. See the projects.ez.no URL to it I posted above, where I have posted the updated stylesheet.

Cheers all,
Russ

Russell Michell, Wellington, New Zealand.
We're building! http://www.theruss.com/blog/
I'm on Twitter: http://twitter.com/therussdotcom

Believe nothing, consider everything.

Powered by eZ Publish™ CMS Open Source Web Content Management. Copyright © 1999-2014 eZ Systems AS (except where otherwise noted). All rights reserved.

eZ debug

Timing: Jan 18 2025 05:03:33
Script start
Timing: Jan 18 2025 05:03:33
Module start 'layout'
Timing: Jan 18 2025 05:03:33
Module start 'content'
Timing: Jan 18 2025 05:03:35
Module end 'content'
Timing: Jan 18 2025 05:03:35
Script end

Main resources:

Total runtime1.2354 sec
Peak memory usage4,096.0000 KB
Database Queries70

Timing points:

CheckpointStart (sec)Duration (sec)Memory at start (KB)Memory used (KB)
Script start 0.00000.0073 589.6953152.6406
Module start 'layout' 0.00730.0040 742.335939.9375
Module start 'content' 0.01131.2226 782.2734680.1641
Module end 'content' 1.23390.0015 1,462.437538.2578
Script end 1.2353  1,500.6953 

Time accumulators:

 Accumulator Duration (sec) Duration (%) Count Average (sec)
Ini load
Load cache0.00370.2979160.0002
Check MTime0.00160.1317160.0001
Mysql Total
Database connection0.00150.117710.0015
Mysqli_queries1.141892.4246700.0163
Looping result0.00080.0617680.0000
Template Total1.162394.120.5812
Template load0.00250.204720.0013
Template processing1.159893.879820.5799
Template load and register function0.00010.007810.0001
states
state_id_array0.00430.352110.0043
state_identifier_array0.00350.281820.0017
Override
Cache load0.00240.1926670.0000
Sytem overhead
Fetch class attribute can translate value0.00050.043630.0002
Fetch class attribute name0.00080.0668100.0001
XML
Image XML parsing0.00290.237530.0010
class_abstraction
Instantiating content class attribute0.00000.0025140.0000
General
dbfile0.00300.2434300.0001
String conversion0.00000.001040.0000
Note: percentages do not add up to 100% because some accumulators overlap

Templates used to render the page:

UsageRequested templateTemplateTemplate loadedEditOverride
1node/view/full.tplfull/forum_topic.tplextension/sevenx/design/simple/override/templates/full/forum_topic.tplEdit templateOverride template
7content/datatype/view/ezimage.tpl<No override>extension/sevenx/design/simple/templates/content/datatype/view/ezimage.tplEdit templateOverride template
7content/datatype/view/ezxmltext.tpl<No override>extension/community_design/design/suncana/templates/content/datatype/view/ezxmltext.tplEdit templateOverride template
18content/datatype/view/ezxmltags/paragraph.tpl<No override>extension/ezwebin/design/ezwebin/templates/content/datatype/view/ezxmltags/paragraph.tplEdit templateOverride template
5content/datatype/view/ezxmltags/line.tpl<No override>design/standard/templates/content/datatype/view/ezxmltags/line.tplEdit templateOverride template
4content/datatype/view/ezxmltags/literal.tpl<No override>extension/community/design/standard/templates/content/datatype/view/ezxmltags/literal.tplEdit templateOverride template
2content/datatype/view/ezxmltags/quote.tpldatatype/ezxmltext/quote.tplextension/ezwebin/design/ezwebin/override/templates/datatype/ezxmltext/quote.tplEdit templateOverride template
1content/datatype/view/ezxmltags/strong.tpl<No override>design/standard/templates/content/datatype/view/ezxmltags/strong.tplEdit templateOverride template
1print_pagelayout.tpl<No override>extension/community/design/community/templates/print_pagelayout.tplEdit templateOverride template
 Number of times templates used: 46
 Number of unique templates used: 9

Time used to render debug report: 0.0001 secs