Importing and Exporting Content Objects in eZ

Author Message

Russell Michell

Wednesday 31 March 2010 7:11:46 pm

Hi everyone,

I have been exploring the various methods available of exporting and importing eZ content-objects and content-classes.

I wanted to ask, what single solution or collection of solutions have people found that best works for them?

I'm using ez 4.1.3 / PHP 5.2.12 / MySQL 5.0.22 / CentOS 5.2

  • ezpm.php looked good but generated a getAttribute() error when re-importing.
  • I looked at ezcsvimport.php and ezcsvexport.php but re-importing a content-node and subtree didn't re-create the parent folder, just the child nodes.
  • I looked at the "extract" project extension, but the parent folder that was recreated was named according to its attributes, complete with comma's - which was weird.
  • I looked at the "ezxmlexport" project-extension which looks fantastic - but requires a companion solution for import also.
  • So I looked at the "data_import" extension, but it does require some additional customisation to complete the range of eZ content-classes that it will import.

My question is, of these solutions and possibly others I haven't come across yet - what combination do you use and is there to be a "complete" solution for eZ that is simply installed and "goes"? :-)

What would be ideal is this:

  • Download and install extension as normal
  • Via Admin siteaccess select "import/export"
  • Select the node/node-tree you're interested in exporting
  • Press "export" which exports the node-tree in whatever format is necessary for an eZ instance to recognise and import it later-on (.xml .ezpkg etc - the user need not know this, it's a back-end process)
  • Later - when you want to import the data into the same or another eZ install:
  • Via its admin siteaccess select "import/export" again
  • Select the parent node underneath which you'd like to import your previously exported content.
  • Press "Import"
  • Done.

Has anyone managed something like this? Perhaps not with the GUI even, but a sequence of CLI scripts? I can go with the data_export + ezxmlexport extensions and make them work - possibly even package the two together, but having already spent 2 or 3 days working through the pros and cons of each solution, I'd really like to get a handle on what you folks do to achieve the same :-)

Many thanks and if you celebrate Easter - Happy Easter to you.
Russ

Russell Michell, Wellington, New Zealand.
We're building! http://www.theruss.com/blog/
I'm on Twitter: http://twitter.com/therussdotcom

Believe nothing, consider everything.

Jérôme Vieilledent

Wednesday 31 March 2010 11:38:47 pm

Hi

I have the same recurrent problem... Packages import/export often fail with content objects.

Nicolas Pastorino

Thursday 01 April 2010 12:20:07 am

Hi Russ,

Great topic, timeless question :)

My experience led me to use, for content classes, the ezxmlinstaller tool. Created at eZ initially ( by Dirk Schmedding ), it was sensed to be useful in a generic fashion, and it actually is imho.

Concerning importing data, i am a big fan of data_import for the following reasons :

  • extensible
  • reliable and tested on some of the most demanding projects ( among then www.caranddriver.com )
  • built-in support for many types of content sources : XML, SQL, CSV.
  • handes updates perfectly

Left is the question of exporting content objects. I pretty much think ezxmlexport is the best candidate here, and i tend to think it can be safely combined with data_import. I have not (yet) done this myself though.

Hope it helps,
and hope many community members will share their experience here.
Cheers !

--
Nicolas Pastorino
Director Community - eZ
Member of the Community Project Board

eZ Publish Community on twitter: http://twitter.com/ezcommunity

t : http://twitter.com/jeanvoye
G+ : http://plus.tl/jeanvoye

Kevin Gaudin

Thursday 01 April 2010 1:31:57 am

Hello,

This is a really painful (so really interesting) topic here.

Exporting/Importing ezpackages of content objects is not reliable, we often experience issues with object relations or translations loss.

On this topic, we could also add the import of content classes definitions modifications (at least attributes addition) which is impossible and requires human actions through the web UI.

These issues make really difficult the process of developing an eZ Publish application and applying changes on different staging environments.

On our main project, we have to deal with dev / user acceptance tests / preproduction / production environments which are managed by different teams which are not all trained to eZ Publish.

Without such reliable tools, eZ Publish can not be considered as a fully corporate-friendly solution.

We did not have time to test extensively the different extensions available yet as we could not identify an "all-in-one" solution without having to plan a full project only fot this "basic" (at least for corporate management) subject.

Thanks a lot for your experiences on this topic !

Twitter: @kevingaudin

Jérôme Vieilledent

Thursday 01 April 2010 1:58:30 am

"

On this topic, we could also add the import of content classes definitions modifications (at least attributes addition) which is impossible and requires human actions through the web UI.

"

Actually you can do that with ezxmlinstaller and it's very efficient ;)

Jérôme Renard

Thursday 01 April 2010 2:00:02 am

"

Left is the question of exporting content objects. I pretty much think ezxmlexport is the best candidate here, and i tend to think it can be safely combined with data_import. I have not (yet) done this myself though.

"

Well, eZXMLExport exports in XML and data_import has an XML handler so that should work without any problem

:)

Nicolas Pastorino

Thursday 01 April 2010 2:03:24 am

"
"

Left is the question of exporting content objects. I pretty much think ezxmlexport is the best candidate here, and i tend to think it can be safely combined with data_import. I have not (yet) done this myself though.

"

Well, eZXMLExport exports in XML and data_import has an XML handler so that should work without any problem

:)

"

Yep, totally. Would be cool to have a data_import handler complying with the ezxmlexport XML format. Would do a good step ahead :)

--
Nicolas Pastorino
Director Community - eZ
Member of the Community Project Board

eZ Publish Community on twitter: http://twitter.com/ezcommunity

t : http://twitter.com/jeanvoye
G+ : http://plus.tl/jeanvoye

Jérôme Renard

Thursday 01 April 2010 2:13:54 am

Well there is an even simpler solution.

eZXMLExport allows you to use an XSLT file which is used during the export process, so with each XML file created you can apply a smal and specific XSLT that will be used to generate the final result, if you already have a ready to go XML handler in data_import that would be an interesting solution :)

Matthieu Sévère

Thursday 01 April 2010 2:16:08 am

"
"

On this topic, we could also add the import of content classes definitions modifications (at least attributes addition) which is impossible and requires human actions through the web UI.

"

Actually you can do that with ezxmlinstaller and it's very efficient ;)

"

Yes but it means that every modifications in a content class has to be replicated in a xml that reflects state of every content class.

It's a bit heavy don't you think ?

The best would be a possibility to export a content class to a xml understandable by ezxmlinstaller.

--
eZ certified developer: http://ez.no/certification/verify/346216

Nicolas Pastorino

Thursday 01 April 2010 2:25:20 am

"
"
"

On this topic, we could also add the import of content classes definitions modifications (at least attributes addition) which is impossible and requires human actions through the web UI.

"

Actually you can do that with ezxmlinstaller and it's very efficient ;)

"

Yes but it means that every modifications in a content class has to be replicated in a xml that reflects state of every content class.

It's a bit heavy don't you think ?

"

The good point though is that one can version the content class definitions in any version control system (SVN/CVS/git..).

"The best would be a possibility to export a content class to a xml understandable by ezxmlinstaller."
Yes, that would help. Alternatively, making sure ezxmlinstaller understand an ezpkg could do the deal.

Cheers,

--
Nicolas Pastorino
Director Community - eZ
Member of the Community Project Board

eZ Publish Community on twitter: http://twitter.com/ezcommunity

t : http://twitter.com/jeanvoye
G+ : http://plus.tl/jeanvoye

Kevin Gaudin

Thursday 01 April 2010 2:39:35 am

"
"

On this topic, we could also add the import of content classes definitions modifications (at least attributes addition) which is impossible and requires human actions through the web UI.

"

Actually you can do that with ezxmlinstaller and it's very efficient ;)

"

Interesting, this is not specified on the project page, I can only see ezcreateclass and ezmodifycontent but nothing like 'ezmodifyclass' which could allow to add a new attribute to an existing class without deleting all it's instances.

Twitter: @kevingaudin

Jérôme Vieilledent

Thursday 01 April 2010 4:10:41 am

Actually, ezcreateclass handler does handle class modification (see handler code).

@ Matthieu : eZXMLInstaller provides an export module for classes and roles (/export/classes & /export/roles) :)

Matthieu Sévère

Thursday 01 April 2010 5:08:52 am

Nice !! I didn't saw that export module :)

This extension is definitely useful !

--
eZ certified developer: http://ez.no/certification/verify/346216

Ivo Lukac

Thursday 01 April 2010 7:31:03 am

"

Concerning importing data, i am a big fan of data_import for the following reasons :

  • extensible
  • reliable and tested on some of the most demanding projects ( among then www.caranddriver.com )
  • built-in support for many types of content sources : XML, SQL, CSV.
  • handes updates perfectly
"

Second that for import. We used data_import few times already and is very good extension.

For example we once wrote mysql import/update function with it pretty fast and easy....

http://www.linkedin.com/in/ivolukac
http://www.netgen.hr/eng/blog
http://twitter.com/ilukac

Russell Michell

Sunday 04 April 2010 12:35:27 pm

Wow! Thanks guys - sometimes it's great to know "It's not just you" when you come across an issue in eZ or any other FOSS project and your head starts to hurt in thinking "Surely this should work, right?"

Anyway, thanks heaps for all your input, it looks like I was pretty much on the right track with data_import and ezxmlexport. I noticed you can use your own XSLT to match your content-classes but I too thought that a bit "heavy" - in addition to having to write your own handlers for it.

I'll check it out again when I'm back at work and report back.

Meanwhile, if anyone has any tips on using data_import or ezxmlexport that they had problems figuring out at the start, I'd be keen to learn from your experiences.

Thanks once again,
Russ

Russell Michell, Wellington, New Zealand.
We're building! http://www.theruss.com/blog/
I'm on Twitter: http://twitter.com/therussdotcom

Believe nothing, consider everything.

Russell Michell

Monday 05 April 2010 7:05:57 pm

Hi again,

OK, so I can see how I might hook data_import into ezxmlexport, and match ezxmlexport's XML format in a data_export custom source-handler. But just using data_import's basic examples (e.g. the XMLFolders.php example) I cannot see the imported content in any GUI - although the data is definitely there in the ezcontentobject MySQL table.

Can someone who's used data_import before show me what I'm missing or doing wrong? The default example should "just work" shouldn't it? - I should be able to use XMLFolders.php as an example and create my own content-class handler from it and use to parse the XML output from ezxmlexport - right?

Any pointers would be most welcome. I'd be just as happy to post this on data_import's projects.ez.no page, but that forum prob doesn't get the kind of traffic this forum gets...

Thanks again :-)
Russell

Russell Michell, Wellington, New Zealand.
We're building! http://www.theruss.com/blog/
I'm on Twitter: http://twitter.com/therussdotcom

Believe nothing, consider everything.

Nicolas Pastorino

Tuesday 06 April 2010 2:42:34 am

Hi Russell,

Have you cleared caches after having run the import ? :)

Have you passed the "-s" option to the runcronjobs.php script (setting the target siteaccess) ?

Otherwise, would you mind describing the procedure you are following to test the XMLFolders.php handler ( in a step-by-step manner, would be great :) )

Cheers !

--
Nicolas Pastorino
Director Community - eZ
Member of the Community Project Board

eZ Publish Community on twitter: http://twitter.com/ezcommunity

t : http://twitter.com/jeanvoye
G+ : http://plus.tl/jeanvoye

Russell Michell

Tuesday 06 April 2010 1:08:53 pm

Hi Nicolas and thanks for your offer of help :-)

Yes I cleared the cache - always do :-)

I'm trying to use the data_import extension. There is no entry for it (run.php) in my settings/override/cronjobs.ini.append.php and there is no copy of cronjob.ini that is bundled with the data_import extension. There is no mention of using a cronjob in the data_import README and no 'cronjobs' folder in the data_import extension.

What am I missing?

I can see the imported content-objects in the ezcontentobject MySQL table, so the import itself does work, but there is something missing or not set correctly somewhere in the DB becuase these items do not show up at all in the admin siteaccess.

This is what I used to set it off:

#> php extension/data_import/scripts/run.php -i ImportOperator -d  XMLFolders -s my_site_access

Starting import with "Folders  Handler" handler
Importing remote object (xmlfolder_10) updating object ID ( 6218 ).
Importing remote object (xmlfolder_20) updating object ID ( 6219 ).
Importing remote object (xmlfolder_30) updating object ID ( 6220 ).Finished
  #> mysql
 Welcome to the MySQL monitor.  Commands end with ; or \g.
Your MySQL connection id is 6969 to server version: 5.0.22
Type 'help;' or '\h' for help. Type '\c' to clear the buffer.
 mysql> use my_DB;
 Reading table information for completion of table and column names
You can turn off this feature to get a quicker startup with -A
 Database changed
 mysql> SELECT * FROM ezcontentobject WHERE id IN (6218,6219,6220);
 +-----------------+-----------------+------+---------------------+--------------+---------------+----------+------------+----------+-----------+--------------+------------+--------+
 | contentclass_id | current_version | id   | initial_language_id |  is_published | language_mask | modified | name       | owner_id |  published | remote_id    | section_id | status |
 +-----------------+-----------------+------+---------------------+--------------+---------------+----------+------------+----------+-----------+--------------+------------+--------+
 |               1 |               1 | 6217 |                   2  |            0 |             3 |        0 | New Folder |       10  |         0 | xmlfolder_10 |          1 |      0 |
 |               1 |               1 | 6218 |                   2  |            0 |             3 |        0 | New Folder |       10  |         0 | xmlfolder_20 |          1 |      0 |
 |               1 |               1 | 6219 |                   2  |            0 |             3 |        0 | New Folder |       10  |         0 | xmlfolder_30 |          1 |      0 |
 +-----------------+-----------------+------+---------------------+--------------+---------------+----------+------------+----------+-----------+--------------+------------+--------+
 3 rows in set (0.00 sec)

The only differences I can see are that the remote_id isn't an MD5 hash - but I guess it becomes one at some point? And also there is no entry for these object ID's in the ezcontentobject_link table where there entries for other content-objects I created manually earlier.

I noticed the owner_id for these objects was set to a non-existent user, so I updated these to reflect the admin user which made no difference.

Any guidance would be very much appreciated :-)

[UPDATE]
I just installed a fresh version of eZ 4.1.3 and installed data_import. Ran the same instruction as above and it worked. There must be some issue with my DB - its schema has changed several times due to at least 3 upgrades and extensions - maybe an issue lies there? Still - no errors in error.log which is odd... I'll keep looking

More...

  • Dumped working DB (With data_import's pre-imported example content-objects), imported into faulty DB: The data_import example data showed-up OK
  • Dumped working DB (Without data_import's pre-imported example content-objects), imported into faulty DB: Re-ran the data_import procedure on faulty DB, and the data_import example data showed-up OK
  • This means:
  1. There's nothing wrong with eZ's codebase in the faulty eZ install
  2. There's something wrong (different) between the working and faulty DBs

In the faulty DB:

  • There are additional extension's tables (ezxmlexport, ezgmaplocation and my own customisations - additional columns - to ezrss_export) which should have no impact - right?
  • Some tables have a slightly different column definition order in their CREATE TABLE declarations: ezpending_actions, ezsession, ezworkflow_event, ezuservisit & ezurlalias_ml have slightly different column definition ordering.

So might there be an issue with some existing data? Is there an admin script or extension you know of that can "clean" the database of "bad" objects? I have already run eZ's cron scripts: internal_drafts_cleanup.php, subtreeexpirycleanup.php and staticcache_cleanup.php which made no difference..
[/UPDATE]

Regards,
Russell

Russell Michell, Wellington, New Zealand.
We're building! http://www.theruss.com/blog/
I'm on Twitter: http://twitter.com/therussdotcom

Believe nothing, consider everything.

Russell Michell

Sunday 11 April 2010 1:39:35 pm

After about 4 days of working on this I narrowed my problems down to some corrupted data.

This install was originally a 4.0.1rc-2 install that has been upgraded at least 3 times and I had always had problems with urlalias errors in error.log.

I wasn't aware of cleandata.sql or kernel_schema.sql so I imported these and then ran the data_import examples = SUCCESS.

I accidentally found that re-importing data into my DB caused duplicate key 1-0 errors and commenting out each of the dumped INSERT statements until the error went away located the problem: ezurlalias_ml and ezuservisit had some sort of corrupted data. Commenting these INSERTS out, re-importing the data and then performing the data_imot examples = SUCCCESS.

Thanks for your help everyone, I got there in the end.

Russell Michell, Wellington, New Zealand.
We're building! http://www.theruss.com/blog/
I'm on Twitter: http://twitter.com/therussdotcom

Believe nothing, consider everything.

Nicolas Pastorino

Monday 12 April 2010 2:46:29 am

Hi Russell,

Good to hear you made it, with a little help from us :)

Note for later : never use any form of 4.0.x anymore :)

Cheers !

--
Nicolas Pastorino
Director Community - eZ
Member of the Community Project Board

eZ Publish Community on twitter: http://twitter.com/ezcommunity

t : http://twitter.com/jeanvoye
G+ : http://plus.tl/jeanvoye

Powered by eZ Publish™ CMS Open Source Web Content Management. Copyright © 1999-2014 eZ Systems AS (except where otherwise noted). All rights reserved.

eZ debug

Timing: Jan 18 2025 02:07:13
Script start
Timing: Jan 18 2025 02:07:13
Module start 'layout'
Timing: Jan 18 2025 02:07:13
Module start 'content'
Timing: Jan 18 2025 02:07:13
Module end 'content'
Timing: Jan 18 2025 02:07:13
Script end

Main resources:

Total runtime0.0231 sec
Peak memory usage4,096.0000 KB
Database Queries3

Timing points:

CheckpointStart (sec)Duration (sec)Memory at start (KB)Memory used (KB)
Script start 0.00000.0055 588.0391152.6406
Module start 'layout' 0.00550.0025 740.679739.4766
Module start 'content' 0.00800.0129 780.1563138.4453
Module end 'content' 0.02090.0020 918.601682.3047
Script end 0.0230  1,000.9063 

Time accumulators:

 Accumulator Duration (sec) Duration (%) Count Average (sec)
Ini load
Load cache0.002611.1157140.0002
Check MTime0.00125.0232140.0001
Mysql Total
Database connection0.00104.269210.0010
Mysqli_queries0.00229.586130.0007
Looping result0.00000.164410.0000
Template Total0.00166.910.0016
Template load0.00093.881410.0009
Template processing0.00073.023010.0007
Override
Cache load0.00062.572110.0006
General
dbfile0.005423.520080.0007
String conversion0.00000.038340.0000
Note: percentages do not add up to 100% because some accumulators overlap

Templates used to render the page:

UsageRequested templateTemplateTemplate loadedEditOverride
1print_pagelayout.tpl<No override>extension/community/design/community/templates/print_pagelayout.tplEdit templateOverride template
 Number of times templates used: 1
 Number of unique templates used: 1

Time used to render debug report: 0.0001 secs