data migration / data export

Author Message

nicholas hemley

Tuesday 27 November 2007 3:15:02 am

please could someone outline how I would go about exporting the content / users in ez and migrating it to a different CMS?

for drupal i have found the following:
http://drupal.org/node/5289

is there no easier way / export function?

Cheerio,
Nic

gabriele francescotto

Tuesday 27 November 2007 8:31:17 am

Hi Nicholas,

that example works only for the old eZ publish 2.x
for that version, I wrote a simpler PHP script to migrate all the contents to Drupal, that I can copy here.

But I have to say that, after the migration, I migrated again the contents to eZ publish (3.x version); eZ is definitely more powerful and complete than Drupal, there's no comparison.

gabriele

 


/**
 *
 *
 * this is a script to export data
 * from  eZ publish 2 into Drupal's postgres database
 *
 *
 * create a new database (testdata), containing ez publish's and drupal's tables
 NOTE: if your eZ publish database is set up with the character encoding (LATIN9),
you need to migrate the contents to the UTF-8/Unicode encoding. More information can be found in PostgreSQL documentation.

 * CREATE DATABASE testdata ENCODING='UTF8';

 * pg_dump --encoding=UTF8 ezpublish > ezpublish.sql
 * pg_dump drupal > drupal.sql
 * psql testdata < drupal.sql
 * psql testdata < ezpublish.sql
 *
 * practice command to extract ez publish's table list (tables to remove from the live database):
   psql -U drupaluser -W  -c "SELECT table_name FROM information_schema.tables WHERE table_schema = 'public'" drupal_database > tables.txt

 * run these queries:
 * drop table indexes_ez_drupal; // needed only if you made other migration tests before

*/

set_time_limit( 0 );

$sql = "
CREATE TABLE indexes_ez_drupal ( ez_id int4 NOT NULL, drupal_id int4 NOT NULL );
CREATE TABLE ez_cat ( ez_art int4 NOT NULL, ez_cat int4 NOT NULL, drupal_cat int4 NOT NULL );
CREATE TABLE ez_countries ( ez_art int4 NOT NULL, ez_country int4 NOT NULL, drupal_cat int4 NOT NULL );
ALTER TABLE node ALTER COLUMN title  TYPE varchar(255);
ALTER TABLE node_revisions ALTER COLUMN title  TYPE varchar(255);
";


/**************************************
* MAIN SETTINGS
**************************************/
$number_of_articles=30000;      // maximum number of articles parsed; the script can demand a lot of time to process all the articles
$import_abstract=true;          // not used
$import_body=true;              // not used

// it is possible to run the script with just an article, in order to
// to test the software or to analyse particular cases

$catch_exceptions=false;        // [true|false] true if you want to use the script with a single article, false to ignore this feature
$exception_article =94912;      // ez article ID

// these 2 variables can be useful in case the script stops
// for some reason

$starting_article=0;            // it must be the ID of firt article to process; the previous ones will be ignored
$remove_prev_imported = true;   // [true|false] true to reset the previous imported data, false to save them

$argv = $GLOBALS["HTTP_SERVER_VARS"]["argv"];
include_once( "classes/INIFile.php" );
include_once( "classes/eztemplate.php" );
include_once( "classes/ezhttptool.php" );
require_once('ezaddress/classes/ezcountry.php');
include_once( "ezarticle/classes/ezarticlecategory.php" );
include_once( "ezarticle/classes/ezarticlerenderer.php" );
#include_once( "lib/ezxml/classes/ezxml.php" );
#include_once( "lib/ezxml/classes/ezdomdocument.php" );
#include_once( "lib/ezxml/classes/ezdomnode.php" );
include_once( "ezuser/classes/ezobjectpermission.php" );
include_once( "classes/ezinifile.php" );
include_once( "ezimagecatalogue/classes/ezimage.php" );


$GLOBALS["DEBUG"] = true;
$Override_ParentCategory = 0;

//$ini = new INIFile( "site_custom.ini", false );
//$GLOBALS["INI_site.ini"] =& $ini;

$writePermGroup=-1;
$pathSeparator = '#!/!#';
//$pathSeparator = '-';
if ( $argv[1] != "" )
{
    for ( $i = 1; $i < count( $argv ); $i++ )
    {
        $option = preg_split( "#\=#is", $argv[$i] );
        $argument = $option[0];
        $value = $option[1];
        $replace = false;
        switch ( $argument )
        {
            case "-r":
                $replace = true;
                break;

            case "-s":
                $stripPathElements = $value;
                break;

            case "-sl":
                $stripPathElementsLive = $value;
                break;

            case "-q": print "

";
                return false;
                break;
            case "-p":
                $Override_ParentCategory = $value;
                break;

            case '-l':
            case '--language':
                $Language=$value;
               break;

            case "-f":
                $file = $value;
                break;

            case "--writepermgroup":
                $writePermGroup = $value;
                break;
        }
    }
} else {
        print "
        Please make sure that you run the following queries:

" . $sql . "

        If you just run them, please run again this script with the option '-q'.
";
}

/*#########################################################################################
                                    GLOBAL SETTINGS
#########################################################################################*/

//ARTICLES

/*#######################################################################################*/

$db =& eZDB::globalDatabase();
$category = new eZArticleCategory();


$db->array_query($articles, "select * from ezarticle_article limit $number_of_articles");
print "\n##############\n\nSelecting the ".count($articles)." articles to import in Drupal... please wait\n\n";
print " DONE\n\n";

print "\nLooking for exceptions (articles containing polls or forms)... please wait\n\n";
$db->array_query($articles_forms, "select id, name from ezarticle_article where contents like '%<form%'");
$db->array_query($articles_polls, "select id, name from ezarticle_article where contents like '%<poll%'");

$arr_exceptions = array();
        print "\n       The following articles contain FORMS and will be excluded from the migration: ";
foreach ($articles_forms as $form)
{
        print "
        " . $form[id] .": " . $form[name];
        array_push($arr_exceptions, $form[id]);
}

        print "\n\nThe following articles contain POLLS and will be excluded from the migration: ";
foreach ($articles_polls as $poll)
{
        print "
        " . $poll[id] .": " . $poll[name];
        array_push($arr_exceptions, $poll[id]);
}

print "\n";


/*  INIZIALIZZAZIONE NECESSARIA, DA EFFETTUARE DOPO OGNI TEST

delete from node;
delete from node_revisions;
delete from sequence;
delete sequence node_revisions_vid_seq;


###################################################*/

if ($remove_prev_imported) {
//$db->query('create table indexes_ez_drupal ( ez_id int4 NOT NULL, drupal_id int4 NOT NULL )');
$db->query('delete from indexes_ez_drupal');
$db->query('delete from localizernode');
$db->query('delete from node where nid>20');
$db->query('delete from term_data where tid>16');
$db->query('delete from term_node where nid>16');
$db->query('delete from term_hierarchy where tid>16');
$db->query('delete from ez_cat');
$db->query('delete from ez_countries');
$db->query('delete from node_revisions where nid>20');

}


function utf8_to_html ($data)
    {
    return preg_replace("/([\\xC0-\\xF7]{1,1}[\\x80-\\xBF]+)/e", '_utf8_to_html("\\1")', $data);
    }

function _utf8_to_html ($data)
    {
    $ret = 0;
    foreach((preg_split('//',strrev(chr((ord($data{0}) % 252 % 248 % 240 % 224 % 192) + 128) . substr($data, 1)))) as $k => $v)
        $ret += (ord($v) % 128) * pow(64, $k);
    return "&#$ret;";
    }

$body = '';
$title = '';
$pub_date = '';
$mod_date = '';
$images = array();
$author_id = '';
$external_url = '';
$keywords = '';
$country = ''; // category
$allow_comments = false;
$language = 'english';
$ispublished = true;



   print "

  > Importing articles: ";

if ($remove_prev_imported) $process_article = true; else $process_article=false;

foreach ($articles as $number => $art) {

  if (!$remove_prev_imported) {
        if($starting_article==$art[id]) $process_article=true;
  }
if ($process_article)
 if (!in_array($art[id], $arr_exceptions)) {
  if ($number < $number_of_articles) {


        $article =  new eZArticle ($art[id]);
        $article_id = $article->id();

        print "Processing Article : " . $article->name() . "  ID=" . $article->id();

        $rendered = new eZArticleRenderer ($article);
        $title = str_replace("\\", "", $article->name());
        $teaser = $rendered->renderIntro();
        $body_page = $rendered->renderPage();
        $body = $body_page[1];

        // images related to the selected article
        if (count($images)>0) {
                foreach ($images as $k=>$image) {
                        $img = $image['Image'];
                        $img_language = $article->language();
                        $img_caption = $img->caption();
                        $img_html_caption = $img->fullHTMLCaption( $img_language );
                        $img_path = "http://www.oneworldsee.org".$img->filePath();
                        $img_category = $img->categoryDefinition();
                        $img_name = $img->name();
                        if ($k==0) {
                                $teaser = "<div class=\"ezimage\"><img alt=\"".$img_name."\" src=\"".$img_path."\"/><div class=\"caption\">".$img_caption."</div></div>".$teaser;
                        } else {
                                $body = "<div class=\"ezimage\"><img alt=\"".$img_name."\" src=\"".$img_path."\" /><div class=\"caption\">".$img_caption."</div></div>".$body;
                        }
                }
        }



        $keywords = $article->keywords();
        $allCountries = eZCountry::getFlatList( false );
        $title = str_replace("'", "\'", $title);
        $title = str_replace("&quot;", "\"",$title);
        $title = str_replace("&amp;", "&",$title);
        $title = str_replace("&gt;", ">",$title);
        $title = str_replace("’’", "\"",$title);
        $title = str_replace("’", "\'",$title);
        $title = str_replace("‘", "\'",$title);
        $title = utf8_to_html($title);
$teaser = stripslashes(str_replace("\\", "", $teaser));
        $teaser = str_replace("'", "\'", $teaser);
        $teaser = str_replace("<br />", "
",$teaser);
        $teaser = str_replace("&quot;", "\"",$teaser);
        $teaser = str_replace("&gt;", ">",$teaser);
        $teaser = str_replace("&amp;", "&",$teaser);
        $teaser = str_replace("’’", "\"",$teaser);
        $teaser = str_replace("’", "\'",$teaser);
        $teaser = str_replace("‘", "\'",$teaser);
        $teaser = utf8_to_html($teaser);

        $body = str_replace("\\", "", $body);
        $body = str_replace("'", "\'", $body);
        $body = str_replace("<br />", "
", $body);
        $body = str_replace("&quot;", "\'", $body);
        $body = str_replace("&gt;", ">", $body);
        $body = str_replace("&amp;", "&", $body);
        $body = str_replace("’’", "\"", $body);
        $body = str_replace("’", "\'", $body);
        $body = str_replace("‘", "\'", $body);
        $body = utf8_to_html($body);

        $pub_date = $article->Published;
        $mod_date = $article->Modified;
        $images = $article->images();
        $author = $article->author();



        // USERS MAP: creates authors if they don't exist
        // manage groups and roles... TODO

        // INITIALIZE DRUPAL USERS, FROM EZ PUBLISH AUTHORS

        $user_id = $author->id();
        if ($user_id == '') $user_id=1;

        // find last drupal's userid
        $db->array_query($users, "SELECT last_value FROM users_uid_seq");
        $userid = $users[0]['last_value'];
        // catch ez's author and create a drupal user, includind the password
        if ($author->id()) {
                $db->array_query($authors_passwords, "SELECT password FROM ezuser_user WHERE id=". $author->id() );
                $author_password = $authors_passwords[0]['password'];
        } else $author_password=md5('owpass');
        $db->array_query($drupal_users, "SELECT name FROM users WHERE name='" . $author->login() . "'") ;
        if ( count($drupal_users)>0 ) {
                print "";
        } else { // insert a new author in drupal
                $db->query("INSERT INTO users ( name, signature, pass, mail, status, language )
                                VALUES ('".$author->login()."', '".$author->firstName().
                                " ".$author->lastName()."', '".$author_password."', '".
                                $author->email()."', 1, '".$short_language."' )");
        }

        if ($article->ispublished()==1) $ispublished=1; else $ispublished=0;
        $db->array_query($node_nid_seq, "SELECT last_value from node_nid_seq");
        $next_node_nid_seq= $node_nid_seq[0]['last_value'] + 1;

        $db->query("INSERT INTO node (vid, type, title, uid, status,
                        created,  changed, comment, promote, moderate, sticky)
                        VALUES ($next_node_nid_seq, 'story',
                        '$title', $userid , $ispublished, $mod_date, $pub_date, 2, 0, 0, 0)");
        $intro = $teaser . $link;
        // TODO revisions
        $sql_revision = "INSERT INTO node_revisions ( nid,vid,uid,title,body,teaser,log,timestamp,format) VALUES ($next_node_nid_seq, $next_node_nid_seq, $userid, '$title', '$body', '$intro', 'imported from OW eZ publish', $pub_date, 1)";
        $db->query($sql_revision);
        $db->query("INSERT INTO indexes_ez_drupal (ez_id, drupal_id) values ($article_id, $next_node_nid_seq)");


        // MANAGE EXTERNAL LINKS, ALTERNATIVE TO THE BODY
        $external_link = $article->linkobject();
        if ($article->haslinkurl()) {
                $link_name =str_replace("'", "\'", $external_link->name() );
                $link = '<br /><a class="external_link" target="_blank" href="' . str_replace("'", "\'", $external_link->url()) . '">' . $link_name . '</a>';
        }
                else
                $link = '';


        //take Drupal node_id
        $db->array_query($nextval_arr, "SELECT last_value from node_nid_seq");
        $nextval = $nextval_arr[0]['last_value'];

        // LOOK FOR THE ISOCODE : IsoCode TODO: shows always english
        $language = $article->language();
        $locale = new eZLocale($Language);
        $language_object = new eZLocale($article->language());
        $short_language = substr($language_object->IsoCode, 0, 2);
        $db->query("INSERT into localizernode (nid, locale, pid) values ($nextval, '".$short_language."', $nextval)");

        $categories = $article->categories(true);
        $countries = $article->countries(); // categories

        foreach ($categories as $category) {


          if ( substr($category->pathstring(), 0 , 22 ) == "/0/2373#2373/2481#2481" ) {
            $db->array_query($check, "select ez_cat from ez_cat where ez_cat=" .$category->id() );
            $categ_exists = false;

        // if the category doesn't exist, create a new one
            if ( count($check)==0 ) {
                        print " categ ez: ".$category->id() ."   ";
                $category_name = str_replace("'", "\'", $category->name());
                $category_name = str_replace("’", "\'", $category_name);
                $category_name = str_replace("‘", "\'", $category_name);
                $category_name = str_replace("&gl;", "<", $category_name);
                $category_name = str_replace("&gt;", ">", $category_name);
                $category_desc = str_replace("'", "\'", $category->description());
                $category_desc = str_replace("&gt;", ">", $category_desc);
                $category_desc = str_replace("&gl;", "<", $category_desc);
                $category_desc = str_replace("‘", "\'", $category_desc);
                $category_desc = str_replace("’", "\'", $category_desc);
        // if the category doesn't exist, check if there are drupal categories with the same name
                $db->array_query($check_name, "SELECT tid FROM term_data WHERE name='". $category_name ."'");
                if ( count($check_name)>0 ) {
                        $categ_exists=true;
                        // new drupal's term id
                        $drupal_cat = $check_name[0]['tid'];
                        $db->query("INSERT INTO ez_cat (ez_art, ez_cat, drupal_cat)
                                VALUES (".$article->id().", ".$category->id(). ", " . $drupal_cat . ")");
                        $db->array_query($check_country_name_node, "SELECT tid FROM term_node WHERE nid ='".$next_node_nid_seq."' AND tid='". $drupal_cat ."'");
                                                if (count($check_country_name_node)==0)
                        $db->query("INSERT into term_node values (". $next_node_nid_seq .", ". $drupal_cat . ")");
                }

// taxonomy


 if (!$categ_exists)
                        $db->query("INSERT into term_data ( vid, name, description) values (3, '" . $category_name . "', '" . $category_desc . "' )");
                $db->array_query($nextval_arr_term, "SELECT last_value from term_data_tid_seq");
                // new drupal's term id
                $drupal_cat = $nextval_arr_term[0]['last_value'];
                if (!$categ_exists)
                        $db->query("INSERT into term_hierarchy ( tid, parent) values ($drupal_cat , 0)");
                $db->query("INSERT into ez_cat (ez_art, ez_cat, drupal_cat) values (".$article->id().", ".$category->id(). ", " . $drupal_cat . ")");
            } else {

            // the category exists: find drupal's category
                $db->array_query($drupal_cat_arr, "SELECT drupal_cat FROM ez_cat WHERE ez_cat=".$category->id());
                $drupal_cat = $drupal_cat_arr[0]['drupal_cat'];

            }

        // insert drupal's node into drupal's category

                if (!$categ_exists) {
                        $db->array_query($check_country_name_node, "SELECT tid FROM term_node WHERE nid ='".$next_node_nid_seq."' AND tid='". $drupal_cat ."'");
                        if (count($check_country_name_node)==0)
                                $db->query("INSERT into term_node values (". $next_node_nid_seq .", ". $drupal_cat . ")");
                }
          }


        } // foreach $categories as $category

// ez countries as drupal taxonomy
        foreach ($countries as $country) {

                foreach ($allCountries as $c) {
                        if ($c->id()==$country) $cc = $c;
                }
                $db->array_query($check, "SELECT ez_country from ez_countries where ez_country=" . $country);
                $country_exists = false;
                if ( count($check)==0 ) {
                        $db->array_query($check_country_name, "SELECT tid FROM term_data WHERE name='". $cc->name() ."'");
                        if (count($check_country_name)>0) {
                                $country_exists = true;
                                $country_drupal_id=$check_country_name[0]['tid'];
                                $db->query("INSERT INTO ez_countries (ez_art, ez_country, drupal_cat) values (".$article->id().", ".$country. ", " . $country_drupal_id . ")");
                                $db->array_query($check_country_name_node, "SELECT tid FROM term_node WHERE nid ='".$next_node_nid_seq."' AND tid='". $country_drupal_id ."'");
                                if (count($check_country_name_node)==0)
                                        $db->query("INSERT INTO term_node VALUES (". $next_node_nid_seq .", ". $country_drupal_id . ")");
                        } else {
                                $country_exists = false;
                                $db->query("INSERT into term_data ( vid, name, description) values (2, '" . str_replace("'", "\'", $cc->name()) . "', 'Country' )");
                        $db->array_query($nextval_arr_term, "SELECT last_value from term_data_tid_seq");
                        $drupal_cat = $nextval_arr_term[0]['last_value'];
                        $db->query("INSERT into term_hierarchy ( tid, parent) values ($drupal_cat , 0)");
                        $db->query("INSERT into ez_countries (ez_art, ez_country, drupal_cat) values (".$article->id().", ".$country. ", " . $drupal_cat . ")");
                        $db->query("INSERT INTO term_node VALUES (". $next_node_nid_seq .", ". $drupal_cat . ")");
                        }
                } else {

                // the category exists: find drupal's category
                        $db->array_query($drupal_cat_arr, "SELECT drupal_cat FROM ez_countries WHERE ez_country=".$country);
                        $drupal_cat = $drupal_cat_arr[0]['drupal_cat'];
                // insert drupal's node into drupal's category
                        $db->array_query($check_country_name_node, "SELECT tid FROM term_node WHERE nid ='".$next_node_nid_seq."' AND tid='". $drupal_cat ."'");
                        if (count($check_country_name_node)==0)
                        $db->query("INSERT INTO term_node values (". $next_node_nid_seq .", ". $drupal_cat . ")");
                }

        }



        print "  Successfully processed!
";
  } // if
 } //if
} // for articles

 

www.opencontent.it
via Verdi 19, 38100 Trento

Paul Leclercq

Tuesday 27 November 2007 9:15:25 am

personnaly I would recommend using rss feeds to export your data.

There is a very good exemple on how to export your data using rss feeds on this page:
http://pwet.fr/blog/des_fils_rss_sur_mesure_dans_ez_publish

unfortunatly this is in french, but basically, all you need to do is create an overide for your pagelayout, and create an rss view. In which you will put the following code:

<?xml version="1.0" encoding="iso-8859-1"?>
<rss version="2.0">
<channel>
<language>fr-FR</language>
{$module_result.content}
</channel>
</rss>

You will then simply need to create overrides to display the various content:
design/<mon_design>/templates/node/view/rss_full.tpl

You will then need an import function to import your data in drupal.

this simple function will read the url you whish. All you will need to do the is get the info in the xml using the xml parser you would like and import the data in your database. This way you could

function getFlux($flux)
{
$handle = fopen($flux, "rb");
$contents = '';
while (!feof($handle)) {
$contents .= fread($handle, 8192);
}
fclose($handle);
return($contents);
}

Powered by eZ Publish™ CMS Open Source Web Content Management. Copyright © 1999-2014 eZ Systems AS (except where otherwise noted). All rights reserved.

eZ debug

Timing: Jan 18 2025 14:56:24
Script start
Timing: Jan 18 2025 14:56:24
Module start 'layout'
Timing: Jan 18 2025 14:56:24
Module start 'content'
Timing: Jan 18 2025 14:56:25
Module end 'content'
Timing: Jan 18 2025 14:56:25
Script end

Main resources:

Total runtime1.5048 sec
Peak memory usage4,096.0000 KB
Database Queries59

Timing points:

CheckpointStart (sec)Duration (sec)Memory at start (KB)Memory used (KB)
Script start 0.00000.0057 589.0313152.6250
Module start 'layout' 0.00570.0035 741.656339.4453
Module start 'content' 0.00921.4941 781.1016660.2578
Module end 'content' 1.50330.0015 1,441.359436.1641
Script end 1.5048  1,477.5234 

Time accumulators:

 Accumulator Duration (sec) Duration (%) Count Average (sec)
Ini load
Load cache0.00340.2279160.0002
Check MTime0.00140.0940160.0001
Mysql Total
Database connection0.00080.053010.0008
Mysqli_queries1.445996.0862590.0245
Looping result0.00060.0399570.0000
Template Total1.470497.720.7352
Template load0.00180.122320.0009
Template processing1.468597.587020.7343
Template load and register function0.00020.015010.0002
states
state_id_array0.00100.065210.0010
state_identifier_array0.00280.184720.0014
Override
Cache load0.00150.1027460.0000
Sytem overhead
Fetch class attribute can translate value0.00080.055930.0003
Fetch class attribute name0.00140.090050.0003
XML
Image XML parsing0.00460.305630.0015
class_abstraction
Instantiating content class attribute0.00000.000750.0000
General
dbfile0.00480.3190210.0002
String conversion0.00000.000640.0000
Note: percentages do not add up to 100% because some accumulators overlap

Templates used to render the page:

UsageRequested templateTemplateTemplate loadedEditOverride
1node/view/full.tplfull/forum_topic.tplextension/sevenx/design/simple/override/templates/full/forum_topic.tplEdit templateOverride template
3content/datatype/view/ezxmltext.tpl<No override>extension/community_design/design/suncana/templates/content/datatype/view/ezxmltext.tplEdit templateOverride template
10content/datatype/view/ezxmltags/paragraph.tpl<No override>extension/ezwebin/design/ezwebin/templates/content/datatype/view/ezxmltags/paragraph.tplEdit templateOverride template
7content/datatype/view/ezxmltags/line.tpl<No override>design/standard/templates/content/datatype/view/ezxmltags/line.tplEdit templateOverride template
2content/datatype/view/ezimage.tpl<No override>extension/sevenx/design/simple/templates/content/datatype/view/ezimage.tplEdit templateOverride template
1content/datatype/view/ezxmltags/literal.tpl<No override>extension/community/design/standard/templates/content/datatype/view/ezxmltags/literal.tplEdit templateOverride template
1print_pagelayout.tpl<No override>extension/community/design/community/templates/print_pagelayout.tplEdit templateOverride template
 Number of times templates used: 25
 Number of unique templates used: 7

Time used to render debug report: 0.0001 secs