David Ogilo
Monday 03 August 2009 3:20:50 am
Hi, I have been trying to understand how to import a html data in ezpublish. Here is a copy of the code
case 'ezxmltext':
$parser = new eZSimplifiedXMLInputParser( $contentObjectID, false, 0, false );
$attributeValue = $dataString;
$attributeValue = str_replace( "\r", '', $attributeValue );
$attributeValue = str_replace( "\n", '', $attributeValue );
$attributeValue = str_replace( "\t", ' ', $attributeValue );
$document = $parser->process( $attributeValue );
if ( !is_object( $document ) )
$cli->output( 'Error in xml parsing' );
$dataString = eZXMLTextType::domString( $document );
Some how it strips out all image and object tags in the html data and just import links, text, and other html tags. I have tried changing all image tags in the html data to <custom name="img"> and still doesn't work. Does anyone know of a way to resolve this issue? Thanks, David
Monday 03 August 2009 3:37:51 am
Hello David,
In BC ImportCSV we did something similar. Perhaps this example will help. <i>http://svn.projects.ez.no/bcimportcsv/trunk/extension/bcimportcsv/bin/bccsvjoomlacontenttablehtmlimport.php</i> case 'ezxmltext':
if( $attribute->ContentClassAttributeIdentifier == 'caption' ) {
$dataString = null;
// Filter for images, process, store and link
if ( is_numeric( $imageContainerID ) )
$matches = array();
$pattern = '/<img\b[^>]*\bsrc=(\\\\["\'])?((?(1)(?:(?!\1).)*|[^\s>]*))(?(1)\1)[^>]*>/si';
preg_match_all( $pattern, $dataString, $matches );
$matches_count = count( $matches[2] );
// $cli->output( print_r( $matches ) );
if ( $matches_count > 0 )
$toReplace = array();
$replacements = array();
$objectComplex = true;
$cli->output( "Matches Count: " . $matches_count );
// $imagenr = 0;
foreach ( $matches[2] as $key => $match )
$toReplace[] = $matches[0][$key];
$imageURL = trim( str_replace(chr(32), '%20', str_replace(' ', '%20', str_replace('\"', '', $matches[2][$key] ) ) ) );
if ( substr($imageURL, 0, 1) == '/') {
$imageURL = 'http://www.diariodelhuila.com' . $imageURL;
$cli->output("Image link: " . $imageURL);
$imageTempURL = 'http://optics.kulgun.net/Blue-Sky/red-sunset-casey1.jpg';
$imageTempFileName = 'def.jpg';
$imageFileName = basename( $imageURL );
$cli->output( 'Image File Name: '.$imageFileName );
$imagePath = "/tmp/imgtmp/" . $imageTempFileName;
// $imagePath = "/tmp/imgtmp" . $imagenr++ . ".jpg";
if ( !copy($imageTempURL, $imagePath ) )
$cli->output("Error copying image from remote server");
$replacements[] = '';
}else {
$imageClass = eZContentClass::fetchByIdentifier( 'image' );
$imageObject = $imageClass->instantiate( $creator );
$imageObjectID = $imageObject->attribute( 'id' );
$imageNodeAssignment = eZNodeAssignment::create( array(
'contentobject_id' => $imageObject->attribute( 'id' ),
'contentobject_version' => $imageObject->attribute( 'current_version' ),
'parent_node' => $imageContainerID,
'is_main' => 1
$imageVersion = $imageObject->version( 1 );
$imageVersion->setAttribute( 'modified', $createDate );
$imageVersion->setAttribute( 'status', eZContentObjectVersion::STATUS_DRAFT );
$imageAttributes = $imageObject->attribute( 'contentobject_attributes' );
$cli->output("Image attributes:" . $cli->output( $imageAttributes, true ) );
$imageAttributes[0]->fromString( $imageTempFileName );
$imageAttributes[2]->fromString( $imagePath );
$operationResult = eZOperationHandler::execute( 'content', 'publish',
array( 'object_id' => $imageObjectID, 'version' => 1 ) );
$replacements[] = '<embed href="ezobject://' . $imageObject->attribute( 'id' ) . '" size="original" />';
$dataString = str_replace( $toReplace, $replacements, $dataString );
$cli->output( print_r( $toReplace ) );
// $cli->output( print_r( $replacements ) );
// $cli->output( $dataString ); echo "\n";
unset( $toReplace ); unset( $replacements ); unset( $imageAttributes ); unset( $imageNodeAssignment );
$parser = new eZSimplifiedXMLInputParser( $contentObjectID, false, 0 );
$document = $parser->process( $dataString );
// $dataString = eZXMLTextType::domString( $document );
// $cli->output( print_r( $dataString ) );
// get links
$links = $document->getElementsByTagName( 'link' );
if( is_numeric( $links->length ) && $links->length > 0 && is_object( $links ) ) {
// $cli->output( print_r( $links ) );
$li = 0;
// for each link
for( $li = 0; $li < $links->length; $li++ )
$linkNode = $links->item( $li );
$url_id = $linkNode->getAttribute( 'url_id' );
$cli->output( 'Link Item Count: '. $li );
$cli->output( 'Link Item ID: '. $url_id );
if( is_numeric( $url_id ) ) {
// create link between url (link) and object
$eZURLObjectLink = eZURLObjectLink::create( $url_id,
$contentObject->attribute('current_version') );
$cli->output( print_r( $eZURLObjectLink ) );
// $cli->output( print_r( $url_id ) );
Cheers, Heath
Brookins Consulting | http://brookinsconsulting.com/
Certified | http://auth.ez.no/certification/verify/380350
Solutions | http://projects.ez.no/users/community/brookins_consulting
eZpedia community documentation project | http://ezpedia.org
Rainer Krauss
Monday 03 August 2009 6:27:06 am
Hi david, depends on what you actually want to achieve.
Would you like to get something into eZ Publish that contains
- an image or - a file? Would you like to have the image / file be shown or should it be available for download? If you want to include files / images as links, you can add them to the media library and reference them via links to eznode:// or ezobject:// and the node or object id.
The tags eZXMLObject accepts are documented here:
http://ez.no/doc/ez_publish/technical_manual/4_0/reference/xml_tags ..and you'd have to look at replacing your tags with valid ones.
Best wishes, Rainer
Sébastien Antoniotti
Friday 28 August 2009 9:35:39 am
Hi, I wake up this topic because I'm making a similar import, but I would like to create objects and nodes like this :
$billet_attributes = array(
'titre' => "article 1",
'contenu' => "<p>some xhtml content</p>"
$params = array();
$params['parent_node_id'] = '2';
$params['class_identifier'] = 'wx_billet';
$params['attributes'] = $billet_attributes;
$object = eZContentFunctions::createAndPublishObject( $params );
The problem is that 'contenu' attribute is not correctly set. I think this is because I need to do parse the xhtml content, but I don't know how to because I'm not in the case were I have the $contentObjectID needed in $parser = new eZSimplifiedXMLInputParser( $contentObjectID, false, 0, false );
Thanks in advance !
eZ Publish Freelance
web : http://www.webaxis.fr