Import rss as literal html

Author Message

michael depetrillo

Tuesday 15 July 2008 10:14:42 am

Hello everyone

I need to import my rss feed as literal html.

I changed the setEZXMLAttribute method in cronjobs/rssimport.php.

The rss feed is importing OK, but when I go to front-end I do not see all the HTML tags.

If I go into back-end with editor enabled, I still do not see all the HTML tags.

If I go into back-end with editor disabled, I see all the HTML. I can then hit save and the editor and front-end will display the correct HTML.

What piece am I missing here?

The feed I am working with is - http://www.cnbc.com/id/20040302/rssCmp/97305/device/rss/rss.xml

function setEZXMLAttribute( $attribute, $attributeValue, $link = false )
{
    //include_once( 'kernel/classes/datatypes/ezxmltext/handlers/input/ezsimplifiedxmlinputparser.php' );
	
    $contentObjectID = $attribute->attribute( "contentobject_id" );
	
	// echo $attributeValue ."\n";
	
	// ADDED FOR LP
	$contentClassID = $attribute->attribute('contentclassattribute_id');
	if ($contentClassID == 206) {
		
		$inputData = "<?xml version=\"1.0\" encoding=\"utf-8\"?>\n";
		$inputData .= "<section xmlns:image=\"http://ez.no/namespaces/ezpublish3/image/\"\n";
		$inputData .= "         xmlns:xhtml=\"http://ez.no/namespaces/ezpublish3/xhtml/\"\n";
		$inputData .= "         xmlns:custom=\"http://ez.no/namespaces/ezpublish3/custom/\">\n";
		$inputData .= "<paragraph>\n<literal class=\"html\">";
		$inputData .= strip_tags($attributeValue, 			"<span><a><p><h1><h2><h3><h4><h5><ul><li><br><table><tr><td><th><tbody><tfoot><hr><img><embed><object>");
		$inputData .= "</literal></paragraph>";
		$inputData .= "</section>";

		$domString = $inputData;
			
	// END ADDED FOR LP
	} else {
		
		$parser = new eZSimplifiedXMLInputParser( $contentObjectID, false, 0, false );
	
		$attributeValue = str_replace( "\r", '', $attributeValue );
		$attributeValue = str_replace( "\n", '', $attributeValue );
		$attributeValue = str_replace( "\t", ' ', $attributeValue );
	
		$document = $parser->process( $attributeValue );
		if ( !is_object( $document ) )
		{
			$cli = eZCLI::instance();
			$cli->output( 'Error in xml parsing' );
			return;
		}
		$domString = eZXMLTextType::domString( $document );
	}
	
	// echo $domString;
	
    $attribute->setAttribute( 'data_text', $domString );
    $attribute->store();
}

Guillaume Kulakowski

Tuesday 15 July 2008 1:37:46 pm

Hello Michael,

I use eZ for a planet : http://planet.fedora-fr.org.

For that, I store RSS content in Text block.
For a valid xHTML content, I use a tidy and a cleaner parser.

You can inspirate of my code :
http://trac.llaumgui.com/browser/ez_publish/myutils/trunk/cronjobs/planet.php (look at setEZTXTAttribute)

My blog : http://www.llaumgui.com (not in eZ Publish ;-))
eZC on RHEL : http://blog.famillecollet.com/pages/Config-en
eZC on Fedora : just "yum install php-channel-ezc"

michael depetrillo

Thursday 17 July 2008 12:12:37 pm

What does the disabled editor due to the HTML before it saves it to a dom document?

Or I could ask

What does the editor due to the HTML from the dom document before it displays it?

Powered by eZ Publish™ CMS Open Source Web Content Management. Copyright © 1999-2014 eZ Systems AS (except where otherwise noted). All rights reserved.

eZ debug

Timing: Jan 18 2025 14:52:43
Script start
Timing: Jan 18 2025 14:52:43
Module start 'layout'
Timing: Jan 18 2025 14:52:43
Module start 'content'
Timing: Jan 18 2025 14:52:44
Module end 'content'
Timing: Jan 18 2025 14:52:44
Script end

Main resources:

Total runtime1.3281 sec
Peak memory usage4,096.0000 KB
Database Queries57

Timing points:

CheckpointStart (sec)Duration (sec)Memory at start (KB)Memory used (KB)
Script start 0.00000.0065 587.9219152.6250
Module start 'layout' 0.00650.0042 740.546939.4453
Module start 'content' 0.01071.3162 779.9922543.5391
Module end 'content' 1.32690.0012 1,323.531312.1641
Script end 1.3281  1,335.6953 

Time accumulators:

 Accumulator Duration (sec) Duration (%) Count Average (sec)
Ini load
Load cache0.00330.2454160.0002
Check MTime0.00130.0999160.0001
Mysql Total
Database connection0.00120.090010.0012
Mysqli_queries1.275996.0705570.0224
Looping result0.00050.0397550.0000
Template Total1.296697.620.6483
Template load0.00230.175220.0012
Template processing1.294297.450420.6471
Template load and register function0.00010.010810.0001
states
state_id_array0.00150.112610.0015
state_identifier_array0.00130.100320.0007
Override
Cache load0.00180.1386240.0001
Sytem overhead
Fetch class attribute can translate value0.00070.052420.0003
Fetch class attribute name0.00110.083240.0003
XML
Image XML parsing0.00240.177920.0012
class_abstraction
Instantiating content class attribute0.00000.000640.0000
General
dbfile0.00230.1717170.0001
String conversion0.00000.000840.0000
Note: percentages do not add up to 100% because some accumulators overlap

Templates used to render the page:

UsageRequested templateTemplateTemplate loadedEditOverride
1node/view/full.tplfull/forum_topic.tplextension/sevenx/design/simple/override/templates/full/forum_topic.tplEdit templateOverride template
3content/datatype/view/ezxmltext.tpl<No override>extension/community_design/design/suncana/templates/content/datatype/view/ezxmltext.tplEdit templateOverride template
5content/datatype/view/ezxmltags/paragraph.tpl<No override>extension/ezwebin/design/ezwebin/templates/content/datatype/view/ezxmltags/paragraph.tplEdit templateOverride template
1content/datatype/view/ezxmltags/literal.tpl<No override>extension/community/design/standard/templates/content/datatype/view/ezxmltags/literal.tplEdit templateOverride template
1content/datatype/view/ezimage.tpl<No override>extension/sevenx/design/simple/templates/content/datatype/view/ezimage.tplEdit templateOverride template
2content/datatype/view/ezxmltags/line.tpl<No override>design/standard/templates/content/datatype/view/ezxmltags/line.tplEdit templateOverride template
1print_pagelayout.tpl<No override>extension/community/design/community/templates/print_pagelayout.tplEdit templateOverride template
 Number of times templates used: 14
 Number of unique templates used: 7

Time used to render debug report: 0.0002 secs