Problem: corrupt contentobjects

Author Message

Jonny Bergkvist

Saturday 25 August 2007 12:26:15 pm

Hi,

I have a problem on all eZ databases I have where users have published content for a while.

Objects exists where current_version of the object doesn't have content attributes stored . This makes the object corrupt. If trying to edit it a fatal error occurs.

I've run these eZdb's for a while, and first without db transactions enabled (requires innodb). The incomplete objects are probably from before enabling transactions.

I want to know if others also have such incomplete objects in your eZ databases.

And I would like you fellows to suggest a way to resolve the situation.
These corrupt objects makes trouble for me when running scripts to do changes with objects.

Try running this php-cli script to see if you have this problem:

<?php

//script for finding objects where ezcontentobject.current_version has no attributes stored
//jonny.bergkvist@hit.no

include_once( 'kernel/common/template.php' );
include_once( "lib/ezutils/classes/ezhttptool.php" );
include_once( 'lib/ezutils/classes/ezcli.php' );
include_once( 'kernel/classes/ezscript.php' );
include_once( 'lib/ezdb/classes/ezdb.php' );

$cli =& eZCLI::instance();
$script =& eZScript::instance();
$script->initialize();
$db =& eZDB::instance();
set_time_limit( 0 );

$arrayResult = $db->arrayQuery( "SELECT id, current_version FROM ezcontentobject" );

$i = 0;
foreach( $arrayResult as $item ) {

    //check if current_version has content attributes
    $hasAttribute = $db->arrayQuery( "SELECT contentobject_id FROM ezcontentobject_attribute WHERE contentobject_id = " . $item['id'] . " AND version = " . $item['current_version'] );

    //or: check if object has no attributes of any version stored
    //$hasAttribute = $db->arrayQuery( "SELECT contentobject_id FROM ezcontentobject_attribute WHERE contentobject_id = " . $item['id'] );

    if ( empty( $hasAttribute ) ) {
        echo "Corrupt object: " . $item['id'] . "\n";
        $i++;
    }
}
echo "total corrupt objects: " . $i . "\n";

$script->shutdown();

?>

On a database with about 40 000 objects I find a total of 378 corrupt objects.

Heath

Tuesday 28 August 2007 3:18:16 pm

Hello Jonny!

Thank you for your form thread to discuss this issue in eZ Publish and your most valuable php-cli script snippet.

I am certain the discussion alone is very valuable for many users both long time and new users.

While I have as well like yourself encountered this problem with corrupt content objects in past customer projects, I can not speak with absolute certainty as to the possible explanations of this negative feature, though I have my own ideas with regards to the cause ...

I would challenge you to submit your script as a contribution on, http://projects.ez.no

This project would very much help other new users more quickly find this useful tool and allows for further collaboration and development on the script as a reliable tool to testing a site for a corrupt content object database.

Cheers!
Heath

Brookins Consulting | http://brookinsconsulting.com/
Certified | http://auth.ez.no/certification/verify/380350
Solutions | http://projects.ez.no/users/community/brookins_consulting
eZpedia community documentation project | http://ezpedia.org

Jonny Bergkvist

Thursday 20 September 2007 4:22:41 am

Hi,

I think I have resolved the problem :-)

I paste my new script here, it's based on the first version that only reports the errors.
Please try out it out, and give feedback.

On my eZDB I've found two different scenarios of corrupt objects, they must have become corrupt when not having transaction enabled, and eZp or the user breaks out of a content object creation, og content object edit:
1: a content object exist only in table ezcontentobject, no referernces to object in any other ezcontentobject*-tables. SOLUTION: delete from ezcontentobject and ezcontentobject_name.
2: a content object has attributes etc., but it has a ezcontentobject.current_version that doesn't have attributes. SOLUTION: roll back version number.

This script report, and does the required job (updated version 2007.10.09):

<?php

// Script for finding and handling content_objects that are not completely created
// That may occur under some circustanses when using a database without transations enabled
//
// 2007.10.09, jonny.bergkvist@hit.no

// $doUpdate, true or false. Set to false for at dry test-run
$doUpdate = true;

include_once( 'kernel/common/template.php' );
include_once( "lib/ezutils/classes/ezhttptool.php" );
include_once( 'lib/ezutils/classes/ezcli.php' );
include_once( 'kernel/classes/ezscript.php' );
include_once( 'lib/ezdb/classes/ezdb.php' );

$cli =& eZCLI::instance();
$script =& eZScript::instance();
$script->initialize();
$db =& eZDB::instance();
set_time_limit( 0 );

$arrayResult1 = $db->arrayQuery( "SELECT id, contentclass_id, current_version FROM ezcontentobject" );
echo "First checking for content objects that has no contentobject_attributes at all...\n";

$i = 0;
foreach( $arrayResult1 as $item) {
	//check if object has no attributes of any version stored
	$hasAttribute = $db->arrayQuery( "SELECT contentobject_id FROM ezcontentobject_attribute WHERE contentobject_id = " . $item['id'] );
	
	if ( empty( $hasAttribute ) ) {
		if ( $doUpdate ) {
			echo "Corrupt object, no attributes: " . $item['id'] . ". Deleting corrupt object with no attributes...\n";
			$db->query( "DELETE FROM ezcontentobject WHERE ezcontentobject.id = " . $item['id'] );
			$db->query( "DELETE FROM ezcontentobject_name WHERE ezcontentobject_name.contentobject_id = " . $item['id'] );
			if ( $item['contentclass_id'] == 4 ) {
				$db->query( "DELETE FROM ezuser WHERE ezuser.contentobject_id = " . $item['id'] );
			}
		}
	    else {
	    echo "Corrupt object, no attributes: " . $item['id'] . ", current_version:" . $item['current_version'] . "\n";	
	    }
	$i++;
	}
}
echo "Total corrupt objects with no attributes: " . $i . "\n\n";

$arrayResult2 = $db->arrayQuery( "SELECT id, current_version FROM ezcontentobject" );
echo "Then checking for content objects that has contentobject_attributes, but not of the current_version...\n";

$i = 0;
foreach( $arrayResult2 as $item) {
	//check if current_version has content attributes
	$hasAttribute = $db->arrayQuery( "SELECT contentobject_id FROM ezcontentobject_attribute WHERE contentobject_id = " . $item['id'] . " AND version = " . $item['current_version'] );

	if ( empty( $hasAttribute ) ) {
		if ( $doUpdate ) {
			$previousCurrentVersion = $item['current_version'] - 1;
				echo "Corrupt object: " . $item['id'] . ", current_version: " . $item['current_version'] . ". Setting back to version: " . $previousCurrentVersion . "\n";
			$db->query( "UPDATE ezcontentobject SET current_version = " . $previousCurrentVersion . " WHERE id = " . $item['id'] );
		}
		else {
	echo "Corrupt object: " . $item['id'] . ", current_version: " . $item['current_version'] . "\n";
		}
	$i++;
	}
}
echo "Total objects with wrong current_version: " . $i . "\n";

$script->shutdown();

?>

Piotrek Karaƛ

Friday 04 July 2008 4:42:12 pm

Hey guys,

Similar experiences here. Here's what I did at first:

    public static function isContentobjectRegisterable( $contentobjectID )
    {
        $isRegistrable = false;
        
        $db = eZDB::instance();
        // This query checks the most important values and relations between
        // object's values across different tables. Is is to determine if
        // it should be possible to register this object, or maybe it should
        // be considered invalid.
        $query = "SELECT COUNT(*) AS count_valid 
            FROM ezcontentobject co, ezcontentobject_tree cot 
            WHERE co.id=" . (int)$contentobjectID . " 
            AND co.id=cot.contentobject_id 
            AND cot.main_node_id>0 
            AND co.status=" . eZContentObject::STATUS_PUBLISHED ;
        $result = $db->arrayQuery( $query, array( 'column' => 'count_valid' ) );
        $validCount = (int)$result[0];
        if( $validCount > 0 )
        {
            $isRegistrable = true;
        }

        return $isRegistrable;
    } 

I decided not to fix the problem here, but make my extension aware of possible corruptions.

Yet, I will definitely check out Jonny's scripts.

Cheers,
Piotrek

--
Company: mediaSELF Sp. z o.o., http://www.mediaself.pl
eZ references: http://ez.no/partners/worldwide_partners/mediaself
eZ certified developer: http://ez.no/certification/verify/272585
eZ blog: http://ez.ryba.eu

Gabriele Francescotto

Thursday 04 February 2010 5:15:11 am

Hi,

I've been faced a similar problem, in particular importing contents using the data_import estension. If the procedure brutallystops for some reason, or if you decide to interrupt the import script, objects and attributes are genereted into the database, but not the respective node (ezcontentobject_tree).

I've extended Jonny's script, but I thank a lot the eZ guys if they could take a look.

$arrayResult3 = $db->arrayQuery( "SELECT id, name FROM ezcontentobject" );

$count = 0;
$message = "Check nodes' consistency"."\n";
$cli->output( $cli->stylize( 'cyan', $message ), false );
foreach( $arrayResult3 as $item) {

        $hasNode = $db->arrayQuery( "SELECT contentobject_id FROM ezcontentobject_tree WHERE contentobject_id = " . $item['id'] );
        if ( empty( $hasNode ) ) {
                $count++;
        //      echo "The node does not exist for the object: " . $item['name'] . ", ID " . $item['id'] . "\n";
        }
}
$message = "Check completed. Affected objects:" . $count ."\n";
$cli->output( $cli->stylize( 'cyan', $message ), false );

if ( $doUpdate ) {
    foreach( $arrayResult3 as $item) {
        $hasNode = $db->arrayQuery( "SELECT contentobject_id FROM ezcontentobject_tree WHERE contentobject_id = " . $item['id'] );
        if ( empty( $hasNode ) ) {
                $message = "The node does not exist for the object: " . $item['name'] . ", ID " . $item['id'] . "\n";
                $cli->output( $cli->stylize( 'cyan', $message ), false );
                $db->query( "DELETE FROM ezcontentobject WHERE ezcontentobject.id = " . $item['id'] );
                echo "Object removed; ";
                $cli->output( $cli->stylize( 'cyan', $message ), false );
                $db->query( "DELETE FROM ezcontentobject_name WHERE ezcontentobject.id = " . $item['id'] );
                $message = "Object name removed; ";
                $cli->output( $cli->stylize( 'cyan', $message ), false );
         //     $db->query( "DELETE FROM ezcontentobject_link WHERE ezcontentobject.id = " . $item['id'] );
        //      $message = "Object links removed; ";
        //      $cli->output( $cli->stylize( 'cyan', $message ), false );
                $db->query( "DELETE FROM ezcontentobject_version WHERE ezcontentobject.id = " . $item['id'] );
                $message = "Object versions removed; ";
                $cli->output( $cli->stylize( 'cyan', $message ), false );
                $db->query( "DELETE FROM ezcontentobject_attribute WHERE ezcontentobject.id = " . $item['id'] );
                $message = "Object attributes removed" . "\n";
                $cli->output( $cli->stylize( 'cyan', $message ), false );
        }
     }
}
$message = "Corrupted elements have been removed from the database."\n";
$cli->output( $cli->stylize( 'cyan', $message ), false );

OpenContent [free software solutions]
via Verdi 19, 38100 Trento (TN) Italy
www.opencontent.it
skype : gabricocek1
twitter: gabricocek

Gaetano Giunta

Thursday 04 February 2010 5:51:08 am

Thanks for the script Gabriele;

I think the best solution here would be to add missing transactions wrappers in the data_import extension. Could you provide more detailed info about where they are missing?

Principal Consultant International Business
Member of the Community Project Board

Gabriele Francescotto

Wednesday 24 February 2010 3:48:07 am

Sure!

the problem is related to the "ImportOperator" class; the transactions wrappers should be added to the "run" function:

 function run()

{

    $db = eZDB::instance();

    $db->begin();

    [...]

    $db->commit();}

OpenContent [free software solutions]
via Verdi 19, 38100 Trento (TN) Italy
www.opencontent.it
skype : gabricocek1
twitter: gabricocek

Powered by eZ Publish™ CMS Open Source Web Content Management. Copyright © 1999-2014 eZ Systems AS (except where otherwise noted). All rights reserved.

eZ debug

Timing: Jan 18 2025 11:31:55
Script start
Timing: Jan 18 2025 11:31:55
Module start 'layout'
Timing: Jan 18 2025 11:31:55
Module start 'content'
Timing: Jan 18 2025 11:31:55
Module end 'content'
Timing: Jan 18 2025 11:31:55
Script end

Main resources:

Total runtime0.0126 sec
Peak memory usage2,048.0000 KB
Database Queries3

Timing points:

CheckpointStart (sec)Duration (sec)Memory at start (KB)Memory used (KB)
Script start 0.00000.0048 587.9063152.6250
Module start 'layout' 0.00480.0023 740.531339.4453
Module start 'content' 0.00710.0038 779.9766109.4141
Module end 'content' 0.01080.0017 889.390650.3047
Script end 0.0126  939.6953 

Time accumulators:

 Accumulator Duration (sec) Duration (%) Count Average (sec)
Ini load
Load cache0.002217.1895140.0002
Check MTime0.00108.1158140.0001
Mysql Total
Database connection0.00064.732810.0006
Mysqli_queries0.002217.467830.0007
Looping result0.00000.087110.0000
Template Total0.001511.610.0015
Template load0.00086.122310.0008
Template processing0.00075.391610.0007
Override
Cache load0.00054.223510.0005
General
dbfile0.00032.144980.0000
String conversion0.00000.053040.0000
Note: percentages do not add up to 100% because some accumulators overlap

Templates used to render the page:

UsageRequested templateTemplateTemplate loadedEditOverride
1print_pagelayout.tpl<No override>extension/community/design/community/templates/print_pagelayout.tplEdit templateOverride template
 Number of times templates used: 1
 Number of unique templates used: 1

Time used to render debug report: 0.0001 secs