ezpublish for LONG term data storage

Author Message

Gabriel Ambuehl

Friday 03 December 2004 2:39:49 am

We have a project where the underlaying data will be used for at least ten years, quite possibly even longer. Now I'm wondering whether I should use ezpublish for that or if we should go for a real custom SQL schema (with "proper" tables). ezpublish's data structure is, while very flexible, extremely complex to deal with (meaning: near impossible from outside ezpublish)? Did anybody ever try to convert ezpublish data structures to a more conventional relational schema? I'm understandably somewhat hesitant (especially as there's family involved...) to this so I'd appreciate any comments on this issue.

Visit http://triligon.org

Tore Jørgensen

Friday 03 December 2004 5:10:18 am

Could you tell a bit more about the project? Do you need to access the data from outside eZ, or do you just need to be able to export your data to some unknow program sometime in the future? How well structured are your data? Is it only textual data, or various kinds of binary data as well? How much data is it? I'm no expert on eZ, but I think even an expert will need more information :-)

Generally speaking, to get data out of the eZ database, I think you'll need to use some part of eZ. We did make some stuff that retrieved data from eZ objects to a Flash program, and it was a lot of work compared to just open a well known sql table. However, if you need a lot of the other eZ stuff, it's probably worth it. You could also look into storing some information in normal sql tables from a module or something like that.

Tore Skobba

Friday 03 December 2004 6:01:00 am

Hmm 10 years.. You know that even MySql could be gone in 10 years... My bet then would be to go for XML files or something plain text, if only text data (like integers etc...) then you could simply export all data from EZ to an XML text file or something.. Just write template which dumps the data. I have been working directly on the data in the database, and it is lots of joins.. However, now that MySql has support for nested SQL quires it is a bit easier I think. But still you need to keep your tonuge straight.

Gabriel Ambuehl

Friday 03 December 2004 6:06:08 am

Oh my. I must have been around economists for too long (who generally don't give you anything useful at all). Of course that information isn't enough.

Basically, it's going to be a rather huge product database for an antique dealer. It's gonna be mostly structured text (which ez excels at as we all know) and obviously scans and pics of the stuff offered, so not a lot of binaries all in all.

ez + enhancedojectrelation has all the features I really need to develop a complex data model:
As an example, take a book:

author: an object relation to the authors, reverse relation enables it to easily list all the books by him
publication date: date
title: string
genre: relation to proper genres, with the same reason as the authors
publisher: relation to the publisher

This makes for a somewhat slow, but incredibly flexible datamodel (of course I could do it in SQL myself, but it's a PITA to write a decent GUI for it). Performance isn't much of a worry (we have fast servers at hand if need be) and traffic will be light.

As for the volume of the data: there's literally tens, maybe hundreds of thousand items but I don't know whether they will all be put online. But I'd estimate somewhere in the lower 5 digits.

Essentially I don't need anything else to talk to it, but I need the possibility to get data out of it if need be. I'm guessing it will be some work (more than with plain SQL data model) but using the ez classes, it should be doable.

As for MySQL being gone: not much of a problem. MySQL's SQL dialect can be fed easily (meaning little s/r needed) to a whole bunch of RDBMS. Dumping the stuff to XML crossed my mind as well. I wonder if there's already some hidden feature for that?

Visit http://triligon.org

Alex Jones

Friday 03 December 2004 7:54:35 am

Well, as Tore Skobba noted you could use templates to generate XML, and for that matter if another format is preferred 10 years from now, you should be able to build your template accordingly. At least, that's what I'm counting on. I'm not aware of any hidden features that will let you dump straight into XML without creating the templates. But, this could be a good project for eZ systems, or one of the partners should you need someone to develop it for you.

Alex
[ bald_technologist on the IRC channel (irc.freenode.net): #eZpublish ]

<i>When in doubt, clear the cache.</i>

Paul Borgermans

Friday 03 December 2004 10:03:28 am

Just a short response:

Long term persistence of data is one of our core requirements along data modelling ... and we chose ez publish for this :-)

The heavily normalised db schema of ez publish indeed is not the way I would "migrate". In it's simplest form you could create a template that just produces a "dump" file for import into a simpler relational schema (non-normalised).

You can even try to create a html table and copy/past this into ms excel or OOo directly.

I'm not worrying about long term data persistence

Have a nice weekend

-paul (after this silence again for another week -- load avalanche coming)

eZ Publish, eZ Find, Solr expert consulting and training
http://twitter.com/paulborgermans

Marco Zinn

Saturday 04 December 2004 11:53:21 am

Hi Gabriel,
i agree with the others: If you're unsure about "what will happen in the next 10 years" with ezPublish, MySQL and your load of data.... use ezPublish anyway ;)
As you mentioned, ez should be able to handle your data model nicely.

If you feel you need to store your data or you need to migrate them later, create one or more templates, that can give you a "dump" of your data in CSV, TXT, XML or some other "cross-plattform" data format. Not sure on how to deal with binary (incl. images), though. But i guess, this can be converted into XML Data structures. (ez is probably doing this for exporting data into packages).

Marco
http://www.hyperroad-design.com

Tore Jørgensen

Sunday 05 December 2004 11:10:49 pm

I assume that if/when you want to move to a new platform, it will be some kind of relational datamodel on the new platform as well. I think the key to get the data across, will be to keep the relations between objects like book and author and images. It will not be a problem to get text information across (except that the new system might want more metadata), and popular image formats like jpg and gif will be possible to convert to something else. If you end up writing one template that exports all the books, and one that export all the authors, it should probably be possible to recreate the relations if your template write the author's node id or object id in both the author list and the book list. I agree with the others: it's not likely to be considerable more work to get it into a new system from eZ than it's going to be from another system.

Powered by eZ Publish™ CMS Open Source Web Content Management. Copyright © 1999-2014 eZ Systems AS (except where otherwise noted). All rights reserved.

eZ debug

Timing: Jan 18 2025 11:20:18
Script start
Timing: Jan 18 2025 11:20:18
Module start 'layout'
Timing: Jan 18 2025 11:20:18
Module start 'content'
Timing: Jan 18 2025 11:20:19
Module end 'content'
Timing: Jan 18 2025 11:20:19
Script end

Main resources:

Total runtime1.1924 sec
Peak memory usage4,096.0000 KB
Database Queries80

Timing points:

CheckpointStart (sec)Duration (sec)Memory at start (KB)Memory used (KB)
Script start 0.00000.0055 587.9141152.6250
Module start 'layout' 0.00550.0031 740.539139.4609
Module start 'content' 0.00871.1819 780.0000761.4219
Module end 'content' 1.19060.0017 1,541.421924.1563
Script end 1.1923  1,565.5781 

Time accumulators:

 Accumulator Duration (sec) Duration (%) Count Average (sec)
Ini load
Load cache0.00340.2889160.0002
Check MTime0.00150.1229160.0001
Mysql Total
Database connection0.00110.092310.0011
Mysqli_queries1.112293.2762800.0139
Looping result0.00110.0884780.0000
Template Total1.156297.020.5781
Template load0.00250.210220.0013
Template processing1.153796.758420.5769
Template load and register function0.00030.027510.0003
states
state_id_array0.00140.113610.0014
state_identifier_array0.00290.242420.0014
Override
Cache load0.00230.1921410.0001
Sytem overhead
Fetch class attribute can translate value0.00100.084060.0002
Fetch class attribute name0.00190.1598120.0002
XML
Image XML parsing0.00390.327360.0007
class_abstraction
Instantiating content class attribute0.00000.0021130.0000
General
dbfile0.00280.2370350.0001
String conversion0.00000.000940.0000
Note: percentages do not add up to 100% because some accumulators overlap

Templates used to render the page:

UsageRequested templateTemplateTemplate loadedEditOverride
1node/view/full.tplfull/forum_topic.tplextension/sevenx/design/simple/override/templates/full/forum_topic.tplEdit templateOverride template
8content/datatype/view/ezxmltext.tpl<No override>extension/community_design/design/suncana/templates/content/datatype/view/ezxmltext.tplEdit templateOverride template
10content/datatype/view/ezxmltags/paragraph.tpl<No override>extension/ezwebin/design/ezwebin/templates/content/datatype/view/ezxmltags/paragraph.tplEdit templateOverride template
5content/datatype/view/ezimage.tpl<No override>extension/sevenx/design/simple/templates/content/datatype/view/ezimage.tplEdit templateOverride template
2content/datatype/view/ezxmltags/line.tpl<No override>design/standard/templates/content/datatype/view/ezxmltags/line.tplEdit templateOverride template
1content/datatype/view/ezxmltags/literal.tpl<No override>extension/community/design/standard/templates/content/datatype/view/ezxmltags/literal.tplEdit templateOverride template
1print_pagelayout.tpl<No override>extension/community/design/community/templates/print_pagelayout.tplEdit templateOverride template
 Number of times templates used: 28
 Number of unique templates used: 7

Time used to render debug report: 0.0001 secs