Massive users import

Author Message

laurent le cadet

Tuesday 31 July 2007 6:33:03 am

Hi,

I need to import 13000 users in a eZ site.

The users already have a 6 figures number which coulb used for the password.

I know there is some extensions which could deal more or less with that but due to the very large number of users I would like to have a community feedback.

Someone already faced this problem?
Any hint?

Regards.

Laurent

Felipe Jaramillo

Wednesday 01 August 2007 6:27:18 am

Hi Laurent,

We have previously handled several data import jobs successfully. The last job involved importing about 150.000 content elements into eZ. It took a few days but worked without much problems.

The process is quite straightforward using the API classes.

Some of my notes on this:

- The import must be done through the CLI, as any web-based extensions will timeout.
- The existing user import classes will come in handy to analyze how attributes are mapped.
- Take a look at bin/php/ezcsvimport.php for CSV import
- We have done most of our imports using XML as a source

Let me know if you need any more help. I could provide you with the basic script we use for your reference if you provide an email.

Regards,

Felipe

Felipe Jaramillo
eZ Certified Extension Developer
http://www.aplyca.com | Bogotá, Colombia

laurent le cadet

Wednesday 01 August 2007 6:34:35 am

Hi Felipe,

Thanks for your precious hints.
I'll try to start with this point and maybe will ask you more in a near futur.

Regards.

Laurent

Lazaro Ferreira

Wednesday 01 August 2007 7:44:33 am

Hi Laurent,

We have imported recently a similar number of users and related objects, before importing we take a look at the import extension availables and ezpublish import/export scripts, none of them suited our import jobs because we need to import the users, then the related objects and link it during the import ( using ezrelationlist attributes ), tailoring the data source wasn't an option

We have developed some scripts to do the job, but we think this is something that ezpublish should have, some kind of import framework to do more advanced taks during the import as linking related objects , creating objects conditionally, re-using external password hashes , etc

On the other hand if you can tailor your data source for ezpublish import script (ezcsvimport.php), it shouldn't be a problem to import user objects alone from command line

Lazaro
http://www.mzbusiness.com

laurent le cadet

Wednesday 01 August 2007 7:51:14 am

Thanks Lazaro.

Maybe I'll ask also for a little more help from you after digging.

Regards.

Laurent

Felipe Jaramillo

Wednesday 01 August 2007 10:21:03 am

I agree with Lazaro about the need for more robust, fast, foolproof way of importing large amounts of data.

We talked about this with other members of the community in the eZ Conference 2007, and some ideas came up. Also, there is an OpenFunding suggestion for Varmosa integration, which is a company that handles content migration for large CMS solutions. See http://ez.no/community/open_funding/suggestions_for_new_functionality/vamosa_integration

Regards,

Felipe

Felipe Jaramillo
eZ Certified Extension Developer
http://www.aplyca.com | Bogotá, Colombia

Lazaro Ferreira

Wednesday 01 August 2007 11:32:19 am

Hi Felipe,

Vamosa support importing content from non CMS ?, like legacy database applications ?

Regards

Lazaro
http://www.mzbusiness.com

Betsy Gamrat

Thursday 02 August 2007 8:11:52 pm

Assuming the user input is just user data, without additional information, it is really similar to a flat file and the import is fairly straightforward.

I have run many imports on eZ, with many types of content and some of my ideas are:

Backup the database

Break the import into chunks, so if something goes wrong, recovery is easier

Remember that an import tends to be a one-shot event, usually the code only has to run once, so it can be simple, brute force code.

Often converting it to a convenient format is as much work as the import itself, thus custom code may actually be more cost-effective than pursuing a more flexible tool.

I do agree a standardized import would be nice, but in this case, I think a simple utility is fine.

Powered by eZ Publish™ CMS Open Source Web Content Management. Copyright © 1999-2014 eZ Systems AS (except where otherwise noted). All rights reserved.

eZ debug

Timing: Jan 18 2025 11:39:38
Script start
Timing: Jan 18 2025 11:39:38
Module start 'layout'
Timing: Jan 18 2025 11:39:38
Module start 'content'
Timing: Jan 18 2025 11:39:39
Module end 'content'
Timing: Jan 18 2025 11:39:39
Script end

Main resources:

Total runtime0.8721 sec
Peak memory usage4,096.0000 KB
Database Queries76

Timing points:

CheckpointStart (sec)Duration (sec)Memory at start (KB)Memory used (KB)
Script start 0.00000.0049 588.9453152.6094
Module start 'layout' 0.00490.0035 741.554739.4141
Module start 'content' 0.00840.8623 780.9688666.1953
Module end 'content' 0.87070.0014 1,447.164120.1875
Script end 0.8721  1,467.3516 

Time accumulators:

 Accumulator Duration (sec) Duration (%) Count Average (sec)
Ini load
Load cache0.00310.3575160.0002
Check MTime0.00130.1436160.0001
Mysql Total
Database connection0.00080.092910.0008
Mysqli_queries0.797691.4593760.0105
Looping result0.00080.0864740.0000
Template Total0.843096.720.4215
Template load0.00180.203220.0009
Template processing0.841296.452420.4206
Template load and register function0.00010.011710.0001
states
state_id_array0.00140.166210.0014
state_identifier_array0.00070.082020.0004
Override
Cache load0.00160.1852600.0000
Sytem overhead
Fetch class attribute can translate value0.00060.070340.0002
Fetch class attribute name0.00150.1749100.0002
XML
Image XML parsing0.00130.145640.0003
class_abstraction
Instantiating content class attribute0.00000.0028120.0000
General
dbfile0.00090.1043220.0000
String conversion0.00000.001140.0000
Note: percentages do not add up to 100% because some accumulators overlap

Templates used to render the page:

UsageRequested templateTemplateTemplate loadedEditOverride
1node/view/full.tplfull/forum_topic.tplextension/sevenx/design/simple/override/templates/full/forum_topic.tplEdit templateOverride template
4content/datatype/view/ezimage.tpl<No override>extension/sevenx/design/simple/templates/content/datatype/view/ezimage.tplEdit templateOverride template
8content/datatype/view/ezxmltext.tpl<No override>extension/community_design/design/suncana/templates/content/datatype/view/ezxmltext.tplEdit templateOverride template
11content/datatype/view/ezxmltags/paragraph.tpl<No override>extension/ezwebin/design/ezwebin/templates/content/datatype/view/ezxmltags/paragraph.tplEdit templateOverride template
3content/datatype/view/ezxmltags/line.tpl<No override>design/standard/templates/content/datatype/view/ezxmltags/line.tplEdit templateOverride template
1print_pagelayout.tpl<No override>extension/community/design/community/templates/print_pagelayout.tplEdit templateOverride template
 Number of times templates used: 28
 Number of unique templates used: 6

Time used to render debug report: 0.0002 secs