Large customer base - New project

Author Message

Lars Eirik R

Tuesday 15 February 2011 3:20:13 am

Hi.

We are currently looking at undertaking a project where a large number of customers need to be imported in to ez from a crm. The number today is approx 500 000 customers with a limit to approximately 10 potential different roles in the system.

Does anyone have any advice on how to scale such a large client base in terms of db? We are looking at using the clustered solution, but we are not really sure if this is required.

The sites (multi language) will be hosted in the cloud and may off course be scaled to our requirements.

Any thoughs on this matter?

Thiago Campos Viana

Tuesday 15 February 2011 4:03:00 am

Don't know if it helps, but there's a clustering tutorial.

There's also a server archicture tutorial

Anyway, here's a list that could help:

http://share.ez.no/learn/ez-publish/using-the-squid-reverse-proxy-to-improve-ez-publish-performance

http://share.ez.no/learn/ez-publish/ez-publish-performance-optimization-part-1-of-3-introduction-and-benchmarking

http://share.ez.no/learn/ez-publish/ez-publish-performance-optimization-part-2-of-3-identifying-trouble-spots-by-debugging

http://share.ez.no/learn/ez-publish/ez-publish-performance-optimization-part-3-of-3-practical-cache-and-template-solutions

eZ Publish Certified Developer: http://auth.ez.no/certification/verify/376924

Twitter: http://twitter.com/tcv_br

Marko Žmak

Tuesday 15 February 2011 4:24:47 am

Lars, there are not general recipes it all depends on many factors... The complexity of the site, it's refresh rate and the expected site traffic are one of the most important.

For a good site optimization It's very important that you understand your site, and know how eZP works internally.

One of the key issues are:

  • doing eZ fetches smartly
  • good cache configuration

For example, I have a news site with more than 500.000 objects (which produces eZ tables with >500.000 and >2.000.000 rows) running on two servers (one for web and one for DB) without DB clustering and without a web accelerator. The site has a very high refresh rate and sill runs perfectly.

Here are some resources I have dealt with lately that might help:

  • http://projects.ez.no/ezsi
  • http://share.ez.no/blogs/marko-zmak/when-ezsi-doesn-t-do-it
  • http://projects.ez.no/saarchive (my archiving extension)

and some good tutorials for optimizing performance:

  • http://share.ez.no/learn/ez-publish/ez-publish-performance-optimization-part-1-of-3-introduction-and-benchmarking
  • http://share.ez.no/learn/ez-publish/ez-publish-performance-optimization-part-2-of-3-identifying-trouble-spots-by-debugging
  • http://share.ez.no/learn/ez-publish/ez-publish-performance-optimization-part-3-of-3-practical-cache-and-template-solutions/(language)/eng-GB

P.S. I could also give you a few good pointers for importing all this data.

--
Nothing is impossible. Not if you can imagine it!

Hubert Farnsworth

Lars Eirik R

Wednesday 16 February 2011 2:15:20 am

Thanks for all the pointers:

We have been looking at the infrastructure options and we consider running without a clustered solution, that is using the standard handler for file and database access in ez.

We consider having different instances with ezpublish installations access the same db. Our hope is that we can use a dedicated gpfs file system to "fool" ezpulish to use a var folder which to ez looks like it is running locally, whereas our gpfs configuration actually maps this to a dedicated server which can be mounted to each of the ezpublish instances.

Is this feasible or do we have to use the clustered solution to have this succesffully working. (using the mount point in the config option)

Marko Žmak

Wednesday 16 February 2011 3:15:40 am

Lars, this sounds like a good idea, I have been thinking about it too, but never had the time nor need to do it.

You should bear in mind that the main thing to be concerned about is the bottleneck. This is the part where most optimization is needed.

For example, if your bottleneck is database queries execution, then having different servers for serving data from the same DB server won't help much. In such a case, you should consider database cluster.

But if your bottleneck is PHP execution, cache files reading and serving the pages to users, then your idea should be a good solution.

As a general architecture guideline, I suggest having a separate DB server for the database, and a separate WEB server for storing files and serving requests (this includes storing DB data and file data on separate disks). The DB server should generally have more RAM and fast disks, and WEB server should have more CPU and storage space.

--
Nothing is impossible. Not if you can imagine it!

Hubert Farnsworth

Powered by eZ Publish™ CMS Open Source Web Content Management. Copyright © 1999-2014 eZ Systems AS (except where otherwise noted). All rights reserved.

eZ debug

Timing: Jan 29 2025 14:41:40
Script start
Timing: Jan 29 2025 14:41:40
Module start 'layout'
Timing: Jan 29 2025 14:41:40
Module start 'content'
Timing: Jan 29 2025 14:41:41
Module end 'content'
Timing: Jan 29 2025 14:41:41
Script end

Main resources:

Total runtime1.1106 sec
Peak memory usage4,096.0000 KB
Database Queries66

Timing points:

CheckpointStart (sec)Duration (sec)Memory at start (KB)Memory used (KB)
Script start 0.00000.0055 588.1328151.2109
Module start 'layout' 0.00550.0022 739.343836.6563
Module start 'content' 0.00781.1019 776.0000580.8281
Module end 'content' 1.10970.0008 1,356.828115.8438
Script end 1.1105  1,372.6719 

Time accumulators:

 Accumulator Duration (sec) Duration (%) Count Average (sec)
Ini load
Load cache0.00300.2687160.0002
Check MTime0.00130.1133160.0001
Mysql Total
Database connection0.00070.066410.0007
Mysqli_queries1.058295.2828660.0160
Looping result0.00050.0457640.0000
Template Total1.074996.820.5374
Template load0.00160.146520.0008
Template processing1.073296.637820.5366
Template load and register function0.00020.014210.0002
states
state_id_array0.00110.096710.0011
state_identifier_array0.00210.190620.0011
Override
Cache load0.00150.1309570.0000
Sytem overhead
Fetch class attribute can translate value0.00080.067730.0003
Fetch class attribute name0.00100.087880.0001
XML
Image XML parsing0.00290.262530.0010
class_abstraction
Instantiating content class attribute0.00000.0013100.0000
General
dbfile0.00570.5092300.0002
String conversion0.00000.000540.0000
Note: percentages do not add up to 100% because some accumulators overlap

Templates used to render the page:

UsageRequested templateTemplateTemplate loadedEditOverride
1node/view/full.tplfull/forum_topic.tplextension/sevenx/design/simple/override/templates/full/forum_topic.tplEdit templateOverride template
5content/datatype/view/ezimage.tpl<No override>extension/sevenx/design/simple/templates/content/datatype/view/ezimage.tplEdit templateOverride template
5content/datatype/view/ezxmltext.tpl<No override>extension/community_design/design/suncana/templates/content/datatype/view/ezxmltext.tplEdit templateOverride template
13content/datatype/view/ezxmltags/paragraph.tpl<No override>extension/ezwebin/design/ezwebin/templates/content/datatype/view/ezxmltags/paragraph.tplEdit templateOverride template
6content/datatype/view/ezxmltags/link.tpl<No override>design/standard/templates/content/datatype/view/ezxmltags/link.tplEdit templateOverride template
3content/datatype/view/ezxmltags/li.tpl<No override>design/standard/templates/content/datatype/view/ezxmltags/li.tplEdit templateOverride template
3content/datatype/view/ezxmltags/ul.tpl<No override>design/standard/templates/content/datatype/view/ezxmltags/ul.tplEdit templateOverride template
1print_pagelayout.tpl<No override>extension/community/design/community/templates/print_pagelayout.tplEdit templateOverride template
 Number of times templates used: 37
 Number of unique templates used: 8

Time used to render debug report: 0.0001 secs