This article examines the life-cycle of an eZ Publish project and offers suggestions for performance optimization that can be implemented during each stage of the deployment.
Planning is a major contributor to the success of a project. As with any sophisticated Content Management System, projects based on eZ Publish benefit from early planning and a complete understanding of the tasks that need to be performed.
Defining the scale of the project and listing the tasks associated with each project phase is critical to successful implementation. Performance is part of this planning process because it is a basic requirement. Performance expectations must be clearly defined using relevant criteria. Meeting these criteria is a "best practice" technique for designing the right solution for a customer.
Criterion | Description |
---|---|
Simultaneous visitors | How many users will browse the website at the same time? |
Page views | How many pages will be displayed in a 24 hour period? |
Content objects | How many pieces of content aka content objects will be stored in the eZ Publish system? |
Content updates | How many times per day / hour will site content be added or modified? |
Simultaneous requests | What is the expected maximum number of simultaneous requests? To what degree should performance be impacted during the peak? (Peak hours usually occur at the same time every day. During these times, the number of visitors on your website can increase dramatically.) |
Load time | On average, how quickly should eZ Publish pages load? How quickly do you expect eZ Publish to answer requests? 1 to 2 seconds, 0.5 to 1 second, less than 0.5 seconds? |
Average content size | What is the average size of a page on your website (including external files)? Pages containing large external media objects (such as images) use more resources than pages containing only text. |
Large or mission-critical projects generally require load balancing. Load balancing enables multiple servers to respond to page requests, reducing the load on individual servers and improving the robustness of the system. It is important to design the load-balancing architecture for eZ Publish during the early phases of the project, rather than postponing it until the end.
Defining the performance criteria at the beginning of the project also enables the implementation team to devise tests to validate the system. These tests are not only useful for project acceptance and sign-off; they are also a valuable troubleshooting resource, providing a performance base-line that can be used as a comparison should the system's responsiveness later degrade.
From the point of view of performance, the most important part of the specification process is selecting and defining the caching mechanisms that eZ Publish will use. Caching must be considered during each stage of development and deployment, and has a critical impact on performance.
The Template Compilation and Override Cache settings may be disabled during development as described in the documentation of the site.ini. These settings should never be disabled in a production environment, because that would considerably slow down your website. For example, the Template compilation cache drastically reduces server load, because it generates pure PHP scripts from the eZ Publish template, so there is no need to parse the templates during production runtime.
The upcoming version 3.8 of eZ Publish introduces template development mode, which negates the need to disable these settings.
ViewCaching is the fundamental mechanism for content caching, and should always be enabled. However, ViewCaching can be disabled for individual pages if required. When enabled, eZ Publish will automatically create multiple cached versions based on roles and the view parameters.
Cache blocks are used to cache dynamic parts of pagelayout.tpl and dependent templates (that is, templates that are not handled by ViewCaching). They store the HTML result of dynamically generated template code in a text file. This file can be loaded the next time the same code is requested. As an example, cache blocks are often used to store navigation menus. Note that too many cache blocks can have a negative effect on performance. More information on the proper use of cache blocks can be found in the eZ Publish documentation.
Finally, static caching is the most powerful cache mechanism for many projects. It stores and reads static HTML files, so that PHP is not called when a page is loaded. Performance can be boosted tremendously by using static caching, but you have to determine during the early stages of the project where it should be implemented, what parts of the project cannot use it and how cache expiration should be handled.
The cache strategy that will be used is directly dependent on the project specifications. For example, features that display personalized information cannot use StaticCache. Content that performs real-time rendering cannot use cache blocks. Therefore, the eZ Publish implementer needs site specifications in order to identify the best cache solution. These specifications don't need to be overly documented or obscure, or rely on a complex development process; they merely need to be factual and efficient. For example, a simple website will have a very short specification.
eZ Publish 3.8 will have new features that allow header alteration and make use of an external caching system like Squid. More information on these features will be published in the near future.
Good practices as well as a few tips and tricks provided in this section will make implementing the site specification easier, less error-prone and improve site performance.
Project components should always be as well-organized as possible. Store your project design and extensions in the extension directory, not in the directory that stores the default design. This will simplify migration and upgrades and make the project layout easier to understand.
Use a consistent and clean directory and file structure. Use consistent naming conventions. For example, use either underscores or camel case for the template names, not a mix of both. Do not keep backup files of templates in the "live" area, and clear deprecated files as soon as possible to avoid accidentally maintaining unused files.
These tips and tricks will make the implementation of your site smoother:
The $node variable is generated by the content/view module. It is therefore available in most pages during development. However, it should never
be used in any templates other than the one that overrides the node/view/* template. When ViewCaching is enabled, the variable will no longer be available after the cache is generated, as the content module is not going to fetch the Node object from the database.
Located in content.ini, the AvailableSiteDesignList[] array lists the designs the system should include when clearing ViewCache. If your custom design name is not listed in your override configuration file, ViewCache files won't be cleared when content is published.
Cache blocks will mostly be used to cache pieces of template code that are not handled in templates cached by ViewCache (for example, a navigation menu's contextual information such as the latest news, connected users and the latest forum posts). Ideally, all calls to the fetch and content template operators should be embedded in a cache block.
In most cases, cache block expiration is handled using time expiry (TTL) or subtree expiration. Time expiry sets the lifetime of a cache block in seconds. For example, if it is set to 3600, the block will be renewed after one hour. Subtree expiry is used to expire cache blocks when content is published under a subtree, for instance /products or /content/view/full/50.
Limit the number of cache blocks used on the same page, as the cache blocks themselves have an input/output overhead. When possible, concatenate multiple cache blocks into one, or use nested cache blocks.
Testing should not be postponed until just before the project goes live. Instead, it should be an ongoing effort that validates each step of development and implementation. This section describes some testing techniques that will help you analyze and optimize performance.
The debug output provided by eZ Publish is extremely useful. It is enabled in the site.ini configuration file ([DebugSettings], DebugOutput=enabled). Debug output generates verbose notices, warnings and errors, and also provides a profiling tool that shows how page generation time is divided among the page construction phases. While the profiler was not specifically designed as a performance or load testing tool, it will give you the ability to immediately identify bottlenecks and major issues during unit testing and thus to correct many potential problems.
For example, after enabling caching, you can use the profiler to ensure that SQL queries are not executed when cache blocks are enabled. You should see three SQL queries on a standard page load when ViewCaching is enabled. If you see more, for instance 20 or 30, something is wrong.
It's quite common to ignore non-critical messages in the debug output (such as compiler errors, uninitialized variables, etc). However, even if the errors have (or seem to have) no impact, they should be fixed as soon as possible. Errors are indicators of system instability or misconfiguration, and may have side effects, including performance impacts. For the sake of project maintenance, the debug output should show a clean system.
Another useful source of information is the error log in var/log/error.log. It can, among other things, show you when a file that doesn't exist is being requested by a browser (for example a JavaScript file you have removed but forgot to remove from the template code or the website icon). When this happens, a request is sent to eZ Publish with a "module not found" error, and every page load actually starts two instances of eZ Publish, which uses a lot of unnecessary server resources.
Load testing must be run periodically during project development and implementation. To start load tests just before launching a project is extremely risky and unprofessional. Unfortunately, this is a common mistake because developers, testers and project managers usually focus on visible tasks (such as page layout, features, etc) and not on transparent infrastructure issues (such as performance).
Several tools can help you in the load testing process, such as Apache Benchmark (provided with the Apache HTTP server), Siege or Jmeter. They show you how well your project scales as the number of visitors increases. However, be aware that these tools can severely impact your server if not used properly, since they can start hundreds of simultaneous requests to your website.
There are two aspects to benchmarking project performance. The first is benchmarking from within the local network or from the server itself. These benchmarks show HTTP server response time without any bandwidth variables. Second, benchmarking your platform's network availability in a "live" environment, where bandwidth varies, provides a view of your system's performance from your users' perspective.
Performance is influenced both by the software and development stack and the hardware architecture. Weaknesses in any of these components will impact performance, no matter how well the other components perform.
Your ability to manage the software stack and the hardware architecture will, of course, depend on your site hosting arrangements. The following section describes platform considerations that should be discussed with your system administrator or hosting partner. By having complete information about the environment where your site will be running, you will be able to anticipate and resolve problems before going live.
To determine the environment in which you'll be working, you should ask your system administrator or hosting partner the following questions:
If your project requires load balancing, infrastructure is "critical path." Various load balancing configurations can be created depending on the project requirements. Consult an experienced systems engineer for assistance. Articles about load balancing are available in the community section of the ez.no site. For additional assistance, eZ systems' experts have extensive experience with setting up high-availability platforms for eZ Publish.
The versions and interrelationships between software in the stack are critical to performance.
If you're running Apache version 2, note that some PHP extensions are not fully threadsafe. If at all possible, you should run Apache 1.x. If you must run Apache 2, be sure to run it under pre-fork mode, which prevents Apache 2 from using threads.
eZ Publish requires PHP 4.3.x for the 3.6 branch, PHP 4.4.x for the 3.7+ branch, and does not run at all on PHP 5.x. Regarding web server integration, SAPI is supported but CGI won't work correctly due to information required by eZ Publish that is not transmitted by CGI. ( FastCGI will be implemented in eZ Publish 4.x as a replacement for CGI.)
Check the following PHP settings: Max execution time and memory limit should be set high enough (more than 64 MB is recommended for memory limit). Safe mode should always be disabled, as it is too restrictive for eZ Publish. If possible, memory limit should be disabled as it has a small performance impact.
An opcode cache is strongly recommended when working with eZ Publish. opcode cache stores the result of the PHP interpreter in shared memory, thus avoiding the task of parsing the scripts again the next time the same files are requested. opcode is recommended for most PHP applications. Since some eZ Publish scripts are quite large, opcode caching has a tremendous effect on performance. APC and eAccelerator are the recommended opcode implementations; they should be allocated sufficient shared memory to be effective.
The command-line version of PHP is required in order to run cronjobs and maintenance scripts. Since this is often not installed on production servers, make sure that this component is compiled and has the same parameters as the PHP version used by the web server.
If you need advanced multilingual support, you must be using MySQL version 4.1.x or greater to benefit from Unicode storage.
Depending on the eZ Publish installation location, you may have to alter DNS configuration to access the site via the desired URLs. Even if DNS modification is not required, mod_rewrite must be enabled on the server, and ideally configured in the website's virtual host.
An image conversion package must be available on the server. eZ Publish can use ImageMagick or GD (ImageMagick is recommended). If the image conversion tool is configured differently on the production server, you must change the image conversion settings.
Once a project is running on the production platform, regular monitoring and maintenance are required to ensure that performance continues to be optimal. With a few tools and procedures, it is easy to identify and resolve problems before they impact your users' site experience.
Monitoring tools like Nagios are extremely useful for ongoing site monitoring. They can alert you when the server load is too high, indicating that something is wrong with the system and that your server may not be able to handle peak volume.
If your database and file repository size regularly increases, you should monitor the available disk space. If eZ Publish runs out of disk space, it will not be able to create cache files, store new objects, etc.
The disk space required for content updates and additions depends on the nature of these changes. eZ Publish objects with text content use less disk space (a few kilobytes), XML blocks use slightly more, and images can use much more space, as each size variation of the image will be stored on disk (which means that a 1 megabyte image can use something like 2 megabytes depending on the image settings and usage).
Be aware that cache blocks also use significant disk space (depending on their parameters). For example, if you have distinct cache blocks for each user, the number of cache blocks depends on how many users are registered on your website.
The eZ Publish database can get corrupted under various circumstances. It is a good idea to use the maintenance tools provided by your RDBMS to check its health. For example, MySQL provides commands like CHECK TABLE or REPAIR table to help you detect and fix issues on SQL tables.
Web statistics provide you with valuable information about site traffic. By monitoring site statistics, you are alerted if the website's usage is higher than expected, allowing you to upgrade the infrastructure before volume becomes a problem. AWStats and Webalizer are two examples of web statistics tools. They produce statistics based on your web server access logs that can be viewed via a web interface.
Even with proper testing, a few bugs will probably appear in the project. Making changes to the project in the production environment is very dangerous. Instead, you should always maintain a development version synchronized with the production website. Fixes and changes can be implemented on the development platform and then moved to production once they have been tested and validated.
eZ systems publishes new versions of eZ Publish on a regular basis. Minor versions of a branch (for instance 3.7.3, 3.7.4) are usually bug fixes and minor improvements, and it is totally safe to upgrade from a version with the same major version number (i.e., 3.7). As with site modifications, it's best to install the upgrade on your development environment first, and then duplicate the installation on the production site once everything has been tested and validated.
The tips and suggestions provided in this article should help you plan, implement and run your eZ Publish site with optimal performance. For additional assistance, consider joining the eZ Publish network, a yearly maintenance service that eliminates many potential problems before they occur and provides rapid assistance for critical issues. Depending on your project, subscribing to the eZ Publish network can be an asset that helps ensure your site runs smoothly over time and remains up-to-date with the latest security and performance improvements.