How to avoid duplicate content for search engines

Next topic

Author	Message
Michael Fürst	Tuesday 09 June 2009 1:39:50 am Hi there, I have some questions on search engine optimization. I did everything to avoid duplicate content on our site (e.g. 302 redirect from domain.com/article to www.domain.com/article) and everything works fine. BUT: As you know, articles can be found within their "original view URL" and with their "nice URL". For example the same article can be found at: <b>http://www.domain.com/content/view/full/960</b> and <b>http://www.domain.com/Folder/Content/200906/Title-of-Article</b> I only use nice-URL in my links on the site, but for any reason google fetches the "original URL" in its index. And so we have a beautiful example for duplicate content.... So has anyone an idea, how to avoid this? The best solution would be a 302 redirect, if the original url is called - but i've no idea how to do ... Thanks in advance, Cheers, Mike
Gaetano Giunta	Tuesday 09 June 2009 8:56:33 am One way is to add a check at the top of the pagelayout.tpl that checks if the url in use is content/view/full/* and if it is, redirect instead to the node virtual url. You will need a simple template operator that sends custom http headers for that. Might be dangerous if you get stuck in loops, though. Principal Consultant International Business Member of the Community Project Board
Radek Kuchta	Tuesday 09 June 2009 12:18:56 pm If you worry about duplicate content use <link> tag to specify your preferred version of page (URL address). http://googlewebmastercentral.blogspot.com/2009/02/specify-your-canonical.html http://ez.no/certification/verify/272582
Damien Pobel	Wednesday 10 June 2009 2:42:40 am Hi, Another solution is to write a "content read" workflow event that checks the URL and redirects if necessary. It's quite close to Gaetano's solution but I find dirty to implement such a logic in a template operator. Cheers Damien Planet eZ Publish.fr : http://www.planet-ezpublish.fr Certification : http://auth.ez.no/certification/verify/372448 Publications about eZ Publish : http://pwet.fr/tags/keywords/weblog/ez_publish
Michael Fürst	Wednesday 10 June 2009 4:38:40 am Hi guys, Thanks for your response. I think the easiest way would be to define the <link> Tag. But I'm really interested in the workflow solution. I read the documentation, but it is really short. Do you have a link to a manual or some tutorials about how to write and implement an own workflow event? Thanks & regards, Michael
Damien Pobel	Wednesday 10 June 2009 6:15:48 am Hi Mike, you can start with eZpedia article [1] or look at an existing workflow event type [2]. For a "content read", you also have to enable the content_read operation, see settings/workflow.ini. [1] http://ezpedia.org/ez/workflow_event_type [2] http://projects.ez.no/types/ez_publish/workflow_event_type/ Damien Planet eZ Publish.fr : http://www.planet-ezpublish.fr Certification : http://auth.ez.no/certification/verify/372448 Publications about eZ Publish : http://pwet.fr/tags/keywords/weblog/ez_publish
Michael84 Michael84	Thursday 18 June 2009 1:25:42 am People, I'd recommend using this duplicate file finder - http://www.moleskinsoft.com/duplicate-file-finder-1 It can deal with unnecessary files on your hard drive. It has many functions, user-friendly interface, safety options and 100% efficiency.

eZ debug

Timing:	Jan 18 2025 11:23:31
Script start
Timing:	Jan 18 2025 11:23:31
Module start 'layout'
Timing:	Jan 18 2025 11:23:31
Module start 'content'
Timing:	Jan 18 2025 11:23:32
Module end 'content'
Timing:	Jan 18 2025 11:23:32
Script end

Main resources:

Total runtime	0.8521 sec
Peak memory usage	4,096.0000 KB
Database Queries	75

Timing points:

Checkpoint	Start (sec)	Duration (sec)	Memory at start (KB)	Memory used (KB)
Script start	0.0000	0.0071	590.2266	152.6406
Module start 'layout'	0.0071	0.0030	742.8672	39.4766
Module start 'content'	0.0101	0.8405	782.3438	714.1484
Module end 'content'	0.8506	0.0014	1,496.4922	20.1250
Script end	0.8520		1,516.6172

Time accumulators:

Accumulator	Duration (sec)	Duration (%)	Count	Average (sec)
Ini load
Load cache	0.0032	0.3727	16	0.0002
Check MTime	0.0013	0.1486	16	0.0001
Mysql Total
Database connection	0.0015	0.1709	1	0.0015
Mysqli_queries	0.7783	91.3426	75	0.0104
Looping result	0.0010	0.1229	73	0.0000
Template Total	0.8214	96.4	2	0.4107
Template load	0.0019	0.2240	2	0.0010
Template processing	0.8195	96.1755	2	0.4097
Template load and register function	0.0001	0.0094	1	0.0001
states
state_id_array	0.0007	0.0844	1	0.0007
state_identifier_array	0.0007	0.0775	2	0.0003
Override
Cache load	0.0017	0.1968	48	0.0000
Sytem overhead
Fetch class attribute can translate value	0.0009	0.1032	5	0.0002
Fetch class attribute name	0.0035	0.4058	11	0.0003
XML
Image XML parsing	0.0020	0.2387	5	0.0004
class_abstraction
Instantiating content class attribute	0.0000	0.0020	13	0.0000
General
dbfile	0.0010	0.1208	34	0.0000
String conversion	0.0000	0.0009	4	0.0000
Note: percentages do not add up to 100% because some accumulators overlap

Templates used to render the page:

Usage	Requested template	Template	Template loaded
1	node/view/full.tpl	full/forum_topic.tpl	extension/sevenx/design/simple/override/templates/full/forum_topic.tpl
6	content/datatype/view/ezimage.tpl	<No override>	extension/sevenx/design/simple/templates/content/datatype/view/ezimage.tpl
7	content/datatype/view/ezxmltext.tpl	<No override>	extension/community_design/design/suncana/templates/content/datatype/view/ezxmltext.tpl
13	content/datatype/view/ezxmltags/paragraph.tpl	<No override>	extension/ezwebin/design/ezwebin/templates/content/datatype/view/ezxmltags/paragraph.tpl
7	content/datatype/view/ezxmltags/line.tpl	<No override>	design/standard/templates/content/datatype/view/ezxmltags/line.tpl
1	print_pagelayout.tpl	<No override>	extension/community/design/community/templates/print_pagelayout.tpl
Number of times templates used: 35 Number of unique templates used: 6

Time used to render debug report: 0.0001 secs