How to avoid duplicate content for search engines

Author Message

Michael Fürst

Tuesday 09 June 2009 1:39:50 am

Hi there,

I have some questions on search engine optimization. I did everything to avoid duplicate content on our site (e.g. 302 redirect from domain.com/article to www.domain.com/article) and everything works fine.

BUT:
As you know, articles can be found within their "original view URL" and with their "nice URL". For example the same article can be found at:

<b>http://www.domain.com/content/view/full/960</b>
and
<b>http://www.domain.com/Folder/Content/200906/Title-of-Article</b>

I only use nice-URL in my links on the site, but for any reason google fetches the "original URL" in its index. And so we have a beautiful example for duplicate content....

So has anyone an idea, how to avoid this? The best solution would be a 302 redirect, if the original url is called - but i've no idea how to do ...

Thanks in advance,
Cheers,
Mike

Gaetano Giunta

Tuesday 09 June 2009 8:56:33 am

One way is to add a check at the top of the pagelayout.tpl that checks if the url in use is content/view/full/* and if it is, redirect instead to the node virtual url.

You will need a simple template operator that sends custom http headers for that.

Might be dangerous if you get stuck in loops, though.

Principal Consultant International Business
Member of the Community Project Board

Radek Kuchta

Tuesday 09 June 2009 12:18:56 pm

If you worry about duplicate content use <link> tag to specify your preferred version of page (URL address).

http://googlewebmastercentral.blogspot.com/2009/02/specify-your-canonical.html

http://ez.no/certification/verify/272582

Damien Pobel

Wednesday 10 June 2009 2:42:40 am

Hi,

Another solution is to write a "content read" workflow event that checks the URL and redirects if necessary. It's quite close to Gaetano's solution but I find dirty to implement such a logic in a template operator.

Cheers

Damien
Planet eZ Publish.fr : http://www.planet-ezpublish.fr
Certification : http://auth.ez.no/certification/verify/372448
Publications about eZ Publish : http://pwet.fr/tags/keywords/weblog/ez_publish

Michael Fürst

Wednesday 10 June 2009 4:38:40 am

Hi guys,

Thanks for your response.
I think the easiest way would be to define the <link> Tag.

But I'm really interested in the workflow solution. I read the documentation, but it is really short. Do you have a link to a manual or some tutorials about how to write and implement an own workflow event?

Thanks & regards,
Michael

Damien Pobel

Wednesday 10 June 2009 6:15:48 am

Hi Mike,

you can start with eZpedia article [1] or look at an existing workflow event type [2].

For a "content read", you also have to enable the content_read operation, see settings/workflow.ini.

[1] http://ezpedia.org/ez/workflow_event_type
[2] http://projects.ez.no/types/ez_publish/workflow_event_type/

Damien
Planet eZ Publish.fr : http://www.planet-ezpublish.fr
Certification : http://auth.ez.no/certification/verify/372448
Publications about eZ Publish : http://pwet.fr/tags/keywords/weblog/ez_publish

Michael84 Michael84

Thursday 18 June 2009 1:25:42 am

People, I'd recommend using this duplicate file finder - http://www.moleskinsoft.com/duplicate-file-finder-1
It can deal with unnecessary files on your hard drive. It has many functions, user-friendly interface, safety options and 100% efficiency.

Powered by eZ Publish™ CMS Open Source Web Content Management. Copyright © 1999-2014 eZ Systems AS (except where otherwise noted). All rights reserved.