Forums / Developer / Manage multi robots.txt for multi subdomains

Manage multi robots.txt for multi subdomains

Author Message

Fabien Scantamburlo

Wednesday 30 June 2010 2:21:35 am

Hi,

I try to set a robots.txt per subdomain. But I'm unable to find how to do this. I know there's the robots.txt in the root directory, and I use it for the moment with the main domain (www).

I'm doing a mobile version and I need to set a robots.txt for this version. Does anyone know the way (.htaccess rule ?).

Cheers,

Fabien.

Jérôme Vieilledent

Wednesday 30 June 2010 3:02:11 am

Hi Fabien

We also had this issue on a multilingual site and managed to fix it :

  • Create a new "Robots.txt" content class, with at least one field dedicated to store the content of your robots.txt file
  • Instantiate your new content class in your content tree, using translations to alter the content of your robots.txt
  • Add an URL translator rule to map robots.txt to your content node
  • In your Apache Virtual Host / .htaccess file, comment the robots.txt line

Another approach would be to build a module instead of using translations (useful when your websites are totally different). Then you could store your content in a foreign table...

Hope this helps

Fabien Scantamburlo

Wednesday 30 June 2010 5:43:29 am

Hi Jérôme,

Thanks for your help. This is certainly the best way to customize robots.txt

Good job.

Cheers,

Fabien.

Luca Pezzoli

Monday 16 August 2010 9:04:01 am

Hi Jérôme,

we own a multisite installation, so every extension has a different URL.
We are trying to specify a different robots.txt file for every extension/site so we are trying to follow your instructions.

Could you be more specific in your explanation? Do we have to setup a different layout ?the physical file "robots.txt" must exist in the root of ez publish?
The optimal situation would be a pratical example.

Thanks in advance

Luca

Jérôme Vieilledent

Monday 16 August 2010 3:22:28 pm

Hi Luca

As I said, you'll need to create a dedicated content class :

  • Create a new "Robots.txt" content class, with at least one field dedicated to store the content of your robots.txt file
  • Instantiate your new content class in your content tree, using translations to alter the content of your robots.txt (yes, you will have to override the pagelayout for this content as it must be a blank layout, not HTML)
  • Add an URL wildcard rule to map robots.txt to your content node

To add a URL wildcard, go to Setup / URL Wildcards :

  • New URL wildcard : robots.txt
  • Destination : <url_of_your_content_node>
  • Leave Redirecting URL unchecked

Finally check that the following line in your Apache VHost (or .htaccess file) is commented :

RewriteRule ^/robots\.txt - [L]

Should be :

#RewriteRule ^/robots\.txt - [L]

This would do the job :)

Luca Pezzoli

Tuesday 17 August 2010 1:44:52 am

Perfect !
Rewriterule was blocking URL Wildcard.

Thank you very much Jérôme !