Robots.txt - SEO Best Practices
Every subdomain on your site should have a robots.txt file that links to a sitemap and describes any crawler restrictions.
A robots.txt file is a file at the root of your site that indicates those parts of your site you donβt want accessed by search engine crawlers.
Use robots.txt files
Add a robots.txt file to every subdomain so you can specify sitemap locations and set web crawler rules. Robots.txt files are always located in the root folder with the name robots.txt
. Each robots.txt file only applies to URLs with the same protocol, subdomain, domain and port as the robots.txt URL. For example, http://example.com/robots.txt
would be the robots.txt URL for http://example.com
but not https://example.com
or http://www.example.com
. Even an empty robots.txt file is useful to have for cleaning up server logs as it will reduce 404 errors from visiting bots. Keep in mind that if you use a robots.txt file to tell search bots not to visit a certain page, that page can still appear in search results if itβs linked to from another page. To hide pages from search results, use noindex
meta tags instead.
Learn more
- Learn about robots.txt filessupport.google.com
- Robots.txt Specificationsdevelopers.google.com
- How to Create a Robots.txt filewww.bing.com
- Meta tags and robots.txt in Yahoo Searchhelp.yahoo.com
Set sitemap locations
Each robots.txt file should specify sitemap file locations. Sitemap files contain a list of page URLs that you want indexed and are read by search bots. These files can also include metadata that describes when pages were last updated and how often different pages are updated to help crawlers index your site more intelligently. A sitemap location should be specified in the robots.txt file with a line such as Sitemap: http://example.com/sitemap.xml
. A robots.txt file can include more than one sitemap reference.
Learn more
- Build and submit a sitemapsupport.google.com
- What are Sitemaps?www.sitemaps.org
- Robots.txt Specificationsdevelopers.google.com
More articles in this series
β This article is from our comprehensive SEO Best Practices guide.
β Next article in this series: Redirects
β Previous article in this series: Links