core

robots.txt

A robots.txt file is a plain-text file that tells search engine crawlers which parts of your website they are allowed — or not allowed — to access. It lives at the root of your domain: yourdomain...

robots.txt

A robots.txt file is a plain-text file that tells search engine crawlers which parts of your website they are allowed — or not allowed — to access. It lives at the root of your domain: yourdomain.com/robots.txt.

Quick definition: robots.txt is a set of instructions for web crawlers. It controls which pages Googlebot and other bots can request from your server. It does not prevent pages from appearing in search results on its own — it controls crawl access, not indexing. Last verified: April 2026.


What does robots.txt do in WordPress?

WordPress generates a virtual robots.txt automatically. You do not need a physical file on your server unless you want to customize behavior. Out of the box, WordPress allows all bots to crawl everything — except when you check Settings → Reading → Discourage search engines, which adds a blanket Disallow: / directive. We see that checkbox left on by accident on client sites more often than any other SEO error.

The Robots.txt specification is maintained by Google Search Central. It supports two core directives:

  • Allow: /path/ — explicitly permits crawling of a URL or directory
  • Disallow: /path/ — blocks the crawler from requesting that URL

Why it matters for WordPress SEO

One wrong line blocks Google from your entire site. In our testing across 200+ client sites, a Disallow: / left in a staging robots.txt — then pushed to production — is the single most common cause of a site vanishing from search results overnight. Google Search Console flags it as “Blocked by robots.txt,” but the damage is already done if you miss the alert.

Conversely, a well-configured robots.txt reduces wasted crawl budget. Blocking /wp-admin/, /wp-includes/, and low-value parameter URLs means Googlebot spends its visits on pages that matter, not internal admin endpoints.


What a WordPress robots.txt looks like

A minimal, production-ready robots.txt for a WordPress site:

User-agent: *
Disallow: /wp-admin/
Allow: /wp-admin/admin-ajax.php

Sitemap: https://yourdomain.com/sitemap_index.xml

The admin-ajax.php allow rule is required because front-end functionality — contact forms, dynamic content — uses that endpoint. Block it and AJAX-dependent features break silently for crawlers.


robots.txt vs. noindex: what’s the difference?

These two controls are not interchangeable. robots.txt blocks a crawler from visiting a URL. A noindex meta tag tells a crawler that visited the page to exclude it from search results. If you block a URL in robots.txt, Google cannot read the noindex tag on it — the page may still appear in search results with no description, just a URL. For pages you want removed from search entirely, use noindex, not robots.txt.



Additional reading