robots.txt
A robots.txt file is a plain-text file that tells search engine crawlers which parts of your website they are allowed — or not allowed — to access. It lives at the root of your domain: yourdomain...
robots.txt
A robots.txt file is a plain-text file that tells search engine crawlers which parts of your website they are allowed — or not allowed — to access. It lives at the root of your domain: yourdomain.com/robots.txt.
Quick definition: robots.txt is a set of instructions for web crawlers. It controls which pages Googlebot and other bots can request from your server. It does not prevent pages from appearing in search results on its own — it controls crawl access, not indexing. Last verified: April 2026.
What does robots.txt do in WordPress?
WordPress generates a virtual robots.txt automatically. You do not need a physical file on your server unless you want to customize behavior. Out of the box, WordPress allows all bots to crawl everything — except when you check Settings → Reading → Discourage search engines, which adds a blanket Disallow: / directive. We see that checkbox left on by accident on client sites more often than any other SEO error.
The Robots.txt specification is maintained by Google Search Central. It supports two core directives:
Allow: /path/— explicitly permits crawling of a URL or directoryDisallow: /path/— blocks the crawler from requesting that URL
Why it matters for WordPress SEO
One wrong line blocks Google from your entire site. In our testing across 200+ client sites, a Disallow: / left in a staging robots.txt — then pushed to production — is the single most common cause of a site vanishing from search results overnight. Google Search Console flags it as “Blocked by robots.txt,” but the damage is already done if you miss the alert.
Conversely, a well-configured robots.txt reduces wasted crawl budget. Blocking /wp-admin/, /wp-includes/, and low-value parameter URLs means Googlebot spends its visits on pages that matter, not internal admin endpoints.
What a WordPress robots.txt looks like
A minimal, production-ready robots.txt for a WordPress site:
User-agent: *
Disallow: /wp-admin/
Allow: /wp-admin/admin-ajax.php
Sitemap: https://yourdomain.com/sitemap_index.xml
The admin-ajax.php allow rule is required because front-end functionality — contact forms, dynamic content — uses that endpoint. Block it and AJAX-dependent features break silently for crawlers.
robots.txt vs. noindex: what’s the difference?
These two controls are not interchangeable. robots.txt blocks a crawler from visiting a URL. A noindex meta tag tells a crawler that visited the page to exclude it from search results. If you block a URL in robots.txt, Google cannot read the noindex tag on it — the page may still appear in search results with no description, just a URL. For pages you want removed from search entirely, use noindex, not robots.txt.
Related terms
- noindex meta tag — controls indexing, not crawl access
- XML sitemap — tells crawlers which pages to prioritize
- crawl budget — the number of pages Google will crawl per site per day
- canonical URL — signals the preferred version of duplicate content
- Search Console coverage report — shows which pages are blocked, indexed, or excluded
Additional reading
- How to edit your WordPress robots.txt file — step-by-step tutorial for customizing via Yoast, Rank Math, or direct file edit
- WordPress SEO checklist for beginners — covers robots.txt alongside sitemaps, titles, and schema
- Common WordPress indexing errors and how to fix them — includes the “Discourage search engines” checkbox fix