robots.txt

A robots.txt file is a plain-text file that tells search engine crawlers which parts of your website they are allowed — or not allowed — to access. It lives at the root of your domain: yourdomain.com/robots.txt.

Quick definition: robots.txt is a set of instructions for web crawlers. It controls which pages Googlebot and other bots can request from your server. It does not prevent pages from appearing in search results on its own — it controls crawl access, not indexing. Last verified: April 2026.

What does robots.txt do in WordPress?

WordPress generates a virtual robots.txt automatically. You do not need a physical file on your server unless you want to customize behavior. Out of the box, WordPress allows all bots to crawl everything — except when you check Settings → Reading → Discourage search engines, which adds a blanket Disallow: / directive. We see that checkbox left on by accident on client sites more often than any other SEO error.

The Robots.txt specification is maintained by Google Search Central. It supports two core directives:

Allow: /path/ — explicitly permits crawling of a URL or directory
Disallow: /path/ — blocks the crawler from requesting that URL

Why it matters for WordPress SEO

One wrong line blocks Google from your entire site. In our testing across 200+ client sites, a Disallow: / left in a staging robots.txt — then pushed to production — is the single most common cause of a site vanishing from search results overnight. Google Search Console flags it as “Blocked by robots.txt,” but the damage is already done if you miss the alert.

Conversely, a well-configured robots.txt reduces wasted crawl budget. Blocking /wp-admin/, /wp-includes/, and low-value parameter URLs means Googlebot spends its visits on pages that matter, not internal admin endpoints.

What a WordPress robots.txt looks like

A minimal, production-ready robots.txt for a WordPress site:

User-agent: *
Disallow: /wp-admin/
Allow: /wp-admin/admin-ajax.php

Sitemap: https://yourdomain.com/sitemap_index.xml

The admin-ajax.php allow rule is required because front-end functionality — contact forms, dynamic content — uses that endpoint. Block it and AJAX-dependent features break silently for crawlers.

robots.txt vs. noindex: what’s the difference?

These two controls are not interchangeable. robots.txt blocks a crawler from visiting a URL. A noindex meta tag tells a crawler that visited the page to exclude it from search results. If you block a URL in robots.txt, Google cannot read the noindex tag on it — the page may still appear in search results with no description, just a URL. For pages you want removed from search entirely, use noindex, not robots.txt.

noindex meta tag — controls indexing, not crawl access
XML sitemap — tells crawlers which pages to prioritize
crawl budget — the number of pages Google will crawl per site per day
canonical URL — signals the preferred version of duplicate content
Search Console coverage report — shows which pages are blocked, indexed, or excluded

Additional reading

How to edit your WordPress robots.txt file — step-by-step tutorial for customizing via Yoast, Rank Math, or direct file edit
WordPress SEO checklist for beginners — covers robots.txt alongside sitemaps, titles, and schema
Common WordPress indexing errors and how to fix them — includes the “Discourage search engines” checkbox fix