WordPress robots.txt Is Blocking Your Pages from Google — Here’s How to Audit and Fix It
- WP SEO Pack
- 0
- Posted on
One misconfigured line in your robots.txt file can silently de-index your entire WordPress site from Google. No warnings. No error messages. Your site just quietly disappears from search results while you sit there wondering why your traffic dropped off a cliff. Robots.txt misconfigurations are the kind of SEO disaster that can take months to notice and weeks to recover from. Let’s talk about how to get this right.
What robots.txt Actually Does (And What It Doesn’t)
Your robots.txt file lives at yoursite.com/robots.txt and is one of the first things Googlebot requests when it visits your site. It contains directives that tell crawlers which parts of your site they’re allowed to access and which they should skip. The key word here is “allowed” — robots.txt is not a security mechanism. It doesn’t prevent pages from being indexed if those pages have external links pointing to them. It just tells well-behaved bots not to crawl certain paths.
The most catastrophic mistake you can make is the one that’s embarrassingly easy: accidentally disallowing everything. The offending code looks like this: Disallow: / — that single forward slash tells all crawlers to avoid your entire site. This setting can end up in your live robots.txt if you copy-paste a staging site configuration without checking it, or if a plugin generates an overaggressive robots.txt without your knowledge.
How WordPress Handles robots.txt
WordPress generates a virtual robots.txt file by default. If there’s no physical robots.txt file in your root directory, WordPress intercepts requests to that URL and generates one on the fly. This default file allows all crawlers to access all content, with a single exception: the /wp-admin/ directory, which is correctly excluded from crawling.
The problem arises when you have a physical robots.txt file in your root directory that overrides the WordPress-generated one, or when a plugin — like an SEO plugin or a security plugin — generates its own robots.txt rules that conflict with your intentions. Check your site’s root directory via FTP or your hosting file manager right now. If there’s a robots.txt file there, open it and read every line.
The Search Engine Visibility Setting: The Silent Killer
WordPress has a built-in setting that adds a noindex header to every page on your site. It’s located at Settings > Reading > “Discourage search engines from indexing this site.” This checkbox is often accidentally left enabled on sites that were initially set up on a staging environment. It’s meant to keep your site out of search results while you’re building it, but it’s also an absolute disaster if you forget to uncheck it before launch.
When this setting is enabled, WordPress adds the following to your robots.txt: Disallow: /. It also adds a noindex header to every page. Either one of these would be enough to tank your rankings. Both together is a complete catastrophe. Check this setting right now if you haven’t launched your site recently.
What Should Actually Be in Your robots.txt
For most WordPress sites, a clean robots.txt is minimal. You want to block crawlers from your admin areas, your login pages, and your feed if you don’t want those indexed. You do not want to block crawlers from /wp-content/uploads/, because that’s where your images live, and Google needs to crawl them to understand your page content.
A sensible baseline looks like this: Allow all user agents to crawl everything, then add specific disallow rules for /wp-admin/, /wp-login.php, and /xmlrpc.php. Add a Sitemap: line pointing to your sitemap index URL. That’s it. Anything more complicated than this should be deliberately chosen, not accidentally inherited from a template you found on Stack Overflow three years ago.
Auditing and Monitoring Your robots.txt
Google Search Console has a robots.txt tester built into the Legacy Tools section. Use it. Test specific URLs against your robots.txt to confirm crawlers can access the pages you want indexed. Also check the Coverage report in Search Console for any URLs blocked by robots.txt that shouldn’t be.
Set a calendar reminder to check your robots.txt after every major plugin update or site migration. Plugin updates can regenerate your robots.txt with new directives. Site migrations almost always introduce robots.txt problems, usually because the staging configuration gets promoted to production without review. Treat your robots.txt like the critical infrastructure document it is, and you’ll avoid one of the most common — and most damaging — WordPress SEO mistakes out there.