Indexability
Indexability is the property of a URL being eligible to appear in search results. Five gates must all open for a URL to be indexed — close any one, and the page disappears from search regardless of content quality.
What is Indexability?
Indexability is the property of a URL being eligible to appear in a search engine's index. A page is indexable when every gatekeeper in the pipeline gives a green light: robots.txt allows crawling, the server returns 200, the HTML or HTTP header carries no noindex directive, and the canonical points to the URL itself.
If any single gate is closed, the URL falls out of the index regardless of its content quality or backlink profile. This is why indexability audits are the first step of every technical SEO engagement.
The 5-Gate Indexability Checklist
- Crawlable: robots.txt does not disallow the path
- Reachable: server returns HTTP 200 (not 4xx, 5xx, or redirect chain)
- Indexable directive: no
noindexin meta robots or X-Robots-Tag header - Canonical self-reference:
rel=canonicalpoints to the URL itself, not a different version - Discoverable: linked from at least one indexable page or listed in sitemap.xml
The 4 Search Console Indexability States
| State | Meaning | Action |
|---|---|---|
| Submitted and indexed | In sitemap, in index, served in SERPs | Healthy — monitor only |
| Crawled - currently not indexed | Google fetched but chose not to index | Improve content quality, add inbound links |
| Discovered - currently not indexed | Google found URL but did not crawl yet | Crawl budget issue or low-priority signal |
| Excluded by noindex tag | Directive blocks indexing | Remove noindex if accidental |
Common Indexability Killers
Accidental noindex
Most common cause: a staging environment ships <meta robots noindex> to production, or a CMS template includes a default noindex on category archives. Audit with Meta Tag Generator or curl:
curl -I -A "Googlebot" https://yourdomain.com/page
Canonical pointing elsewhere
If /blog/post-1 has rel=canonical href="/blog/post-1?utm=campaign", Google indexes the UTM version. Worse: if canonical points to a 404 or a redirect target, the page falls out entirely.
Soft 404
Server returns 200 but content is essentially empty ("No results found" pages, expired product pages with just a header). Google classifies these as soft 404s and silently excludes them.
Render-blocked content
SPAs that fail to server-render critical content may have empty initial HTML. Googlebot sees a near-blank page, classifies it as thin content, and skips indexing. Verify with URL Inspection › View Crawled Page › HTML in Search Console.
Frequently Asked Questions
What is the difference between crawlable and indexable?
Crawlable means Googlebot can fetch the page. Indexable means Google is allowed to include it in the index after fetching. A page can be crawlable but not indexable (noindex tag) or indexable but not crawlable (robots.txt block) - both cases prevent the page from showing in search results.
How long does it take Google to index a new URL?
Anywhere from minutes to 6+ months depending on site authority, internal linking, and crawl budget. High-authority sites with strong internal links see new URLs indexed within 24 hours. Low-authority new sites may wait weeks even for clearly indexable pages.
Why does Google sometimes not index pages that meet all indexability criteria?
Because indexability is necessary but not sufficient. Google also evaluates quality, uniqueness, and demand. Thin content, duplicates, and pages with no traffic potential are routinely classified as 'Crawled - currently not indexed' even when fully indexable.
How do I check indexability for a single URL?
Use Search Console URL Inspection. Enter the URL and review the Indexing section - it shows the indexability verdict, the canonical URL Google chose, last crawl date, and any blocking directives detected.
Can a URL be indexed without being in the sitemap?
Yes. Sitemaps speed up discovery but are not required for indexing. URLs discovered through internal links, external backlinks, or direct submission via URL Inspection can all be indexed without sitemap entries.
Related Terms & Resources
- Robots.txt Best Practices — the #1 cause of accidental deindexing
- X-Robots-Tag glossary — HTTP-level indexability directives
- Crawl Budget glossary — why some URLs never get crawled
- Robots.txt Tester — verify crawl rules per URL
Part of the PositiveBacklink SEO Glossary. Updated May 2026.