Your canonical tag is a hint, and you probably pointed it at the wrong URL.
The canonical tag is one line of HTML, which makes it easy to copy from a template, easy to leave pointing at the wrong place, and easy to forget. It tells search engines which URL is the "real" one when the same or very similar content lives at multiple addresses. Get it right and your ranking signals consolidate onto the page you actually want indexed. Get it wrong and you do one of two damaging things: you deindex pages you wanted to rank, or you split ranking authority across duplicate URLs that should have been merged.
Here is the part most developers miss: a canonical is a hint, not a command. Google treats <link rel="canonical"> as one signal among several, and it will quietly ignore a canonical that contradicts your internal links, your sitemap, or your redirects. That single fact explains most of the confusing canonical behavior you will ever see, and it is why the mistakes below are so costly. Every one of them is visible in your HTML before a crawler ever disagrees with you.
What's in this post
- What a canonical actually does, and why it is only a hint
- Self-referencing canonicals: the correct default
- Absolute, never relative: the ambiguity that bites
- The four canonical mistakes that deindex real pages
- Canonical vs. noindex: do not combine them
- How to verify all of this in one scan
What a canonical actually does, and why it is only a hint
When the same content is reachable at multiple URLs (a product page with tracking parameters, a www and a non-www version, a trailing-slash variant, a print view), search engines have to pick one URL to index and rank. Left to themselves, they guess. The canonical tag lets you make the call instead, consolidating the duplicate-content signals onto the URL you choose. Google documents this in Consolidate duplicate URLs.
The tag lives in the <head>:
<head>
<link rel="canonical" href="https://example.com/products/widget" />
</head>
The critical nuance: Google calls the URL it actually picks the canonical, and your tag is only one input to that decision. Internal links, the URL in your sitemap, and HTTP redirects all feed in. If those signals point one way and your tag points another, Google can and does override your tag. So the canonical tag is not a directive you issue; it is a vote you cast, and a low-confidence or self-contradictory vote gets discarded. Everything that follows is about making your vote unambiguous and consistent with every other signal.
Self-referencing canonicals: the correct default
For any page that should be indexed on its own, the recommended setup is a canonical that points to itself:
<!-- On https://example.com/blog/canonical-tags this is correct -->
<link rel="canonical" href="https://example.com/blog/canonical-tags" />
This looks redundant. It is not. A self-referencing canonical removes ambiguity for the crawler and, more importantly, protects you from the duplicate URLs you did not create on purpose. The moment someone shares your page with ?utm_source=newsletter or ?ref=twitter appended, that is a new URL serving identical content. Without a self-referencing canonical, Google may index the parameter version and split signals. With one, every parameter variant points back to the clean URL.
The rule of thumb: if a page should be indexed as itself, give it a self-referencing canonical. Only point a canonical at a different URL when the page is genuinely a duplicate that should not rank independently. Do not leave indexable pages with no canonical at all and hope Google figures it out, because that is exactly the guessing you are trying to prevent.
Absolute, never relative: the ambiguity that bites
A canonical value must be an absolute URL: full protocol, full host, full path. Relative canonicals are technically allowed, but they are ambiguous and easy to resolve incorrectly.
<!-- Wrong: relative, ambiguous across http/https and www/non-www -->
<link rel="canonical" href="/products/widget" />
<!-- Right: absolute, no room for misinterpretation -->
<link rel="canonical" href="https://example.com/products/widget" />
The problem with /products/widget is that it inherits the protocol and host of whatever URL the crawler happened to fetch. If your site answers on both http:// and https://, or on both www. and the bare domain, a relative canonical resolves to a different absolute URL depending on which variant got crawled. Now your "consolidation" tag is itself producing duplicates. An absolute canonical names exactly one URL and ends the ambiguity. This is the single cheapest canonical mistake to avoid, so make every canonical absolute, every time.
The four canonical mistakes that deindex real pages
These four account for the large majority of canonical damage in production. All four are detectable in the page's HTML or response headers.
MISTAKE | WHAT IT DOES | BLAST RADIUS
-------------------------------------------------------------------------------------------
Everything to the homepage | tells Google only the homepage is real| whole site deindexed
Templated staging/example URL | canonicalizes prod to a dead URL | affected templates gone
http/https or www mismatch | points at a redirecting variant | signals leak, page drops
HTTP Link header disagrees | two conflicting canonicals | Google ignores both
1. Every page canonicalizing to the homepage. A developer hardcodes <link rel="canonical" href="https://example.com/" /> into the shared layout, and now every URL on the site tells Google that the homepage is the real version of it. Google obliges and drops the rest from the index. This can deindex an entire site, and it is alarmingly easy to ship because the homepage itself still looks fine.
2. A templated staging or example URL left in production. The canonical is generated from a config value that never got switched over, so production pages canonicalize to https://staging.example.com/... or a literal https://example.com/page placeholder that does not exist. Google follows the hint to a URL it cannot index, and the real page suffers.
3. An http-vs-https or www mismatch. Your live page is https://www.example.com/page, but the canonical points at http://example.com/page, which 301-redirects back. You have now pointed your canonical at a URL that redirects, which is a contradictory signal. Make the canonical match the final, non-redirecting URL exactly. If you are consolidating hosts, the redirects and the canonical must agree; a redirect checker is the fastest way to confirm the http/https and www/non-www variants all resolve to the one host your canonical names.
4. A conflicting HTTP Link header. Canonical can also be delivered in the HTTP response, not just the head tag:
Link: <https://example.com/products/widget>; rel="canonical"
This is the right way to canonicalize non-HTML files like PDFs, which have no <head>. But if you set both a Link header and a head tag and they disagree, you have handed Google two contradictory canonicals for the same page. When the signals conflict, Google's confidence drops and it may ignore your input entirely and pick its own canonical. If you use both, they must name the identical URL.
Canonical vs. noindex: do not combine them
Canonical and noindex solve different problems and must not be mixed to mean "deindex this page."
<!-- noindex: keep this page OUT of the index entirely -->
<meta name="robots" content="noindex" />
<!-- canonical: this page IS a duplicate of another indexable page -->
<link rel="canonical" href="https://example.com/the-real-page" />
A canonical says "index that other URL instead of me, and merge my signals into it." A noindex says "do not index me, and do not pass my signals anywhere." Putting both on the same page sends mixed signals: you are simultaneously telling Google to consolidate this page into another (which requires Google to process the page) and to ignore the page completely. Google has warned against combining them precisely because the instructions contradict each other, and the result is unpredictable. Pick one. If a page is a duplicate that should consolidate, use canonical alone. If a page should never appear in search, use noindex alone. The canonical tag is a head-tag concern that lives right next to your title, meta description, and robots directives, so it is worth auditing the whole <head> together; see common meta tag mistakes for the rest of that block.
How to verify all of this in one scan
Every mistake above is visible before Google ever disagrees with you. But checking by hand means reading the <head> on every page, resolving each canonical to an absolute URL, confirming it is self-referencing where it should be, checking the HTTP Link header for a second conflicting canonical, and verifying the named URL does not redirect, across your whole site.
The LintPage Canonical Tag Checker does it in one request. It confirms a rel="canonical" tag is present, resolves it to an absolute URL, tells you whether the canonical is self-referencing or points at a different page or domain, checks whether a canonical is also declared in the HTTP Link header, and flags relative or malformed canonical values, the exact failure modes that quietly deindex pages.
Canonical problems rarely travel alone. The same pages with a stray homepage canonical often have duplicate URLs leaking into the sitemap, because only canonical, indexable URLs belong in a sitemap; if a non-canonical variant is listed, you are sending Google another contradictory signal (more on that in why your sitemap is broken). It is worth running the full set of checks so the canonical, the sitemap, the redirects, and the head tags all agree.
The 30-second version
The canonical tag tells search engines which URL is the real one when the same content lives at multiple addresses, but it is a hint, not a command: Google ignores a canonical that contradicts your links, sitemap, or redirects. Give every indexable page a self-referencing canonical so tracking parameters do not split your signals. Make every canonical absolute (full https://host/path), never relative, because relative ones resolve differently across http/https and www/non-www. The four mistakes that deindex real pages: every page canonicalizing to the homepage, a templated staging or example.com URL left in production, an http/www mismatch pointing at a redirecting variant, and an HTTP Link header that disagrees with the head tag. Never combine canonical and noindex to mean "deindex this," because the two instructions contradict each other. Find these in your HTML now, before Search Console finds them for you.