What are Canonical URLs &Why Are They Important?

Naushil Jain
4 min readDec 12, 2020
What are Canonical URLs &Why Are They Important?
Photo by MayoFi on Unsplash

A Canonical URL tells search engines that certain similar URLs are actually the same. Sometimes we have a single page accessible by multiple URLs, or different pages with similar content (for example, a page with both a mobile and a desktop version). The search engine sees these as duplicate versions of the same page and it will choose one URL as the canonical version and crawl that, and all other URLs will be considered duplicate URLs and crawled less often. The canonical link element was introduced by Google, Bing, and Yahoo! in February 2009.

A canonical URL refers to an HTML link element, with the attribute of rel="canonical" (also known as a canonical tag), found in the <head> element of your client’s webpage.

Take for example the following URLs:

naushiljain.medium.com

https://naushiljain.medium.com

https://m.naushiljain.medium.com
www.naushiljain.medium.com

https://www.naushiljain.medium.com

Each URL is referring to the same homepage content for my naushiljain.medium.com blog, however, the URLs themselves are slightly different. This can be an issue for search engines, because the search engine itself doesn’t necessarily know which page should be the source of truth, and it may just choose a canonical URL algorithmically for you.

In other words, if you have a web page accessible by multiple URLs, or different pages with similar content (i.e. separate mobile and desktop versions), you should specify to a search engine which URL is authoritative (canonical) for that page.

Why should We have to choose a canonical URL?

There are a number of reasons why you would want to explicitly choose a canonical page in a set of duplicate/similar pages:

  1. It will help you to show which URL you want people to see in search results.

You might prefer people to reach your profile page via:

https://naushiljain.medium.com

Rather than:

https://m.naushiljain.medium.com

Using canonicals can help you keep things “clean.”

2. You can manage your website's syndicated content.

If you syndicate your content for publication on other domains, you want to consolidate page ranking to your preferred URL.

3. It will avoid spending crawling time on duplicate pages.

4. It will simplify tracking metrics for a single product/topic

If our website has a variety of URLs then it’s more challenging to get consolidated metrics for a specific piece of content.

How I can find my canonical URL, according to Google?

Use the URL Inspection tool to learn which page Google considers canonical. Note that even if you explicitly designate a canonical page, Google might choose a different canonical for various reasons, such as performance or content.

Find Your Website Canonical URLs: Click Here

Important Notes

  • Do not use the robots.txt file for canonicalization purposes.
  • Do not use the URL removal tool for canonicalization. It removes all versions of a URL from search.
  • Do not use noindex, a means to prevent the selection of a canonical page. This directive is intended to exclude the page from the index, not to manage the choice of a canonical page.
  • Try to Prefer HTTPS over HTTP for canonical URLs.

Methods to define Canonical URLS

Choose one of the following methods to specify a canonical URL and Be sure to follow the important notes above for all methods.

  1. Using rel=”canonical” link tag

You can use a <link> tag in the page header to indicate when a page is a duplicate of another page. Suppose you want https://example.com/games/bolo-tarara to be the canonical URL, even though a variety of URLs can access this content. Indicate this URL as canonical with these steps:

  • Mark all duplicate pages with arel=”canonical”link element. Add a <link> element with the attribute rel=”canonical” to the <head> section of duplicate pages, pointing to the canonical page, like this one:

<link rel=”canonical” href=”https://example.com/games/bolo-tarara" />

If the canonical page has a mobile variant, add a rel=”alternate” link to it, pointing to the mobile version of the page:

<link rel=”alternate” href=”https://example.com/games/bolo-tarara">

2. Using rel=”canonical” HTTP header

If you can configure your server, you can use rel=”canonical” HTTP headers (rather than HTML tags) to indicate the canonical URL for non-HTML documents such as PDF files.

For example, if you expose a PDF file through multiple URLs, you can return a rel=”canonical” HTTP header such as the following for the duplicate URLs to tell Googlebot what is the canonical URL for the PDF file:

Link: <http://www.example.com/downloads/docs.pdf>; rel=”canonical”

3. Using Sitemap

Pick a canonical URL for each of your pages and submit them in a sitemap. All pages listed in a sitemap are suggested as canonicals; Googlebot will decide which pages (if any) pages are duplicates, based on the similarity of content. We don’t guarantee that we’ll consider the sitemap URLs to be canonical, but it is a simple way of defining canonicals for a large site, and sitemaps are a useful way to tell Google which pages you consider most important on your site.

--

--