Tuesday, July 28, 2009

Duplicate Content and Canonical Issues Solved by rel=canonical Tag

One of the most common and problematic issues for website developers, particularly those with larger, dynamic sites powered by databases, is the issue of duplicate content. Search engines are primarily interested in unique documents and text, and when they find multiple instances of the same content, they are likely to select a single one as "canonical" and display that page in their organic search listing pages.

Duplicate Content and Canonical Issues

If a website has multiple pages with the same content, either through a content management system that creates duplicates through separate navigation, or because copies exist from multiple versions, the website may be hurting those pages' chances of ranking in the search engine result pages (SERPs). In addition, the value that comes from anchor text and link weight, through both internal and external links to the page, will be diluted by multiple versions. An effective SEO strategy must include the solution of duplicate content and canonical versions because it's one of the most important On Page SEO elements of a website for successful rankings in the SERPs.

Good news comes that a few days ago Google and other search engines announced support for a canonical link element rel="canonical" that can help website owners with duplicate content issues.

This tag permits to publicly stipulate the preferred version of a URL. If a website has identical or vastly similar content that's accessible through multiple URLs, this element provides more control over the URL returned in search results. It also helps to make sure that properties such as link popularity are consolidated to the preferred version. To do this, specify a link tag in the head section of the page content:

Link rel=canonical example

The above tag indicates to the search engine crawlers that the URL it is present on should be represented canonically to http://www.mydomain.com/services. This URL "http://www.bangladesh-seo-company/services" would eliminate the duplicates of the following URLs -

http://www.bangladesh-seo-company.com/services?trackingid=707
http://www.bangladesh-seo-company.com/services?sessionid=bdseo707
http://www.bangladesh-seo-company.com/services?printable=yes&trackingid=seoexpertbd707

A few technical concepts to implement the rel="canonical" attribute:
  • The URL paths in the link tag can be absolute or relative, though representatives from search engines recommend using absolute paths to avoid any chance of errors.
  • The attribute tag can only point to a canonical URL form within the same domain and not other domains. For example, the tag on http://seobangladesh.mysite.com can point to a URL on http://www.mysite.com but not on http://www.yoursite.com or any other domain.
  • The tag allows slight differences content with canonical version
  • If the rel="canonical" returns a 404 then search engine crawlers continue to index the content and use a heuristic to find a canonical, but it is recommend that we specify live URLs as canonical
  • rel="canonical" can be a redirect. Google will then process the redirect as usual and try to index it
The new tag attribute also supported by major search engines Yahoo, Bing and Ask.

3 comments:

  1. this is very good news from search engine industry to solve the duplicate content or canonical issues. But what about the others that duplicate content for many different sites?

    is search industry is going to take any strps to solve this? how will be the effect on page rank sculpting for this new canonical tag?

    ReplyDelete
  2. Search engines normally do not like duplicate contents. Duplicate contents arise in many formats or ways like from affiliate marketing, content plagiarism or through a content management system that creates duplicates through separate navigation.

    Recently search engine industry goes a step ahead for webmasters to solve duplicate content or canonical issue for the same domain.

    but for different domains or websites there is no solution for duplicate content still. I think it will not very effective and also not very easy for solving duplicate content issue for different domains.

    there are some effects on page rank sculpting for new canonical tag. It would be wise for webmasters to correct the canonical problems within very short time.

    ReplyDelete
  3. I will recommend using a .htaccess 301 permanent redirects for all cannonicalization issues. This maybe a little bit complicated but was very much effective. This will instruct the search engine that the page has moved permanently therefore will always follow the new link stated in the redirection code.

    ReplyDelete