Prevent Duplicate Content By Blocking Archives From Search Engines

Why would you want to block archive pages anyway? The answer is to prevent Google from penalizing your blog due to duplicate content. You see archive pages do not have contents of their own. What you see on them are duplicates, taken from individual post pages. And if both pages (the individual post and archive pages having the same content) are indexed by Google, that constitutes duplicate content.

That’s exactly the case for this blog –duplicate content due to archive pages. And I paid the price for it during the last Google PageRank update. This blog’s PR went down from 4 to 3.

The problem probably started when I added Blogger Archive widget to the sidebar. This widget provided access for web crawlers to the archive pages, which eventually led to the pages being indexed by Google.

Preventing duplicate content

You can prevent duplicate content by telling search engines not to index archive pages. This can be achieved by adding a “noindex” robots meta tag to the archive pages (and archive pages only).

Here’s how:

Login to your Blogger account.
Go to Dashboard > Design > Edit HTML.
Find the <head> tag and add the following code below it:

<b:if cond='data:blog.pageType == &quot;archive&quot;'>
<meta content='noindex,noarchive' name='robots'/>
</b:if>


What happens to the archive pages already listed in SERPs?

The archive pages will eventually drop off from search result pages. However if you want them removed quickly, remove them from SERPs using the URL removal tool.

bloggercontrol

Pages

Prevent Duplicate Content By Blocking Archives From Search Engines

Preventing duplicate content

No comments:

Post a Comment

Labels