GoogleBot and Duplicate Content - How Google Helps You
So, I’m again back to my favorite topic - Google Bot which has at times put me into deep frustration and at some other time, simply delighted me with terrific search results. Now anyone who cares little bit for SEO may not be unaware that duplicate content has been a mojor concern oflate for your websites. I just checked out Google Official Blog to get an insight into what Google has to say about this burning topic recently. There are certain changes in their stand regarding use of url parameters - and that’s going in the right direction, and must be a great relief for webmasters who love to play with url parameters.
Surprisingly, Bots may consider your site to contain duplicate content even when you have actually not copied the content of someone else. It may also arise due to use of
- Session IDs to track your visitors
- Affiliate Tracking IDs
- Or any other url parameters
In each of the above cases you are pointing to the same physical file with different name,
like:
http://mydomain.com/
http://mydomain.com/sess_id=123456
and http://mydomain.com/sess_id=abcdef
are actually point to the same web page (http://mydomain.com/) . But bots consider it as three different pages, so, it is natural for them to index all the three versions of the page having the same content.
Now just see the index. All of the three versions of the same web page may have been indexed with great care and honesty.
http://mydomain.com/
http://mydomain.com/sess_id=123456
and http://mydomain.com/sess_id=abcdef
And how will the Bots interpret such pages? - Bots will consider that you are simply displaying (duplicating) the same content in different pages and may impose a penalty - and there is every possibility that your PR gets diluted - as the same IBLs ( Links pointing to your web page) may be distributed among all the three versions. So, this will in turn reduce your position in Search Engine Result Pages - a gross disaster for you!
But as told in that post - and the best part of it is the way Google Bot handles such url parameters, and seems to have somehow got rid of their earlier reservation regarding use of URL parameters…
It says:
- They will try to cluster the three(or more) different versions of the urls
- They will select the one they think as the “best” URL to represent the cluster.
- Then all properties of the urls are attributed to the representative one.
It appears more convincing and relaxing particularly considering the vast majority of websites using dynamic URL parameters. Also a point in this regard is worth mentioning . As stated in that post - “webmasters need not be overly concerned with the loss of link popularity or loss of PageRank due to duplication”
However, you can still help Google by:
- Submitting a sitemap
- keeping the URL as clean as possible.
There are few other recommendations particularly for the merchant sites, which you can read here!
So, I think there is enough reason to relax a little bit for the time being if the bot behaves as they claim it to do in that post.