Many websites have multiple URL’s that display the same content, especially e-commerce sites. Often times this is simply from various URL paramerters attached to the URL. This may be to sort the products in a different order, track session ID’s, or to note the source of a referral. There have been quite a number of ways to prevent search engines from indexing all of these URL’s in an attempt to avoid duplicate content penalities. A combination of ‘nofollow’ tags in the internal site navigation and regular expression ‘disallow’ rules in the robots.txt file are common solutions. Unfortunately, this requires a thorough analysis of your internal site navigation as well as some knowlege of regular expressions and the proper logic to block only the duplicate URL’s.
Fortunately, Google has recently added a feature in Webmaster Tools that allows for an easier, simpler solution to blocking duplicate content URL’s due to numerous parameters. The ‘Parameter Handling’ feature in Google Webmaster Tools is located by going to ‘Site Configuration’ and then ‘Settings’:
Google’s explanation of the parameter handling feature is as follows. *Note that there’s no guarantee that the GoogleBot will accept your suggestions. 😉
“Dynamic parameters (for example, session IDs, source, or language) in your URLs can result in many different URLs all pointing to essentially the same content. For example, http://www.example.com/dresses?sid=12395923 might point to the same content as http://www.example.com/dresses. You can specify whether you want Google to ignore up to 15 specific parameters in your URL. This can result in more efficient crawling and fewer duplicate URLs, while helping to ensure that the information you need is preserved. (Note: While Google takes suggestions into account, we don’t guarantee that we’ll follow them in every case.)”