Does Google's "duplicate content penalty" harm Drupal sites? No! Here's why.
For years, Drupal has enjoyed a solid reputation as a search engine friendly CMS. It generates relatively clean, standards-compliant HTML out of the box; syncs up the important TITLE tag with semantically useful H1 and H2 tags in the body of each page; and provides short, human-readable URLs with plentiful options for customization. (Anecdotal evidence: several years back, I wrote a post on my Drupal-powered blog that mentioned the name of the company I worked for. Within two weeks, my blog post ranked higher than the company's own web site on Google.)
Recently, I've witnessed a number of discussions where people expressed concern about the way Drupal generates the human-readable URLs that help make it Google-friendly. In particular, they were worried about Google's dreaded Duplicate Content Penalty, a system designed to keep spammers from flooding Google with the same content at dozens (or hundreds!) of URLs. There's a lot of confusion floating around, so for the geeks in the crowd (and the not-so-geeky interested in learning how things work behind the scenes), I thought it would be useful to give a guided tour of how Drupal manages and generates URLs.
read more