Fixing Duplicate Content SEO Problems in Drupal

Having duplicate content is a major problem for search engine optimization (SEO). To put it simply, by having multiple copies of the same content online, you’re competing against yourself for search result rankings. And Google has also been explicit that duplicate content on your site may result in lower search ranking. Avoiding this penalty is important for maximizing your site's search rankings, however, Drupal's default configuration can often lead you to unknowingly create duplicate content. Read on to find out how to fix this issue.

The Problem: Drupal's URL Path System Causes Duplicate Content

In general, it’s pretty easy to avoid showing content more than one place, but Drupal users need to pay special attention to the way URL paths are managed. Let’s take a made-up example: a page called "My Puppies" on the site https://www.example.com. We'll assume that the page’s internal Drupal path is “node/10”, and we’ve given it a friendly path alias of “my-puppies”. With the default configuration of Drupal, this page will now visible at ALL of the following URLs:

https://www.example.com/node/10
https://www.example.com/node/10/
https://www.example.com/my-puppies
https://www.example.com/my-puppies/
https://www.example.com/My-Puppies
https://www.example.com/My-Puppies/
https://example.com/node/10
https://example.com/node/10/
https://example.com/my-puppies
https://example.com/my-puppies/
https://example.com/My-Puppies
https://example.com/My-Puppies/

That’s TWELVE different pages with exactly the same content, all from only one node. And it could be even more, since any variation in capitalization ("/mY-pUppIES", "/My-puppiES", etc.) will also load this page. Luckily, this problem is relatively easy to fix by doing the following two things.

Step One: Edit Drupal's .htaccess file to redirect all users to a single domain.

The .htaccess file that comes with Drupal is located at the top level of the website folder tree. (with index.php and update.php) In this file, there is a disabled section of options that looks like this:

 
# If your site can be accessed both with and without the 'www.' prefix, you
# can use one of the following settings to redirect users to your preferred
# URL, either WITH or WITHOUT the 'www.' prefix. Choose ONLY one option:
#
# To redirect all users to access the site WITH the 'www.' prefix,
# (https://example.com/... will be redirected to https://www.example.com/...)
# adapt and uncomment the following:
# RewriteCond %{HTTP_HOST} ^example\.com$ [NC]
# RewriteRule ^(.*)$ https://www.example.com/$1 [L,R=301]
#
# To redirect all users to access the site WITHOUT the 'www.' prefix,
# (https://www.example.com/... will be redirected to https://example.com/...)
# uncomment and adapt the following:
# RewriteCond %{HTTP_HOST} ^www\.example\.com$ [NC]
# RewriteRule ^(.*)$ https://example.com/$1 [L,R=301] 

Pick one of the domain options by deleting the # in front of the two associated lines, and don't forget to actually put your own domain name in there. So, on this website we use:

# RewriteCond %{HTTP_HOST} ^www\.zengenuity\.com$ [NC]
# RewriteRule ^(.*)$ https://zengenuity.com/$1 [L,R=301]

You have now eliminated half of the duplicate URLs in the list above.

Step Two: Install and Enable the Global Redirect Module

The Global Redirect module does three things for you. First, it will automatically remove any slash at the end of a URL. Second, it will redirect any URL that has different capitalization to the URL address that you actually selected. So, https://www.example.com/My-Puppies will automatically redirect to https://www.example.com/my-puppies. Finally, Global Redirect will ensure that anyone accessing the original internal Drupal path (https://www.example.com/node/10) gets redirected to the path alias that you created.

Conclusion

After editing .htaccess and enabling Global Redirect, you will be left with a single URL for this content:

https://www.example.com/my-puppies

Every other variation of the URL will be redirected to this address with search engine friendly 301 redirects. By following these guidelines, you will eliminate the duplicate content that is generated by the Drupal system itself and take a big step towards maximizing the search engine ranking of your site content. 

Wayne Eaker
November 15, 2010