Google Custom Search: Clean Up Duplicate Search Results

Duplicate search results can be caused by two different issues.

  • Single sites with both https and http versions
  • Single sites with both www and non-www versions

To avoid duplication in search results and to streamline your site's analytics, these kinds of sites should be set up to permanently redirect to the canonical site. Http sites should redirect to https and www sites should redirect to non-www.

Below are instructions for doing this with Enterprise Drupal sites and HTML sites.

Enterprise Drupal

  1. Enable the domain_301_redirect, securepages, and the block modules if they aren't already enabled
  2. Domain 301 Redirect: Navigate to /admin/config/search/domain_301_redirect
  3. Select Enabled
  4. Enter the domain (https without www) to which you'd like all domains to resolve
  5. If you'd like redirection to occur only from specific pages you may enter those in the Pages field at the bottom
  6. Save your changes
  7. *Secure Pages: Navigate to /admin/config/system/securepages
  8. Select Enabled
  9. Change the Redirect HTTP Code to 301
  10. Enter both secure (e.g. https://mysite.umn.edu) and non-secure (http://mysite.umn.edu) base URLs, without www
  11. Select "Make Secure Only the Listed Pages" radio button
  12. In the Pages field, enter an asterisk (*) to redirect all pages
  13. After you save your changes, flush all caches and test to ensure redirection is working correctly

* If you copy a site with securepages enabled between environments or locally you will be redirected to the production environment. To correct this, use the following to delete securepages' two basepath variables in your testing or local environment:

drush vdel securepages

HTML

If your site supports https, add the following to your .htaccess file:

RewriteEngine On

RewriteCond %{HTTP_HOST} ^www\.(.*)$ [NC]

RewriteRule ^(.*)$ https://%1%{REQUEST_URI} [R=301,QSA,NC,L]

RewriteCond %{HTTPS} off [OR]

RewriteCond %{HTTP:X-Forwarded-Proto} !https

RewriteRule ^(.*)$ https://%{HTTP_HOST}%{REQUEST_URI} [L,R=301]

If your site does not support https, add the following to your .htaccess file:

RewriteEngine On

RewriteCond %{HTTP_HOST} ^www\.(.*)$ [NC]

RewriteRule ^(.*)$ http://%1%{REQUEST_URI} [R=301,QSA,NC,L]

NOTE: If the "RewriteEngine On" line is already in your .htaccess file do not add it again.

Test to ensure it works correctly. One way to validate your .htaccess is to test it against various URLs from here: http://htaccess.mwl.be/