Good Practice

When to Ask for Support with Google Search Appliance Results

Challenge

Here are some common issues to be aware of when using the Google Search Appliance and next steps for resolving these issues.

Issue

You manage a website and are seeing too much traffic from the Google Search Appliance.

Solution
  • In this situation, email your site’s URL along with any available details from your web logs and an explanation of the issue to web-search@umn.edu. The search appliance can be adjusted to crawl your site with reduced intensity or frequency.

Issue

You’ve noticed that your website appears in search results multiple times for each URL. You have not been able to resolve this using a forward configured on your web server (for example forwarding www.yoursite.umn.edu to yoursite.umn.edu).

Solution
  • In this situation, email your site’s URL along with any details and explanation of the issue to web-search@umn.edu. We can adjust how the Google Search Appliance sees the site; however, you will need to work with your website administrator to make sure that proper forwarding rules are in place.

Issue

You’ve created a new website or moved an existing website and are not seeing it in search results for your campus. It is not showing up in system-wide results either, but it is linked to by a page that is available via University search.

Solution

In this situation, there are a number of troubleshooting steps that can help locate the underlying issue and address it. Be sure to contact web-search@umn.edu for help in trying any of the solutions listed below.

  • Check in the Search Appliance Index Diagnostics
    • Go to Google Search Appliance; select Index → Index Diagnostics.
    • You will be directed to enter the URL you’re adjusting and search for it to show you what it found.
      • If there is nothing for the site, email information to web-search@umn.edu.
      • If there is information about the site and some of the pages are listed as having an issue, (i.e. blocked via robots.txt), correct that issue. In the Index Diagnostics area, click to recrawl the affected URL pattern.
  • Check for anything that might be blocking the crawlers
    • Your site might be blocking crawlers with a firewall rule or via robots.txt file entries.
    • Your site might be in Drupal dev or stage environment; only Drupal production sites can be crawled.
    • Look for and remove any noindex meta tag in the HTML of the site’s homepage.
  • Consider if your site has changed protocol or subdomain
    • It may be only the formerly used protocol (such as http vs. https) is being crawled or only the www. version of the site.
    • In this case, inform web-search@umn.edu of the old and new setting so that it can be updated within the Google Search Appliance.