Google Search Appliance: Initiate a Recrawl
If you are noticing that new pages are not being shown in your search results or that deleted pages are still showing in your search results, you can have the Google Search Appliance (GSA) recrawl your site. Note that you can only recrawl individual sites or URL patterns; you cannot trigger a recrawl for entire collections.
Initiating a Recrawl
- Log into the GSA at https://google.umn.edu:8443/.
- Expand Index.
- Expand Diagnostics, and then select Index Diagnostics.
- Go to the Show Diagnostics for Collection dropdown; select a collection that contains the site you wish to recrawl.
- If you know exactly what you want recrawled:
- In the URLs Starting With: text field, enter the pattern for the recrawl
- After entering the URL pattern, click the Show URLs button to display a list of URLs that will be recrawled.
- Note: Make sure you are using the correct protocol -- either http or https.
- If you don’t know exactly what you want recrawled:
- In the first column of the All Hosts table, click the Host Name for the site you want to recrawl.
- Then, select the protocol that you wish to recrawl (http or https).
- If you want only a subdirectory recrawled, click the folder in the table.
- Next to All hosts you should see the URL pattern with two buttons: Recrawl This Pattern and Cancel Recrawl Request.
- Note: if you are recrawling a single page, there will be only one button: Recrawl This URL
- Click Recrawl This Pattern or Recrawl This URL to put your site/subdirectory/page into the Crawl Queue to be reindexed by the GSA.
Note that the recrawl will not be immediate. Triggering a recrawl will include your site in the list of items to be recrawled. This can take up to an hour to complete. There is no way to adjust the Crawl Queue to prioritize some requests over others.
If you are still not seeing the changes after an hour, contact email@example.com for further help.