Searching, Crawling, and Indexing of retired pages
Our Technology Help website page retirement process puts pages behind authentication, causing the webserver to return a 403 error to search crawlers.
- After a document has been indexed, a 403 error does not result in immediate removal from the search index.
- Retired pages are likely to drop from the search index in less than 12 days.
A page will be dropped after the search crawlers receive a 403 error on four consecutive attempts to retrieve the page.
For the Google Search Appliance, used for IT@UMN searches and University of Minnesota search, a page may take approximately one month to drop from the search index, following the pattern described in this Google Search Appliance documentation.