As a natural continuation of this, today we’d like to announce that we’re adjusting our indexing system to look for more HTTPS pages. Specifically, we’ll start crawling HTTPS equivalents of HTTP pages, even when the former are not linked to from any page. When two URLs from the same domain appear to have the same content but are served over different protocol schemes, we’ll typically choose to index the HTTPS URL if:
- It doesn’t contain insecure dependencies.
- It isn’t blocked from crawling by robots.txt.
- It doesn’t redirect users to or through an insecure HTTP page.
- It doesn’t have a rel=”canonical” link to the HTTP page.
- It doesn’t contain a noindex robots meta tag.
- It doesn’t have on-host outlinks to HTTP URLs.
- The sitemaps lists the HTTPS URL, or doesn’t list the HTTP version of the URL
- The server has a valid TLS certificate.
Although our systems prefer the HTTPS version by default, you can also make this clearer for other search engines by redirecting your HTTP site to your HTTPS version and by implementing the HSTS header on your server.
We’re excited about taking another step forward in making the web more secure. By showing users HTTPS pages in our search results, we’re hoping to decrease the risk for users to browse a website over an insecure connection and making themselves vulnerable to content injection attacks. As usual, if you have any questions or comments, please let us know in the comments section below or in our webmaster help forums.
Posted by Zineb Ait Bahajji, WTA, and the Google Security and Indexing teams