What heuristics do browsers use to cache resources not explicitly set to be cachable? What heuristics do browsers use to cache resources not explicitly set to be cachable? google-chrome google-chrome

What heuristics do browsers use to cache resources not explicitly set to be cachable?


Let's assume all browsers we are interested in are Internet Explorer 8 or newer (e.g. IE5 has some terrible behaviour with caching headers).

There is only ONE standards based way of controlling caching (introduced with HTTP/1.1) - the Cache-Control HTTP header.

Since at least 1996 IE has been using an opt-out policy for caching HTTPS content.

Seemingly since its introduction Chrome has done opt-out for HTTPS (i.e. it will cache it unless told not to). In 2011 Firefox 4 (but not Safari) switched to opt-out caching for HTTPS content. Source.

Recommendations

  1. Only use HTTP headers to control browser caching. If you decide to go against this be aware that IE only recognizes two cache control directives that are set inside HTML:

    <META HTTP-EQUIV="Pragma" CONTENT="no-cache"><META HTTP-EQUIV="Expires" CONTENT="-1">

    and seemingly only the former is useful in the HTTPS scenario. Further, there can be problems when trying to use Pragma in IE. Finally, Chrome ignores cache directives in meta tags reducing their usefulness even further.

  2. Don't use the Expires header. In modern browsers Expires is superseded by Cache-Control. Expires: 0 and Pragma: no-cache are technically invalid response headers. Yes, they have existed since the beginning but not all modern browsers (e.g. Chrome) use them and they have been superseded by Cache-Control.

  3. The Vary header is a minefield. How Vary behaves in older IEs. How Vary behaves with XHR. Finding the details out is left as an exercise to the reader - and leaves the impression it is preferable to use different URLs for different content...

  4. Allow the browser to make conditional requests by setting ETags.Etags allow a browser to do a lightweight check to see if the content has changed and it can avoid making a full request if it hasn't.

  5. Be aware some browsers are just broken and need hacks. IE 8 can have issues downloading files which it has been told not to cache.

Browser caching algorithms

See also


From Chromium's source code: https://code.google.com/p/chromium/codesearch#chromium/src/net/http/http_response_headers.cc&l=1082&rcl=1421094684

  if ((response_code_ == 200 || response_code_ == 203 ||       response_code_ == 206) && !must_revalidate) {    // TODO(darin): Implement a smarter heuristic.    Time last_modified_value;    if (GetLastModifiedValue(&last_modified_value)) {      // The last-modified value can be a date in the future!      if (last_modified_value <= date_value) {        lifetimes.freshness = (date_value - last_modified_value) / 10;        return lifetimes;      }    }  }


This blog post says that Internet Explorer 9 uses max-age = (DownloadTime - LastModified) * 0.1: http://blogs.msdn.com/b/ie/archive/2010/07/14/caching-improvements-in-internet-explorer-9.aspx

Which is effectively the same as Mozilla (this post is rather old, I don't know if it has changed since): https://developer.mozilla.org/en-US/docs/HTTP_Caching_FAQ