
This adds some practical guidance to the existing section on caching to make it clear that where responses are not making explicit assertions about controlling cache, they MUST "cache-control: no-cache" Change-Id: If13a5a19ff077cd9a2c9c400c24af70ea6f818d9 Closes-Bug: #1747935
3.5 KiB
HTTP Caching and Proxy Behavior
HTTP was designed to be proxied and cached heavily. HTTP caching by both intermediary proxies, and by clients themselves, is to be expected in all cases where it is allowed. This is a fundamental design point to allow HTTP to work at high scale.
That means that whenever a response is defined as cacheable, for any reason, the server implementation should assume that those responses will be cached. This could mean that the server will never see follow up requests if it does not specify appropriate Cache-Control directives on cacheable responses.
The following HTTP methods are defined as cacheable: HEAD, GET, and
POST 7231#section-4.2.3
(section 4.2.3).
Requests that return a status code of any of the following are
defined as cacheable: 200, 203, 204, 206, 300, 301, 404, 405, 410, 414,
and 501 7231#section-6.1
(section 6.1).
A common misconception is that requests issued over a secure HTTP connection are not cached for security reasons. In fact, there is no exception made for https in the HTTP specification, caching works in exactly the same way as for non-encrypted HTTP. Most modern browsers apply the same caching algorithm to secure connections.
Most Python HTTP client libraries are extremely conservative on caching, so a whole class of completely valid RFC caching won't be seen when using these clients. Assuming "it works in the Python toolchain" does not mean that it will in all cases, or is the only way to implement the HTTP. We expect that in-browser javascript clients will have vastly different cache semantics (that are completely valid by the RFC) than the existing Python clients.
Thinking carefully about cache semantics when implementing anything in the OpenStack API is critical to the API being compatible with the vast range of runtimes, programming languages, and proxy servers (open and commercial) that exist in the wild.
Cache Headers in Practice
Given what is said above ("caching [...] is to be expected in all
cases"), services MUST provide appropriate Cache-Control
headers to avoid bugs like those described in 1747935
wherein an intermediary proxy caches a response indefinitely, despite a
change in the underlying resource.
To avoid this problem, at a minimum, responses defined above as "cacheable" that do not otherwise control caching MUST include a header of:
Cache-Control: no-cache
Despite how it sounds, no-cache
(defined by 7234#section-5.2.1.4
) means
only use a cached resource if it can be validated against the origin
server. However, in the absence of headers which can be sent back to the
server in an If-Modified-Since
or
If-None-Match
conditional request, no-cache
means no caching will happen. For more on validation see 7234#section-4.3
.
This means that at least all responses to GET
requests
that return a 200
status need the header, unless explicit
caching requirements are expressed in the response.
MDN provides a good overview of the Cache-Control
header and provides some guidance on ways to indicate that caching
is desired. If caching is expected, in addition to the
Cache-Control
header, headers such as ETag
or
Last-Modified
must also be present.
Describing how to do cache validation and conditional request handling is out of scope for these guidelines because the requirements will be different from service to service.