HTTP caching 101

Every browser implements its own in-memory caching. The information about the cache size per browser is spotty, but there’s one thing for sure: the cache sizes vary. The great thing is that browsers are smart nowadays – they manage their caches opaquely for us, the end-users.

There are a few ways to put these caches to use. But it all starts with HTTP caching directives (or headers). The two HTTP response headers used for specifying freshness (another word for should something be cached) are Cache-Control and Expires:

  • Expires sets an explicit date and time when the content expires;
  • Cache-Control specifies how long the browser can cache the content relative to the fetch time

In cases when the request has both headers specified, Cache-Control takes precedence.

Another way to put the browser caches to use is by using conditional requests and revalidation. We use the Last-modified and ETag (Entity Tag) response headers for that purpose:

  • Last-modified is a timestamp that specifies when the backend last changed the object.
  • ETag is a unique identifier for the content as a string. The server decides its format, but usually, it’s some form of a hash.

As we advance, we will explore caching by using conditional GET requests. In-depth!

Conditional HTTP requests

Conditional requests are HTTP requests that include headers that indicate a certain precondition. When said headers are present, the HTTP server must check the condition. The server must perform the check before executing the HTTP method against the target resource. The result of the request can be different based on the result of the check of the precondition.

At the core of the conditional HTTP requests are validators. Validators are metadata the user-agent and the server use to discern if a resource is stale or not. The revalidation mechanisms implemented by the HTTP server check if the resource stored on the server matches a specific version that the user-agent has cached. The HTTP server performs the check (validation) based on the validators (metadata) sent by the user-agent.

Validators fall in two categories:

  • timestamp of last modification of the resource - the Last-Modified header
  • unique string representing the resource version - the ETag (entity tag) header

Validator types

Both validators, Last-Modified and ETag, allow two types of validation: weak and strong. Depending on the case, the complexity of implementing the validation can vary.

Strong validators

Strong validators are metadata that changes its value whenever the representational data of the resource changes. The representational data is the data format that would be observable in the payload body of an HTTP 200 OK response to GET /resource/:id.

I know that’s loaded, so let me explain through an oversimplified example. Imagine we have a user resource served by GET /users/:id. For example, the response body of GET /users/1 would be:

{
  "id": 1,
  "name": "Jane",
  "age": 30
}

A strong validator in this particular case can be an ETag composed based on the user’s id, name, and age attributes. In Ruby, the MD5 hash of these three values would be:

>> Digest::MD5.hexdigest("1-Jane-30")
=> "fb324ab8bda9e1cbb47c2a001fa36349"

If any of the attributes of the user change, given that our ETag is supposed to be a strong validator, we would like the ETag value to change. For example, after Jane’s birthday, the age will increase to 31. The new ETag would be:

>> Digest::MD5.hexdigest("1-Jane-31")
=> "f78e2fe80b589cd55a2fef324e877d34"

But also the change in the representational data of the resource is observable by GET-ing it from the server, e.g., GET /users/1:

{
  "id": 1,
  "name": "Jane",
  "age": 31
}

This means that the metadata sent in the ETag changed because it is also observable by the user-agent in the data of the resource itself – not a magical field in the server’s database that the user-agent won’t know about.

There are scenarios, such as content negotiation, in which strong validators can change. The bottom line is: the origin server should only change the validator’s value when it is necessary to invalidate the cached responses in caches and other user-agents.

HTTP uses strong validation by default, but it provides a way for clients to specify when they use weak validation. Clients can prepend ETags with the W/ string, denoting that the server should perform a weak validation.

Weak validators

A weak validator is metadata that might not change for every change to the representation data. It’s the opposite of the strong validators – even if the representation data changes, the HTTP server considers the change not worth busting the cached representation in the user-agents and other proxies/caches.

Building on the example from above, let’s imagine that the representation of the user is as follows:

{
  "id": 1,
  "name": "Jane",
  "age": 31,
  "position": "Software Engineer"
}

If this user is promoted, from Software Engineer to Senior Software Engineer, a strong validator would be updated, causing user-agents and proxies to invalidate their caches. But what about weak validators?

It depends. The validator’s weakness might stem from how the product uses the data or the context in which it resides. For example, say we have an endpoint as part of an employee directory within a company. In that context, the change of age (i.e. birthday) might not be the reason to invalidate a resource. Invalidation in such an event will bust the caches, and we will incur a performance penalty for a non-essential change.

If that is the case, the ETag will not take all attributes from the representation into account – only the relevant ones, hence being a weak ETag. For example:

>> etag = Digest::MD5.hexdigest("1-Jane-Software-Engineer")
=> "d18968bf453b0208dbbbcb5bd72af3e1"

But a change of position (a.k.a. promotion) is relevant information that should cause the resource to be revalidated by caches.

In other words, an origin server should change a weak entity tag whenever it considers prior representations to be unacceptable as a substitute for the current representation.

Revalidating weak ETags Revalidating weak ETags

Imagine Jane getting a promotion and then still seeing their old title in the employees' directory, even though they worked so hard to get the promotion. Soul-crushing. And all that because of a weak ETag. So choose your validators wisely!

Deeper into the validators

Let’s look at the format of Last-Modified and ETag, how to generate them, how to perform the comparison, and when to use them.

Last-modified

As we already established, the Last-Modified response header is a timestamp that indicates when the origin server believes the selected resource was last modified while handling the request.

For example, if the origin server serves a request to GET /users/1 at Date: 2021-05-05 12:00:00, then the Last-Modified header will be set to a timestamp that the server knows the user with an ID of 1 was last updated. When the origin server responds to a GET /users/1, even a second later in time (at Date: 2021-05-05 12:00:01), the server can potentially return a different Last-Modified as in the meantime the underlying entity might be updated.

Setting the Last-Modified

In the wild, the Last-Modified header on the response is rarely generated based on a single resource. When set on web pages, the Last-Modified header will be set based on all the different resources that are rendered on the page.

Therefore the most straightforward way to generating the Last-Modified header is to take the most recent time that the server has changed any of those resources. An additional angle to consider is if all resources rendered on the page are worth busting the cache.

For example, a page that renders TV shows for watching, with a list of recommended shows in the footer, could set the Last-Modified to the most recent time server updated the TV show entity. When the recommendations in the footer change, the origin server can still keep the Last-Modified tied to the show update time, as the recommendations are secondary to the page.

When the TV show gets a new season, this is an event when the server should update the Last-Modified header, as we would like all users to see that we have added a new season.

But if the recommendations keep people watching and discovering new shows, then more recent (and better), recommendations should also be taken into account when setting the Last-Modified header.

ETag

ETag is an opaque validator - as we already mentioned, the server generates the entity tags, but how it generates them is unknown to the user-agent. In fact, from the user-agent perspective, the approach to generating the ETag is irrelevant. It’s only crucial that the ETag is changed once the resource representation changes, so the user-agent can revalidate when the change happens.

The main principle to generating and sending an ETag is always to create one where the server can reliably and consistently determine the changes on the underlying resource. Proper usage of ETags will substantially reduce HTTP network traffic and can be a significant factor in improving service scalability and reliability.

Comparing ETags

ETags opaqueness forced their inventors to add a weakness denominator: the W/ prefix. For example, the same ETag header will be differently handled by the user-agent, based on the weakness indicator:

# Weak ETag
ETag: "W/3cb377-13f27-5c0c816db0d40"

# Strong ETag
ETag: "3cb377-13f27-5c0c816db0d40"

Although they contain the same value, these two headers are different due to the weakness indicator. Looking at the example above, we can see that strong is the default, or in other words, an ETag is weak only if the weak indicator (W/) is present.

A comparison table on weak and strong ETags looks like this:

ETag 1ETag 2Strong ComparisonWeak Comparison
W/"Foo"W/"Foo"No matchMatch
W/"Foo"W/"Bar"No matchNo match
W/"Foo""Foo"No matchMatch
"Foo""Foo"MatchMatch

Conditional requests semantics and mechanisms

Now that we understand the validator headers and their intricacies let’s look at the precondition and header semantics.

There are five headers used for communicating preconditions between the user-agent and the servers:

  1. If-Match
  2. If-None-Match
  3. If-Modified-Since
  4. If-Unmodified-Since
  5. If-Range

In this article, we are looking at only conditional reads, or conditional GETs, as a way of caching content at the user-agent. That’s why, out of the five headers above, we will look only at the 2nd and 3rd: If-None-Match and If-Modified-Since. The other three are used for conditional requests but of a different kind that we will look into in another article.

Before we continue investigating the two headers, let’s first familiarize ourselves with the example application that we’ll be using.

Familiarizing with our application

We will begin this exploration by looking at a sample application. For my comfort, we will use a Ruby on Rails application. Yet, the concepts discussed can be transferred to any web framework or language, as long as it can speak HTTP.

The application that we will be using to demonstrate these concepts is the “Sample app” built as part of the most popular Ruby on Rails book - Ruby on Rails Tutorial by Michael Hartl. The complete source code is available on Github.

The sample application is a Twitter clone. It is a multi-tenant application where users can create microposts. The application’s root path (/) points to the timeline of the logged-in user, which the StaticPagesController#home action renders:

class StaticPagesController < ApplicationController
  def home
    if logged_in?
      @feed_items = current_user.feed.paginate(page: params[:page])
    end
  end
end

Below you can see what the microposts feed looks like:

Let’s use this action and explore the various ways we can implement conditional HTTP requests.

Conditional requests using ETags and If-None-Match

As mentioned in RFC-7232, the implementation of ETags is left to the server itself. That’s why there is no single implementation of ETags. It is every server for itself.

The case for using conditional requests, in this case, is: when a user is logged in, we would like not to send bytes over the network as long as there are no new @feed_items on the page.

Why? Think about this: we open the page, get the latest microposts from the people we follow and read them. That’s it. The next time we refresh the page, if there are no new microposts on the page, our browser already has all the data in its cache, so there’s no need to fetch new bytes from the server.

In such cases, we want to skip (most of) the network traffic and get an HTTP 304 Not modified from the server. As stated in section 4.1 of RFC-7232:

The 304 (Not Modified) status code indicates that a conditional GET or HEAD request has been received and would have resulted in a 200 (OK) response if it were not for the fact that the condition evaluated to false. In other words, there is no need for the server to transfer a representation of the target resource because the request indicates that the client, which made the request conditional, already has a valid representation; the server is therefore redirecting the client to make use of that stored representation as if it were the payload of a 200 (OK) response.

This excerpt is packed, but the main point is: the client made the request conditional (using a header), the server evaluated the condition and decided that the client has the fresh content by returning an HTTP 304.

In our example, having the latest @feed_items means that there’s nothing new to be returned by the server, so the server informs the client that the cached @feed_items are still valid to be used by returning HTTP 304.

Now, how can we practically do that?

A manual approach

As we mentioned before, the ETag is a string that differentiates between multiple representations of the same resource. It means that if the @feed_items collection changes, the ETag must also change. In addition, two identical @feed_items collections must have the same ETags.

Knowing all of this, our ETag has to consider the id and the content of the Micropost (the @feed_items collection contains multiple Micropost objects). If the id or content changes on any Micropost, we want the ETag to change. When the ETag changes, the client will fail the revalidation condition, and the server will respond with the latest data.

Therefore, the ETag of the @feed_items collection will consist of all ETags of each Micropost in the collection. The implementation would look like this:

# app/models/micropost.rb

def my_etag
  [id, Digest::MD5.hexdigest(content)].join('-')
end

If we run this method against a micropost:

>> m = Micropost.last
=> #<Micropost id: 1, content: "Soluta dolorem aspernatur doloremque vel.", user_id: 1, created_at:...
>> m.id
=> 1
>> m.content
=> "Soluta dolorem aspernatur doloremque vel."
>> m.my_etag
=> "1-3268ba55dd9915c975821eda93eb22dc"

We concatenate the MD5 hash of the content and the id to form an ETag for the micropost.

Usually, we generate the ETag using an MD5 hash of the resource, allowing us to use a single string value, with 32 hexadecimal digits, instead of long arbitrary strings.

The MD5 hash will also change if any of the parameters used to generate it get changed.

Now, back to our controller:

# app/controllers/static_pages_controller.rb

def home
  if logged_in?
    @feed_items = current_user.feed.paginate(page: params[:page])

    my_etag = @feed_items.map(&:my_etag).join('-')

    if my_etag != request.headers['If-None-Match']
      response.set_header('ETag', my_etag)
    else
      head :not_modified
    end
  end
end

We generate my_etag for the @feed_items collection by joining all the Micropost#my_etag outputs for each micropost in the collection. Then, we take the If-None-Match header from the request and compare it with the my_etag value. If the two are the same, that means that the browser already has the latest @feed_items in its cache, and there’s no point for the server to return it again. Instead, it returns a response with HTTP 304 status and no body.

If the two values differ, then we set the response header ETag to that new value, and we return the whole response.

But why the If-None-Match header? As described in section 3.2 of RFC-7232:

The “If-None-Match” header field makes the request method conditional on a recipient cache or origin server either not having any current representation of the target resource, or having a selected representation with an entity-tag that does not match any of those listed in the field-value.

In other words, the client sends the ETag value in the If-None-Match header, and our server decides whether that’s the valid representation of the content to be served. If it is valid, then the server will return an HTTP 304 to the client.

If we send a request to the page again, we will see the following output:

Our (massive!) custom ETag is present. While such a long header is not practical, it still works. If we rerun the same request, by refreshing the page, we will see the server responding with an HTTP 304:

That happened because our browser sent the ETag value as part of the If-none-match header:

Voila! We implemented our conditional requests using Rails. Before we go on, there’s one thing I’d like to point out – avoid custom implementations. Our implementation is problematic, as it does not consider the differences between strong and weak validators.

There’s a better way to do it – using the built-in helpers by the web framework or a library for the respective language/framework.

Using helpers

Similar to most web frameworks, Rails provides helpers to deal with conditional GET requests. In our particular case, we can substitute all of the custom code we wrote with a single method call: fresh_when.

Here’s the new version of our controller action:

# app/controllers/static_pages_controller.rb

def home
  if logged_in?
    @feed_items = current_user.feed.paginate(page: params[:page])

    fresh_when(etag: @feed_items)
  end
end

Internally fresh_when will do what our custom code did before, and a bit more.

If we inspect the method’s source code, we will see that fresh_when can handle both strong and weak ETags and the Last-modified header (which we will look into soon).

In our code snippet above, we explicitly set the etag to be set by fresh_when based on the @feed_items collection.

But how Rails know how to calculate the ETag? Well, internally fresh_when calls the ActionDispatch::Http::Cache::Request#fresh? method, which handles the ETag validation. It taps into the If-None-Match value of the request object and the ETag header value of the response and compares the two.

If we test out the new code, we will see a very similar behavior as before:

The server set the ETag response header to a string, prepending it with W/ denoting a weak ETag. The server will exhibit the same behavior on the next request: an empty response body with the HTTP 304 Not Modified status.

Conditional requests using Last-modified and If-Modified-Since

As we mentioned before, the Last-modified header contains a timestamp specifying when the server last changed the object. If we continue to use the same example with @feed_items, we are running into an interesting problem: from all the @feed_items in the collection, we have to return only a single last modified date. What object’s last modified date should we pick then?

The easiest way to do it is to find the largest last updated date from all the objects in the collection. Neatly, most Rails models have a updated_at attribute updated when the record in the database is changed – perfect for us to use it as the last updated date.

If your framework or application does not have the updated_at attribute (or similar), you need to figure out another way to deduce when each record was last updated. We can find the last updated timestamp through an audit trail or another field in the database.

Still, I recommend adding a updated_at field as it is a neat way to solve the problem.

In the example application, that would look like:

last_modified = @feed_items.map(&:updated_at).max.httpdate

Knowing all this, let’s implement this manually and then using fresh_when.

A manual approach

Validating the If-Modified-Since is a comparison between two DateTime objects: the last_modified and the if_modified_since. We want to compare the two and return an HTTP 304 if the last_modified is more recent (larger) than the if_modified_since.

If the client does not send the If-Modified-Since header, we need to make sure we return it to the client to send it on the subsequent request.

All of that, in code:

# app/controllers/static_pages_controller.rb

def home
  if logged_in?
    @feed_items = current_user.feed.paginate(page: params[:page])

    last_modified = @feed_items.map(&:updated_at).max.httpdate

    if request.headers.key?('If-Modified-Since')
      if_modified_since = DateTime.parse(request.headers['If-Modified-Since']).httpdate

      head :not_modified if if_modified_since >= last_modified
    end

    response.set_header('Last-Modified', last_modified)
  end
end

If we give this a shot in the browser, we will see that on the first request-response cycle, we will get the Last-Modified header set:

Following the spec, on the subsequent request, the browser will send the If-Modified-Since condition header, which will cause the server to make the comparison of the two dates. Once it determines that the dates are the same, it will return an HTTP 304:

If we were to update any of the @feed_items or create a new item, the last_modified date would be changed, and the conditional validation will fail, resulting in an HTTP 200 instead of an HTTP 304.

Even though this implementation works fine, let’s see how we can do that with fresh_when.

Using fresh_when

Similar to before, finding the last_modified stays in the code. But all the other logic goes away:

# app/controllers/static_pages_controller.rb

def home
  if logged_in?
    @feed_items = current_user.feed.paginate(page: params[:page])

    last_modified = @feed_items.map(&:updated_at).max

    fresh_when(last_modified: last_modified)
  end
end

That’s all! We substituted that logic with a single line of fresh_when. If we rerun the same tests, the behavior will stay identical:

In the first request-response cycle, we got the Last-Modified header set on the response.

And similar to before, on the following request, the browser will send the If-Modified-Since condition header, which will cause the server to make the comparison of the two dates. Once it determines that the dates are the same, it will return an HTTP 304:

As you can see, both the manual and the built-in solutions work identically. Revalidating requests using Last-Modified and If-Modified-Since is a powerful mechanism of speeding up our applications by not sending (useless) bytes over the network.

Outro

We began our exploration of conditional requests by looking at the specification. We familiarized ourselves with validators, conditions,and how they work. We then went on to explore how we can implement conditional HTTP requests with some header comparisons. Our Last-Modified implementation works as well as the built-in framework one!

We saw how implementing such optimizations can improve the performance of our web applications. We all know the fastest requests are the ones that are never sent. But as the title of this article says: the second-fastest are the ones that need no response body!

While there are more details that we could explore here, this covers the whole topic of conditional GET requests. We could further explore conditional requests for static files (such as assets) in combination with Content Delivery Networks (or popularly called CDNs). But that is a topic for another article.

And, as always, I hope you all learned something.

Further reading