There are several ways the browser or ProxySG can determine if it should serve a cached object, or retrieve fresh copy from the web server.
Caching Method 1: Last-Modified
The server tells the browser what version of the file it is sending. A server can return a Last-modified date along with the file like this:
Last-modified: Fri, 16 Mar 2007 04:00:25 GMT
File Contents (could be an image, HTML, CSS, Javascript...)
Now the browser knows that the file was created on Mar 16 2007. The next time the browser needs the file, it can do a special check with the server. Sending the short “Not Modified” message is a lot faster than needing to download the file again, especially for giant javascript or image files.
Caching Method 2: ETag
Comparing versions with the modification time generally works, but could lead to problems. For example if the server’s clock was originally wrong and then got fixed, or if daylight savings time comes early and the server isn’t updated, the caches could be inaccurate.
ETags to the rescue. An ETag is a unique identifier given to every file, and behaves like a hash or fingerprint: every file gets a unique fingerprint, and if you change the file (even by one byte), the fingerprint changes as well.
Instead of sending back the modification time, the server can send back the ETag (fingerprint):
ETag: ead145f
File Contents (could be an image, HTML, CSS, Javascript...)
The ETag can be any string which uniquely identifies the file. The next time the browser needs a file, it knows if it is getting the latest ETag.
Caching Method 3: Expires
Caching a file and checking with the server is effective, except that we are still checking with the server. If we know when the file expires, we keep using it until that date. As soon as it expires, we contact the server for a fresh copy, with a new expiration date. The header looks like this:
Expires: Tue, 20 Mar 2007 04:00:25 GMT
File Contents (could be an image, HTML, CSS, Javascript...)
In the meantime, we avoid even talking to the server if we’re in the expiration period.
There isn’t a conversation here; the browser has a monologue.
The web server didn’t have to do anything, and the user sees the file instantly.
Caching Method 4: Max-Age
Using the expiration is effective, but it has to be computed for every date. The max-age header lets us say “This file expires 1 week from today”, which is simpler than setting an explicit date.
Max-Age is measured in seconds. Here’s a few quick second conversions:
1 day in seconds = 86400
1 week in seconds = 604800
1 month in seconds = 2629000
1 year in seconds = 31536000 (effectively infinite on internet time)
Bonus Header: Public and Private
The cache headers never cease. Sometimes a server needs to control when certain resources are cached.
Cache-control: public means the cached version can be saved by proxies and other intermediate servers, where everyone can see it.
Cache-control: private means the file is different for different users (such as their personal homepage). The user’s private browser can cache it, but not public proxies.
Cache-control: no-cache means the file should not be cached. This is useful for things like search results where the URL appears the same but the content may change.
However, be wary that some cache directives only work on newer HTTP 1.1 browsers. If you are doing special caching of authenticated pages then read more about caching from the web/google.