To achieve optimum performance and results when caching content you should try to set both the Expires header and the Last-Modified header.
The Expires should be set to a date in the future. It instructs the browser to cache the content until it expires when the Expires date is surpassed. If the content is very volatile (changes frequently) then the Expires date should be set to a lower value (near in the future) otherwise it should be set to a higher value (far in the future). This type of caching is referred to as "hard" caching because the browser never reconnects to the server again until the appointed time. Because of this, this type of caching is relatively fast.
The Last-Modified header should be set to a date in the past. It instructs the browser to cache the content unless it has been modified since the Last-Modified date was last set. If the content has since been modified then the Last-Modified date should be updated to a new value (the date the content was modified) to reflect this. This type of caching is referred to as "soft" caching because the browser always reconnects to the server again to check if the content has been modified. Because of this, this type of caching is relatively slow.
The Last-Modified header works in conjunction with the If-Modified-Since header. The first time the content is requested the server sets the Last-Modified date in the response. The browser then receives this response and caches the content along with the Last-Modified date. In subsequent requests the browser sets the If-Modified-Since date equal to the Last-Modified date that was previously cached. The server then receives the request and gets the If-Modified-Since date for the purpose of comparing it to the content's Last-Modified date.
The server is responsible for comparing the If-Modified-Since date and the Last-Modified date to determine if the browser has the most recent version of the content. If the dates are equal then browser already has the most recent version of the content and the server responds with an HTTP status code of 304 (Not Modified) which is the signal for the browser to reload the content from its cache. If the dates are not equal then the browser does not already have the most recent version of the content and the server responds by sending the updated content in its entirety along with the updated Last-Modified date.
When you combine the Expires header and the Last-Modified header the resulting behavior will be a hybrid of the two types of caches which will be the best in terms of both performance and flexibility.
Below is sample code that demonstrates how to set both the Expires header and the Last-Modified header in a Java Servlet. This Servlet makes use of the getLastModified method which automatically sets the Last-Modified date to the method's return value which is then compared to the If-Modified-Since date. Please note that the order in which the Expires header is set and the getLastModified method is called is important. The Expires header must always be set first which is why the service method is overridden for this purpose.
import java.io.IOException;
import javax.servlet.ServletException;
import javax.servlet.http.*;
import java.util.*;
public class CacheTestServlet extends HttpServlet
{
private static long hits = 0;
private static Date date;
//This method sets the Expires date 15 seconds in the future
//Typically the Expires date would be set to a higher value
public void service(HttpServletRequest request, HttpServletResponse response) throws ServletException, IOException
{
response.setDateHeader("Expires", new Date().getTime() + 15000);
super.service(request, response);
}
//This method updates the Last-Modified date on every other request
//This is meant to simulate content that is modified occasionally
//Typically the Last-Modified date would be retrieved from a file or database
public long getLastModified(HttpServletRequest request)
{
hits++;
if (hits % 2 == 1)
{
date = new Date();
System.out.println("FULL HIT");
}
else
{
System.out.println("PARTIAL HIT");
}
return date.getTime() / 1000 * 1000;
}
public void doGet(HttpServletRequest request, HttpServletResponse response) throws IOException
{
response.getWriter().println("The current date & time is " + new Date());
}
}
Below is a summary of the Servlet's output:
- The first time you load the page you will see the current date & time and the phrase "FULL HIT" will be printed.
- If you immediately reload the page, then the page will be reloaded from the browser's cache without reconnecting to the server. The date & time will not change and neither the phrase "FULL HIT" nor the phrase "PARTIAL HIT" will be printed because the server does not actually receive any requests.
- If you wait at least 15 seconds before reloading the page, then the browser will reconnect to the server to determine if the content has been modified but since the Last-Modified date has not changed it will reload the page from the browser's cache as opposed to reloading it from the server. The the date & time will not change but the phrase "PARTIAL HIT" will be printed.
- If you immediately reload the page, then the page will be reloaded from the browser's cache without reconnecting to the server. The date & time will not change and neither the phrase "FULL HIT" nor the phrase "PARTIAL HIT" will be printed because the server does not actually receive any requests.
- If you wait at least 15 seconds before reloading the page, then the browser will reconnect to the server to determine if the content has been modified and since the Last-Modified date has changed it will reload the page from the server as opposed to reloading it from the browser's cache. The date & time will change and the phrase "FULL HIT" will be printed.
- And so forth...
Tip: Pressing F5 in your browser will cause the browser to ignore any "soft" caches and reload the page. Pressing CTRL+F5 in your browser will cause the browser to ignore any "hard" caches and reload the page.
Here is a good link regarding caching:
http://code.google.com/speed/page-speed/docs/caching.html