Tuesday, December 12, 2006

Response Cache - double request protection

A servlet filter that avoids processing double requests in Java web applications. When subsequent identical request is detected the original response is sent.
Problem statement

A standard problem in web-applications is the handling of undesired, double requests with the same data. A typical situation causing such an event is the double-clicking of the submit button on a form, either because of user habits or in cases of long response times from the server when the user thinks the action hasn't commenced. The user can refresh the previous page causing the last action to repeat, or he can unintentionally interrupt the execution flow by using the browser's back button to send an old request again.

Double requests can have a serious negative impact on the application. Let's imagine double shop cart confirmation. In the worst case the user's account can be charged twice (or even more, if there are more clicks on the submit button). In cases of database insert operations, double actions can duplicate data. Because after double clicking both requests are processed concurrently for the same user session, this can lead to unpredicted interactions in business logic or cause database constraint violation.

Refreshing the last page can have nearly the same negative effects that double clicking has. However, because the actions are executed one after another, they are more predictable and easier to manage.

Use of the back button can disturb the process of long running transactions if applications have to have a strict action sequence. Receiving a request which has been handled four steps before can damage the application state if it hasn't been designed for such a situation.

There are various methods for preventing such situations. The simplest one for double click protection is the disabling of the submit button in JavaScript. This works perfectly against double clicking if JavaScript is enabled in the browser, but the problem is that the user doesn't always enable it. You cannot protect the application from page refresh and back button consequences. If html headers are set appropriately the user gets a warning before re-sending the page but this is mostly ignored by users.

The only guaranteed method is sever-side protection. Double requests must be identified and acted on accordingly. In a typical approach a unique id is generated for each request through a hidden field in the form. The last request id is stored in the user session and compared with the new one. If the new request id differs from the stored one, the stored id is updated and the request is processed. If the ids are the same, the user is redirected to an error page. The simple version of this identification mechanism doesn't recognize previous requests and doesn't protect the application against back button problems. It requires the addition of the unique id to each form and link. Displaying an error unfortunately isn't optimal - the user doesn't get a status report about the previous action. Ignoring the last request isn't optimal either - the browser waits for a response to the last request until the time-out threshold is exceeded. The best solution is implementing a special logic to make responses meaningful, but this typically requires a lot of work.
Universal solution

The general solution is realized as a servlet filter. Requests are identified in the traditional manner, with a unique request id received in hidden fields, a link parameter or fully automatic, without any preparation at the front end. This identification can be done for the last request (double click and page refresh protection) or for the whole interaction during the user session (back button protection as well). In every kind of situations the user receives the last normal response; there is no redirection to an error page. The server application gets only one request and every action is executed only once.

Response Cache is a vertical solution and applying it to an already implemented application needs almost no additional work. In the simplest case, all you need to do is add several classes to the project and declare one filter in the web.xml deployment descriptor.

How it works


The first task in such a protection mechanism is proper double request identification. This can be done by generating a unique id for each response and inserting it in each form as a hidden field and in each link as an additional, special parameter. This id is sent back in the following request and can be compared to previous requests. Id generation isn't provided by the filter and should be done manually in the application.

Another possibility is automatic detection. There is a hash value calculated from the entire request and used in place of a request id. Any change in request parameters (e.g. correcting a value in the input field) changes the hash value and generates another id. But after double clicking or page refresh the requests are identical and generated ids are equal. The only limitation of the current solution affects changes in file fields - they are not detected automatically. Explicit ids should be generated for pages with file fields.

Filtering can work in two modes: blocking of repeated last request (protecting against double clicking and last page refresh) or blocking all already executed requests (protection against back button and refresh of any previous page). In the first case only the last id of the correct request is stored in the filter, in the second case all requests are held. In both situations the ids are stored in a hash table indexed by the user session id for independent interaction for concurrent users.

The main functionality concerns the reaction to detected double requests. To make the solution safe and universal, each proper response (or the last in the cases of last request mode) is cached in the filter, together with the request id. Subsequent identical requests are not passed to the application's business logic. The response is sent with the content retrieved from the cache. For requests with a very small time-lag arriving at the server nearly simultaneously (double click) the subsequent requests have to wait until the cache is populated.

The user receives the original content, exactly the same as for single requests. After the back button is used and an old page requested again, the user also receives the last request - the back button simply doesn't work. This is completely transparent for the actual application; it doesn't affect it in any way. It always receives only single requests.

Code example

public void doFilter(ServletRequest servletRequest,
ServletResponse servletResponse, FilterChain chain)
throws IOException, ServletException
{
PrintWriter out = response.getWriter();

// transparent wrapper
CharResponseWrapper wrapper = new CharResponseWrapper(response);
ResponseCache cache =
ResponseCache.getInstance(request.getSession().getId());

if (cache.isRequestValid(request))
{
// request has been sent and response buffer is not yet prepared
// request is being processed (possible time consuming operation)
chain.doFilter(request, wrapper);

// response buffer is ready - we can read it using wrapper
// writing wrapper output to the response cache
cache.write(wrapper.toString());

// writing wrapper output to the real response buffer
out.write(wrapper.toString());
}
else // double click detected
{
// do not call chain.doFilter - this way request won't be proceeded
// writing previously saved cache content to the real response
out.write(cache.read());
}
out.close();
}


When to use it

The Response Cache should be used especially in situations where the response data is meaningful for the user.

For example, when the application validates data before executing business logic or before saving data in the database, response information depends on whether input data is properly validated. When the data is invalid the application displays an error message and asks the user to correct it. Otherwise it executes e.g. an electronic payment process and displays a confirmation message with the payment status (again - the status message is also very important, because connection to the external payment provider is not possible).

If the mechanism that indicates identical subsequent requests redirects the user to the page with the error message that informs the user about an error caused by an attempted sending of the same request twice, the user wouldn't know anything about the real status of the payment operation.

The Response Cache handles this problem in a very universal and transparent way. It takes care of displaying only the first meaningful response to the user - just like in situations where no double clicking is performed by the user.
Limitations

As mentioned previously, automatic identification of double requests doesn't work for forms with file fields. This will probably be improved in future library versions. At present this can be worked around with an explicitly generated request id for that particular form.