As you probably know, HTTP supports several methods that define the nature of the current request. The two most important ones are GET and POST. GET is the primary method to get content (so called entities) from the server such as HTML pages, images, CSS style sheets etc. The POST method on the other hand is meant to transport entities to the server, for example login credentials or a blog comment. On the server side a POST request often results in an update of certain data (databases, session state).
Both GET and POST can return an entity as a response. For GET this is obvious - it's what the method exists for in the first place. For POST it might sound reasonable in the first place as well, but it brings a pile of problems.
A simple scenario
Imagine you fill in a sign-up form of some web based e-mail service and POST it to the server using a submit button. The server processes the new account and updates its database. Maybe it even logs you in directly. In response of the POST request the server directly shows you a view of your inbox. Here's a diagram of what happens between browser and server:
- The browser POSTs form data to an URL called signup.aspx
- The server processes the request
- The server responds with a status code of 200 (OK) and sends back a view of the new users inbox rendered as HTML
You leave the computer to have a coffee and when you come back 5 minutes later you refresh the page (using CTRL+R or F5 or whatever shortcut your browser uses) to see whether you already have new messages. You are a bit puzzled why this (or a similar) message box appears:
You click on OK and are even more confused as the page that appears says "This user name is already taken" instead of showing your inbox .
What has happened? Remember that the page you saw was the response of a POST request (submitting the sign up form). When you refreshed the page and confirmed to "resend the data" you actually repeated the POST request with the same form data. The server processed the "new" account and found that the user name is already in use (by yourself!), therefore it showed an error. "But wait", you say, "I just wanted the server to refresh the view of my inbox, what have I done wrong? " The answer is: nothing! The problem is that the application abused the POST response to transport an entity back to the client that should have been accessed with a GET request in the first place.
POST related issues
Here are some of the problems that occur if you abuse POST requests to return entities:
1. Refreshing the page results in a re-transmission of the POST data
This is what I described above. Hitting "refresh page" for a reponse based on a POST request will re-issue the POST request. Instead of refreshing what you see this will repeat what you did to reach the current page. This is not "refresh page" anymore, it becomes "repeat last action" - which is most likely not what the user wants. If you see a summary page after you have submitted an order in an online store, you don't want F5 to drop another order, do you?
2. POST responses are hard to bookmark
Bookmarks (or favourites etc.) normally only remember the URL of the bookmarked page (along with some user supplied meta data). Because a POST request transports data in the request body instead as query parameters in the URL like GET does, bookmarking the result of a POST will not work in most cases.
3. POST responses pollute the browser history
If the browser keeps the result of a POST request in it's history, going back to that history entry will normally result in POST data to be retransmitted. This again causes the same issues as mentioned in point 1.
"But I need POSTs to send forms to the server - how can I avoid the problems mentioned above?" you might say. Here's where the POST-Redirect-GET (PRG hereafter) pattern enters the stage.
Instead of sending entity content with the POST response after we processed the request, we return the URL of a new location which the browser should visit afterwards. Normally this new location shows the result of the POST or an updated view of some domain model.
This can be achieved by not returning a result code of 200 (success) but instead returning a status code that indicates a new location for the result, for example 303 ("See other") or 302("Found"/"Moved temporarily"), the latter of which is used most often nowadays. Together with the 30x result code a Location header is sent which contains the URL of the page to which the request is redirected. Only the headers are sent, no body is included.
If the browser sees the 30x status code, it will look for the Location header and issue a GET request to the URL mentioned there. Finally the user will see the body of that GET request in the browser.
The browser-server communication would look like this:
- The browser POSTs to signup.aspx
- The server updates some state etc.
- The response is 302 redirect with a Location header value of inbox.aspx
- The browser realizes that the response is redirected and issues a GET to inbox.aspx
- The server returns 200 together with the content of the resource.
What do we gain?
- The page can be safely refreshed. Refreshing will cause another GET to inbox.aspx which won't cause any updates on the server
- The result page can be easily bookmarked. Because the current resource is defined by the URL a bookmark to this URL is likely to be valid.
- The browser history stays clean. Responses that have a redirect status code (such as 302) will not be put into the browser cache by most browsers. Only the location to which the response is redirecting is. Therefore signup.aspx won't be added to the history and we can safely go back and forth through the history without having to resubmit any POST data
The drawbacks of POST-Redirect-GET
While it should be clear by now that the POST-Redirect-GET pattern is the way to go in most situations, I'd like to point at the few drawbacks that come along with this pattern.
First of all, redirection from one request to another causes an extra roundtrip to the server (one for the POST request, one for the GET request it redirects to). In this context the roundtrip should be understood as all processing and transmission time that is required and fixed per request, ie. transmission delay, creation and invation of the HTTP handler, opening and closing database connections/transactions, filling ORM caches etc.
If both requests can be handled very quickly by the server this will essentially double the response time. If your roundtrip time is 200ms, using PRG will cause a minimum delay of 400ms between submitting the form and the result page being visible. This issues has to be put in perspective with reality, however. The server will need some time processing both requests, so the percentage of time needed for the roundtrips decreases with the amount of time server processing time takes. The response from the POST itself can be extremely small (few hundred bytes), because only the headers need to be transmitted.
In practice I haven't noticed a real performance problem with PRG. A slow app will stay slow, a fast one won't truly suffer from the extra roundtrip. And besides, if you replace POSTs by GETs where appropriate the effect of PRG will be even less noticeable.
The problem with ASP.NET WebForms
Now that you know about POST-Redirect-GET you are of course eager to use it (at least I hope I could convince you). But as an ASP.NET WebForms developer you will soon run into problems: ASP.NET WebForms is fundamentally based on POSTs to the server. In essence, all ASP.NET web pages are wrapped in one huge <form> element with "method" set to "POST". Whenever you click a button, you essentially POST all form fields to the server. Of course you can redirect from a Button.Click handler. If you do so, you're applying PRG. At the same time you're working quite against the WebForms philosophy, especially the ViewState (which will get lost as soon as you redirect), which will force you to rethink a lot of your application logic. And if you don't rely on all this postback behaviour inherent to ASP.NET WebForms you might as well ask why you're using WebForms in the first place.
This makes clear why a lot of developers (including me) think that WebForms are inherently "broken" (viewstate, ubiquitous postbacks and the hard-to-mock HttpContext are just a few reasons). If you share these concerns but like .NET just as I do, you might want to look at alternate .NET based web frameworks such as Castle MonoRail or ASP.NET MVC.
PRG and AJAX
In situations where you use AJAX the whole PRG issue becomes a new story. AJAX responses don't appear in the history, you wouldn't want to bookmark them and refreshing a web page does not re-issue any AJAX requests (except those fired on page load). Therefore I have no problem with returning entitiest (HTML fragments, JSON, XML) from AJAX POSTs - PRG is not of much use here.
To conclude this article here's a list of some basic rules that have been useful to me:
- Use POST-Redirect-GET whenever you can, that is: whenever you process a POST request on the server, send a redirect to a GETtable resource as response. It's applicable in almost all cases and will make your site much more usable for the visitor
- Don't POST what you can GET. If you only want to retrieve a parameterised resource it might be completely suitable to use a GET request with query string parameters. Google is a good example. The start page contains a simple form with a single text field to enter the search terms. Submitting the form causes a GET to /search with the search terms passed as the query string parameter q. This can be easily done by providing method="GET" on the <form> element (or just leave out the method attribute, as GET is the default).
- POST requests from AJAX are allowed to return entities directly as they don't suffer from the problems like "full" POSTs.