Tuesday, January 9, 2007

Why PUT and DELETE?

In "Why REST Failed", Elliotte Rusty Harold described the difference between the four HTTP verbs GET, POST, PUT, and DELETE, and praised their virtues:

The beauty of REST is that those four verbs are all you need. Everything you need to do can be done with GET, PUT, POST, and DELETE. You don't need to invent a new verb for every operation you can imagine. This make HTTP scalable, flexible, and extensible. Simplicity leads to power.

He then expressed the problem with the picture:

The problem is that GET, PUT, POST, and DELETE really are a minimal set. You truly do need all four, and we only have two; and I've never understood why. In 2006 browser vendors still don't support PUT and DELETE.

In this interview, Bill Venners asks Harold to explain what value PUT and DELETE really adds that POST doesn't offer.
Idempotency of PUT and DELETE

Bill Venners: In your blog post entitled "Why REST Failed," you said that we need all four HTTP verbs—GET, POST, PUT, and DELETE— and lamented that browser vendors only GET and POST." Why do we need all four verbs? Why aren't GET and POST enough?

Elliotte Rusty Harold: There are four basic methods in HTTP: GET, POST, PUT, and DELETE. GET is used most of the time. It is used for anything that's safe, that doesn't cause any side effects. GET is able to be bookmarked, cached, linked to, passed through a proxy server. It is a very powerful operation, a very useful operation.

POST by contrast is perhaps the most powerful operation. It can do anything. There are no limits as to what can happen, and as a result, you have to be very careful with it. You don't bookmark it. You don't cache it. You don't pre-fetch it. You don't do anything with a POST without asking the user. Do you want to do this? If the user presses the button, you can POST some content. But you're not going to look at all the buttons on a page, and start randomly pressing them. By contrast browsers might look at all the links on the page and pre-fetch them, or pre-fetch the ones they think are most likely to be followed next. And in fact some browsers and Firefox extensions and various other tools have tried to do that at one point or another.

PUT and DELETE are in the middle between GET and POST. The difference between PUT or DELETE and POST is that PUT and DELETE are idempotent, whereas POST is not. PUT and DELETE can be repeated if necessary. Let's say you're trying to upload a new page to a site. Say you want to create a new page at http://www.example.com/foo.html, so you type your content and you PUT it at that URL. The server creates that page at that URL that you supply. Now, let's suppose for some reason your network connection goes down. You aren't sure, did the request get through or not? Maybe the network is slow. Maybe there was a proxy server problem. So it's perfectly OK to try it again, or again—as many times as you like. Because PUTTING the same document to the same URL ten times won't be any different than putting it once. The same is true for DELETE. You can DELETE something ten times, and that's the same as deleting it once.

By contrast, POST, may cause something different to happen each time. Imagine you are checking out of an online store by pressing the buy button. If you send that POST request again, you could end up buying everything in your cart a second time. If you send it again, you've bought it a third time. That's why browsers have to be very careful about repeating POST operations without explicit user consent, because POST may cause two things to happen if you do it twice, three things if you do it three times. With PUT and DELETE, there's a big difference between zero requests and one, but there's no difference between one request and ten.

Bill Venners: Let's say that I am collaborating on some resource, and I do a PUT, and I don't get a response back, even though it succeeded. You come along and do a PUT and update the document. My browser does a PUT again and it also succeeds. I've overwritten your change.

Elliotte Rusty Harold: That can happen. You can PUT and I can PUT without necessarily GET-ing. There's no locking in here. Perhaps the server could implement something of that nature, but that's not unique to the issue of PUT or a retry. You could upload something, and I could be working on my version of the same thing, and I could PUT five seconds after you and accidentally overwrite your changes, without ever having seen them. There's no requirement that I do a GET first, look at the resource, lock the state of the resource, and then re-PUT.

Bill Venners: And the same thing is true of DELETE. I could DELETE something, you could PUT something new at the same URL, then my browser could retry the DELETE, which would delete the content your latest PUT.

Elliotte Rusty Harold: That is true. There's no locking built into this mechanism. The whole thing is stateless. Each operation acts on the resource at a given point in time. Of course a server could assign a particular owner to each resource and use HTTP authentication to insure that neither of us could write or delete each other's files. Or it could use version control to make sure that all changes are reversible. (Most Wikis do this.) However that's all a server side implementation decision, not something built into HTTP.
URL selection with POST versus PUT

Bill Venners: Yes, but then I can't see how PUT and DELETE can actually be retried. If they can't be retried in practice, then I could just use POST. I can do putting and deleting with POST. I do it all the time.

Elliotte Rusty Harold: You can, but it doesn't work quite as well as if you do it with PUT and DELETE. There's one more difference between PUT and POST that's quite significant. If you're creating a new page, and you want it to go at a certain URL, you would PUT the page to http://www.example.com/foo.html. The server would then reate the page at that URL, assuming it supports PUT at all. Of course all of this can be protected with usernames and passwords.

By contrast, if you type POST to http://www.example.com/foo.html, there's nothing there to receive your POST. On the other hand you could POST to http://www.example.com/pageCreator.php, and that resource could take your document, choose a URL at which to place that document, and return in the response the URL of the document it had created in response to your POST. So generally speaking you use PUT to create a new document when the client wants to choose the URL. The client wants to say it goes here. You use POST to create a new document when you're posting to some existing URL. The server is going to take that content, choose the URL for that content, create a new page from that content, and then return that URL to the client. The client then knows where the server placed it. POST can do other things besides merely creating a new page, but if that's what it's doing, then that's another crucial difference between PUT and POST.

Bill Venners: I could also include in my POST a requested URL to PUT it at.

Elliotte Rusty Harold: Yes, and if you do that, you likely need to have some special encoding, probably, x-www-formurlencoded. Here's the content of the page, here's the URL. Here's the other various pieces. By contrast if you're PUT-ing, you can just say, here's the data of the page. Nothing else is required. The URL you're putting it to is part of the HTTP header. Any authentication credentials, username and password and so forth, are part of the HTTP header. So it's quite a bit simpler in terms of what you're uploading. If you want to PUT in a JPEG, PUT a JPEG. If you want to PUT an XML document, PUT an XML document. It's all just raw bytes as far as the server is concerned. The server takes it and puts it where you said to PUT it, assuming your authentication checks out.
POST-ing to update a resource

Bill Venners: The way I think about resources is that in /articles, there's a collection of articles, So /articles is the collection resource. I can POST to that to add another one, or delete one. Once an article is there, such as at /articles/why_put_and_delete, I can POST to it to update it. That's why I couldn't find a very compelling need for PUT and DELETE.

Elliotte Rusty Harold: The key difference between PUT and POST for update is that POST merely adds to an existing resource while PUT replaces it entirely. If you're just adding a paragraph or comment to the article, and it's going down to the bottom of the page, then use POST. If you're not replacing the entire page, then yes, you're POST-ing new ontent to that article. By contrast, if you're replacing the entire article—all of it; you're sending a completely new version of the article—then that's a PUT.

Bill Venners: But I could still do that with a POST. To me, POST means updating the thing I'm posting to. If you have four verbs, PUT, DELETE, GET, and POST. Isn't that what POST kind of means.

Elliotte Rusty Harold: POST is the most general thing and I hesitate to say it means anything, because on some systems it means this. On others it means that. POST is your generic catch-all, with no real restrictions on what it can do. It is incredibly powerful, but the principal of least power would suggest that where we can get away with PUT or DELETE, we do so. If all you have to work with is POST and GET, you'll probably be OK so long as you're clear about the difference between those two. You can live without PUT and DELETE, but it's a somewhat nicer world if we in fact have them.