Weekly Research: WebDAV

I’ve been using WebDAV for many years now to synchronize multiple OmniFocus installations but I never knew all that much about it apart from being like a virtual filesystem strapped onto HTTP. So this week I wanted to learn a little bit more about it.


The premise

Back when Tim Berners-Lee authored HTTP the whole idea of it all was to have a read-write web. The latter part got quite complicated over the following years and in 1996 WebDAV was started as part of a workshop for distributed authoring on the web.

WebDAV is an extension to HTTP that adds new methods, status codes and more in order to bring better management facilities for resources to the protocol. This includes features like listing resources within a collection, storing and retrieving complex properties of resources, moving resources from one collection to another and locking.

All of this was specified in two RFCs by the IETF: RFC2518 in 1998 which was replaced by RFC4918 in 2007.

While in HTTP a resource has only a limited set of metadata represented by its HTTP headers, in WebDAV these can be much more complex. And for everything that is a little bit more complex than a simple string or number the IETF back in the late 1990s and 2000s loved XML; so if you work with WebDAV you work with XML.

Data structures

First of all, to be able to move, lock or describe something we have to get down to what that something might be. In WebDAV a resource can not only be a simple resource but can also be a container for other resources, called a “collection”. To stay with the file-system, think about a collection as a folder/directory. An item in a collection is called a “member” of that collection.

Each resource (be it a collection or something like an HTML page) can have multiple properties that are described using XML in all its glory (including schema definitions, DTDs etc. if you want). A property can either be “live” or “dead”. While a dead property is something that the client has to maintain, a live property is one that is enforced by the server (like a file’s size automatically attached to a response).

Operations

For these data structures, WebDAV provides a handful of new HTTP methods:

  • PROPFIND
  • PROPPATCH
  • MKCOL
  • COPY
  • MOVE
  • LOCK
  • UNLOCK

But since also the semantics of already defined HTTP methods have been changed here, I will just describe some common operations here.

To give all this a try I’ve configured my local Apache installation to provide me with an /uploads/ directory that supports WebDAV. All the code samples were executed against that using curl.

Creating a resource/collection

For creating a collection, WebDAV has added a new command to HTTP: MKCOL

$ curl -i -X MKCOL http://localhost:8080/uploads/collection/
HTTP/1.1 201 Created
Date: Sat, 07 Sep 2013 14:31:24 GMT
Server: Apache/2.2.22 (Unix) DAV/2 mod_ssl/2.2.22 OpenSSL/0.9.8x
Location: http://localhost:8080/uploads/collection/
Content-Length: 194
Content-Type: text/html; charset=ISO-8859-1

...

If this is successful, 201 Created is returned. If the collection already existed there, you get a 405 Method Not Allowed.

Now that we have a collection to store stuff into, let’s create a test.txt file in it:

$ curl -i -X PUT http://localhost:8080/uploads/collection/test.txt
HTTP/1.1 201 Created
Date: Sat, 07 Sep 2013 14:27:10 GMT
Server: Apache/2.2.22 (Unix) DAV/2 mod_ssl/2.2.22 OpenSSL/0.9.8x
Location: http://localhost:8080/uploads/collection/test.txt
Content-Length: 200
Content-Type: text/html; charset=ISO-8859-1

...

Managing properties of a resource

The big feature of WebDAV is the whole property-management thing. So let’s see what mod_dav stores for our newly created test.txt (I’ve formated the output a little bit for better readability):

$ curl -i --header "Depth: 0" -X PROPFIND http://localhost:8080/uploads/collection/test.txt
HTTP/1.1 207 Multi-Status
Date: Sat, 07 Sep 2013 14:35:38 GMT
Server: Apache/2.2.22 (Unix) DAV/2 mod_ssl/2.2.22 OpenSSL/0.9.8x
Content-Length: 902
Content-Type: text/xml; charset="utf-8"

<?xml version="1.0" encoding="utf-8"?>
<D:multistatus xmlns:D="DAV:">
    <D:response xmlns:lp1="DAV:" xmlns:lp2="http://apache.org/dav/props/">
        <D:href>/uploads/collection/test.txt</D:href>
        <D:propstat>
            <D:prop>
                <lp1:resourcetype/>
                <lp1:creationdate>2013-09-07T14:35:35Z</lp1:creationdate>
                <lp1:getcontentlength>0</lp1:getcontentlength>
                <lp1:getlastmodified>Sat, 07 Sep 2013 14:35:35 GMT</lp1:getlastmodified>
                <lp1:getetag>"1d9906e-0-4e5cc11689bc0"</lp1:getetag>
                <lp2:executable>F</lp2:executable>
                <D:supportedlock>
                    <D:lockentry>
                        <D:lockscope><D:exclusive/></D:lockscope>
                        <D:locktype><D:write/></D:locktype>
                    </D:lockentry>
                    <D:lockentry>
                        <D:lockscope><D:shared/></D:lockscope>
                        <D:locktype><D:write/></D:locktype>
                    </D:lockentry>
                </D:supportedlock>
                <D:lockdiscovery/>
                <D:getcontenttype>text/plain</D:getcontenttype>
            </D:prop>
            <D:status>HTTP/1.1 200 OK</D:status>
        </D:propstat>
    </D:response>
</D:multistatus>

As the example request indicates, you can also specify a “depth” for which properties should be returned. This is relevant for collections where you can fetch requests not only for a single resource but for multiple members of a collection. According to the specs this is a required parameter with the value 0, 1 or “infinity”, but Apache will assume an infinite depth request if you omit it. And if you do that on a collection, you get a nice info that the server doesn’t like it one bit:

$ curl -i --header "Depth: infinity" -X PROPFIND http://localhost:8080/uploads/collection/
HTTP/1.1 403 Forbidden
Date: Sat, 07 Sep 2013 18:09:48 GMT
Server: Apache/2.2.22 (Unix) DAV/2 mod_ssl/2.2.22 OpenSSL/0.9.8x
Content-Length: 235
Content-Type: text/html; charset=ISO-8859-1

<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML 2.0//EN">
<html><head>
<title>403 Forbidden</title>
</head><body>
<h1>Forbidden</h1>
<p>PROPFIND requests with a Depth of "infinity" are not allowed for /uploads/collection/.</p>
</body></html>

But just requesting info about the direct members of the collection (Depth: 1) works. Turns our that there is a setting for that which prevents infinite-depth requests by default.

But back to the properties returned with our initial request. Here nothing out of the ordinary for something that a file-system would report over a file shows up. Permissions, last modified time, creation date, file-type and size. There is also information about supported locking methods which we will get into a little bit later.

The PROPFIND method also let’s you request only a subset of properties associated with a resource. Let’s say all we want is the file’s size. Then we would have to submit following request body:

<?xml version="1.0" encoding="utf-8" ?>
<D:propfind xmlns:D="DAV:">
    <D:prop>
        <D:getcontentlength /> 
    </D:prop>
</D:propfind>
$ curl -i --header "Depth: 0" --header "Content-Type: text/xml" --data @propfind-size.xml -X PROPFIND http://localhost:8080/uploads/collection/test.txt
HTTP/1.1 207 Multi-Status
Date: Sat, 07 Sep 2013 18:33:23 GMT
Server: Apache/2.2.22 (Unix) DAV/2 mod_ssl/2.2.22 OpenSSL/0.9.8x
Content-Length: 365
Content-Type: text/xml; charset="utf-8"

<?xml version="1.0" encoding="utf-8"?>
<D:multistatus xmlns:D="DAV:" xmlns:ns0="DAV:">
    <D:response xmlns:lp1="DAV:" xmlns:lp2="http://apache.org/dav/props/">
        <D:href>/uploads/collection/test.txt</D:href>
        <D:propstat>
            <D:prop>
                <lp1:getcontentlength>0</lp1:getcontentlength>
            </D:prop>
            <D:status>HTTP/1.1 200 OK</D:status>
        </D:propstat>
    </D:response>
</D:multistatus>

Now let’s add a “dead” property to all these live ones using PROPPATCH:

<?xml version="1.0" encoding="utf-8" ?>
<D:propertyupdate xmlns:D="DAV:" xmlns:dc="http://purl.org/dc/elements/1.1/">
    <D:set>
        <D:prop>
            <dc:language>en</dc:language>
        </D:prop>
    </D:set>
</D:propertyupdate>
$ curl -i --header "Depth: 0" --header "Content-Type: text/xml" --data @proppatch-set.xml -X PROPPATCH http://localhost:8080/uploads/collection/test.txt
HTTP/1.1 207 Multi-Status
Date: Sat, 07 Sep 2013 18:42:36 GMT
Server: Apache/2.2.22 (Unix) DAV/2 mod_ssl/2.2.22 OpenSSL/0.9.8x
Content-Length: 322
Content-Type: text/xml; charset="utf-8"

<?xml version="1.0" encoding="utf-8"?>
<D:multistatus xmlns:D="DAV:" xmlns:ns1="http://purl.org/dc/elements/1.1/" xmlns:ns0="DAV:">
<D:response>
<D:href>/uploads/collection/test.txt</D:href>
<D:propstat>
<D:prop>
<ns1:language/>
</D:prop>
<D:status>HTTP/1.1 200 OK</D:status>
</D:propstat>
</D:response>
</D:multistatus>

The same structure can also be used to remove a property:

<?xml version="1.0" encoding="utf-8" ?>
<D:propertyupdate xmlns:D="DAV:" xmlns:dc="http://purl.org/dc/elements/1.1/">
    <D:remove>
        <D:prop>
            <dc:language />
        </D:prop>
    </D:remove>
</D:propertyupdate>

Updating a resource

Now, if we want to change the content of our test.txt resource, we just execute a PUT request on it:

$ curl -i -X PUT --header "Content-Type: text/plain" --data "some content" http://localhost:8080/uploads/collection/test.txt
HTTP/1.1 204 No Content
Date: Sat, 07 Sep 2013 18:52:17 GMT
Server: Apache/2.2.22 (Unix) DAV/2 mod_ssl/2.2.22 OpenSSL/0.9.8x
Content-Length: 0
Content-Type: text/plain

Fetching a resource

To fetch the data associated with a resource, simply do a GET request on its URL:

$ curl -i http://localhost:8080/uploads/collection/test.txt
HTTP/1.1 200 OK
Date: Sat, 07 Sep 2013 18:52:22 GMT
Server: Apache/2.2.22 (Unix) DAV/2 mod_ssl/2.2.22 OpenSSL/0.9.8x
Last-Modified: Sat, 07 Sep 2013 18:52:17 GMT
ETag: "1d9906e-c-4e5cfa7707a40"
Accept-Ranges: bytes
Content-Length: 12
Content-Type: text/plain

some content

When it comes to collections, the GET method has no standardized behavior here. Apache, by default, will just send you a 403 error. And even if you enable “Options Indexes” in your configuration, you only get a listing as HTML, not something you’d use as a directory listing. So how do you find the members of a collection?

Turns out this isn’t done with GET or some other collection specific command but just with PROPFIND, which we already looked at before. Setting the request depth to a value other than 1 seems to be the way to go here.

Deleting a resource

Deleting a resource is done using a simple DELETE request:

curl -i -X DELETE http://localhost:8080/uploads/collection/test.txt

If you do that to a collection the collection as well as all its members is removed. If a member cannot be deleted (for instance because someone else has a lock in it) then non of its parents are allowed to be deleted either, because otherwise the namespace containing that member would end up being messed up.

Copying and moving

The COPY method copies a resource from its URL to another defined using the “Destination” header:

$ curl -i -X COPY \
--header "Destination: http://localhost:8080/uploads/collection/test2.txt" \
http://localhost:8080/uploads/collection/test.txt

The MOVE command combines a COPY and a DELETE into one atomic operation. If it succeeds you either get a 201 Created or a 204 No Content depending on the existence of a resource at the target location before the move.

Locking and unlocking

As mentioned above, WebDAV also supports locking of resources so that for instance other clients may not change it. The protocol so far only supports one lock-type: write. So while you can request a lock-scope of either shared or exclusive, the only lock itself is a write lock.

The point of a shared lock is, that multiple principals can have one on the same resource.

$ curl -i -X LOCK --data @lock.xml http://localhost:8080/uploads/collection/test.txt
HTTP/1.1 200 OK
Date: Sat, 07 Sep 2013 19:52:19 GMT
Server: Apache/2.2.22 (Unix) DAV/2 mod_ssl/2.2.22 OpenSSL/0.9.8x
Lock-Token: <opaquelocktoken:e1471c1c-5e91-460a-aa9c-1fc7c2ab6540>
Content-Length: 378
Content-Type: text/xml; charset="utf-8"

<?xml version="1.0" encoding="utf-8"?>
<D:prop xmlns:D="DAV:">
<D:lockdiscovery>
<D:activelock>
<D:locktype><D:write/></D:locktype>
<D:lockscope><D:exclusive/></D:lockscope>
<D:depth>infinity</D:depth>
<D:timeout>Infinite</D:timeout>
<D:locktoken>
<D:href>opaquelocktoken:e1471c1c-5e91-460a-aa9c-1fc7c2ab6540</D:href>
</D:locktoken>
</D:activelock>

To unlock the resource, take the Lock-Token returned as a response header and send it back up with the UNLOCK method:

$ curl -i -X UNLOCK --header "Lock-Token: <opaquelocktoken:e1471c1c-5e91-460a-aa9c-1fc7c2ab6540>" http://localhost:8080/uploads/collection/test.txt
HTTP/1.1 204 No Content
Date: Sat, 07 Sep 2013 19:52:38 GMT
Server: Apache/2.2.22 (Unix) DAV/2 mod_ssl/2.2.22 OpenSSL/0.9.8x
Content-Length: 0
Content-Type: text/plain

As we can see in the original LOCK response, you can also set a depth and a timeout for the LOCK request. Contrary to the PROPFIND command, “Depth” only supports the values 0 and “infinity” here. 0 only locks the given URL while “infinity” locks the resource as well as all members, sub-members and so forth.

The timeout is either specified as a number of seconds or set to “infinity” as documented in the specs.

Extensions

As the whole properties-system and the preparation for different lock types indicates, WebDAV is not complete but has open doors for extensions. Here are just a few of them:

I didn’t have the time to also look into these but they sound like a perfect match :-)

Compatibility

The WebDAV specs also protocol compliance classes. Class 1 acts as a base layer including all the “MUST” requirements in the specification. Class 2 more or less contains locking and Class 3 describes all requirements of the specification except for locking. The idea here is that you can have an implementation that supports the protocol to the letter except for locking. In this case it could be advertised as supporting classes 2 and 3.

On the other hand you could have an implementation that supports the basics as well as locking (1 and 2). Apache’s mod_dav claims to be of that kind.

Tools support

What makes WebDAV so interesting is that it’s so widely supported. I already mentioned mod_dav which, for instance, is used for sharing your SVN repositories (as part of another extensions).

nginx also has a module for WebDAV but it only implements a really small subset of the specification and not even supports properties. Because of that it not even supports class 1 operations.

Regarding client support, I hardly know where to begin. Basically, wherever you have some file-system support, WebDAV is probably also there :-)