The HTTP "Range" Header

Published: 2013-04-03
Last Updated: 2013-04-03 17:58:20 UTC
by Johannes Ullrich (Version: 1)
4 comment(s)

One of the topics we cover in our "Defending Web Applications" class is how to secure static files. For example, you are faced with multiple PDFs with confidential information, and you need to integrate authorization to read these PDFs into your web application. The standard solution involves two steps:

- Move the file out of the document root
- create a script that will perform the necessary authorization and then stream the file back to the user

Typically, the process of streaming the file back to the user is pretty simple. Most languages offer the ability to read the file, and then echo it back to the browser. In some cases, like for example PHP, there is a special command for this (readfile). This makes writing these access control scripts pretty easy, until you are faced with a new twist, the "Range" header.

The "Range" header is meant to be used to support partial downloads. A client may request just part of a file, instead of asking for the entire file.

RFC 2616 is a bit ambiguous when it comes to "Range" headers. First of all, it introduces the "Accept-Ranges" header, which can be used by the server to signal that it supports the "Range" header. Next, it states that the client may send a request using a "Range" header anyway, even if the server doesn't advertise support for it. The server also has the option to send "Accept-Ranges: none" to explicitly state that it does not support this type of header.

So what's the problem? It turns out that different HTTP clients appear to deal with "Range" headers slightly differently. In particular the iOS Podcast client requires support for the Range header, and will only download parts of the file if they are not supported. Apple recently advised iTunes publishers of this issue and requires content to be hosted on servers that support the Range header.

For a server, this is usually not a problem, wouldn't it be for a recent Apache DoS attack that caused some to block Range requests. Also, our "file streaming" script now needs to support the range requests. 

Here is a quick outline of how to support "Range" requests properly:

  1. Figure out if the Range header is used and extract the requested range. The range header should look like: 
    Range: bytes=1234-5678
    but could look like:
    Range: bytes=0-
    If the upper end is missing, it is assumed to empty "until the end of the file".
  1. load the file (if possible, only the part that needs to be send)
  2. Send the file, but use a "206 Partial Content" response code. Also, add the "Content-Range" header to indicate what you are sending. 
    Content-Range: bytes 1234-5678/1234567    (start-end/total size). One interesting twist: The "size" is indicated in bytes, while the range is indicated as an offset. So the maximum "Range" is the size-1.

Aside from the annoyance of having to write a more complex script, why does this matter for security?

Think Intrusion Detection systems, and maybe even web application firewalls: It is now for example possible for an attacker to request your secret document one byte at a time, possibly defeating data leakage protection. Or an attacker streaming an exploit from a web server could do so in small chunks to again defeat content filtering by the client. I played with various overlapping ranges and such, and it looks like browsers will discard these requests as they should. 

It is also possible to specify multiple ranges in one request (which is what the Apache DoS was about), but so far I haven't observed any requests like this.

In short: watch it but don't block it. It may make sense to log and pay attention to Range requests, but you shouldn't blindly block all of them as they may be required by the browser/http client.

RFC 2616:

Johannes B. Ullrich, Ph.D.
SANS Technology Institute

4 comment(s)


Form-based authentication (the recipe described in paragraph 1) is certainly *a* way to restrict access to web content. However, another common way is to use HTTP Basic Auth (preferably over https!) or HTTP Digest Auth, where the browser throws up its own login box. This way, the server handles everything, so there is no need to faff around with scripts that handle Range: headers etc etc.

The above method is what's defined by RFC (RFC2617) so arguably more standard. However, forms-based auth is perhaps more common (not sure why - maybe it's prettier?).

The comments about Range: headers being used to evade IDS etc are all valid.

Basic/Digest authentication are valid alternatives, but they do have their own problems, for example there is typically no easy way to terminate stale sessions if you just use the Apache built in functionality. Also, form based authentication will usually integrate easier with existing sites and authentication schemes.
There is this ability to expire sessions in Apache 2.4, mod_session, mod_session_cookie & mod_session_crypto, check them out. Want pretty forms matching your site, mod_auth_form.
Some version of Adobe reader seen in the wild, malfunctions in such a way that it hammers a server with hundreds of these range requests while viewing a document, to a point that can cause DoS of a busy Apache system.

On affected systems I've been forced to use mod_headers to unset the Range header in the request; I believe that safely falls back to serving the file in full and is hopefully in accordance with spec.

Diary Archives