Testing for HTTP Splitting/Smuggling (OTG-INPVAL-016)

[Up]

Brief Summary
In this chapter we will illustrate examples of attacks that leverage specific features of the HTTP protocol, either by exploiting weaknesses of the web application or peculiarities in the way different agents interpret HTTP messages

Description of the Issue
We analyze two different attacks that target specific HTTP headers: HTTP splitting and HTTP smuggling. The first one exploits a lack of input sanitization which allows an attacker to control to insert CR and LF characters into the headers of the application response and to 'split' that answer into two different HTTP messages. The goal of the attack can vary from a cache poisoning to cross site scripting. In the second attack, we exploit the fact that the same HTTP message can be parsed and interpreted in different ways depending on the agent that receives. HTTP smuggling requires some level of knowledge about the different agents that are handling the HTTP messages (web server, proxy, firewall) and therefore will be included only in the Gray Box testing section

HTTP Splitting
Some web applications use part of the user input to generate the values of some headers of their response. The most classic example is provided by redirections in which the target URL depends on some user submitted value. Let's say that the user is asked to choose whether she prefers a standard or advanced interface. Such choice will be passed as a parameter that will be used for a redirection to the corrisponding page. If the parameter, for instance, has the value 'advanced', the application will answer with the following: HTTP/1.1 302 Moved Temporarily Date: Sun, 03 Dec 2005 16:22:19 GMT Location: http://victim.com/main.jsp?interface=advanced When receiving this message, the browser will ask for the page indicated in the Location header. However, if the application does not filter the user input, it is possible to insert the sequence %0d%0a, which represent the CRLF sequence that is used to separate different lines, and use it to craft a response that will be interpreted as two different responses by the target, which for instance can be a web cache. This can be leveraged by an attacker to poison this web cache so that it will provide false content (controlled by the attacker) in all subsequent requests. Let's say that in our previous example an attacker passes the following data as the interface parameter: advanced%0d%0aContent-Length:%200%0d%0a%0d%0aHTTP/1.1%20200%20OK%0d%0aContent- Type:%20text/html%0d%0aContent-Length:%2035%0d%0a%0d%0a Sorry,%20System%20Down The resulting answer from the vulnerable application will therefore be the following: HTTP/1.1 302 Moved Temporarily Date: Sun, 03 Dec 2005 16:22:19 GMT Location: http://victim.com/main.jsp?interface=advanced Content-Length: 0

HTTP/1.1 200 OK Content-Type: text/html Content-Length: 35

Sorry,%20System%20Down The web cache will see two different responses, so if the attacker sends a second request asking for /index.html, the web cache will match this request with the second response, so that all subsequent requests for http://victim.com/index.html passing through that web cache will will receive the "system down" message. In this way, the attacker will have effectively defaced the site.

Therefore, in order to look for this vulnerability, the tester needs to identify all user controlled input that influences one or more headers in the response, and check whether he/she can successfully inject a CR+LF sequence in it. The headers that are the most likely candidates for this attack are:


 * Location
 * Set-Cookie

It must be noted that a successful exploitation of this vulnerability in a real world scenario can be quite complex, as several factors must be taken into account:


 * 1) The application, while not filtering the CR+LF sequence, might filter other characters that are needed for a successful attack (e.g.: "<" and ">"). In this case, the tester can try to use other encodings (e.g.: UTF-7)
 * 2) Different targets can use different methods to decide when the first HTTP message ends and when the second starts. Some will use the message boundaries, as in the previous example. Others will allocate for each message a number of chunks of predetermined length: in this case, the second message will have to start exactly at the beginning of a chunk and this will require the tester to use padding between the two messages. Other targets will assume that different messages will be carried by different packets.

For a detailed discussion about this attacks, check the papers referenced at the bottom of this page

Gray Box testing and example
Testing for Topic X vulnerabilities: ... Result Expected: ...