XPATH Injection

Author
Contact Author: [mailto:mark.bradshaw@gmail.com Mark Bradshaw]

Description
Similar to SQL Injection, XPath Injection attacks occur when a web site uses user-supplied information to construct an XPath query for XML data. By sending intentionally malformed information into the web site, an attacker can find out how the XML data is structured or access data that he may not normally have access to. He may even be able to elevate his privileges on the web site if the xml data is being used for authentication (such as an xml based user file).

Querying XML is done with XPath, a type of simple descriptive statement that allows the xml query to locate a piece of information. Like SQL you can specify certain attributes to find and patterns to match. When using XML for a web site it is common to accept some form of input on the query string to identify the content to locate and display on the page. This input must be sanitized to verify that it doesn't mess up the XPath query and return the wrong data.

Examples
We'll use this xml snippet for the examples.

&lt;?xml version="1.0" encoding="utf-8"?&gt; &lt;Employees&gt; &lt;Employee ID="1"&gt; &lt;FirstName&gt;Arnold&lt;/FirstName&gt; &lt;LastName&gt;Baker&lt;/LastName&gt; &lt;UserName&gt;ABaker&lt;/UserName&gt; &lt;Password&gt;SoSecret&lt;/Password&gt; &lt;Type&gt;Admin&lt;/Type&gt; &lt;/Employee&gt; &lt;Employee ID="2"&gt; &lt;FirstName&gt;Peter&lt;/FirstName&gt; &lt;LastName&gt;Pan&lt;/LastName&gt; &lt;UserName&gt;PPan&lt;/UserName&gt; &lt;Password&gt;NotTelling&lt;/Password&gt; &lt;Type&gt;User&lt;/Type&gt; &lt;/Employee&gt; &lt;/Employees&gt;

Suppose we have a user authentication system on a web page that used a data file of this sort to login users. Once a username and password had been supplied the software might use an XPath to lookup the user such as this:

VB: Dim FindUserXPath as String FindUserXPath = "//Employee[UserName/text='" & Request("Username") & "' And        Password/text='" & Request("Password") & "']"

C#: String FindUserXPath; FindUserXPath = "//Employee[UserName/text='" + Request("Username") + "' And        Password/text='" + Request("Password") + "']";

With a normal username and password this XPath would work, but an attacker may send a bad username and password and get an xml node selected without knowing the username or password, like this:

Username: blah' or 1=1 or 'a'='a Password: blah

FindUserXPath becomes //Employee[UserName/text='blah' or 1=1 or        'a'='a' And Password/text='blah']

Logically this is equivalent to: //Employee[(UserName/text='blah' or 1=1) or        ('a'='a' And Password/text='blah')]

In this case, only the first part of the XPath needs to be true. The password part becomes irrelevant, and the UserName part will match ALL employees because of the "1=1" part.

Just like SQL injection, in order to protect yourself you must escape single quotes (or double quotes) if your application uses them.

VB: Dim FindUserXPath as String FindUserXPath = "//Employee[UserName/text='" & Request("Username").Replace("'", "&apos;") & "' And        Password/text='" & Request("Password").Replace("'", "&apos;") & "']"

C#: String FindUserXPath; FindUserXPath = "//Employee[UserName/text='" + Request("Username").Replace("'", "&apos;") + "' And        Password/text='" + Request("Password").Replace("'", "&apos;") + "']";

Another better mitigation option is to use a precompiled XPath. Precompiled XPaths are already preset before the program executes, rather than created on the fly after the user's input has been added to the string. This is a better route because you don't have to worry about missing a character that should have been escaped.

Related Attacks

 * Injection problem
 * SQL injection