Preventing SQL Injection in Java

Status
Released 14/1/2008

Overview
As the name implies, SQL injection vulnerabilities allow an attacker to inject (or execute) SQL commands within an application. It is one of the most wide spread and dangerous application vulnerability. The CLASP project provides a good overview of SQL injection.

Example of SQL injection
The following Java servlet code, used to perform a login function, illustrates the vulnerability by accepting user input without performing adequate input validation or escaping meta-characters: conn = pool.getConnection; String sql = "select * from user where username='" + username +"' and password='" + password + "'"; stmt = conn.createStatement; rs = stmt.executeQuery(sql); if (rs.next) { loggedIn = true; out.println("Successfully logged in"); } else { out.println("Username and/or password not recognized"); } It is possible for attackers to provide a username containing SQL meta-characters that subvert the intended function of the SQL statement. For example, by providing a username of: admin' OR '1'='1 and a blank password, the generated SQL statement becomes: select * from user where username='admin' OR '1'='1' and password=' ' This allows an attacker to log in to the site without supplying a password, since the ‘OR’ expression is always true. Using the same technique attackers can inject other SQL commands which could extract, modify or delete data within the database.

Attack techniques
For more information on SQL injection attacks see:
 * http://www.spidynamics.com/papers/SQLInjectionWhitePaper.pdf
 * http://www.nextgenss.com/papers/advanced_sql_injection.pdf
 * http://www.appsecinc.com/presentations/Manipulating_SQL_Server_Using_SQL_Injection.pdf

Defence Strategy
To prevent SQL injection, a two pronged approach is recommended:
 * First, all queries should be parametrized with explicit binding of all user-driven variables.
 * Second, all data accepted from user input should be thoroughly validated to ensure that the characters received are part of the set of valid characters for that field;

Escaping Meta-characters
All data access techniques provide some means for escaping SQL meta-characters automatically. The important thing to remember is to never construct SQL statements using string concatenation of unchecked input values. The following sections detail how to perform input validation and meta-character escaping using popular data access technologies.

Prepared Statements
Variables passed as arguments to prepared statements will automatically be escaped by the JDBC driver. Example: ps.1 String selectStatement = "SELECT * FROM User WHERE userId = ? "; PreparedStatement prepStmt = con.prepareStatement(selectStatement); prepStmt.setString(1, userId); ResultSet rs = prepStmt.executeQuery;

Although Prepared Statements helps in defending against SQL Injection, there are possibilities of SQL Injection attacks through inappropriate usage of Prepared Statements. The example below explains such a scenario where the input variables are passed directly into the Prepared Statement and thereby paving way for SQL Injection attacks. Example: ps.2 String strUserName = request.getParameter("Txt_UserName"); PreparedStatement prepStmt = con.prepareStatement("SELECT * FROM user WHERE userId = '+strUserName+'"); It is highly recommended to use Bind Variables as mentioned in the example ps.1 above. Usage of PreparedStatement with Bind variables defends SQL Injection attacks and improves the performance.

Hibernate
According to this forum thread hibernate uses prepared statements, so it is protected from direct sql injection, but it could still be vulnerable to injecting HQL statements.

Ibatis
Ibatis creates prepared statements for database access. However, SQL injection is possible in Ibatis if the $$ variable replacement syntax is used.

Vulnerable:   select * from table where id = $value$ The above query is called as follows: MyBean b = (MyBean)sqlMap.queryForObject("vuln", new Integer(1)); The SQL statement thus created, looks as follows:

select * from table where id = 1

The object passed as parameter is directly fed to the SQL query making it susceptible to SQL injection Secure: 

 select * from table where id = #value#

Using this form instead generates the following SQL

select * from table where id = ?

The value of the parameter is sent directly to the driver and not used to modify the SQL statement itself. This approach thwarts SQL injection attacks by automatically escaping SQL meta-characters.

Validating Input
A general security principle which applies itself well to data validation is that of “deny by default” where data is rejected unless it specifically matches the criteria for known good data. This is also known as a “white list” approach and is the preferred method for performing data validation. It allows one to define a restricted range for valid data and reject everything that does not fit this set. The set of valid data should be constrained by: It is essential that the data validation routines themselves can be trusted, therefore they must be performed on the server side. Client side validation can be performed as a useful user interface feature, but it must be reinforced by server side validation. Where input validation is performed on the server side will depend largely on the frameworks available. JSF and Struts provide validation functions that are defined in the view layer, while Spring and EJB 3.0 allow validation to be defined in the model. Input validation provides the first line of defence in preventing dangerous characters from being processed by the application. But even if data is constrained in this way it does not solve the meta-character problem: How should the application handle meta-characters that are defined as valid data, but cannot be used in certain processing contexts? For example, the single quote (') character may be a valid character in a surname, but this character cannot simply be used in a string that is used to form an SQL statement. The OWASP Guide project has more information on Data Validation.
 * Type – String, integer, unsigned integer, float etc;
 * Length;
 * Set of character – for example, only alphabetic characters [a-zA-Z]*;
 * Format – if appropriate the data could be further constrained by specifying a format, e.g.: \d\d\/\d\d\/\d\d
 * Reasonableness – where possible, values should be compared to expected ranges. For example, a customer ordering 1000 televisions could be suspicious.