CRV2 CrawlingCode

Saltar a: navegación, buscar


Crawling code is the practice of scanning a code base of the review target in question. It is, in effect, looking for key pointers wherein a possible security vulnerability might reside. Certain APIs are related to interfacing to the external world or file IO or user management, which are key areas for an attacker to focus on. In crawling code we look for APIs relating to these areas. We also need to look for business logic areas which may cause security issues, but generally these are bespoke methods which have bespoke names and can not be detected directly, even though we may touch on certain methods due to their relationship with a certain key API.

We also need to look for common issues relating to a specific language; issues that may not be *security* related but which may affect the stability/availability of the application in the case of extraordinary circumstances. Other issues when performing a code review are areas such a simple copyright notice in order to protect one’s intellectual property. Generally these issues should be part of your companies Coding Guidelines, and should be enforceable during a code review (i.e. a reviewer can fail code review because the code violates something in the Coding Guidelines, regardless of whether or not the code would work in its current state, and regardless on whether the original developer agrees or not).

Crawling code can be done manually or in an automated fashion using automated tools. Crawling code manually is probably not effective, as (as can be seen below) there are plenty of indicators which can apply to a language. Tools as simple as grep or wingrep can be used. Other tools are available which would search for key words relating to a specific programming language. If you are using a particular review tool which allows you to specify strings to be highlighted in a review (e.g. Python based review tools using pygments syntax highlighter, or an in-house tool for which you can change the source code) then you could add the relevant string indicators from the lists below and have them highlighted to reviewers automatically.

The following sections shall cover the function of crawing code for Java/J2EE, .NET and Classic ASP. This section is best used in conjunction with the transactional analysis section also detailed in this guide.

Searching for Key Indicators

The basis of the code review is to locate and analyse areas of code which may have application security implications. Assuming the code reviewer has a thorough understanding of the code, what it is intended to do, and the context in which it is to be used, firstly one needs to sweep the code base for areas of interest.

This can be done by performing a text search on the code base looking for keywords relating to APIs and functions. Below is a guide for .NET framework 1.1 & 2.0

Searching for Code in .NET

Firstly one needs to be familiar with the tools one can use in order to perform text searching, following this one needs to know what to look for.

In this section we will assume you have a copy of Visual Studio (VS) .NET at hand. VS has two types of search "Find in Files" and a cmd line tool called Findstr.

To start off, one could scan thorough the code looking for common patterns or keywords such as "User", "Password", "Pswd", "Key", "Http", etc... This can be done using the "Find in Files" tool in VS or using findstring as follows:

findstr /s /m /i /d:c:\projects\codebase\sec "http" *.*

HTTP Request Strings

Requests from external sources are obviously a key area of a security code review. We need to ensure that all HTTP requests received are data validated for composition, max and min length, and if the data falls with the realms of the parameter white-list. Bottom-line is this is a key area to look at and ensure security is enabled.

request.accepttypes request.browser request.files request.headers request.httpmethod request.item
request.querystring request.form request.cookies request.certificate request.rawurl request.servervariables
request.url request.urlreferrer request.useragent request.userlanguages request.IsSecureConnection request.TotalBytes
request.BinaryRead InputStream HiddenField.Value TextBox.Text recordSet

HTML Output

Here we are looking for responses to the client. Responses which go unvalidated or which echo external input without data validation are key areas to examine. Many client side attacks result from poor response validation. XSS relies on this somewhat.

response.write <% = HttpUtility HtmlEncode UrlEncode
innerText innerHTML

SQL & Database

Locating where a database may be involved in the code is an important aspect of the code review. Looking at the database code will help determine if the application is vulnerable to SQL injection. One aspect of this is to verify that the code uses either SqlParameter, OleDbParameter, or OdbcParameter(System.Data.SqlClient). These are typed and treat parameters as the literal value and not executable code in the database.

<link to SQL injection section>

exec sp_ select from insert update delete from where delete
execute sp_ exec xp_ exec @ execute @ executestatement executeSQL
setfilter executeQuery GetQueryResultInXML adodb sqloledb sql server
driver Server.CreateObject .Provider .Open ADODB.recordset New OleDbConnection
ExecuteReader DataSource SqlCommand Microsoft.Jet SqlDataReader ExecuteReader
GetString SqlDataAdapter CommandType StoredProcedure System.Data.sql


Cookie manipulation can be key to various application security exploits, such as session hijacking/fixation and parameter manipulation. One should examine any code relating to cookie functionality, as this would have a bearing on session security.

System.Net.Cookie HTTPOnly document.cookie


Many of the HTML tags below can be used for client side attacks such as cross site scripting. It is important to examine the context in which these tags are used and to examine any relevant data validation associated with the display and use of such tags within a web application.

HtmlEncode URLEncode <applet> <frameset> <embed> <frame> <html>
<iframe> <img> <style> <layer> <ilayer> <meta> <object>
<frame security <iframe security <body>

Input Controls

The input controls below are server classes used to produce and display web application form fields. Looking for such references helps locate entry points into the application.

htmlcontrols.htmlinputhidden webcontrols.hiddenfield webcontrols.hyperlink webcontrols.textbox webcontrols.label
webcontrols.linkbutton webcontrols.listbox webcontrols.checkboxlist webcontrols.dropdownlist


The .NET Framework relies on .config files to define configuration settings. The .config files are text-based XML files. Many .config files can, and typically do, exist on a single system. Web applications refer to a web.config file located in the application’s root directory. For ASP.NET applications, web.config contains information about most aspects of the application’s operation.

requestEncoding responseEncoding trace authorization compilation CustomErrors
httpCookies httpHandlers httpRuntime sessionState maxRequestLength debug
forms protection appSettings ConfigurationSettings appSettings connectionStrings authentication mode
allow deny credentials identity impersonate timeout remote


Each application has its own Global.asax if one is required. Global.asax sets the event code and values for an application using scripts. One must ensure that application variables do not contain sensitive information, as they are accessible to the whole application and to all users within it.

Application_OnAuthenticateRequest Application_OnAuthorizeRequest Session_OnStart Session_OnEnd


Logging can be a source of information leakage. It is important to examine all calls to the logging subsystem and to determine if any sensitive information is being logged. Common mistakes are logging userID in conjunction with passwords within the authentication functionality or logging database requests which may contains sensitive data.

log4net Console.WriteLine System.Diagnostics.Debug System.Diagnostics.Trace


Its important that many variables in machine.config can be overridden in the web.config file for a particular application.

validateRequest enableViewState enableViewStateMac

Threads and Concurrency

Locating code that contains multithreaded functions. Concurrency issues can result in race conditions which may result in security vulnerabilities. The Thread keyword is where new threads objects are created. Code that uses static global variables which hold sensitive security information may cause session issues. Code that uses static constructors may also cause issues between threads. Not synchronizing the Dispose method may cause issues if a number of threads call Dispose at the same time, this may cause resource release issues.

Thread Dispose

Class Design

Public and Sealed relate to the design at class level. Classes which are not intended to be derived from should be sealed. Make sure all class fields are Public for a reason. Don't expose anything you don't need to.

Public Sealed

Reflection, Serialization

Code may be generated dynamically at runtime. Code that is generated dynamically as a function of external input may give rise to issues. If your code contains sensitive data, does it need to be serialized?

Serializable AllowPartiallyTrustedCallersAttribute GetObjectData StrongNameIdentityPermission
StrongNameIdentity System.Reflection

Exceptions & Errors

Ensure that the catch blocks do not leak information to the user in the case of an exception. Ensure when dealing with resources that the finally block is used. Having trace enabled is not great from an information leakage perspective. Ensure customised errors are properly implemented.

catch finally trace enabled customErrors mode


If cryptography is used then is a strong enough cipher used, i.e. AES or 3DES? What size key is used? The larger the better. Where is hashing performed? Are passwords that are being persisted hashed? They should be. How are random numbers generated? Is the PRNG "random enough"?

RNGCryptoServiceProvider SHA MD5 base64 xor
DES RC2 System.Random Random System.Security.Cryptography


If storing sensitive data in memory, I recommend one uses the following.

SecureString ProtectedMemory

Authorization, Assert & Revert

Bypassing the code access security permission? Not a good idea. Also below is a list of potentially dangerous permissions such as calling unmanaged code, outside the CLR.

.RequestMinimum .RequestOptional Assert Debug.Assert
CodeAccessPermission ReflectionPermission.MemberAccess SecurityPermission.ControlAppDomain SecurityPermission.UnmanagedCode
SecurityPermission.SkipVerification SecurityPermission.ControlEvidence SecurityPermission.SerializationFormatter SecurityPermission.ControlPrincipal
SecurityPermission.ControlDomainPolicy SecurityPermission.ControlPolicy

Legacy Methods

Some standard functions that should be checked in any context include the following.

printf strcpy

Searching for Code in Java

Input and Output Streams

These are used to read data into one’s application. They may be potential entry points into an application. The entry points may be from an external source and must be investigated. These may also be used in path traversal attacks or DoS attacks.

<are some of these a bit wide ranging?> java.util.jar FileInputStream ObjectInputStream
FilterInputStream PipedInputStream SequenceInputStream StringBufferInputStream BufferedReader
ByteArrayInputStream CharArrayReader File ObjectInputStream PipedInputStream
StreamTokenizer getResourceAsStream mkdir renameTo


These API calls may be avenues for parameter, header, URL, and cookie tampering, HTTP Response Splitting and information leakage. They should be examined closely as many of such APIs obtain the parameters directly from HTTP requests.

javax.servlet.* getParameterNames getParameterValues getParameter getParameterMap
getScheme getProtocol getContentType getServerName getRemoteAddr
getRemoteHost getRealPath getLocalName getAttribute getAttributeNames
getLocalAddr getAuthType getRemoteUser getCookies isSecure
HttpServletRequest getQueryString getHeaderNames getHeaders getPrincipal
getUserPrincipal isUserInRole getInputStream getOutputStream getWriter
addCookie addHeader setHeader setAttribute putValue
javax.servlet.http.Cookie getName getPath getDomain getComment
getMethod getPath getReader getRealPath getRequestURI
getRequestURL getServerName getValue getValueNames getRequestedSessionId

Cross Site Scripting

These API calls should be checked in code review as they could be a source of Cross Site Scripting vulnerabilities.

javax.servlet.ServletOutputStream.print javax.servlet.jsp.JspWriter.print

Response Splitting

Response splitting allows an attacker to take control of the response body by adding extra CRLFs into headers. In HTTP the headers and bodies are separated by 2 CRLF characters, and thus if an attackers input is used in a response header, and that input contained 2 CRLFs, then anything after the CRLFs would be interpreted as the response body. In code review ensure you are sanitizing any information being put into headers.

javax.servlet.http.HttpServletResponse.sendRedirect addHeader setHeader


Any time your application is sending a redirect response, ensure that the logic involved cannot be manipulated by an attackers input. Especially when input is used to determine where the redirect goes to.

sendRedirect setStatus addHeader setHeader

SQL & Database

Searching for Java Database related code this list should help you pinpoint classes/methods which are involved in the persistence layer of the application being reviewed.

jdbc executeQuery select insert update
delete execute executestatement createStatement java.sql.ResultSet.getString
java.sql.ResultSet.getObject java.sql.Statement.executeUpdate java.sql.Statement.executeQuery java.sql.Statement.execute java.sql.Statement.addBatch
java.sql.Connection.prepareStatement java.sql.Connection.prepareCall


Looking for code which utilises SSL as a medium for point to point encryption. The following fragments should indicate where SSL functionality has been developed. SSLContext SSLSocketFactory
TrustManagerFactory HttpsURLConnection KeyManagerFactory

Session Management

The following APIs should be checked in code review when they control session management.

getSession invalidate getId

Legacy Interaction

Here we may be vulnerable to command injection attacks or OS injection attacks. Java linking to the native OS can cause serious issues and potentially give rise to total server compromise.

java.lang.Runtime.exec java.lang.Runtime.getRuntime


We may come across some information leakage by examining code below contained in one’s application. log4j jLo Lumberjack MonoLog
qflog just4log log4Ant JDLabAgent

Architectural Analysis

If we can identify major architectural components within that application (right away) it can help narrow our search, and we can then look for known vulnerabilities in those components and frameworks:

### Ajax

### Struts

### Spring

### Java Server Faces (JSF)
import javax.faces

### Hibernate
import org.hibernate

### Castor

### JAXB

### JMS

Ajax and JavaScript

Look for Ajax usage, and possible JavaScript issues:

document.write eval document.cookie
window.location document.URL window.createRequest

Searching for Code in Classic ASP


These API in ASP are commonly used to retrieve the input from the request. Therefore code review should ensure these requests (and dependent logic) cannot be manipulated by an attacker.

Request Request.QueryString Request.Form
Request.ServerVariables Query_String hidden
include .inc


These APIs are used by ASP to write the response body, that will be sent to the end user. Code review should check these requests are used in a proper manner and no sensitive information can be returned.

Response.Write Response.BinaryWrite <%=


Cookies can be a source of information leakage.


Error Handling

<link to error handling section> Ensure errors in your application are handled properly, otherwise an attacker could use error conditions to manipulate you application.

err. Server.GetLastError On Error Resume Next
On Error GoTo 0

Information in URL

These APIs are used to extract information from the URL object in the request. Code review should check that the information extracted from the URL is sanitized.

location.href location.replace method="GET"


These APIs can be used to interact with a database, which can lead to SQL attacks. Code review can check these API calls use sanitized input.

commandText select from update
insert into delete from where exec
execute .execute .open
ADODB. commandtype ICommand


These API calls can control session within ASP applications.

session.timeout session.abandon session.removeall

DOS Prevention

The following ASP APIs can help prevent DOS attacks against your application.

server.ScriptTimeout IsClientConnected


Leaking information to a log can be of use to an attacker, hence the following API call can be checked in code review to ensure no sensitive information is being written to logs.



Do not allow attacker input to control when and where rejection occurs.

Response.AddHeader Response.AppendHeader Response.Redirect
Response.Status Response.StatusCode Server.Transfer

Searching for Code in Javascript and AJAX

Ajax and JavaScript have brought functionality back to the client side, which has brought a number of old security issues back to the forefront. The following keywords relate to API calls used to manipulate user state or the control the browser. The event of AJAX and other Web 2.0 paradigms has pushed security concerns back to the client side, but not excluding traditional server side security concerns.

Look for Ajax usage, and possible JavaScript issues:

eval document.cookie document.referrer document.attachEvent
document.body document.body.innerHtml document.body.innerText document.close
document.create document.execCommand document.forms[0].action
document.location document.URL document.URLUnencoded
document.write document.writeln location.hash location.href window.alert window.attachEvent window.createRequest
window.execScript window.location window.navigate
window.setInterval window.setTimeout XMLHTTP

Searching for Code in C++ and Apache

Commonly when a C++ developer is building a web service they will build a CGI program to be invoked by a web server (though this is not efficient) or they will use the Apache httpd framework and write a handler or filter to process HTTP requests/responses. To aid these developers, this section deals with generic C/C++ functions used when processing HTTP input and output, along with some of the common Apache APIs that are used in handlers.

Legacy C/C++ Methods

For any C/C++ code interacting with web requests, code that handles strings and outputs should be checked to ensure the logic does not have any flaws.

exec sprintf snprintf fprintf
printf stdio FILE strcpy
strncpy strcat cout cin
cerr system popen stringstream
fstringstream malloc free

Request Processing

When coding within Apache, the following APIs can be used to obtain data from the HTTP request object.

headers_in ap_read_request post_read_request

Response Processing

Depending on the type of response you wish to send to the client, the following Apache APIs can be used.

headers_out ap_rprintf ap_send_error_response
ap_send_fd ap_vprintf

Cookie Processing

Cookie can be obtained from the list of request headers, or from specialized Apache functions.

headers_in headers_out ap_cookie_write
ap_cookie_write2 ap_cookie_read ap_cookie_check_string


Log messages can be implemented using custom loggers included in your module (e.g. log4cxx, boost::log, etc), by using the Apache provided logging API, or by simply writing to standard out or standard error.

cout cerr ap_open_stderr_log
ap_log_error ap_log_perror ap_log_rerror

HTML Encoding

When you have got a handle for the HTML input or output in the C/C++ handler, the following methods can be used to ensure/check HTML encoding.

ap_unescape_all ap_unescape_url ap_unescape_url_keep2f
ap_unescape_urlencoded ap_escape_path_segment