Test File Extensions Handling for Sensitive Information (OTG-CONFIG-003)

Brief Summary
File extensions are commonly used in web servers to easily determine which technologies / languages / plugins must be used to fulfil the web request.

While this behavior is consistent with RFCs and Web Standards, using standard file extensions provides the pentester useful information about the underlying technologies used in a web appliance and greatly simplifies the task of determining the attack scenario to be used on particular technologies.

In addition, misconfiguration in web servers could easily reveal confidential information about access credentials.

Description of the Issue
Determining how web servers handle requests corresponding to files having different extensions may help us to understand web server behaviour depending on the kind of files we try to access. For example, it can help us understand which file extensions are returned as text/plain versus those which cause execution on the server side. The latter are indicative of technologies / languages / plugins which are used by web servers or application servers, and may provide additional insight on how the web application is engineered. For example, a “.pl” extension is usually associated with server-side Perl support (though the file extension alone may be deceptive and not fully conclusive; for example, Perl server-side resources might be renamed to conceal the fact that they are indeed Perl related). See also next section on “web server components” for more on identifying server side technologies and components.

Black Box testing and example
Submit http[s] requests involving different file extensions and verify how they are handled. These verifications should be on a per web directory basis.

Verify directories which allow script execution. Web server directories can be identified by vulnerability scanners, which look for the presence of well-known directories. In addition, mirroring the web site structure allows us to reconstruct the tree of web directories served by the application.

If the web application architecture is load-balanced, it is important to assess all of the web servers. This may or may not be easy, depending on the configuration of the balancing infrastructure. In an infrastructure with redundant components there may be slight variations in the configuration of individual web / application servers; this may happen, for example, if the web architecture employs heterogeneous technologies (think of a set of IIS and Apache web servers in a load-balancing configuration, which may introduce slight asymmetric behaviour between themselves, and possibly different vulnerabilities).

'Example:

We have identified the existence of a file named connection.inc. Trying to access it directly gives back its contents, which are:

<?  	mysql_connect("127.0.0.1", "root", "") or die("Could not connect"); ?>

We determine the existence of a MySQL DBMS back end, and the (weak) credentials used by the web application to access it. This example (which occurred in a real assessment) shows how dangerous can be the access to some kind of files.

The following file extensions should NEVER be returned by a web server, since they are related to files which may contain sensitive information, or to files for which there is no reason to be served.


 * .asa
 * .inc

The following file extensions are related to files which, when accessed, are either displayed or downloaded by the browser. Therefore, files with these extensions must be checked to verify that they are indeed supposed to be served (and are not leftovers), and that they do not contain sensitive information.


 * .zip, .tar, .gz, .tgz, .rar, ...: (Compressed) archive files
 * .java: No reason to provide access to Java source files
 * .txt: Text files
 * .pdf: PDF documents
 * .doc, .rtf, .xls, .ppt, ...: Office documents
 * .bak, .old and other extensions indicative of backup files (for example: ~ for Emacs backup files)

The list given above details only a few examples, since file extensions are too many to be comprehensively treated here. Refer to http://filext.com/ for a more thorough database of extensions.

To sum it up, in order to identify files having a given extensions, a mix of techniques can be employed, including: Vulnerability Scanners, spidering and mirroring tools, manually inspecting the application (this overcomes limitations in automatic spidering), querying search engines (see Testing: Spidering and googling). See also Testing for Old, Backup and Unreferenced Files which deals with the security issues related to "forgotten" files.

Gray Box testing and example
Performing white box testing against file extensions handling amounts to checking the configurations of web server(s)/application server(s) taking part in the web application architecture, and verifying how they are instructed to serve different file extensions.

If the web application relies on a load-balanced, heterogeneous infrastructure, determine whether this may introduce different behaviour.