Testing for Cookie and Session Token Manipulation

Brief Summary
..here: we describe in "natural language" what we want to test.

Description of Issue
Cookies are used to implement session management and are described in detail in RFC 2965. In a nutshell, when a user accesses an application which needs to keep track of the actions and identity of that user across multiple requests, a cookie (or more than one) is generated by the server and sent to the client, which will send it back to the server in all following connections until the cookie expires or is destroyed. The data stored in the cookie can provide to the server a large spectrum of information about who the user is, what action has performed so far, what are his/her preferences, etc. therefore providing a state to a stateless protocol like HTTP.

A typical example is provided by an online shopping cart: along the whole session of a user, the application must keep track of its identity, its profile, the products that he/she has chosen to buy, the quantity, the individual prices, discounts, etc. Cookies are an efficient way to store and pass this information back and forth (other methods are URL parameters and hidden fields).

Due to the importance of the data that they store, cookies are therefore vital in the overall security of the application. Being able to tamper with cookies may result in hijacking the sessions of legitimate users, gaining higher privileges in an active session and more in general influencing the operations of the application in an unauthorized way. In this test we have to check whether the cookies issued to clients can resist to a wide range of attacks aimed to interphere with the sessions of legitimate users and with the application itself. The overall goal is to be able to forge a cookie that will be considered valid by the application and that will provide some kind of unauthorized access (session hijacking, privilege escalation, ...). Usually the main steps of the attack pattern are the following:
 * cookie collection: collection of a sufficient number of cookie samples;
 * cookie reverse engineering: analysis of the cookie generation algorithm;
 * cookie manipulation: forging of a valid cookie in order to perform the attack. This last step might require a large number of attempts, depending on how the cookie is created (cookie brute-force attack).

Another pattern of attack consists of overflowing a cookie. Strictly speaking, this attack has a different nature, since here we are not trying to recreate a perfectly valid cookie. Instead, our goal is to overflow a memory area, interphering with the correct behavior of the application and possibly injecting (and remotely executing) malicious code.

Black Box Testing and Examples
Cookie collection

The first step required in order to manipulate the cookie is obviously to understand how the application creates and manages cookies. For this task, we have to try to answer the following questions:

Surf the application. Note down when cookies are created. Make a list of received cookies, the page that sets them (with the set-cookie directive), the domain for which they are valid, their value and characteristics. Surfing the application, find which cookies remain constant and which get modified. What events modify the cookie ? Find out which parts of the application need a cookie. Access a page, then try again without the cookie, or with a modified value of it. Try to map which cookies are used where.
 * How many cookies are used by the application ?
 * Which parts of the the application generate and/or modify the cookie ?
 * Which parts of the application require this cookie in order to be accessed and utilized?

A spreadsheet mapping each cookie to the corresponding application parts and the related information can be a valuable output of this phase.

Cookie reverse engineering

Now that we have enumerated the cookies and have a general idea of their use, it's time to have a deeper look at cookies that seem interesting. What are we interested in? Well, a cookie, in order to provide a secure method of session management, must combine together several characteristics, each of which is aimed to protect the cookie from a different class of attacks. These characteristics are summarized below:
 * 1) Unpredictability: a cookie must contain some amount of hard to guess data. The harder it is to forge a valid cookie, the harder is to break into legitimate users' session. If an attacker can guess the cookie used in an active session of a legitimate user, he/she will be able to fully impersonate that user (session hijacking). In order to make a cookie unpredictable, random values and/or criptography can be used
 * 2) Tamper resistance: a cookie must resist to malicious attempts of modification. If we receive a cookie like IsAdmin=No, it is trivial to modify it to get administrative rights, unless the application performs a double check (for instance appending to the cookie an encrypted hash of its value)
 * 3) Expiration: a critical cookie must be valid only for an appropriate period of time and must be deleted from disk/memory afterwards, in order to avoid the risk of being replayed. This does not apply to cookie that store non-critical data that needs to be remembered across sessions (e.g.: site look-and-feel)
 * 4) “Secure” flag: a cookie whose value is critical for the integrity of the session should have this flag enabled, in order to allow its transmission only in an encrypted channel to deter eavesdropping.

The approach here is to collect a sufficient number of instances of a cookie and start looking for patterns in their value. The exact meaning of “sufficient” can vary from a handful of samples if the cookie generation method is very easy to break to several thousands if we need to proceed with some mathematical analysis (e.g.: chi-squares, attractors, ..., see later).

It is important to pay particular attention to the workflow of the application, as the state of a session can have a heavy impact on collected cookies: a cookie collected before being authenticated can be very different from a cookie obtained after the authentication.

Another aspect to keep into consideration is time: always record the exact time when a cookie has been obtained, when there is the doubt (or the certainty) that time plays a role in the value of the cookie (the server could use a timestamp as part of the cookie value). The time recorded could be the local time or the server's timestamp included in the HTTP response (or both).

Analyzing the collected values, try to figure out all variables that could have influenced the cookie value and try to vary them one at the time. Passing to the server modified versions of the same cookie can be very helpful in understanding how the application reads and precesses the cookie.

Examples of checks to be performed at this stage include:
 * What character set is used in the cookie ? Has the cookie a numeric value ? Alphanumeric ? Hexadecimal ? What happens inserting in a cookie characters that do not belong to the expected charset ?
 * Is the cookie composed of different sub-parts carrying different pieces of information ? How are the different parts separated ? With which delimiters ? Some parts of the cookie could have a higher variance, others might be constant, others could assume only a limited set of values. Breaking down the cookie to its base components is the first and fundamental step. An example of an easy-to-spot structured cookie is the following:

ID=5a0acfc7ffeb919:CR=1:TM=1120514521:LM=1120514521:S=j3am5KzC4v01ba3q

In this example we see 5 different fields, carrying different types of data:

ID – hexadecimal CR – small integer TM and LM – large integer. (And curiously they hold the same value. Worth to see what happens modifying one of them) S – alphanumeric

Even when no delimiters are used, having enough samples can help. As an example, let's see the following series:

0123456789abcdef

=
=== 1 323a4f2cc76532gj 2 95fd7710f7263hd8 3 7211b3356782687m 4 31bbf9ee87966bbs

We have no separators here, but the different parts start to show up. We seem to have a 2-digit decimal number (columns #0 and #1), a 7-digit hexadecimal number (#2-#8), a constant “7” (#9), a 3-digit decimal number (#a-#c) and a 3-character string (#d-#f). There are still some shades, tho: the first column is always odd, so maybe it's a value of its own where the least significant bit is always 1. Or maybe the first 9 columns are just one hexadecimal value. Collecting a few more samples will quickly answer our last questions.


 * Does the cookie name provide some hints about the nature of data it stores? As hinted before, a cookie named “IsAdmin” would be a great target to play with
 * Does the cookie (or its parts) seem to be encoded/encrypted? A 16 bytes long pseudo-random value could be a sign of a MD5 hash. A 20 bytes value could indicate a SHA-1 hash. A string of seemingly random alphanumeric characters could actually hide a base64 encoding that can be easily reversed using WebScarab or even a simple Perl script. A cookie whose value is “YWRtaW46WW91V29udEd1ZXNzTWU=” would translate into a more friendly “admin:YouWontGuessMe”. Another option is that the value has been obfuscated XORing it with some string.
 * What data is included in the cookie? Example of data that can be stored in the cookie include: username, password, timestamp, role (e.g.: user, admin,...), source IP address. It is important at this stage to distinguish which pieces of information have a deterministic value and which have a random nature.
 * If the cookie contains information about the source IP address, is it a corresponding check enforced server side? What happenes changing, inside the same session, the IP address with which we contact the server? Is the request rejected?
 * Does the cookie contain information about the application workflow? A cookie named “FailedLoginAttemps” could trigger an account logout. Being able to change its value keeping it to zero could allow a brute-force attack against one or more accounts.
 * In case of numeric values, what are their boundaries? In the previous example, CR can probably hold a very limited set of values, while TM and LM use a much broader space. Can a field contain a negative number? If not, what happens forcing a negative number in it ? Is it possible to guess how many bytes are allocated for the value? If a cookie seems to assume values between 0 and 65535 only, then probably it is stored in an unsigned 2-bytes variable. What happens trying to overflow it ? If the cookie holds a string, how long can it be?
 * If we start multiple separate sessions, how do the delivered cookies change? Let's say that we login 5 times in a row and we receive the following cookies:

id=7612542756:cnt=a5c8:grp=0 id=7612542756:cnt=a5c9:grp=0 id=7612542756:cnt=a5ca:grp=0 id=7612542756:cnt=a5cb:grp=0 id=7612542756:cnt=a5cd:grp=0


 * As we can see, we have two constant fields (“id” and “grp”) that probably identify us, so these parts are unlikely to change in future attempts. A third field (“cnt”) changes, however, and looks like a hexadecimal 2-bytes counter. Between the 4th and the 5th cookie however we see that we have missed a value, meaning that probably someone else logged in.
 * Does the cookie have an expiration time? Is it enforced server side (in order to do this check you can simply modify the set-cookie directive on the fly to indicate a much longer validity period and see whether the server respects it)? Enforcing of expiration times is extremely important as a defence against reply attacks.

If the cookie has authentication purposes, it is better to have at least 2 different users, in order to check how the cookie varies when belonging to different accounts. Sometimes, a cookie generation algorithm uses only deterministic values and once we have understood the algorithm logic we can easily forge a valid cookie. But sometimes things get more complex and a cookie (or parts of it) is generated by algorithms that do not let us easily forge valid cookies with a single attempt. For instance, a cookie might include a pseudo-random value. Another example is the use of encryption or hashing functions. Let's have a look at the following 5 cookies:

1: c75918d4144fc122975590ffa48627c3b1f01bb1 2: 9ec985ef773e19bab8b43e8ad7b6b4d322b5e50d 3: d49e0a658b323c4d7ee888275225b4381b70475c 4: 9ddc4dc3900890cf9c22c7b82fa3143a56b17cf6 5: fb000aa881948bffbcc01a94a13165fece3349c2

Is there any easy-to-spot generation algorithm? Except for the fact that they are all 20 bytes long, there is not much to be said. But they happen to be the SHA-1 hash of the five cookies of the previous example, which varied only by a 2-bytes counter. Therefore, they can assume only 65536 (216) different values, which is not a tiny number but still a lot less than the 2160 possible values of a SHA-1 hash. More precisely, we have reduced the cookie space of 2.23e+43 (2144) times.

The only way to spot this behavior of course would be to collect enough cookies, and a simple perl script would be enough for the task. Also WebScarab and CookieDigger provide very efficient and flexible cookie collection and analysis tools. Once we know that this cookie can assume only a very limited set of values, we now know that an impersonation attack against an active user has much higher chances to succeed than what would appear at first sight. We only have to change the user id and generate the 65536 corresponding possible hashed cookies.

More in general, a seemingly random cookie can be less random than it seems, and collecting a high number of cookies can provide valuable information about which values are more likely to be used, revealing hidden properties that could make guessing a valid cookie a viable attack. How many cookies are needed to perform such an analysis is a function of a high number of factors:
 * Algorithm resistance to pattern discovery
 * Computing resources that are available for the analysis
 * Time needed to collect a single cookie

Once enough samples have been collected, it's time to look for patterns: for example, some characters might be more frequent than others, and another perl script may be well enough to discover that.

There are some statistical methods that can help in finding patterns in apparently random numbers. A detailed discussion of these methods is outside the scope of this paper, but a few approaches are the following:
 * Strange Attractors and TCP/IP Sequence Number Analysis http://www.bindview.com/Services/Razor/Papers/2001/tcpseq.cfm
 * Correlation Coefficient - http://mathworld.wolfram.com/CorrelationCoefficient.html
 * ENT - http://fourmilab.ch/random/

If the cookie seems to have some kind of time dependency, a good approach is to collect a large amount of samples in a short time, in order to see whether it is possible to reduce (or almost eliminate) the time impact when guessing “nearby” cookies.

Cookie manipulation

Once you have squeezed out as much information as possible from the cookie, it is time to start to modify it. The methodologies here heavily depend on the results of the analysis phase, but we can provide some examples:

Example 1: cookie with identity in clear text

In figure 1 we show an example of cookie manipulation in an application that allows subscribers of a mobile telecom operator to send MMS messages via Internet. Surfing the application using OWASP WebScarab or BurpProxy we can see that after the authentication process the cookie msidnOneShot contains the sender’s telephone number: this cookie is used to identify the user for the service payment process. However, the phone number is stored in clear and is not protected in any way. Thus, if we modify the cookie from msidnOneShot=3*******59 to msidnOneShot=3*******99, the mobile user who owns the number 3*******99 will pay the MMS message!

Figure 1 - Example of Cookie with identy in clear text

Source: A Case Study of a Web Application Vulnerability - Matteo Meucci: http://www.owasp.org/docroot/owasp/misc/OWASP-Italy-MMS-Spoofing.zip

Example 2: guessable cookie 

An example of a cookie whose value is easy to guess and that can be used to impersonate other users can be found in OWASP WebGoat, in the “Weak Authentication cookie” lesson. In this example, you start with the knowledge of two username/password couples (corresponding to the users 'webgoat' and 'aspect'). The goal is to reverse engineer the cookie creation logic and break into the account of user 'alice'. Authenticating to the application using these known couples, you can collect the corresponding authentication cookies. In table 1 you can find the associations that bind each username/password couple to the corresponding cookie, together with the login exact time.

Table 1: Cookie collections

First of all, we can note that the authentication cookie remains constant for the same user across different logons, showing a first critical vulnerability to replay attacks: if we are able to steal a valid cookie (using for example a XSS vulnerabilty), we can use it to hijack the session of the corresponding user without knowing his/her credentials. Additionally, we note that the “webgoat” and “aspect” cookies have a common part: “65432u”. “65432” seems to be a constant integer. What about “u” ? The strings “webgoat” and “aspect” both end with the “t” letter, and “u” is the letter following it. So let's see the letter following each letter in “webgoat”:

1st char: “w” + 1 =“x” 2nd char: “e” + 1 = “f” 3rd char: “b” + 1 = “c” 4th char: “g” + 1= “h” 5th char: “o” + 1= “p” 6th char: “a” + 1= “b” 7th char: “t” + 1 = “u”

We obtain “xfchpbu”, which inverted gives us exactly “ubphcfx”. The algorithm fits perfectly also for the user 'aspect', so we only have to apply it to user 'alice', for which the cookie results to be “65432fdjmb”. We repeat the authentication to the application providing the “webgoat” credentials, substitute the received cookie with the one that we have just calculated for alice and…Bingo! Now the application identifies us as “alice” instead of “webgoat”.

Brute force

The use of a brute force attack to find the right authentication cookie, could be an heavy time consuming technique. FoundStone CookieDigger can help to collect a large number of cookies, giving the average length and the character set of the cookie. In advance, the tool compares the different values of the cookie to check how many characters are changing for every subsequent login. If the cookie values does not remain the same on subsequent logins, CookieDigger gives the attacker longer periods of time to perform brute force attempts. In the following table we show an example in which we have collected all the cookies from a public site, trying 10 authentication attempts. For every type of cookie collected you have an estimate of all the possible attempts needed to “brute force” the cookie.

Table 2: An example of CookieDigger report

Overflow

Since the cookie value, when received by the server, will be stored in one or more variables, there is always the chance of performing a boundary violation of that variable. Overflowing a cookie can lead to all the outcomes of buffer overflow attacks. A Denial of Service is usually the easiest goal, but the execution of remote code can also be possible. Usually, however, this requires some detailed knowledge about the architecture of the remote system, as any buffer overflow technique is heavily dependent on the underlying operating system and memory management in order to correctly calculate offsets to properly craft and align inserted code.

Example: http://seclists.org/lists/fulldisclosure/2005/Jun/0188.html

Gray Box Testing and Examples
When the analysis is performed with a White-box approach, the nature of the aspects of the cookie that we have to investigate do not vary. We still have to measure how easy is for an attacker to create a valid cookie. However, since we have full visibility on how the cookie is created, we do not have to put any effort on reverse-engineer the cookie creation algorithm. Instead, we can directly delve in the creation mechanics, and assess whether all best practices are used or not.

Most of the questions that we tried to answer in the black-box section. Exactly which ones is left as an exercise to the reader. Additionally, the following aspects can be analyzed.

Cryptographic algorithms: if one (or more) part of the cookie uses cryptographic algorithms, check that these algorithms are safe. Secret algorithms are to be considered not secure, as they have not been properly reviewed by the community. Algorithms that are considered secure include:


 * Symmetric algorithms:
 * 3DES
 * AES
 * Blowfish
 * Serpent
 * Twofish
 * Asymmetric algorithms:
 * RSA
 * Hashing algorithms:
 * SHA-256
 * SHA-512
 * SHA-1 (even if some attacks have been recently discovered, we still can consider it secure enough)

Examples of insecure algorithms include DES, RC2, MD2, MD4

If secure cryptographic algorithms are used, are keys long enough? Any encryption performed with a secret key that is shorter than 128 bits cannot be considered secure. If public key cryptography is used during cookie generation and handling, are secret keys properly secured and protected? For instance, if every cookie is signed by the server, stealing the corresponding secret key would allow the attacker to easily forge valid cookies.

Are cryptographic algorithms properly implemented? An algorithm can be perfectly secure, but it could be implemented badly. Use of public cryptographic libraries instead of home-made ones is strongly recommended.

If random data is included, is the entropy pool large enough? When using random data (that can be included directly in the cookie or used in some cryptographic algorithms to generate keys) it is of foremost importance that the PRNG (Pseudo Random Number Generator) is fed with enough entropy, to minimize predicibility. Usually, entropy is gathered from hard to predict events like keystrokes, mouse movements, I/O operations. A good reference for assessing whether enough entropy is used when creating random values is RFC 1750 “Randomness Recommendations for Security”.

Is the cookie content properly checked when received by the server? As a general rule, no client supplied data should be trusted, and cookies are no exception. Each piece of information that is enclosed in the cookie should be checked depending on its nature and goal. The goal is to protect both legitimate sessions and the application itself. For instance, if an integer value is expected, the application should control that this constraint is respected, before trying to utilize that value. If a value is used to convey some critical information about the client, the application should be able to detect whether that value has been tampered with. Another important check is size: no cookie (or part of it) should be allowed to overflow data structures of the application, as it might lead to stack or heap based buffer overflow attacks. Such attacks have been rarely applied to cookies, as a cookie is usually too small to provide enough space to a working shellcode. However, overflowing a variable can be enough to perform a denial of service attack, and corresponding check should always be properly put in place.