OWASP Hatkit Datafiddler Project

Main


The Hatkit Datafiddler is a tool for performing analysis of captured http traffic. It currently consists of two main views, one table-based and one tree-based. These views allow the user to study different aspects of the http traffic, with very high degree of configurability. The tool is also meant to be a framework which can utilize existing tools analyze traffic.

It is written in Python with a Qt-based UI and uses a MongoDB database. It has a sister-project, which is the Hatkit Proxy

Getting started
First of all, visit BitBucket download page to check which is the latest release. Then get it: $ wget https://bitbucket.org/holiman/hatkit-datafiddler/downloads/hatkit_datafiddler-0.5.0.zip $ unzip hatkit_datafiddler-0.5.0.zip $ cd hatkit_datafiddler-0.5.0/ $ python datafiddler.py Datafiddler will tell you about any missing dependencies with something like this: Unfortunately, you have the following missing dependencies: * python-qt4 : Python bindings for Qt4 * pymongo : Python drivers for MongoDB Fetch them via your favourite package manager (on *nix systems. Windows is currently not endorsed). Naturally, you need a MongoDB also. MongoDB is available either from the package repositories or from MongoDB download section.

If all goes well, you should be met by this screen, where you can choose which session to use. Sessions are really just databases, but Datafiddler only lists the databases in your MongoDB which contain a  called.



Table view


The table view displays data in a 1:1 mapping, where every line in a table corresponds to one 'object' in the database. One 'object' contains the request, the response and some extra data. What is special about the Datafiddler is that what you see here is fully customizable.

Settings
If you select Settings, you will be met by the settings-window. This window gives you tools to define what is displayed in the table view to suit your current task.

"The section below is pretty technical. You don't have to know python or javascript to use this tool, Datafiddler comes with predefined expressions and views that you can use. When you learn the ropes a bit, you can just make modifications to these and you should be fine."

On the left side, there are variables. For each object which is fetched from the database, these expressions determines exactly what parts are fetched and places these parts, into python variables with the names v0 and onward. On the right side, there are the actual definitions of the columns that are shown in the table. These column definitions are python expressions which are evaluated at display-time, and are therefore able to either display a variable directly or perform operations on them and display the result.

For example, a database object stored by Hatkit Proxy always contain these fields: request response (For more details about storage format, see OWASP_Hatkit_Proxy_Project If the request part of a database object is loaded into v0, it means that v0 will contain python dictionary containing everything that concerns the request. E.g. The python expression  will be the request verb (GET/POST/FOO),while the expression   will be another python dictionary containing the request headers.

This means that this object introspection can be performed either inside the database - which is using javascript, or in the application itself, using python. Example: v0 = request.headers.Host ==> v0 : "foobar.com" v0 = request ==> v0['headers']['Host'] : "foobar.com"

Worth mentioning though, is that accessing a non-existant attribute (or member) in javascripts returns undefined: var x = {}; alert(x.foo); // alerts "undefined" alert(x.foo.bar); // yields exception While in python, a similar operation yields exception sooner: >>> x={} >>> x["foo"] Traceback (most recent call last): File " ", line 1, in KeyError: 'foo' Also, it makes sense to fetch only what is required for the kind of view that you are interested in. If you are analysing session tokens, it is less resource intensive on your machine not to fetch the html content of each response.

What to show: Database Filtering
See the database filtering tab.

How to show it
Todo

Transformers
Todo

Aggregation
Todo

Database Filtering
The table view (and the upcoming third-party view) supports database filtering. A database filter is a set of expressions which are evaluated in the database, and determines which items are returned in the result of a given query. Note: This is much like the WHERE-part of an SQL statement (on steroids).



The image above contains two tabs. One is for 'native' expressions, where the syntax is a standard javascript object notation. Example below: a filter that returns only pages where (url-)parameters were present in the request.



In the other tab, you can enter javascript expressions (a V8 engine is embedded inside the database). Simply enter a function which returns true if the object should be returned, false otherwise. Example below: a filter which only returns items where one of the request (url-) parameters were present in the html response - i.e reflected content.



You can click 'Test' to see if the filter works. Javascript errors should be visible in the mongodb console, and the number of matches should be shown in the text pane:



If you write a good filter, you can click "Save as" so you don't have to retype it next time. A good set of standard filters for different purposes is planned for inclusion at a later stage. In the meantime, if you write any handy filters you are encouraged to post them to the datafiddler mailing list.

For further information about how filtering works, and examples of operators which may be used, see the MongoDB documentation

Development
Todo

Getting the source code
Todo