GSoC2013 Ideas/OWASP ZAP CMS SCANNER

Introduction
Latest Stats Show that the usage of CMS has grown in the last 5 years Just WordPress and Joomla occupy more than 6% of the top 1 million site The Usage Has Grown in Both Corporate and Personal sites And with the Chaotic Development of plugins and Components in those CMSs The risk of vulnerabilities and flows increase more and more



OWASP ZAP CMS SCANNER (ZAP CMSS) is a Scanner with More specified search methods

Functionalities
- Enumerating Plugins and Components and Themes in the CMS (Passive and Aggressive Search methods)

- Enumerating from page content

- Enumerating from lists (or database)

- Version Fingerprinting (with multiple methods) - Labor intensive to add signatures

- Manually locate version in files or build regexes for headers

- Built-in options to remove identifiers (eg, meta generator)



- Enumerating Vulnerable plugins, themes and Components

- Enumerating from a well-known list

- Enumerating from web search

- Enumerating using the ZAP api

- Enumerating Security measures (firewalls, security plugins ...)



Matches
Matches are made with:

- Text strings (case sensitive)

- Regular expressions

- Google Hack Database queries (limited set of keywords)

- MD5 hashes

- URL recognition

- HTML tag patterns

- Custom java code for passive and aggressive operations

Features
- Control the trade off between speed/stealth and reliability

- Plugins include example URLs

- Performance tuning. Control how many websites to scan concurrently

- Result certainty awareness

- Fast

- Low resource usage

- Accurate (Low FP/FN)

- Resistant to hardening/banner removal

- Super easy to support new versions/apps

ZAP CMSS modules
ZAP CMS scanner extension consists of four main modules:

1- CMS detection module
mains to indicate what CMS uses the application of given url, two methods are used with CMS detector :

a- Passive search
based on page content analysis : 1 - using text string (case sensitive) 2 - using regex

these two methods are used to recognize html tags, eg: meta tag Generator, or to extract texts from the page showing the tool with which this application is created

3 - using google hacks (from a list of predefined keywords)

b- Agressive search
Is to try known and unique URLs from a predefined list, the presence of these paths indicates with certainty the CMS used, the problem with this method is that it will not be very effective in if several CMSs are supported, this because of the absence of a single file in this case

2- Plugin enumerating module
The purpose of this module is to list the plugins used in the application of the given url, both passive and active methods are always possible

a- Passive search: analyzing the page content using regex

b- Aggressive research: using a list of names of plugins, which will be compared with any names found in specific URLs, for example, WordPress plugins in : url + / wp-content/plugins /

3- Fingerprinting module
Used to identify with controllable accuracy - overlooked used method and processing time - the version of CMS and / or plugins used, the joint research passive / aggressive is used, using

- Content analysis of some specific file, example: a readme file contains the following information: "package to Version 3.0.x"

- A list / database that contains unique file-links / paths that determines version of CMS / plugin

- A list / database that contains the correspondence : plugin version / filePath / hashDegest, so after you have verified the presence of a unique name file, but not unique content file, comparing its digest with that present in the database, the result indicates the component version here is an example of WordPress versions database :



4- Vulnerability Enumeration module
Used to give a list of vulnerabilities that contains a given plugin, this module is called after the steps of sensing the CMS and plugin enumerating, this module use :

1- database that contains the correspondence between name-version-plugin / vulnerability-list

2- web search based on a list of links to useful sites

Passing to WebApp Scanner
after pushing the research of fingerprinting techniques, and advanced in detailed design, it is apparent that is more appropriate and useful to go wider, and work on web applications in general and not only CMSs. I spoke with Simon about it, and he was pleased with the proposal, so I started the implementation of the web application finger printer core methods based on existing tools such as BlindElephant and Wappalyzer.

Passive search == ==

is to look in the target application (HTML, HTTP headers content ...) patterns that determine how likely or definite name and version of the technology used to make this application, passive research does not change anything in requests nor in content.