Difference between revisions of "Category:OWASP Favicon Database Project"

From OWASP
Jump to: navigation, search
(42 intermediate revisions by 2 users not shown)
Line 3: Line 3:
 
= Idea  =
 
= Idea  =
  
Idea is to have software enumerated via favicon.ico. How to do that? Take MD5 of favicon.ico and compare it against the known database. This article is description of building the MD5 database of most popular/frequent favicon.ico.  
+
Idea is to have software enumerated via favicon.ico. How to do that? Take hash (in our case MD5) of favicon.ico and compare it against the known database. This project is about the favicon database itself and process in how to get the database of most frequent ones by crawling internet.
  
I wrote .nse script for nmap to perform enumeration of software via favicon.ico. I've noticed that there is very small database of existing MD5 fingerprints of favicon.ico and also most of the current md5 fingerprinting implementations have only web server enumeration, I have added also some popular CMS, wikis, etc. I added some of them manually, but it's boring process. Fyodor suggested that we should do internet wide scan (nmap -iR 0) and gather the statistics and MD5 fingerprints of most usual favicons.ico and document them.
+
= Problem and solution =
  
= Problem  =
+
So, project has started the adventure of getting the statistics of MD5 fingerprints of most usual favicons.ico. We have faced problems how to enumerate http(s) hosts on Internet. Currently, we have recognized two types of http servers which we want to cover. First type is http servers on network devices and appliances and the second type is normal web servers with virtual hosts support.
  
So, I have started the adventure of getting the statistics of MD5 fingerprints of most usual favicons.ico. I have faced problems how to enumerate http(s) hosts on Internet. I recognized two types of http servers which I want to cover. First type is http servers on network devices and appliances and the second type is normal web servers with virtual hosts support.  
+
You can read process, problem and solution on [[OWASP_favicon_database_crawl]].  
  
First type is more or less straightforward to cover (with nmap -p80,443 -iR). But the second type is problematic. Because there is no straight way (ignore services like mynetworkneighborhood or live.msn.com, they only know what they crawled) of knowing all virtual hosts from the IP address you have (so you cannot do nmap -iR stuff).
+
= Results =
  
One of the ideas was to extract all links from wikipedia. From my viewpoint, Wikipedia is not good source. They started to remove http:// references from the usual articles and only top 5 (or some other number) links they put on External references in articles. I did small research and I found out that the best source for this would be Open Directory Project (DMOZ). It's interesting that DMOZ have XML files of their whole directory located here. They even have nice format to do so.
+
[[OWASP_favicon_database]] -favicon database in wiki format, '''feel free to contribute directly to wiki'''
  
= Solution  =
+
[[File:Favicon-md5-20100222.zip]] - Favicon MD5 database of most popular favicons found on the Internet - latest
  
Note that I did not want to do only DMOZ gathering or only nmap -iR gathering. With only DMOZ favicon gathering, I would lose favicons from network and appliance as usually they are not entered into DMOZ. And with only nmap -iR gathering, I would lose virtual hosts as there is no easy way of enumerating of all virtual hosts behind specific IP. So, I'm doing it both because I want to cover all possible cases.
+
[[File:Favicon-md5-20090925.zip]] - Favicon MD5 database of most popular favicons found on the internet - archive
  
== Solution of gathering via nmap -iR  ==
+
= Implementations =  
 +
[http://nmap.org Nmap] http-favicon.nse: NSE script for MD5 favicon fingerprinting
  
This solution gathers all favicons from port 80 as example.  
+
[http://www.openvas.org OpenVAS]  webserver_favicon.nasl: NASL NVT for MD5 favicon fingerprinting
  
Gather the data using modified version of favicon.nse:
+
[http://w3af.sourceforge.net w3af] favicon_identification.py: w3af plugin for MD5 favicon fingerprinting
<pre>nmap -v -sT -iR 0 -p80 -n -PN --script=http-favicon-get.nse -oN nmap-p80-ir-favicon </pre>
+
Extract only MD5 and IP from nmap output:  
+
<pre>grep -i "http-favicon.*Unknown" nmap-p80-ir-favicon | awk -F':' '{print $4,",",$2; } &gt; content-p80.md5.url
+
</pre>
+
Display sorted list of most frequent MD5 and last IP:
+
<pre>./get-favicon-md5-count.pl &lt; content-p80.md5.url | sort -r -n | less </pre>
+
...and that's it. But if you're brave enough, you can do it all at once (note that -iR number then must be specified):  
+
<pre>nmap -v -sT -iR 100000 -p80 -n -PN --script=http-favicon-get.nse | grep -i "http-favicon.*Unknown" | awk -F':' '{print $4,",",$ 2; } | ./get-favicon-md5-count.pl | sort -r -n | less
+
</pre>
+
== Solution of gathering via DMOZ  ==
+
  
Grab the XML file from DMOZ, extract URLs, make them unique and store them in content.url (URL per line):
+
= Roadmap =
<pre>wget -o /dev/null -O - http://rdf.dmoz.org/rdf/content.rdf.u8.gz | gunzip -dc | ./dmoz-extract-urls.pl -b -f - | sort | uniq &gt; content.url </pre>
+
For each URL get MD5 of favicon (if found) and write it to content.md5.url:
+
<pre>./get-favicon-md5.rb &lt; content.url &gt; content.md5.url
+
</pre>
+
Note that perl equivalent of this script did not work due to broken threads in Perl(even in Perl 5.10) and I need threads for this badly (performance!).
+
  
Display sorted list of most frequent MD5 and last URL:
+
Project roadmap is available at [[OWASP_favicon_database_roadmap]] page.
<pre>./get-favicon-md5-count.pl &lt; content.md5.url | sort -r -n | less </pre>
+
...and that's it. But if you're brave enough, you can do it all at once:
+
<pre>wget -o /dev/null -O - http://rdf.dmoz.org/rdf/content.rdf.u8.gz | gunzip -dc | ./dmoz-extract-urls.pl -b -f - | sort | uniq | ./get-favicon-md5.rb | ./get-favicon-md5-count.pl | sort -r -n | less </pre>
+
== Notes  ==
+
  
I have limited gathering scripts to fetch only favicon.ico from server root (i.e. /favicon.ico). So scripts will not parse HTML directives in order to find location of favicon. Reason is: simplicity.
+
= Feedback and Participation  =
  
= Related  =
+
We hope you find the information in the OWASP Favicon Database project useful. Please contribute back to the project by sending your comments, questions, and suggestions to the OWASP Favicon mailing list. Thanks!
  
Original scripts and files can be found at [http://kost.com.hr/favicon.php].
+
You can contribute by editing the database on wiki itself: [[OWASP_favicon_database]]. Also, you can contribute via [http://www.twitter.com Twitter] by using [http://www.twitter.com/OWASPfavicon @OWASPfavicon] and sending MD5, name of favicon and version (if possible).  
  
= Feedback and Participation  =
+
To join the OWASP Testing mailing list or view the archives, please visit the [https://lists.owasp.org/mailman/listinfo/owasp-favicon-database mailing list subscription page].
  
We hope you find the information in the OWASP Favicon Database project useful. Please contribute back to the project by sending your comments, questions, and suggestions to the OWASP Favicon mailing list. Thanks!
+
= Related  =
  
To join the OWASP Testing mailing list or view the archives, please visit the [https://lists.owasp.org/mailman/listinfo/owasp-favicon-database mailing list subscription page].  
+
* Original scripts and files can be found at [http://kost.com.hr/favicon.php http://kost.com.hr/favicon.php]
 +
* Nmap favicon poster project can be found at [http://nmap.org/favicon/ http://nmap.org/favicon/]
 +
* [http://www.nessus.org Nessus] webserver_favicon.nasl: NASL plugin for MD5 favicon fingerprinting
  
==== Project Identification ====
+
==== Project About ====
  
{{Template:OWASP Project Identification Tab
+
{{:Projects/OWASP Favicon Database Project | Project About}}  
| project_name = OWASP Favicon Database Project
+
| project_description = Software enumeration via favicon.ico
+
| project_license =
+
| leader_name = Vlatko Kosturjak
+
| leader_email = kost@linux.hr
+
| leader_username =
+
| maintainer_name = Vlatko Kosturjak
+
| maintainer_email = kost@linux.hr
+
| maintainer_username = 
+
| contributor_name1 = Fyodor
+
| contributor_email1 =
+
| contributor_username1 = 
+
| contributor_name2 = Brandon Enright
+
| contributor_email2 =
+
| contributor_username2 =
+
| contributor_name3 = Kris Katterjohn
+
| contributor_email3 =
+
| contributor_username3 =
+
| contributor_name4 =
+
| contributor_email4 =
+
| contributor_username4 =
+
| contributor_name5 =
+
| contributor_email5 =
+
| contributor_username5 =
+
| contributor_name6 =
+
| contributor_email6 =
+
| contributor_username6 =
+
| contributor_name7 =
+
| contributor_email7 =
+
| contributor_username7 =
+
| contributor_name8 =
+
| contributor_email8 =
+
| contributor_username8 =
+
| contributor_name9 =
+
| contributor_email9 =
+
| contributor_username9 =
+
| contributor_name10 =
+
| contributor_email10 =
+
| contributor_username10 = 
+
| pamphlet_link =
+
| presentation_link =
+
| mailing_list_name = owasp-favicon-database
+
| links_url1 =  http://kost.com.hr/favicon.php
+
| links_name1 = favicon.ico enumeration project
+
| links_url2 = http://seclists.org/nmap-dev/2009/q3/0462.html
+
| links_name2 = favicon survey script
+
| links_url3 =
+
| links_name3 =
+
| links_url4 =
+
| links_name4 =
+
| links_url5 =
+
| links_name5 =
+
| links_url6 =
+
| links_name6 =
+
| links_url7 =
+
| links_name7 =
+
| links_url8 =
+
| links_name8 =
+
| links_url9 =
+
| links_name9 =
+
| links_url10 =
+
| links_name10 =
+
| project_road_map =
+
| project_health_status =
+
| current_release_name = First Release
+
| current_release_date =
+
| current_release_download_link =
+
| current_release_rating =
+
| current_release_leader_name = Vlatko Kosturjak
+
| current_release_leader_email = kost@linux.hr
+
| current_release_leader_username =
+
| current_release_details = :Category:OWASP Favicon Database Project - First Release
+
| last_reviewed_release_name =
+
| last_reviewed_release_date =
+
| last_reviewed_release_download_link =
+
| last_reviewed_release_rating =
+
| last_reviewed_release_leader_name =
+
| last_reviewed_release_leader_email =
+
| last_reviewed_release_leader_username =
+
| old_release_name1 =
+
| old_release_date1 =
+
| old_release_download_link1 =
+
| old_release_name2 =
+
| old_release_date2 =
+
| old_release_download_link2 =
+
| old_release_name3 =
+
| old_release_date3 =
+
| old_release_download_link3 =
+
| old_release_name4 =
+
| old_release_date4 =
+
| old_release_download_link4 =
+
| old_release_name5 =
+
| old_release_date5 =
+
| old_release_download_link5 =
+
}}  
+
  
 
__NOTOC__ <headertabs />  
 
__NOTOC__ <headertabs />  
  
 
[[Category:OWASP_Project|Favicon Database Project]] [[Category:OWASP_Tool]] [[Category:OWASP_Alpha_Quality_Tool]]
 
[[Category:OWASP_Project|Favicon Database Project]] [[Category:OWASP_Tool]] [[Category:OWASP_Alpha_Quality_Tool]]

Revision as of 11:33, 24 June 2011

Main

[edit]

Idea is to have software enumerated via favicon.ico. How to do that? Take hash (in our case MD5) of favicon.ico and compare it against the known database. This project is about the favicon database itself and process in how to get the database of most frequent ones by crawling internet.

So, project has started the adventure of getting the statistics of MD5 fingerprints of most usual favicons.ico. We have faced problems how to enumerate http(s) hosts on Internet. Currently, we have recognized two types of http servers which we want to cover. First type is http servers on network devices and appliances and the second type is normal web servers with virtual hosts support.

You can read process, problem and solution on OWASP_favicon_database_crawl.

OWASP_favicon_database -favicon database in wiki format, feel free to contribute directly to wiki

File:Favicon-md5-20100222.zip - Favicon MD5 database of most popular favicons found on the Internet - latest

File:Favicon-md5-20090925.zip - Favicon MD5 database of most popular favicons found on the internet - archive

Nmap http-favicon.nse: NSE script for MD5 favicon fingerprinting

OpenVAS webserver_favicon.nasl: NASL NVT for MD5 favicon fingerprinting

w3af favicon_identification.py: w3af plugin for MD5 favicon fingerprinting

Project roadmap is available at OWASP_favicon_database_roadmap page.

We hope you find the information in the OWASP Favicon Database project useful. Please contribute back to the project by sending your comments, questions, and suggestions to the OWASP Favicon mailing list. Thanks!

You can contribute by editing the database on wiki itself: OWASP_favicon_database. Also, you can contribute via Twitter by using @OWASPfavicon and sending MD5, name of favicon and version (if possible).

To join the OWASP Testing mailing list or view the archives, please visit the mailing list subscription page.

Pages in category "OWASP Favicon Database Project"

The following 3 pages are in this category, out of 3 total.