Difference between revisions of "OWASP JSON Sanitizer"

From OWASP
Jump to: navigation, search
Line 5: Line 5:
 
| valign="top"  style="border-right: 1px dotted gray;padding-right:25px;" |
 
| valign="top"  style="border-right: 1px dotted gray;padding-right:25px;" |
  
==OWASP Java Encoder Project ==
+
== OWASP JSON Sanitizer Project ==
  
The OWASP Java Encoder is a Java 1.5 simple-to-use drop-in high-performance encoder class with no dependencies and little baggage. This project will help Java web developers defend against Cross Site Scripting!
+
Our Mission: Given JSON-like content, convert it to valid JSON!
 +
This can be attached at either end of a data-pipeline to help satisfy Postel's principle:
 +
be conservative in what you do, be liberal in what you accept from others Applied to JSON-like content from others, it will produce well-formed JSON that should satisfy any parser you use.
 +
Applied to your output before you send, it will coerce minor mistakes in encoding and make it easier to embed your JSON in HTML and XML.
  
 
==Introduction==
 
==Introduction==
  
<i>Contextual Output Encoding</i> is a computer programming technique necessary to stop [https://www.owasp.org/index.php/XSS_Prevention_Cheat_Sheet Cross Site Scripting]. This project is a Java 1.5 simple-to-use drop-in high-performance encoder class with no dependencies and little baggage. It provides numerous encoding functions to help defend against XSS in a variety of different HTML, JavaScript, XML and CSS contexts.
+
Our Mission: Given JSON-like content, convert it to valid JSON! The OWASP JSON Sanitizer Project is a simple to use Java library that can be attached at either end of a data-pipeline to help satisfy Postel's principle: <i>be conservative in what you do, be liberal in what you accept from others<i><br/>
 +
Applied to JSON-like content from others, it will produce well-formed JSON that should satisfy any parser you use.<br/>
 +
Applied to your output before you send, it will coerce minor mistakes in encoding and make it easier to embed your JSON in HTML and XML.
  
==Quick Overview==
+
== Security ==
  
The OWASP Java Encoder library is intended for quick contextual encoding with very little overhead, either in performance or usage. To get started, simply add the encoder-1.1.1.jar, import org.owasp.encoder.Encode and start encoding.
+
Since the output is well-formed JSON, passing it to eval will have no side-effects and no free variables, so is neither a code-injection vector, nor a vector for exfiltration of secrets. This library only ensures that the JSON string → Javascript object phase has no side effects and resolves no free variables, and cannot control how other client side code later interprets the resulting Javascript object. So if client-side code takes a part of the parsed data that is controlled by an attacker and passes it back through a powerful interpreter like eval or innerHTML then that client-side code might suffer unintended side-effects.
 
+
Example usage:
+
 
+
PrintWriter out = ....;
+
out.println("<textarea>"+Encode.forHtml(userData)+"</textarea>");   
+
 
+
Please look at the [http://owasp-java-encoder.googlecode.com/svn/tags/1.1/core/apidocs/org/owasp/encoder/Encode.html javadoc for Encode] to see the variety of contexts for which you can encode.
+
 
+
If you want to try it out or see it in action, head over to "Can You XSS This? (.com)" and hit it with your best XSS attack vectors!
+
 
+
Happy Encoding!
+
  
 
==Licensing==
 
==Licensing==
The OWASP Java Encoder is free to use under the [http://opensource.org/licenses/BSD-3-Clause New BSD License].
+
The OWASP Java Encoder is free to use under the [http://www.apache.org/licenses/LICENSE-2.0 Apache 2 License].
 
+
  
 
| valign="top"  style="padding-left:25px;width:200px;border-right: 1px dotted gray;padding-right:25px;" |
 
| valign="top"  style="padding-left:25px;width:200px;border-right: 1px dotted gray;padding-right:25px;" |
Line 36: Line 29:
 
== What is this? ==
 
== What is this? ==
  
The OWASP Java Encoder provides:
+
The OWASP JSON Sanitizer Projects provides:
  
* Output Encoding functions to help stop XSS
+
* Java based JSON outbound or inbound sanitization library
* Java 1.5+ standalone library
+
  
 
== Code Repo ==
 
== Code Repo ==
  
[https://code.google.com/p/owasp-java-encoder/ OWASP Java Encoder at Google Code]
+
[https://code.google.com/p/json-sanitizer/ OWASP JSON Sanitizer at Google Code]
  
 
== Project Leader ==
 
== Project Leader ==
  
Project Leader:<br/>Jeff Ichnowski (The Encoding Grandmaster)
+
Project Leader:<br/>Mike Samuel
 
<br/><br/>
 
<br/><br/>
 
Contributors: <br/>
 
Contributors: <br/>
Jeremy Long<br/>
 
 
Jim Manico<br/>
 
Jim Manico<br/>
  
 
== Related Projects ==
 
== Related Projects ==
  
* [[XSS (Cross Site Scripting) Prevention Cheat Sheet]]
+
* [[OWASP HTML Sanitizer Project]]
  
 
| valign="top"  style="padding-left:25px;width:200px;" |
 
| valign="top"  style="padding-left:25px;width:200px;" |
Line 61: Line 52:
 
== Quick Download ==
 
== Quick Download ==
  
* [http://search.maven.org/remotecontent?filepath=org/owasp/encoder/encoder/1.1.1/encoder-1.1.1.jar encoder-1.1.1.jar]
+
* [https://code.google.com/p/json-sanitizer/downloads/detail?name=json-sanitizer-2012-10-17.jar https://code.google.com/p/json-sanitizer/downloads/detail?name=json-sanitizer-2012-10-17.jar]
  
 
== News and Events ==
 
== News and Events ==
* [30 Jan 2014] 1.1.1 Released!
+
* [Oct 17, 2012] 1.0 Released!
  
== In Print ==
 
We will be releasing a user guide soon!
 
  
 
==Classifications==
 
==Classifications==
Line 78: Line 67:
 
   | align="center" valign="top" width="50%"| [[File:Owasp-defenders-small.png|link=]]
 
   | align="center" valign="top" width="50%"| [[File:Owasp-defenders-small.png|link=]]
 
   |-
 
   |-
   | colspan="2" align="center"  | [http://opensource.org/licenses/BSD-3-Clause New BSD License]
+
   | colspan="2" align="center"  | [http://www.apache.org/licenses/LICENSE-2.0 Apache 2 License]
 
   |-
 
   |-
 
   | colspan="2" align="center"  | [[File:Project_Type_Files_CODE.jpg|link=]]
 
   | colspan="2" align="center"  | [[File:Project_Type_Files_CODE.jpg|link=]]

Revision as of 11:02, 4 February 2014

[edit]

OWASP Project Header.jpg

OWASP JSON Sanitizer Project

Our Mission: Given JSON-like content, convert it to valid JSON! This can be attached at either end of a data-pipeline to help satisfy Postel's principle: be conservative in what you do, be liberal in what you accept from others Applied to JSON-like content from others, it will produce well-formed JSON that should satisfy any parser you use. Applied to your output before you send, it will coerce minor mistakes in encoding and make it easier to embed your JSON in HTML and XML.

Introduction

Our Mission: Given JSON-like content, convert it to valid JSON! The OWASP JSON Sanitizer Project is a simple to use Java library that can be attached at either end of a data-pipeline to help satisfy Postel's principle: be conservative in what you do, be liberal in what you accept from others<i>
Applied to JSON-like content from others, it will produce well-formed JSON that should satisfy any parser you use.
Applied to your output before you send, it will coerce minor mistakes in encoding and make it easier to embed your JSON in HTML and XML.

Security

Since the output is well-formed JSON, passing it to eval will have no side-effects and no free variables, so is neither a code-injection vector, nor a vector for exfiltration of secrets. This library only ensures that the JSON string → Javascript object phase has no side effects and resolves no free variables, and cannot control how other client side code later interprets the resulting Javascript object. So if client-side code takes a part of the parsed data that is controlled by an attacker and passes it back through a powerful interpreter like eval or innerHTML then that client-side code might suffer unintended side-effects.

Licensing

The OWASP Java Encoder is free to use under the Apache 2 License.

What is this?

The OWASP JSON Sanitizer Projects provides:

  • Java based JSON outbound or inbound sanitization library

Code Repo

OWASP JSON Sanitizer at Google Code

Project Leader

Project Leader:
Mike Samuel

Contributors:
Jim Manico

Related Projects

Quick Download

News and Events

  • [Oct 17, 2012] 1.0 Released!


Classifications

Owasp-incubator-trans-85.png Owasp-builders-small.png
Owasp-defenders-small.png
Apache 2 License
Project Type Files CODE.jpg

Our Mission: Given JSON-like content, convert it to valid JSON!

This can be attached at either end of a data-pipeline to help satisfy Postel's principle:

be conservative in what you do, be liberal in what you accept from others Applied to JSON-like content from others, it will produce well-formed JSON that should satisfy any parser you use.

Applied to your output before you send, it will coerce minor mistakes in encoding and make it easier to embed your JSON in HTML and XML.

architecture

Many applications have large amounts of code that uses ad-hoc methods to generate JSON outputs. Frequently these outputs all pass through a small amount of framework code before being sent over the network. This small amount of framework code can use this library to make sure that the ad-hoc outputs are standards compliant and safe to pass to (overly) powerful deserializers like Javascript's eval operator.

Applications also often have web service APIs that receive JSON from a variety of sources. When this JSON is created using ad-hoc methods, this library can massage it into a form that is easy to parse.

By hooking this library into the code that sends and receives requests and responses, this library can help software architects ensure system-wide security and well-formedness guarantees.

The sanitizer takes JSON like content, and interprets it as JS eval would. Specifically, it deals with these non-standard constructs.

'...' Single quoted strings are converted to JSON strings.
\xAB Hex escapes are converted to JSON unicode escapes.
\012 Octal escapes are converted to JSON unicode escapes.
0xAB Hex integer literals are converted to JSON decimal numbers.
012 Octal integer literals are converted to JSON decimal numbers.
+.5 Decimal numbers are coerced to JSON's stricter format.
[0,,2] Elisions in arrays are filled with null.
[1,2,3,] Trailing commas are removed.
{foo:"bar"} Unquoted property names are quoted.
//comments JS style line and block comments are removed.
(...) Grouping parentheses are removed.

The sanitizer fixes missing punctuation, end quotes, and mismatched or missing close brackets. If an input contains only white-space then the valid JSON string null is substituted.

The output is well-formed JSON as defined by RFC 4627. The output satisfies three additional properties:

  1. The output will not contain the substring (case-insensitively) "</script" so can be embedded inside an HTML script element without further encoding.
  2. The output will not contain the substring "]]>" so can be embedded inside an XML CDATA section without further encoding.
  3. The output is a valid Javascript expression, so can be parsed by Javascript's eval builtin (after being wrapped in parentheses) or by JSON.parse. Specifically, the output will not contain any string literals with embedded JS newlines (U+2028 Paragraph separator or U+2029 Line separator).
  4. The output contains only valid Unicode scalar values (no isolated UTF-16 surrogates) that are allowed in XML unescaped.

Since the output is well-formed JSON, passing it to eval will have no side-effects and no free variables, so is neither a code-injection vector, nor a vector for exfiltration of secrets.

This library only ensures that the JSON string → Javascript object phase has no side effects and resolves no free variables, and cannot control how other client side code later interprets the resulting Javascript object. So if client-side code takes a part of the parsed data that is controlled by an attacker and passes it back through a powerful interpreter like eval or innerHTML then that client-side code might suffer unintended side-effects.

The sanitize method will return the input string without allocating a new buffer when the input is already valid JSON that satisfies the properties above. Thus, if used on input that is usually well formed, it has minimal memory overhead.

The sanitize method takes O(n) time where n is the length in UTF-16 code-units.