What are web applications?
In the early days of the web, web sites consisted of static pages, which severely limited interaction with the user. In the early 1990’s, this limitation was removed when web servers were modified to allow communication with server-side custom scripts. No longer were applications just static brochure-ware, edited only by those who knew the arcane mysteries of HTML; with this single change, normal users could interact with the application for the first time.
This is a huge and fundamental step towards the web as we know it today. Without interactivity, there would be no e-commerce (such as Amazon), no web e-mail (Hotmail or GMail), no Internet Banking, no blogs, no online share trading, and no web forums or communities like Orkut or Friendster. The static Internet would have been vastly different to today.
Initially, it was quite difficult to write sophisticated applications. The first generation web applications were primitive, usually little more than form submissions and search applications. Even these basic applications took quite a great deal of skill to craft.
Over time, the arcane knowledge required to write applications has been reduced. Today, it is relatively easy to write sophisticated applications with modern platforms and simpler languages, like PHP or VB.NET.
However, this push to make applications as easy to write as possible has a downside – many entry-level programmers are completely unaware of the security implications of their code. This is discussed further in the “Security Principles” chapter.
Let’s look at the various generations of web application technology.
First generation – CGI
CGI works by encapsulating user supplied data in environment variables. These are inherited by the custom written scripts or programs, usually developed in Perl or C. The custom programs process the supplied user data, and send fully formed HTML to the “standard out” (stdout), which is captured by the web server and passed back to the user.
As few sites today write new CGI applications, the techniques to secure CGI applications are not discussed within the Guide. However, many of the techniques discussed can be used with few or no changes.
Filters can be used to control access to a web site, implement a different web application framework (such as Perl, PHP, or ASP), or to provide a security check. Filters must be written in C or C++ because __________.
Because filters live within the execution context of the web server itself, they can be high performance. Typical examples of a filter interface include Apache web server modules, SunONE’s NSAPI, and Microsoft’s ISAPI. Because filters are rarely used specialist interfaces that can directly affect the availability of the web server, they are not considered further.
I’m not sure if that’s what you mean above but please clarify.
Unlike low-level languages, scripting languages rarely suffer from buffer overflows or resource leaks. Thus, programmers who use them can avoid two of the most common security issues. However, they do have their disadvantages:
- Most scripting languages aren’t strongly typed and do not promote good programming practices
- Scripting languages are generally slower than their compiled counterparts (sometimes as much as 100 times slower)
- Scripts often lead to unmanageable code bases that perform poorly as their size grows
- It’s difficult (but not impossible) to write multi-tier large scale applications in scripting languages, because often the presentation, application and data tiers reside on the same machine, thus limiting scalability and security
- Most scripting languages do not natively support remote method or web service calls, which makes it difficult to communicate with application servers and external web services.
Scripting frameworks include ASP, Perl, Cold Fusion, and PHP. However, many of these would be considered interpreted hybrids now, particularly later versions of PHP and Cold Fusion, which pre-tokenize and optimize scripts.
Web application frameworks – J2EE and ASP.NET
As scripting languages reached the boundaries of performance and scalability, many larger vendors jumped on Sun’s J2EE web development platform. There are many J2EE implementations, including Tomcat from the Apache Foundation and _______.
- Uses the Java language to produce applications which run nearly as quickly as C++ based ones, and that do not easily suffer from buffer overflows and memory leaks
- Allowed large distributed applications to run acceptably for the first time
- Possesses good session and authorization controls
- Enabled relatively transparent multi-tier applications via various remote component invocation mechanisms, and
- Is strongly typed to prevent many common security and programming issues before the program even runs
J2EE’s downside is that it has a steep learning curve which makes it difficult for web designers and entry-level programmers to use it to write applications. While certain graphical development tools make it somewhat easier to program with J2EE, a scripting language like PHP is still much easier to use.
When Microsoft updated their ASP technology to ASP.NET. which mimics the J2EE framework in many ways, they offered several improvements on the development process. For example, .NET:
- Makes it easy for entry level programmers and web designers to whip up smaller applications
- Allows large distributed applications
- Offers good session and authorization controls
- Allows programmers to use their favorite language, which is compiled to native code for excellent performance (near C++ speeds), along with buffer overflow and resource garbage collection
- Permits transparent communication with remote and external components
The choice of J2EE or ASP.NET frameworks is largely dependent upon platform. There is little reason to choose one over the other from a security perspective.
- ASP.Net is primarily available for Microsoft Windows. The Mono project (http://www.go-mono.com/) can run ASP.NET applications on many platforms including Solaris, Netware, and Linux.
Small to medium scale applications
Most applications are either small or medium scale. The usual architecture is a simple linear procedural script. This is the most common form of coding for ASP, Cold Fusion and PHP scripts, but rarer (but not impossible) for ASP.NET and J2EE applications.
The reason for this architecture is that it is easy to write, and few skills are required to maintain the code. For smaller applications, any perceived performance benefit from moving to a more scalable architecture will never be recovered in the runtime for those applications. For example, if it takes an additional three weeks of developer time to re-factor the scripts into an MVC approach, the three weeks will never be recovered (or noticed by end users) from the improvements in scalability.
It is typical to find many security issues in such applications, including dynamic database queries constructed from insufficiently validated data input, poor error handling and weak authorization controls.
This Guide provides advice throughout to help improve the security of these applications.
Large scale applications
Larger applications need a different architecture to that of a simple survey or feedback form. As applications get larger, it becomes ever more difficult to implement and maintain features and to keep scalability high. Using scalable application architectures becomes a necessity rather than a luxury when an application needs more than about three database tables or presents more than approximately 20 - 50 functions to a user.
Scalable application architecture is often divided into tiers, and if design patterns are used, often broken down into re-usable chunks using specific guidelines to enforce modularity, interface requirements and object re-use. Breaking the application into tiers allows the application to be distributed to various servers, thus improving the scalability of the application at the expense of complexity.
One of the most common web application architectures is model-view-controller (MVC), which implements the Smalltalk 80 application architecture. MVC is typical of most Apache Foundation Jakarta Struts J2EE applications, and the code-behinds of ASP.NET can be considered a partial implementation of this approach. For PHP, the WACT project (http://wact.sourceforge.net) aims to implement the MVC paradigm in a PHP friendly fashion.
The front-end rendering code, often called the presentation tier, should aim to produce the HTML output for the user with little to no application logic.
As many applications will be internationalized (i.e. contain no localized strings or culture information in the presentation layer), they must use calls into the model (application logic) to obtain the data required to render useful information to the user in their preferred language and culture, script direction, and units.
All user input is directed back to controllers in the application logic.
The controller (or application logic) takes input from the users and gates it through various workflows that call on the application’s model objects to retrieve, process, or store the data.
Well written controllers centrally server-side validate input data against common security issues before passing the data to the model for processing, and ensure that output is safe or in a ready form for safe output by the view code.
As the application is likely to be internationalized and accessible, the data needs to be in the local language and culture. For example, dates cannot only be in different orders, but an entirely different calendar could be in use. Applications need to be flexible about presenting and storing data. Simply displaying “7/4/2005” is ambiguous to anyone outside a few countries.
oAccount->TransferFunds(fromAcct, ToAcct, Amount)
rather than writing it such as this:
if oAccount->isMyAcct(fromAcct) &
amount < oAccount->getMaxTransferLimit() &
oAccount->getBalance(fromAcct) > amount &
if oAccount->withdraw(fromAcct, Amount) = OK then
end if The idea is to encapsulate the actual dirty work into the model code, rather than exposing primitives. If the controller and model are on different machines, the performance difference will be staggering, so it is important for the model to be useful at a high level.
The contract between the controller and the model needs to be carefully considered to ensure that data is strongly typed, with reasonable structure (syntax), and appropriate length, whilst allowing flexibility to allow for internationalization and future needs.
The best performance and highest security is often obtained through parameterized stored procedures, followed by parameterized queries (also known as prepared statements) with strong typing of the parameters and schema. The major reason for using stored procedures is to minimize network traffic for a multi-stage transaction or to remove security sensitive information from traversing the network.
Stored procedures are not always a good idea – they tie you to a particular database vendor and many implementations are not fast for numeric computation. If you use the 80/20 rule for optimization and move the slow and high-risk transactions to stored procedures, the wins can be worthwhile from a security and performance perspective.
Web applications can be written in many different ways, and in many different languages. Although the Guide concentrates upon three common choices for its examples (PHP, J2EE and ASP.NET), the Guide can be used with any web application technology.