Difference between revisions of "Code Review Metrics"

From OWASP
Jump to: navigation, search
m (Added navigation to facilitate sequential reading online)
(7 intermediate revisions by one user not shown)
Line 1: Line 1:
[[Category:OWASP Code Review Project]]
+
{{LinkBar
[[OWASP Code Review Guide Table of Contents]]__TOC__
+
  | useprev=PrevLink | prev=OCRG1.1:Application Threat Modeling | lblprev=Application Threat Modeling
 +
  | usemain=MainLink | main=OWASP Code Review Guide Table of Contents | lblmain=Table of Contents
 +
  | usenext=NextLink | next=Crawling Code | lblnext=
 +
}}
 +
__TOC__
 +
 
 
Code review is an excellent source of metrics that can be used to improve your software development process. There are two distinct classes of these software metrics: Relative and Absolute.  
 
Code review is an excellent source of metrics that can be used to improve your software development process. There are two distinct classes of these software metrics: Relative and Absolute.  
  
Line 22: Line 27:
 
===Defect Density===
 
===Defect Density===
  
The average occurrence of programming faults per Lines of code (LOC). This gives a high level view of the code quality but not much more. Fault density on its own does not give rise to a pragmatic metric. Defect density would cover minor issues as well as major security flaws in the code; all are treated the same way. Security of code can not be judged accurately using defect density alone.  
+
The average occurrence of programming faults per Lines of Code (LOC). This gives a high level view of the code quality but not much more. Fault density on its own does not give rise to a pragmatic metric. Defect density would cover minor issues as well as major security flaws in the code; all are treated the same way. Security of code can not be judged accurately using defect density alone.
  
===Lines of code (LOC)===
+
===Lines of Code (LOC)===
  
The count of the executable lines of code. Commented-out code or spaces don't count. This is another metric in an attempt to quantify the size of the code. This gives a rough estimate but is not particularly scientific. Some circles of thinking believe that the estimation of an application size by virtue of LOC is professional malpractice!  
+
The count of the executable lines of code. Commented-out code or spaces don't count. This is another metric in an attempt to quantify the size of the code. This gives a rough estimate but is not particularly scientific. Some circles of thinking believe that the estimation of an application size by virtue of LOC is professional malpractice!
  
 
===Function Point===
 
===Function Point===
Line 36: Line 41:
 
Similar to defect density, but discovered issues are rated by risk (high, medium & low). In doing this we can give insight into the quality of the code being developed via a  ['''''X Risk / LoC'''''] or ['''''Y Risk / Function Point'''''] value. (X&Y being high, medium or low risks) as defined by your internal application development policies and standards.  
 
Similar to defect density, but discovered issues are rated by risk (high, medium & low). In doing this we can give insight into the quality of the code being developed via a  ['''''X Risk / LoC'''''] or ['''''Y Risk / Function Point'''''] value. (X&Y being high, medium or low risks) as defined by your internal application development policies and standards.  
  
Eg:<br>
+
Example:<br>
 
4 High Risk Defects per 1000 (Lines of Code)<br>
 
4 High Risk Defects per 1000 (Lines of Code)<br>
 
2 Medium Risk Defects per 3 Function Points<br>
 
2 Medium Risk Defects per 3 Function Points<br>
Line 48: Line 53:
 
A decision could be considered commands such as:
 
A decision could be considered commands such as:
  
If....else
+
If....else<br>
switch
+
switch<br>
case
+
case<br>
catch
+
catch<br>
while
+
while<br>
do
+
do<br>
 
and so on.....
 
and so on.....
  
Line 67: Line 72:
  
 
===Inspection Rate===
 
===Inspection Rate===
This metric can be used to get a rough idea of the required duration to perform a code review. The inspection rate is the rate of coverage a code reviewer can cover per unit of time. From experience, a rate of 250 lines per hour would be a baseline. This rate should not be used as part of a measure of review quality but simply to determine duration of the task.  
+
This metric can be used to get a rough idea of the required duration to perform a code review. The inspection rate is the rate of coverage a code reviewer can cover per unit of time. From experience, a rate of 250 lines per hour would be a baseline. This rate should not be used as part of a measure of review quality, but simply to determine duration of the task.
  
===Defect detection Rate===
+
===Defect Detection Rate===
This metric measure the defects found per unit of time. Again can be used to measure performance of the code review team but not to be used as a quality measure. Defect detection rate would normally increase as the inspection rate (above) decreases.  
+
This metric measures the defects found per unit of time. Again, can be used to measure performance of the code review team, but not to be used as a quality measure. Defect detection rate would normally increase as the inspection rate (above) decreases.
  
 
===Code Coverage===
 
===Code Coverage===
Line 79: Line 84:
  
 
===Reinspection Defect Rate===
 
===Reinspection Defect Rate===
The rate at which upon re-inspection of the code more defects exist, some defects still exist or other defects manifest through an attempt to address previously discovered defects.
+
The rate at which upon re-inspection of the code more defects exist, some defects still exist, or other defects manifest through an attempt to address previously discovered defects.
 +
 
 +
{{LinkBar
 +
  | useprev=PrevLink | prev=OCRG1.1:Application Threat Modeling | lblprev=Application Threat Modeling
 +
  | usemain=MainLink | main=OWASP Code Review Guide Table of Contents | lblmain=Table of Contents
 +
  | usenext=NextLink | next=Crawling Code | lblnext=
 +
}}
 +
 
 +
[[Category:OWASP Code Review Project]]

Revision as of 10:30, 9 September 2010

««Application Threat Modeling«« Main
(Table of Contents)
»»»»

Contents


Code review is an excellent source of metrics that can be used to improve your software development process. There are two distinct classes of these software metrics: Relative and Absolute.

Absolute metrics are numerical values that describe a trait of the code such as the number of references to a particular variable in an application, or the number of lines of code (LOC). Absolute metrics, such as the number of lines of code, do not involve subjective context but are material fact.

Relative metrics are a representation of an attribute that cannot be directly measured, and are subjective and reliant on the context of where the metric was derived. There is no definitive way to measure such an attribute. Multiple variables are factored into an estimation of the degree of testing difficulty, and any numeric representation or rating is only an approximation and is subjective.

Some Metric Benefits

The objective of code review is to detect development errors which may cause vulnerabilities, and hence give rise to an exploit. Code review can also be used to measure the progress of a development team in their practice of secure application development. It can pinpoint areas where the development practice is weak, areas where secure development practice is strong, and give a security practitioner the ability to address the root cause of the weaknesses within a developed solution. It may give rise to investigation into software development policies and guidelines and the interpretation of them by the users; communcation is the key.

Metrics can also be recorded relating to the performance of the code reviewers and the accuracy of the review process, the performance of the code review function, and the efficiency and effectiveness of the code review function.


SCR Process.jpg

The figure above describes the use of Metrics throughout the code review process.

Secure Development Metrics

Defect Density

The average occurrence of programming faults per Lines of Code (LOC). This gives a high level view of the code quality but not much more. Fault density on its own does not give rise to a pragmatic metric. Defect density would cover minor issues as well as major security flaws in the code; all are treated the same way. Security of code can not be judged accurately using defect density alone.

Lines of Code (LOC)

The count of the executable lines of code. Commented-out code or spaces don't count. This is another metric in an attempt to quantify the size of the code. This gives a rough estimate but is not particularly scientific. Some circles of thinking believe that the estimation of an application size by virtue of LOC is professional malpractice!

Function Point

The estimation of software size by measuring functionality. The combination of a number of statements which perform a specific task, independent of programming language used or development methodology.

Risk Density

Similar to defect density, but discovered issues are rated by risk (high, medium & low). In doing this we can give insight into the quality of the code being developed via a [X Risk / LoC] or [Y Risk / Function Point] value. (X&Y being high, medium or low risks) as defined by your internal application development policies and standards.

Example:
4 High Risk Defects per 1000 (Lines of Code)
2 Medium Risk Defects per 3 Function Points

Path complexity/complexity-to-defect/cyclomatic complexity

Cyclomatic complexity can help establish risk and stability estimations on an item of code, such as a class or method or even a complete system. It was defined by Thomas McCabe in the 70's and it easy to calsulate and apply, hence its usefulness.

CC = Number of decisions +1

A decision could be considered commands such as:

If....else
switch
case
catch
while
do
and so on.....

As the decision count increases, so does the complexity. Complex code leads to less stability and maintainability.

The more complex the code, the higher risk of defects. One could establish thresholds for Cyclomatic complexity:

0-10: Stable code. Acceptable complexity
11-15: Medium Risk. More complex
16-20: High Risk code. Too many decisions for a unit of code.

Review Process Metrics

Inspection Rate

This metric can be used to get a rough idea of the required duration to perform a code review. The inspection rate is the rate of coverage a code reviewer can cover per unit of time. From experience, a rate of 250 lines per hour would be a baseline. This rate should not be used as part of a measure of review quality, but simply to determine duration of the task.

Defect Detection Rate

This metric measures the defects found per unit of time. Again, can be used to measure performance of the code review team, but not to be used as a quality measure. Defect detection rate would normally increase as the inspection rate (above) decreases.

Code Coverage

Measured as a % of LoC of function points, the code coverage is the proportion of the code reviewed. In the case of manual review we would aim for close to 100%, unlike automated testing wherein 80-90% is considered good.

Defect Correction Rate

The amount of time used to correct detected defects. This metric can be used to optimise a project plan within the SDLC. Average values can be measured over time, producing a measure of effort which must be taken into account in the planning phase.

Reinspection Defect Rate

The rate at which upon re-inspection of the code more defects exist, some defects still exist, or other defects manifest through an attempt to address previously discovered defects.


««Application Threat Modeling«« Main
(Table of Contents)
»»»»