Updated: Sep 2
A security code review is the act of analyzing an application's source code for vulnerabilities. As part of the Secure SDLC, take everything you’ve learned about the application during threat modeling and analyze the code to determine if all of the security requirements have been met. Everything you discover during the code review should be reported to the application’s engineering team and it will also help inform the final penetration test and ongoing testing.
Unfortunately, this part of the process can be tedious and sometimes frustrating. Quite a bit of effort goes into a security code review and can be the largest hurdle to complete in the Secure SDLC. However there are numerous resources online to help and a number of tools to assist you. A good security code review generally uses a mix of tools and manual analysis working in tandem. Let’s first go over some preliminary information.
Why should we manually review code for security?
As previously blogged, Application Security should be focused on not becoming a bottleneck for engineering. Any manual processes will exacerbate this and if engineering’s releases get halted or delayed because of security, then we might be ignored in the future.
However, manual code review should still occur, so the question then is when? Based on the criticality of the application, a manual code review should occur on the initial creation of a new application or if an application is going through major changes that affect its risk factor. If the application’s risk factor is Low, perhaps a security code review is not necessary. Once each step of the Secure SDLC has been satisfied for a new application, then we should think about security scanning for that application for future changes.
The learning curve
To manually review code for security we need to be able to read and understand the code. If the majority of applications are built in Java, you should know how to fully read and analyze Java code and its common frameworks. This is the single largest barrier for someone who does not come from a coding background, but it’s doable.
Luckily reading and analyzing code is easier than writing it (at least for me, since I do not have a coding background), so the learning curve is less steep. And there are numerous resources online to get better at it. You can download the application on your machine and run its unit tests to see how the application is intended to run.
Perform searches online first for the language the application is written in if you need, this will be most important to understand the coding conventions and syntax. Second, learn about whatever major frameworks and libraries the application uses. Frameworks can drastically alter how the code is put together. For example, Java written with Spring will look and read much differently than without or a different framework. Lastly, if the application uses any additional add-ons, plugins, or infrastructure code, learn what their purpose is and how they’re supposed to be used.
This obviously can take quite a bit of time but once you get on the trail it gets easier with each additional security review. Be sure to inform engineering teams creating new applications that, if they are using a language or framework outside of the norm for the organization, the application security review may take more time.
Good tools will help speed up your review and help with frustration. The first and best thing to have is a good code editor or IDE. A good code editor should have at least the following features:
Allow you to quickly cross-reference a class or function definition without having to search for it saves you a ton of time.
The editor should allow you to globally search the code base, or even multiple code bases, at once and should allow regex searching.
It should support multiple languages with syntax highlighting.
You should be comfortable using it (may take some time).
If it’s available, diagramming software will turn the code into a diagram to help you visualize the application. Use this to see if the data flow of the application matches the documents and diagrams originally given to you earlier in the Secure SDLC. For example, Eclipse IDE has plugins that do just this for Java code.
Finally, being able to quickly build/run/test the application locally will allow you to quickly create tests for the code, both statically and dynamically. There are numerous automated tools to analyze the code for security issues for you, but we will discuss that more below.
How to code review
Let’s get into the nitty gritty. There are two primary avenues for conducting a security code review for an application, manually and using automated tooling. While manual review will be more tedious, it will ensure that your findings are verified. Automated testing is quite faster but the findings still need to be reviewed for validity, and automated tests may have False Negatives (real vulnerabilities that the scanner could not detect).
The most straight-forward tactic to begin with is to search or ‘grep’ the code base for certain keywords that are known indicators for vulnerabilities or misconfigurations. This can be terms like ‘encrypt’ to find issues with encryption, ‘secret’ or ‘password’ to find hard-coded secure values, or known dangerous functions like eval or System.
Moving forward, there are two major terms you should understand for performing a code review, either manually or automated, and that is ‘Source’ and ‘Sink’. Source refers to an area of the code base where untrusted third party input enters the application. From the Source we can determine the ‘Tainted Path’ by traversing the code wherever the untrusted input goes. Where this untrusted input ends up is the Sink. Some examples of a dangerous Sink can be process creation, file system interactions, SQL queries, and HTTP requests and responses.
A lot of what you will be doing during the code review will be isolating areas where there is a Source of untrusted third party data and following the Tainted Path to the Sink. Somewhere in that Tainted Path there needs to be some form of security control to validate the input is valid and that the application will not act in an insecure manner. This additionally can be done in reverse, finding dangerous Sinks and traversing backwards to the Source.
As with any complex task, it’s best done in discrete chunks. So as you review code, mark off what has been reviewed and what the result of that review is.
Manual Code Review Techniques
Use your Threat Model The threat model you created in the previous steps of the Secure SDLC will be your guide for manually reviewing code. Trying to review all of the code for all possible code paths is a road to your own destruction. Most of the code paths are not relevant to your security code review, so your very first goal is to isolate those areas of the code that matter the most to the security review. Let’s take some examples from threat modeling in a previous blog post.
Authentication/Authorization When reviewing the application for possible issues in authentication or authorization, you first need to isolate the relevant packages that handle this exact thing. Load up the application into your editor or IDE and begin doing global searches for keywords that handle authentication. This will depend on the language and the frameworks being used, but this type of information should have been called out while building security requirements earlier in the process. Be sure to reference your previous notes when searching the code base.
Authentication and authorization functions generally assume that there is a call coming from outside of the application to gain access to it, so the first step is to find in which functions or packages that first take in that external input. Once that is identified, traverse the code until a success or failure to gain access is reached. Analyze the Source to Sink path to ensure that all authentication and authorization decisions are verified in the proper order to ensure access is only granted to legitimate users.
Injection and Output Encoding As before, you need to first isolate where in the application untrusted external input enters the application. This is, generally speaking for web applications, some type of Controller class that handles the routing and retrieving of information from a HTTP request and returns the resulting response. From here you should be able to follow from the Source, traversing the code base following the Source to the Sink, where the untrusted input ends up. If no where in that Tainted Path there are obvious security controls validating the input, then there is a potential issue.
For output encoding, you need to isolate in the code base wherever the application calls out to other services or clients. Remember, this also includes the response to the original client HTTP request. Determine what the outbound service or client is expecting as a data type and ensure that the code base performs that type of encoding. For example, when outputting to another service that is expecting JSON, ensure the application is using an officially supported JSON library to form the output before it sends it out. If the application is forming a HTTP response back to the requesting client, ensure that any input placed back into that response is encoded as the same type as the rest of the response. If it’s an HTML response body, ensure all input into that response is HTML encoded (prevents Cross-Site Scripting).
Data Protection When an application is dealing with critical data, it’s essential to analyze how the application is treating said data. Isolate in the code base where critical data is being handled and see how it is treated and where it is being sent. We need to ensure that critical data is only staying on the machine for a limited amount of time and is being sent outbound to or stored in already approved locations with proper encryption.
Communications While security code reviews will generally not tell you much about how transport protection will work for this application, you can at least ensure that all of its outbound connections are being sent securely. Check configuration or property files in the code base and ensure URLs are utilizing secure protocols. HTTPS should be the default protocol for all outbound web based communications.
Business Logic Depending on the application’s purpose, it may contain very sensitive logic that is critical to business functions. Isolate where in the code base this logic is and analyze it. Either you will discover a possible issue in the logic and can rule all usage of it unsafe or if there are no vulnerabilities you can rule all usage of it as safe.
Reuse of Secure Code If libraries or frameworks already exist for security functions in your organization, the application should not write its own custom code to implement it (unless that’s the entire intention of said application). For example, if your organization uses an internal library for single sign-on or certificate-based authentication or authorization, the application should be utilizing these libraries or frameworks and not creating its own version of it. This is especially true for encryption, as there is no need to create your own encryption libraries when there are known good standards (e.g. RSA or AES).
Review Patterns There are numerous patterns you can follow to perform the security code review. Each one has strengths and weaknesses, so it’s best to know which of these review patterns are best suited to which vulnerability types.
Trace malicious input: As described above, this is where you follow untrusted input from Source to Sink and analyze the Tainted Path for security protections.
Analyze a module: Review an individual module as it is from top to bottom. This is good to use when analyzing packages that are repeated multiple times throughout the code base, so you can rule out possible issues within it affecting the entire code base.
Analyze a Class or Object: In an Object Oriented based language, vulnerabilities in high-level objects may affect an entire code base, so it’s useful to isolate these high-level objects and review them individually.
Candidate Point Approach: Search the code for security or known dangerous functions and analyze backwards for any untrusted external input. For example, searching for a database function and then traversing backwards to see if any untrusted user input is placed within it. This is essentially tracing in reverse, from Sink to Source.
There are a number of other patterns that get even more in depth, but I will leave that for you to discover on your own. Check out the recommended reading below for more info.
Unit Tests A fantastic place to look in a code base for examples for how certain functions and objects are meant to be used are in the application’s testing files, or Unit Tests. You can model these for yourself or even modify and run them locally if you’re able to set up the application to build and run on your machine. This allows you to verify possible security issues in suspicious code.
An awesome testing strategy is to copy a function in its entirety and paste it into an spreadsheet application. Each row should be a line of the copied code (minus whitespace and just syntax like curly brace). You can then add a number of columns after the code column equal to the number of expected parameters the functions expects. You can then start from the top row with initial seed values for each parameter, then as you go down each row modify the parameter values along with each code statement. That way you will get an idea of what the function will return based on your seed values.
Take copious notes As with any complicated task, be sure you’re taking A LOT of notes. This will help keep you organized and also prevent you from duplicating work. Take notes about your findings in the code and where the offending code resides. Also take notes of your decision making about various aspects of the code. Made a decision that a certain class is free from possible security issues? Note it down. Found a suspicious code path in a specific function? Note it down. In the future when you run across usages of these classes or functions, you will know all about it from your notes instead of having to re-traverse back to that code block and look at it a second time.
Eventually you should get to a point in the code review where you should be able to traverse the code just from your notes and not the code itself. This way you can make quick decisions when you see the same class or function called upon for the umpteenth time.
Automated Code Review
Static Analysis Static Analysis Security Testing (or SAST for short) is the act of running either the source code or the compiled code through an automated scanning application. Static analyzers will model and traverse the code looking for known vulnerability signatures. Generally, static analyzers check for vulnerabilities in a few different ways, but the largest downside is they tend to be quite false-y. Any findings from a static analyzer needs to be double-checked to ensure you’re not reporting too many false positives to the responsible engineering team. However, this is your universal super tool to ensure full coverage of the code review.
Secret Scanner While static analyzers are good at finding secrets in a code base, there are some cool tools that do that explicitly and will also search through git history and comments. TruffleHog is one such tool and it allows you to customize your searches with a JSON dictionary of regex patterns.
Infrastructure-as-code static analysis Infrastructure-as-code like Ansible, Puppet, Terraform, or CloudFormation (AWS) also needs to be security scanned to ensure compliance. These tools are best utilized as part of an automated security pipeline. Luckily these security scanners can be used manually by yourself during a code review:
Cloud Formation: https://github.com/stelligent/cfn_nag
Dependency scanner An important check is to ensure that the application is not utilizing third party components with known vulnerabilities. There are a couple of good automated tools for this. Snyk is a tool you can run from the command line that will analyze package and build files for lists of dependencies and their versions, then search through its known vulnerability database. Dependency-Check from OWASP performs a similar task. These can then be reported to the engineering team to recommend patching.
So when you have these two approaches, which is best? Well both used in tandem! It will take some time and experience, but as you perform more and more code reviews, you will understand what tools or manual approaches work best for each type of vulnerability. Don’t constrain yourself to just one type of testing, utilize different tools and techniques that will allow you to confirm vulnerabilities quickly without sacrificing the coverage of the review.
Hopefully this is a good primer to help get you started completing thorough security code reviews, but there is plenty more to learn. Just remember the following:
Don’t start the code review first in the Secure SDLC. Ensure you have had the time to threat model the application’s documentation and diagrams first to set you up for the code review.
Don’t go too far down rabbit holes in the code base that have no relevance to its security or the goals you have set forward for yourself.
Take breaks often to refresh yourself, because this work can become tedious and it’s easy to just glaze over the code.
Coordinate with the engineers responsible for the application. If you’re stuck or have questions, ask them for help!
Once the code review is complete, write up a thorough report for the engineering team.
Follow up with your findings and ensure the identified vulnerabilities are either addressed or are in that engineering team’s backlog.
What are some tips, techniques, or tricks you use while performing security code reviews? We would love to know more and discuss it on Twitter at @HellaSecure.
About the Author
Aaron is an Application Security Engineer with over 10 years of experience. His unorthodox career path has led to many unique insights in the security industry.