The Binarly team is constantly researching ways to automate our proprietary deep code inspection technology to improve the discovery of different classes of bugs within system firmware. The Binarly efiXplorer team has decades of experience in program analysis and automation, enabling us to develop unique binary analysis techniques and technologies internally.
In one of our previous blogs “FwHunt The Next Chapter: Firmware Threat Detection At Scale" we covered the difference in functionality between Binarly Platform and open source tools Binarly provides to the industry. In this blog, we will cover our internal framework in more detail , and demonstrate its capabilities for bug hunting.
As a result of the limitations of current open source tooling, such as performance, and its adoption by our internal projects, we developed our own static analysis framework, written in Rust. The Binarly static analysis framework translates assembly code to an intermediate representation (IR) that preserves all semantic metadata from assembly code and allows for more advanced analysis techniques to be used. In contrast with pattern-based approaches that are heavily used in the industry, we developed FwHunt (fwhunt-scan) which uses semantic properties as a filtering criteria during the binary code search to improve detection accuracy and reduce the number of false positives. The semantics-based approach overcomes many of the problems faced when using lightweight syntactic scanning rules and other pattern or signature-based approaches.
A key component of the Binarly Vulnerability Management offering is FwHunt, which identifies and discovers existing vulnerabilities based on semantic properties. But, what if a class of vulnerability is known, can we precisely define its semantic properties to detect new vulnerabilities based on this previous knowledge? In order to improve detection capabilities, we started thinking about symbolic execution as a potential solution. Binarly co-founder Alex Matrosov used symbolic execution during his time at Intel for automated path discovery to improve fuzzing coverage and automate test cases generation (“Excite project: All the truth about Symbolic Execution for BIOS security”). The Binarly team decided to move in a different direction to avoid limitations on adopting disassembly code analysis, which is primarily based on syntactic code properties. Leveraging concolic execution methods helps us to answer the questions such as: what specific value is needed to reach a vulnerability trigger or is there a way for a vulnerable function to be controlled by an attacker (path discovery and exploitability)?
The Binarly custom static analysis framework extracts the uniform SSA form (Static Single-Assignment) from the intermediate representation (IR) produced by the aforementioned framework. Binarly’s form of IR explicitly encodes instruction side-effects to make all of the code properties from the original disassembly code available for analysis.
Our approach works by leveraging the semantic properties extracted from the Binarly IR form and applying lightweight code checkers on that representation to provide hints for deeper analysis. The analysis of the whole firmware image, which usually contains hundreds of modules, takes only 4-6 seconds.
The approach we use for symbolic execution is inspired by the USENIX Security paper (“Sys: A Static/Symbolic Tool for Finding Good Bugs in Good (Browser) Code” - Brown et al., USENIX Security 2020). The paper introduced a new approach combining lightweight static code checkers and under-constrained symbolic execution. The static code analysis (checking) identifies potential candidates of vulnerable code places and under-constrained symbolic execution helps to validate that the discovered problems are exploitable bugs. In the following figure it is presented the annotated IR with types and EFI service information which helps to identify analysis entry-points based on the module type: SMI handlers (DXE/SMM modules) or PEI notification callbacks (PEI modules).
Using our framework, we can develop code checkers using an embedded Domain Specific Language (eDSL) that makes it possible to quickly add new checking techniques. Binarly code checkers are based on a lightweight static analysis defined using an eDSL:
We define static checkers using different code properties that are used to build a checking “state machine.” The code properties include inferred information about call-sites, service information, and common API usage. These annotations can be user-supplied for a given analysis task or automatically inferred by leveraging the Binarly custom static analysis framework.
The following types of semantic properties are used during Binarly code analysis:
- Control-flow properties (reachability)
- Data-flow properties (data-dependence)
- Inferred call-site properties (e.g., arguments passed, type information)
- Domain-specific annotations
- Service-specific (e.g., GetVariable variants in PEI and DXE phases)
- Common APIs (e.g., CopyMem, ZeroMem, etc.)
While static checkers provide a lightweight way to identify potential vulnerabilities, there is the risk they can introduce false positives. To filter these false positives, we apply symbolic execution to check for violations of code properties that should hold, with respect to the bug classes identified using our lightweight checkers.
We used this approach to discover and confirm vulnerabilities for the Black Hat USA 2022 research (“Breaking Firmware Trust From Pre-EFI: Exploiting Early Boot Phases”) and demonstrated for the first time the implementation of vulnerability checkers based on under-constrained symbolic execution applied to Pre-EFI firmware code.
Here is an example of the discovery of BRLY-2022-014/CVE-2022-32579 (misuse of GetVariable() leading to arbitrary write) discovered originally on newest Intel NUC M15 device:
Another example of the discovery of BRLY-2022-027/CVE-2022-28858 (classical misuse of GetVariable() without sanity check of DataSize):
Using our combined static and symbolic approach we can significantly reduce the number of false positives by detecting symbolic model violations augmented by semantic code properties (as shown on the figure below).
In previous demonstrations, we discussed pre-EFI vulnerabilities, but the same approach is also effective for discovering classical SMI handler vulnerabilities. Let’s take a look at the detection of BRLY-2022-016/CVE-2022-33209 (classical stack overflow in SMI handler) vulnerability. The figure below shows the discovery of a buffer overflow in the SMI handler and the automated type reconstruction of the CommBuffer structure.
Binarly team used the BRLY-2022-027 (INTEL-SA-00712) vulnerability from the 12 additional bugs disclosed at Black Hat to perform the previously discussed Intel PPAM attack on a specific HP device (“Black Hat 2022: The Intel PPAM Attack Story”).
In this blog we focused on describing the automated discovery process of such vulnerabilities with symbolic execution. While pattern-based approaches will always be the most lightweight option to detect known bugs at scale, the techniques described here can complement such approaches by reducing false positives, and aid in identifying previously unknown vulnerabilities from known classes of bugs.
Discovering repeatable failures in firmware without automation is a difficult proposition. We are constantly improving our detection methods in order to keep up with the current and future threats. The Binarly team is constantly working to protect the firmware supply chain and reduce the attack surfaces of our customers industry-wide by delivering innovative technologies to the market.
Are you interested in learning more about Binarly Platform or other solutions? Don't hesitate to contact us at firstname.lastname@example.org.