Header bannerHeader banner
December 27, 2021

The Firmware Supply-Chain Security is broken: Can we fix it?

Binarly Team

At the beginning of December, Binarly was very active in spreading the word about the problems in the firmware supply chain ecosystem at multiple security conferences. Alex Matrosov, the Binarly CEO, gave a keynote entitled “The Evolution of Threat Actors: Firmware is the Next Frontier” at AVAR conference in which he focused on the evolving threats coming from historically overlooked places below the operating system.

Another research entitled “The Firmware Supply-Chain Security is broken: Can we fix it?” has been presented by Binarly team in collaboration with our friends from Immune and Richard Hughes, the creator of Linux Vendor Firmware Service (LVFS), at the Open Source Firmware Conference. During the talk, there were presented multiple industry problems and offered a potential solution which could help the industry get fixed at scale. Through this research, ​we wanted to raise awareness about the risks in the firmware supply chain and the complexity of fixing known vulnerabilities. Through a deeper analysis of the firmware ecosystem, we discovered that the failure patterns have a repeatable nature.

Why is the firmware supply chain broken?

Nowadays, it’s difficult to find a hardware vendor who develops all the components present in its products. Many of these components, including their pieces of firmware, are outsourced to Original Device Manufacturers (ODMs). As a result, this limits the ability of the hardware vendors to have complete control over their products. In addition to creating extra supply chain security risks, this also introduces security gaps in the threat modeling process. As more parties become involved in firmware development, the security risks are increasing exponentially.

Another visible issue is related to the asynchronous nature of the firmware security fixes delivery from multiple parties. The firmware patch cycles last typically around 6-9 months (sometimes even longer) due to the complexity of the firmware supply chain and the lack of a uniform patching process. The lack of transparency in the vendor’s security advisories creates a very opaque channel for notifying customers about the criticality of the released security fixes. As a result, when dealing with enterprise infrastructure, it is difficult to estimate the impact and measure it with a specific threat model. Because of that, already known vulnerabilities can remain unpatched for longer than they are supposed to. That leads to the situation when the 1-day and N-day vulnerabilities (known vulnerability with a CVE number assigned) in many cases have a large impact on enterprises since either the latest firmware update wasn’t installed or the specific device vendor didn’t release a patch yet.

Each vendor follows their own patch cycle, thus even known issues may not be patched until the next firmware update is available.

In our study, we found that the asynchronous nature of the implemented firmware supply chain delivery algorithm could lead to specific side-effects that impact security properties equivalent to race condition attacks. The classical race condition attack (TOCTOU) takes advantage of the design of computing systems that requires computing tasks to be executed in a certain order. When a system carries out a particular task, it will wait a period of time before starting on the next. This gap in time between subsequent tasks can be exploited by an attacker to disrupt the original order of the tasks, leading to successful exploitation.

The same type of problem exists in the firmware security supply chain when the firmware development process includes multiple parties. It is necessary for the device vendor to be in sync with the patches of multiple vendors or contributors across the hardware platform or firmware components.

In the figure below, we show different points of failure where asynchronous patch cycles can lead to firmware supply chain race condition failures and the release of the latest firmware update can lead to the distribution of 1-day or N-day unpatched vulnerabilities.

Figure 1

The Firmware Supply Chain Race Condition - the race (asynchronous activities) between the patched vulnerabilities and upcoming fixes from third parties against the device vendor's firmware update schedule.

The Firmware Supply Chain Race Condition

The impact of a particular vulnerability, in many cases, differs based on the perspective of the affected parties. As an example, a Denial of Service (DoS) vulnerability in firmware (CVE-2021-21557) can lead to the disruption of the affected device or even worse, to Permanent DoS (PDoS). In terms of severity, in many cases it can have a medium or low severity. Nevertheless, if this device is a part of an Industrial Control System (ICS), Industrial Internet of Things (IIoT) or medical device, the disruption of such a device can have a critical impact.

A vulnerability impact for different parties can have a different severity depending on the threat model.

In practice, we can only see the firmware vendor severity score and determine a criticality based on that. It is common to see that the same vulnerability (literally the same) can have different severity calculated by different vendors. That leads to the situation where the understanding of the real risk on the end customer's side is frequently misrepresented.

How does the Firmware Supply Chain Race Condition look like in practice? The combination of vulnerabilities or exploitation primitives could lead to a successful attack vector for compromising firmware. Various sources of the firmware supply chain can introduce known vulnerabilities on a different time frame.

Figure 2

As a result, potential attackers can discover the same vulnerabilities on different devices multiple times.

There is a significant decrease in the cost of deploying a firmware attack with a 1/N-day vulnerability compared to leveraging a new vulnerability (0-day).

There are many examples in the wild that confirm this theory - Lojax and CVE-2014-8273 is one of them. Another example was provided in Alex Matrosov's talk entitled “Betraying the BIOS: Where the Guardians of the BIOS are Failing” at Black Hat 2017 in Vegas. He highlighted multiple vulnerabilities such as CVE-2017-3197 and CVE-2017-11312 that were found on multiple devices from different vendors (including enterprise devices).

Figure 3

Some of the aforementioned examples represent rediscovered vulnerabilities from previously published advisories. We're talking about vulnerabilities reported in 2017 and their constant rediscovery, but, unfortunately, not much has changed in the state of the firmware supply chain security since then.

Why is the responsible disclosure process broken?

Behind the scenes of a vulnerability disclosure, there are many instances when researchers are trying to do the right thing by reporting vulnerabilities, but the affected vendors are frequently protecting their own interests in response. These vendors usually try to downplay the impact of the vulnerabilities reported, as we frequently see such behavior in our disclosures. This can sometimes lead to very strange situations when vendors update old advisories with new information that was not previously present. Then, they try to minimize the impact of the reported issue by claiming it was previously reported and patched (linking newly reported issues to old CVEs).

How can so many devices be affected by the same vulnerability if those issues were already patched at the time of the initial report?

So far, the weirdest behavior we've seen is combining multiple vulnerabilities into one CVE, even when the vulnerabilities are completely unrelated and were discovered in different modules.

Not all firmware vendors are the same, and many of them aim to do their best to protect their customers. Unfortunately, the disclosure experience we described earlier was not unique. This complicates the disclosure process, but the worst part is that it obscures the risks and impacts to the end users. When vulnerabilities in third-party firmware code are mapped to a single CVE affecting multiple modules, the device vendor is at risk of misinterpreting the fixes and not patching all the vulnerabilities in their firmware. The American Megatrends (AMI) UsbRt issue is one of the many examples of failed disclosures.

Disclosing the vulnerabilities and informing the customers of this firmware or device is the most important part for not leaving devices unpatched for years. Understanding the impact and scope of the affected parties at scale is the most challenging part of each vulnerability disclosure.

Why do firmware 1-day/N-day vulnerabilities persist for so long?

The main difference between 0-day and 1/N-day vulnerability is that in the case of an 1/N day issue an attacker can discover information from public sources to learn about technical details or find pointers to write an exploit. Binary diffing for firmware binaries is a pretty straightforward approach to learn about the changes in code between patched and unpatched modules. Because they are already publicly disclosed and often contain technical details available to an attacker, 1/N-day vulnerabilities are frequently even more dangerous than 0-day vulnerabilities.

One good example is the AMI UsbRt vulnerability which is already five years old but it is still present in newer devices. The initial UsbRt vulnerability was discovered in 2016. However, due to the complexity of the code, more variants of the initial vulnerability were discovered later. Binarly efiXplorer team recently discovered and reported some of those variants on pretty new enterprise grade devices.

Figure 4

The UEFI System firmware is intended to replace the legacy BIOS and bring more consistency and transparency to the firmware development ecosystem. However, due to the significant increase in the code base for modern system firmware and the lack of transparency, we are experiencing other issues. The legacy BIOS firmware has been phased out for over a decade and now UEFI becomes the new legacy firmware in many cases. The legacy of supply chain failures is now inside the UEFI firmware ecosystem.

In our OSFC talk we presented our vision around building an open-source framework to identify known vulnerabilities in the context of UEFI specifics, classify them based on their impact and detect at scale across the whole firmware ecosystem with the help of the LVFS project and Binarly SaaS Platform.

How can we fix firmware supply chain failures?

Recently, there has been a lot of positive activity to create a more transparent and verifiable supply chain process. The Software Bill of Materials (SBOM) helps to build a dependency graph and identify the scope for upcoming fixes. In the firmware layer, the same type of activities is taking place to develop a traceable firmware Bill of Materials. However, no silver bullet exists at this time. Most components of the UEFI firmware are part of proprietary frameworks that come from Independent BIOS Vendors (IBVs). In this case all traceable artifacts can only be built based on integrity verification, which doesn't provide enough transparency at the code level. We have previously discussed in our blog the issues with blindly trusting black box firmware even when it's properly signed (“Why Firmware Integrity is Insufficient for Effective Threat Detection and Hunting”).

Integrity doesn't provide code level visibility, and scoping dependencies based on a binary module's hash is difficult. That's especially true when dependencies are indirect and buried in code abstraction layers.

It is also important to remember that attestation based on TPM has a lot of limitations in terms of runtime visibility. Any result of runtime exploitation (payload execution) for all firmware vulnerabilities discussed in this and our previous blog posts can’t be measured ​and the TPM PCR's will not be extended to detect such threats.

The remote health attestation will not detect active exploitation on affected systems, plus there are vulnerabilities in runtime and device health attestation mechanisms.

As an example, a recent discovery describes how an attacker can manipulate blank PCRs values to break device health attestation on Microsoft Surface devices (CVE-2021-42299: TPM Carte Blanche).

Only this year, the Binarly efiXplorer team reported over fifty high and critical severity vulnerabilities through our collaboration with CERT/CC (https://www.kb.cert.org/vuls/).

Most of those vulnerabilities affect not only a single vendor, it's an industry-wide disclosure process. CERT/CC and their VINCE system allow multiple parties to be notified at the same time, thus significantly reducing the timeline of the initial engagement with the impacted parties.

Figure 5

VINCE should become the industry standard for disclosing vulnerabilities, especially when multiple parties are involved in an industry-wide disclosure process. In most of the cases when you report an issue to a single vendor, usually multiple parties are impacted. The vendor associated with the initial disclosure very rarely does a good job of coordinating with the other parties without missing timelines. That’s exactly the reason why the problems we previously named firmware supply chain race conditions exist. There should be an independent oracle, such as VINCE, to maintain consistent information disclosure for all impacted vendors for a more effective disclosure process and coordination.

As a result of our partnership between LVFS and Binarly, we are able to identify the impact and scope of the affected parties at scale.

Scoping impacted vendors is one of the most important and challenging parts of each vulnerability disclosure process. That's another reason why we developed the FwHunt rule format to detect vulnerable code patterns with semantic annotation flavor. We’ve previously discussed in our blog the reasoning for developing the FwHunt rule format and the advantages of such approach over a YARA-based approach (Detecting Firmware vulnerabilities at scale: Intel BSSA DFT case study).

The Binarly Firmware Hunt (FwHunt) rule format was designed to scan for known vulnerabilities and verify that an affected OEM or vendor has patched the issue in its latest update.

Binarly FwHunt rules are based on an open specification and stored in our github repository. Binarly disclosures help vendors identify all vulnerable devices using the FwHunt technology. After the advisory becomes public and the vendor releases the patch we will publish the FwHunt rule to the public. The customers of Binarly SaaS platform can be notified about the impact and vulnerable devices earlier to get prepared, in advance, for upcoming fixes.

Disclosing and fixing a vulnerability is one side of the problem. The other is to deliver this patch at scale to as many systems as possible in the field.

Here is result the successful detection of CVE-2017-5721 (original UsbRt vulnerability rule) with FwHunt rule and uefi_r2 scanner:

Figure 6

We will continue our mission with our partners to help the industry recover from the aforementioned issues of the constant supply chain failures. Additionally, Binarly team is contributing to the Open Source Firmware Foundation, particularly by leading the Firmware Security Workstream.

Figure 7

Check if you are affected by the XZ backdoor