At Black Hat USA 2021, Binarly CEO Alex Matrosov jointly presented with Nvidia security researchers Alex Tereshkin and Adam 'pi3' Zabrocki their findings in the “Safeguarding UEFI Ecosystem: Firmware Supply Chain is Hard(coded)” talk, highlighting five high severity vulnerabilities that affected the whole UEFI ecosystem.
In our previous blog post “Firmware Supply Chain is Hard(coded)”, we discussed the problems related to the UEFI firmware ecosystem by providing an example of an attack on Intel reference code. In this second post in the series we continue introducing new attacks on the (pre)EFI ecosystem. Historically, both vendors and attackers have overlooked (pre)EFI boot process (in)security; pre-EFI Initialization (PEI) boot stage opens many doors and offers flexibility to attackers.
From the impact perspective of exploiting Pre-EFI vulnerabilities, an attacker can execute arbitrary code at the time of the PEI stage and influence the subsequent boot stages. This can lead to the physical memory contents exposure (including Virtual Machine Control Structure (VMCS)), revealing of secrets from any Virtual Machines (VMs) and bypassing memory isolation and confidential computing boundaries. Additionally, an attacker can build a malicious payload which can be injected into the SMRAM memory (System Management Mode (SMM)).
The Intel BSSA DFT as a reference code vulnerability, is affecting the whole industry, not just a single vendor. The original code developed by a hardware vendor typically represents around 5-7% of the entire code base. The rest of the code consists of Intel/AMD reference code and the third-party UEFI firmware frameworks (i.e. AMI/Phoenix/Insyde).
This is why vulnerabilities such as Intel-SA-00525 are so dangerous and their patch cycle is so long.
The timeline for releasing the fix for such of the vulnerabilities is usually looks like that:- Researchers reported the issue to Intel or device vendors.- Intel is analyzing the issue and working on the fix.- Intel released the fix for its reference code under NDA to IBV, ODM and OEM vendors. - ODM and OEM scheduling a fix for the next patch cycle window.- The end customer needs to apply the fix in their infrastructure.
This entire time window from reporting the issue to the fix usually takes around 6-8 months and sometimes can be extended for 12+ months depending on the issue. Another problem is delivering the fix to end users. The firmware update may also take longer to release because hardware vendors who are responsible for shipping the update may face delays on their end. Even on critical vulnerabilities, different vendors react with varying urgency. In the end, the timeframe for the fix is more than one year. Applying the fixes to the infrastructure can take even longer. Very frequently, we see enterprise networks where the firmware is never updated once the devices are enabled. It's a scary part when a 1-day issue can turn into a partial 0-day on many devices due to the delayed security fix.
Intel's BSSA DFT vulnerability affects the entire industry, not just one vendor.
Impact: 7.5 HIGH (CVSS:3.1/AV:L/AC:H/PR:H/UI:N/S:C/C:H/I:H/A:H)
Scope: Industry-wide
Unfortunately, not all of the affected vendors have issued a security advisory to notify their customers about the need for an update. One of our previous blogs - "Why Firmware Integrity is insufficient for Effective Threat Detection and Hunting" - discusses the limitations of legacy approaches focused on integrity monitoring and attestation solutions and their lack of visibility into the root cause of the problems. In our next blog, we will introduce a detection method of such vulnerabilities at scale, as well as discussing the industry challenges related to patching such issues. Let's now dive into the technical details of the Intel BSSA DFT vulnerability.
During our reverse engineering exercise on the targeted firmware image we discovered a feature in the Intel reference code which is responsible for loading and running an arbitrary unsigned code during the very early boot stage, without taking into account secure boot settings. Such functionality allows an attacker to install a persistent implant in the firmware. This feature is intended to be used by developers for debugging or hotfixing purposes. The Intel BIOS Shared SW Architecture (BSSA) Design for Test (DFT) feature can be used to set up software hooks for various POST codes.
During reverse engineering of the UncoreInitPeim module and one of the vulnerabilities INTEL-SA-00463 (CVE-2020-24486) we have described in a previous blog post “Firmware Supply Chain is Hard(coded)”, we found a referenced to “BIOS Shared SW Architecture (BSSA) Module Loader” string. Given that the string constants "Module" and "Loader" were found next to each other in the aforementioned name, we were intrigued and eager to investigate what this feature is.
While researching the EFI variables dumped from the target machine, we discovered some unusual names. Since these NVRAM variables have been parsed during the PEI boot stage, it got our attention.
In comparison to the DXE phase, PEI NVRAM access, by design, is considered read-only. The figure below shows the PEI code that reads configuration data from one of those NVRAM variables.
Due to their Read-Only nature, the attack surfaces during PEI phase on EFI variables are frequently overlooked or considered not exploitable by firmware developers. But the attack surfaces coming from PEI EFI variables have the same critical impact as the DXE phase EFI variables attack vector.
In many cases, arbitrary code execution during the PEI boot phase can cross security boundaries and cause significant damage. Let's go back to the code that reads the "syscg" EFI variable.
int __cdecl ExecuteTargetOnlyCmd(int host)
{
...
char syscg_stack[2048]; // [esp+Ch] [ebp-81Ch] BYREF
UINTN DataSize; // [esp+80Ch] [ebp-1Ch] MAPDST BYREF
...
// DataSize may become > 2048 bytes if GetVariable() call fails with EFI_BUFFER_TOO_SMALL.
// That is the case if “syscg” NVRAM variable is longer than 2048 bytes.
DataSize = 2048;
...
dbgprint_0(host, "Inside ExecuteTargetOnlyCmd() \n");
PeiServices = *(host + 142696);
(*PeiServices)->LocatePpi(PeiServices, &gSsaGuid, 0, 0, &Ppi);
(*PeiServices)->LocatePpi(PeiServices, &gReadOnlyVariable2Guid, 0, 0, &ReadOnlyPpi);
ZeroMem(syscg_stack, 2048);
// no status check after GetVariable()
ReadOnlyPpi->GetVariable(ReadOnlyPpi, L"syscg", &gSsaBiosVariablesGuid, 0, &DataSize, syscg_stack);
// no check for a return value of AllocatePool()
syscg = AllocatePool(DataSize);
// memcpy() will not overflow syscg, but may copy stack memory into it
memcpy(syscg, syscg_stack, DataSize);
...
}
From an exploitation point of view, these bugs are not particularly useful but, at a closer look, this code may be full of surprises. Taking a few more hours to review this code, we realized that the GetVariable() API can be even more dangerous when called in a loop.
An attacker may change the DataSize and overflow the buffer on the next iteration with an attacker-controlled malformed data if the returned state of the API call is not checked.
One such instance of GetVariable() API that is called in a loop was found in the ReadChunkedData() function shown in the figure below.
The code in the figure below reads a chain of EFI variables. Each EFI variable contains the name of the next variable to be read at the beginning of its data (10 bytes representing 5 Unicode characters). The DataSize variable, controlled by an attacker, is not reinitialized before the GetVariable() call. An attacker may change DataSize value and overflow the buffer on the next iteration:
The ChunkName buffer is stored on the stack (6 Unicode characters), the GetVariable() call encounters a large EFI variable (length > DataSize), DataSize gets overwritten with the actual length. On the next iteration a stack overflow occurs when GetVariable() assumes ChunkName variable data length is at most DataSize bytes.
This stack buffer is overflowing, there are no stack cookies or ASLR for the PEI phase. The memory corruption vulnerabilities are pretty easy to exploit without any runtime mitigations.
An updated version of Binarly efiXplorer plugin includes a more fine tuned vulnerability checker which can detect GetVariable() misuse at both DXE and PEI boot phases.
We found a few cool vulnerabilities, but we don’t even need to exploit this stack overflow to make our unsigned code execute during PEI.
These hidden features of Intel BSSA were designed to run arbitrary unsigned code blobs stored in EFI variables! To make matters worse, Intel BSSA DFT is a part of the reference code.
Let's explore the Uncore unsigned binary blob loader. On the figure below EvLoadTool() walks the EFI variable chain starting from “toolh” and builds a contiguous 32-bit PE image. The payload may be 100kb in size or even more, the available NVRAM space sets the upper limit. The routine Entry() executes the binary blob from PE entry point.
Intel BSSA DFT was intended to be used for debugging or testing purposes only. This feature needs to be disabled in the production environment. In fact, we have noticed that a lot of vendors forget to disable or leave it enabled on purpose in production devices.
The official Intel guidance states that this feature should not be enabled if a physical presence has not been established.
Vendors of UEFI firmware are supposed to implement their own code for establishing physical presence. A physical jumper setting may be used, for example. In reality, the physical jumper has a direct impact on pricing. This is why we were only aware of the software-based approach to control this feature.
"Reference implementations" often become the de facto implementation — due diligence with proper (safe) defaults should be the norm.
The key to successful exploitation of the Intel BSSA DFT is shown on the following figure.
Intel implemented a stub inlined function for a physical presence which always returns TRUE(1). Shouldn’t they use "return FALSE(0)" by default? This is the result of a developer mistake who forgot to set a boolean return to FALSE(0) by default.
According to Intel, IBVs must implement a code that determines whether jumpers are physically present. IBVs simply reused Intel's reference code implementation without altering it in any way. The presence check on Grantley+ server platforms is now effectively disabled due to this reason.
In the PEI phase, it is difficult to exploit arbitrary code execution. Although it is relatively simple to trigger the vulnerability, if the payload needs to survive into the later phase of DXE it is much more challenging. One of the examples that has been demonstrated before is the Hyper-V Backdoor PoC by Dmytro Oleksiuk. The goal of this PoC was to inject a payload over a PCIe device using a DMA attack to influence the DXE phase. In order to control the boot process, one of the methods used is to hook EFI_BOOT_SERVICES once the DXE phase is over.
During PEI, we try to find a stable way to control the boot stages transition. The most stable way is to hook LocateProtocol() to survive the transition between PEI and DXE boot phases.
After successfully triggering the vulnerability, the payload is waiting for the end-of-PEI event. By that time the DxeCore is already initialized. The PEI payload searches for EFI_BOOT_SERVICES in DxeCore image, allocates and maps memory for the next stage DXE payload. Following that, PEI payload hooks DxeCore's LocateProtocol(). The transition to DXE phase happens successfully and on the next step the hooked LocateProtocol() is executed.
That triggers the execution of the next stage DXE payload to set LoadImageProtocol() notification callback and unhook LocateProtocol(). The LoadImageProtocol() callback monitors SmmCore initialization. When SmmCore is loaded DXE payload searches for EFI_SMM_SYSTEM_TABLE in the SmmCore image. After that, DXE payload allocates and maps memory for the SMM payload. Following that, the DXE payload is hooking the SmmLocateProtocol() to trigger the SMM payload execution in the next step.
The next boot phase executes the hooked SmmLocateProtocol() which triggers the SMM payload which installs the Software SMI handler and unhooks SmmLocateProtocol(). For the Black Hat presentation demo we created an SW SMI handler which scans memory pages and looks for string patterns associated with SSH keys on an Ubuntu Server.
As part of our Black Hat demo, we demonstrated how a payload executed in the PEI phase survives transition to DXE phase and executes SMM payloads. That's probably the best example to prove that PEI vulnerabilities can be just as dangerous as DXE ones or even worse since frequently PEI is considered a security boundary for enabling some of the platform security features.
It is worth mentioning we created a full chain attack going from user-mode (ring-3) to leverage privilege escalation to kernel-mode (ring-0) and from there triggering the Intel BSSA DFT vulnerability. Next, we created a full chain of payloads that survived all stages of the boot process for an UEFI system firmware from PEI through DXE to SMM. This full chain started by a single shell script represents a stable exploitation tool because it’s abusing a design class vulnerability.
The result of exploitation (payload execution) for all vulnerabilities discussed in this blog post can’t be measured and TPM PCR's will not be extended to detect such threats. The remote health attestation will not detect the active exploitation on affected systems.