UEFI Bootkit Hunting: In-Depth Search for Unique Code Behavior

by Takahiro Haruyama, Fabio Pagani, Yegor Vasilenko, Anton Ivanov, and Sam Thomas (Binarly REsearch team)

Firmware threats such as bootkits and implants have become increasingly prevalent due to their persistence and ability to evade detection compared to traditional OS-level malware. Attackers favor these threats because they can remain undetected even when conventional security measures are in place, especially if UEFI Secure Boot is disabled. Detecting unknown bootkits under these circumstances is a critical challenge in cybersecurity. Mostly, the publicly known UEFI implants and bootkits have been detected after successful deployment, which points to the limitations of the existing security solutions.

In this blog post, the Binarly REsearch team introduces a novel methodology for detecting UEFI bootkits by analyzing their unique code behaviors. By starting from an in-depth analysis of known bootkits, we identify features that can be used for generically detecting bootkits and build rules that we used for hunting new unknown bootkits. Then, we show how these rules can be even further improved, by leveraging advanced static analysis techniques, semantic detection and ML-based clustering.

Background

A bootkit is a type of rootkit that runs during the boot process, before the operating system starts up. Once installed, a bootkit is generally harder to detect than OS-level malware and can bypass OS security mechanisms like PatchGuard and Driver Signature Enforcement (DSE), allowing it to patch the OS kernel, run arbitrary kernel shellcode, or install malicious drivers.

Bootkits have been around for decades at this point, and have long expanded from targeting legacy BIOS to modern UEFI firmware. This evolution also followed the adoption of firmware security features, like Intel Boot Guard and BIOS Guard, which forced bootkits to move from infecting SPI flash memory to targeting the ESP.

*Figure 1: Past UEFI bookits discovered in the wild*

UEFI Secure Boot also played an important role in this evolution. This security feature is designed to ensure that only trusted software is executed during the boot process, helping to protect against malware and unauthorized code. However, the risk of bootkit infection still exists as attackers can disable UEFI Secure Boot through physical access, exploits, or supply-chain attacks. This has been shown in the past by Black Lotus, which used a vulnerability known as Baton Drop (CVE-2022-21894) to bypass Secure Boot.

Additionally, while historically Windows has been the main target for bootkit attacks, Linux-targeting bootkits such as Bootkitty (Ubuntu) and Pacific Rim (modified Linux kernel?) have been recently discovered. This is one of the reasons that drove us to research generic bootkit hunting methods: bootkits remain a prevalent threat and give powerful options to attackers, so it is highly likely that new bootkit families will continue to emerge in the future.

Hunting Approach Based on Known Bootkit Analysis

To develop the methodology for generic detection techniques, we started with an in-depth analysis of all publicly known bootkits, including Lojax, MosaicRegressor, MoonBounce, CosmicStrand, ESPecter, and BlackLotus. This allowed us to find shared features and differences among various bootkits.

*Table 1: Basic information about analyzed bootkits*
	Lojax	MosaicRegressor	MoonBounce	CosmicStrand	ESPecter	BlackLotus
Year of Discovery	2018	2020	2022	2022	2021	2023
Infection Target	SPI flash	SPI flash	SPI flash	SPI flash	ESP	ESP
Firmware Components	DXE driver	Two DXE drivers and one EFI application	Modified DXE Foundation (CORE_DXE)	Modified CSMCORE DXE driver	Modified Windows Boot Manager binary (bootmgfw.efi)	EFI application disguised as bootloader (grubx64.efi)
Code Reuse	NTFS DXE driver (ntfs-3g)	Hacking Team’s Vector-EDK	BootLoader	—	—	umap, EfiGuard

The table above shows some basic information about the analyzed bootkits. Except for MoonBounce, all bootkits are either DXE drivers or UEFI applications. More interestingly, most bootkits reuse large portions of open-source or leaked implementations of bootkits. For example, most of MosaicRegressor’s code is based on HackingTeam’s leaked Vector-EDK, while BlackLotus borrows some code from umap and EfiGuard.

As mentioned earlier, bootkits run during the boot process with the intent of compromising an operating system. Because firmware operates with high privileges, bootkit authors have several ways to achieve this goal. In the following sections we will discuss three key features that are shared amongst bootkits:

Hook chain: the method by which a bootkit execution is triggered during the boot process.
Additional components and features: any extra component included in the bootkit includes (such as a filesystem driver) or any mechanisms to disable security features in the OS.
OS persistence: how the bootkit spreads into the OS, allowing it to remain active and operate within the OS at runtime.

In particular, we will discuss how each of these features can be leveraged (or not leveraged) to build generic bootkit detections.

As we will see in the next sections, the key takeaway from this analysis is that the hook chain and additional components do not offer strong detection features, and using them would lead to imprecise results with many false positives. On the other hand, the OS persistence techniques are shared by modern bootkits and can be effectively modeled and leveraged for reliable detection.

Bootkit Hook Chain

Using the BootService table as a hooking point is very common among bootkits. Lojax and MosaicRegressor register their malicious callbacks using the legitimate BootService function CreateEventEx(EFI_EVENT_GROUP_READY_TO_BOOT), ensuring their callbacks are executed right before the boot manager is about to load and execute a boot option. MoonBounce and CosmicStrand instead use a more direct hooking strategy, and replace the function pointers stored in the global BootService. In terms of detection, none of these behaviors is very reliable. The first pattern is very common in UEFI firmware and it will lead to many false positives, while the second one would only work for detecting CosmicStrand, as MoonBounce is a DxeCore module that contains an already hooked BootService table.

Another hooking strategy, which is instead shared by many samples, is the more traditional inline code hooking, where code is directly patched in memory. This technique is used by MoonBounce, CosmicStrand, ESPecter and BlackLotus, and is usually implemented in two steps:

Code-like byte signatures are used to scan memory and identify a target function.
The target function is patched with malicious code.

*Table 2: Hook chain used by four bootkits*
MoonBounce	CosmicStrand	ESPecter	BlackLotus
1. Multiple BS function table hooks (CORE_DXE) 2. OslArchTransferToKernel (winload.efi) 3. ExAllocatePool (ntoskrnl.exe)	1. BS.HandleProtocol (CSMCORE) 2. Archpx64TransferTo64BitApplicationAsm (bootmgfw.efi) 3. OslArchTransferToKernel (winload.efi) 4. ZwCreateSection (ntoskrnl.exe)	1. Entrypoint (bootmgfw.efi) 2. Archpx64TransferTo64BitApplicationAsm (bootmgfw.efi) 3. OslArchTransferToKernel (winload.efi) 4. CmGetSystemDriverList (ntoskrnl.exe)	1. ImgArchStartBootApplication (bootmgfw.efi or bootmgr.efi) 2. BlImgAllocateImage Buffer and OslArchTransferToKernel (winload.efi)

From a detection perspective, this inline hooking technique looked promising, since four bootkits patch the same target function (OslArchTransferToKernel). However, this strategy quickly turned into a dead end: all memory scanning algorithms used by the bootkits were distinct, and the patched instructions varied across bootkit too, showing no common patterns.

For example, both MoonBounce and CosmicStrand use 4 byte signatures to search for OslArchTransferToKernel, but using different signatures (0xCB485541 and 0x41106A56).

*Table 3: Memory scan algorithm comparison*
	MoonBounce	CosmicStrand	ESPecter	BlackLotus
Signature size	4 bytes	Combination of 4 bytes and 2 bytes	Combination of 1 byte and 4 bytes	Flexible length
Search direction	forward	forward/backward	forward	forward

Additional Components/Features

Some bootkits include additional components and features to avoid detection, and these can also provide a venue for detection. For instance, Lojax and MosaicRegressor run a UEFI module by using the LoadImage and StartImage, but these are once again very common functions and thus don’t provide a reliable indicator. On the other hand, ESPecter and BlackLotus use inline hooks to disable security features, such as DSE and Windows Defender. Since these features are disabled via inline code patching, the generic detection is hard for the same reasons described before. VBS can be disabled through setting the NVRAM variable VbsPolicyDisabled, but the variable name can be obfuscated (for example, BlackLotus encrypted the string).

Overall, creating a generic detection rule based on additional components and features found in bootkit would not yield good results.

*Table 4: Additional components/features*
Lojax / MosaicRegressor	ESPecter	BlackLotus
Load Inline ntfs-3g DXE driver (Lojax) or UEFI application (MosaicRegressor): BS.LoadImage() and BS.StartImage()	Disable verification of the boot manager’s own digital signature: - Patch BmFwVerifySelfIntegrity (bootmgfw.efi) Disable Windows DSE (Driver Signature Enforcement): - Patch SepInitializeCodeIntegrity (ntoskrnl.exe)	Disable VBS (Virtualization Based Security): – SetVariable() – VbsPolicyDisabled (obfuscated) Disable Windows Defender: – Patch the driver's entry point – WdFilter.sys/WdBoot.sys (obfuscated) – Access driver list structure (LOADER_PARAMETER_BLOCK, KLDR_DATA_TABLE_ENTRY)

Operating System Persistence

To achieve OS persistence, Lojax and MosaicRegressor drop an executable in the NTFS filesystem, by using the BootService functions HandleProtocol and OpenProtocol and the Write function exported from the EFI_FILE_PROTOCOL.

These functions are very common in file system drivers, and even detecting these persistence techniques through the filesystem paths used would also not be very effective, as these paths are obfuscated.

In any case, this technique is very noisy and can be easily detected by security solutions, making it uncommon in the latest bootkits.

*Table 5: OS-level persistence techniques used by four bootkits*
MoonBounce	CosmicStrand	ESPecter	BlackLotus
Load a kernel driver: • Clear the WP bit in the CR0 register • PE parsing – Resolve kernel API address by hash (IMAGE_EXPORT_ DIRECTORY) – Change PE section header flag (IMAGE_SECTION_HEADER) – Resolve relocations (IMAGE_BASE_RELOCATION) – Resolve IAT	Run kernel shellcode: • Clear the WP bit in the CR0 register • Copy shellcode to the slack space after .text section of kernel – Resolve kernel API address by hash (IMAGE_EXPORT_DIRECTORY)	Run kernel shellcode to drop driver/config: • Clear the WP bit in the CR0 register • Write file using kernel APIs - Resolve kernel API address by hash (IMAGE_EXPORT_DIRECTORY)	Load Windows kernel driver: • Rootkit Driver (AES-encrypted) • PE parsing - Copy sections (IMAGE_SECTION_HEADER) - Resolve relocations (IMAGE_BASE_RELOCATION) - Backup disk.sys EP (IMAGE_EXPORT_DIRECTORY) - Get BuildNumber from resource section (IMAGE_RESOURCE_DIRECTORY)

On the other hand, as shown in the table above, the remaining four bootkits use two OS-persistence techniques that can be leveraged for creating a generic detection: clearing bits in control registers and shellcode-like PE parsing.

Clearing bits in control registers

Bootkits often clear the Write Protect (WP) bit in the CR0 register, to remove write protection on read-only memory pages, with the goal of in hooking code or to modifying PE header values, such as the entry point or the section permission. This behaviour is relatively uncommon in UEFI applications, so it provides a great venue for building a detection rule.

Shellcode-like PE parsing

Several bootkits parse PE format structures to execute kernel shellcode or to load kernel drivers, a behavior that is very rare in benign UEFI modules and applications. In particular, we found that multiple bootkits parse the IMAGE_EXPORT_DIRECTORY structure in the PE header for finding kernel API addresses by string hashes, and also the IMAGE_BASE_RELOCATION structure for resolving code relocations.

For this reason, we decided to use these OS-persistence techniques as a base for building our generic hunting and detection methods.

Hunting Rules and Results

Based on the two OS-persistence techniques discussed above, we developed detection rules in YARA, which can be used for hunting on VirusTotal, and in the FwHunt format, which is compatible with our Binary Risk Hunt scanner. We followed an iterative approach to develop them: suspicious samples were identified using the YARA and FwHunt rules, which were then statically triaged to refine the rules. In the following sections, we present the hunting results related to the YARA rules and to VirusTotal, as those also cover the results from the FwHunt rules and Binarly Risk Hunt. However, FwHunt will be discussed in the next section, when we explore more advanced detection methods.

Clearing Bits in Control Registers

WP bit in CR0

The first rule detects the clearing of the WP bit in CR0, and we created it based on the behavior found in MoonBounce, CosmicStrand and ESPecter.

As we can see in the figure above, the code sequence to read/write CR0 is rather simple ($clear_wp_in_cr0), which resulted in a few false positives in some edk2 modules (e.g. OvmfPkg, UefiCpuPkg and EmulatorPkg) and commercial bootloaders that we had to exclude.

Using this rule to hunt for unknown bootkits, we detected two Bootlicker variants with VT detection rates of 1/71 and 2/68. Bootlicker is an open-source bootkit based on DmaBackdoorBoot, but the two samples were only detected as Win/malicious_confidence_70% and MALICIOUS, not as bootkits.

The hook chain used in these variants fully matches the one implemented in Bootlicker: ExitBootServices → OslArchTransferToKernel → ACPI.sys .rsrc shellcode → PsSetCreateThreadNotifyRoutine → shellcode in .text slack space → KeInsertQueueApc → APC callback → KeInsertQueueApc → user-mode shellcode. One of the samples has no user-mode payload (null function), while the other downloads shellcode from a local IP (192.168.1.44). Therefore, we suspect that the developers submitted their PoCs to check the detection rate.

CET bit in CR4

EfiGuard, an open source bootkit implementation, also clears the WP bit in CR0. However, the previous rule was not effective because EfiGuard calls AsmWriteCr0 and passes a bit mask value to clear WP as an argument.

*Figure 3: AsmWriteCr0 call with argument of bit mask value*

For this reason, to detect this bootkit, we created another rule matching the clearing of the CET (Control-Flow Enforcement Technology) bit in CR4, a functionality which is implemented in EfiGuard. Since we directly defined the code sequence bytes of the assembly-written function AsmDisableCet, this rule is specific to EfiGuard. However, we think it’s worth creating it because EfiGuard has been actively abused in the wild and used as a starting point for malicious bootkits.

*Figure 4: YARA rule detecting the clearing of CET bit in CR4*

Hunting with this rule on VirusTotal led to the discovery of two unknown bootkit samples named "Vixen.efi", which were found without detections (0/75, 0/73). We compared the samples with the original EfiGuardDxe binary and found only trivial differences:

Disabling debug information (print() message, pdb path, etc.)
Inline expansion of utility functions through compiler optimizations
Difference in the submodule (older version of Zydis)

The code behavior of these two Vixen samples was the same as EfiGuardDxe: disabling both PatchGuard and DSE. Since the number of EfiGuard detections is usually in the range 10-20 (e.g., the latest release of EfiGuard has 13 positive detections), we were surprised to see that the Vixen samples were not detected at all.

We also tried to identify the purpose of the binary, however, we could not get any concrete evidence for the identification. We found a related loader for Vixen.efi, but the code didn’t present any notable difference from the Loader.efi of EfiGuard. We also searched the bundled files reported on VirusTotal using OSINT engines, but no other related sample was found, thus the behavior of this sample remains unknown to us.

Shellcode-like PE Parsing

The second set of rules capture how bootkits parse PE headers in memory to extract information, such as addresses, that enable them to spread in the OS.

Resolving Kernel API Address by String Hash

As shown in the following figure, MoonBounce, CosmicStrand and ESPecter access multiple offsets of OS kernel export directory (IMAGE_DATA_DIRECTORY) and IMAGE_EXPORT_DIRECTORY structures to resolve OS kernel API addresses.

We translated the structure offset accesses into one code sequence using YARA jumps.

*Figure 6: YARA rule resolving kernel API address*

Hunting with this rule identified two unknown bootkit samples (1/72, 0/72), that we named “Valkyrie”, since both output an ASCII art showing the text “Valkyrie” in debug mode. These samples are based on umap, another open source bootkit that allows for manual mapping of kernel drivers, but with a better engineered implementation:

A JSON configuration file “go.cfg” is used
The custom FNV-1-64 hash algorithm is used for string comparison
Additional code signatures for inline hooking supporting more bootloader versions
The BlImgAllocateImageBuffer is called only once, to store the loader binary in the slack space of the legitimate binary, whereas umap calls it twice to map both the legitimate binary and the loader separately.

The remaining behavior, including the hook chain, is the same as umap: ImgArchStartBootApplication → BlImgAllocateImageBuffer → OslFwpKernelSetupPhase1 → ExitBootServices → acpiex.sys entrypoint → injected “loader” entrypoint.

Based on the VirusTotal relation information, we identified the kernel driver sample loaded by the bootkit (“loader”). The driver uses the same string hash algorithm for resolving kernel API addresses. Additionally, the data and strings were highly obfuscated with SSE instructions. One of the decoded strings was an IP address whose hostname was resolved as valkyrie[.]cx.

*Figure 7: Valkyrie loader code (API string hashes and string de-obfuscation)*

This last discovery is what led us to the real purpose of this bootkit: according to the website, the bootkit is part of game cheat software.

*Figure 8: Valkyrie web site description*

Resolving Code Relocations for Kernel Drivers

Unlike the kernel API addresses resolution matched by the rule, bootkits resolve code relocations differently, preventing us from creating a single rule. We observed two distinct patterns in the instruction bytes when accessing the relocation directory (IMAGE_DATA_DIRECTORY) and IMAGE_BASE_RELOCATION. This rule was created from the MoonBounce and BlackLotus samples.

*Figure 9: YARA rule resolving code relocations*

Using this rule we found four additional umap variants. The first one had zero detections (0/71), despite its code being the same as the binary downloaded from the GitHub release page. The other three samples (mp.efi/winboot.efi) have instead a different hook chain from umap: ExitBootServices → CreateEvent callback with EVT_SIGNAL_VIRTUAL_ADDRESS_CHANGE (an event notified when SetVirtualAddressMap() is performed) → IoInitSystem in OS kernel. While these findings looked promising at first glance, we concluded that they were probably another game cheat after checking their compressed parent files from VirusTotal.

*Figure 10: GUI menu of the embedded executable*

With this rule, we also discovered an unknown bootkit sample named BOOTKIT.efi (4/71), which interestingly saw its detection number drop from 6 to 2 last month.

*Figure 11: VirusTotal detection trend for BOOTKIT.efi*

The sample was a bootkit that disables PatchGuard and DSE like EfiGuard, but the code was not similar (it just reused part of the signatures from EfiGuard). The hook chain was also unique: OpenProtocol → BlImgLoadPEImageEx → Several functions in the OS kernel (KiSwInterrupt, KiMcaDeferredRecoveryService, SeCodeIntegrityQueryInformation, SeValidateImageData, etc.).

Our additional VirusTotal RetroHunt revealed another variant of this bootkit, called SandboxBootkit.efi (3/71). Analysis of its related files (exe/sys in the parent compressed file) showed it was another sample used for game cheat software.

[Updated on 03/17/2025]

After publishing this blog, Duncan Ogilvie informed us that SandboxBootkit is another open-source bootkit implementation. Based on this, we decided to revisit BOOTKIT.efi and SandboxBootkit.efi for further analysis. This closer examination confirmed that both samples are indeed based on the open-source implementation SandboxBootkit. However, SandboxBootkit.efi was submitted to VT along with a user-mode program and a kernel driver implementing game-cheating software, which confirms our previous findings. On the other hand, the true purpose of BOOTKIT.efi remains unknown, since the sample doesn’t have any relations on VT. We have also updated the relevant information in the Results Summary section.

Resolving Code Relocations for PEI Stage Backdoor

Finally, we decided to build a detection rule for PeiBackdoor, a bootkit running during the PEI phase. Unlike the later-stage bootkits that we have investigated so far, it does not include OS-persistence code, but it contains similar code for resolving code relocations of the infected backdoor image.

*Figure 12: YARA rule resolving code relocations for PeiBackdoor*

This rule detected another backdoor by the same author, but no additional samples were discovered.

Results Summary

The summary results of our hunting on VirusTotal with the presented rules are shown in the following table. In total, we identified 11 new unknown bootkit samples and 1 old version of umap. Of all these samples, 10 of them were not detected as bootkits and 4 had zero detections on VirusTotal.

*Table 6: Hunting result*
	bootmgr.exe (bootlicker variants)	Vixen.efi (EfiGuard variants)	Valkyrie (game cheat software)	bootx64.efi (old umap binary)	mp.efi / winboot.efi (game cheat software)	BOOTKIT.efi / SandboxBootkit.efi
Number of samples	2	2	2	1	3	2
VT detection rate	1/71, 2/68	0/75, 0/73	1/72, 0/72	0/71	1/73, 1/71, 1/72	3/71, 3/71
VT detection names	Win/malicious_confidence_70%, MALICIOUS	—	W64.AIDetectMalware	—	W64.AIDetectMalware	Boot.Malware.Bootkit or Trojan.EFI64.Agent
Code reuse (similarity in BinDiff)	bootlicker (0.5% and 0.4%, due to infection with bootmgfw.efi)	EfiGuard (85%)	umap (32%, 39%)	—	umap (62%, 61%)	SandboxBootkit (95%, 97%)
Purpose	Shellcode execution	Disabling PatchGuard and DSE	Game cheating	Mapping a kernel driver	Game cheating	Disabling PatchGuard and DSE
Matched YARA rules	bootkit_disable_WP_CR0	bootkit_disable_CET_CR4	bootkit_resolve_api_addr, bootkit_resolve_relocation	bootkit_resolve_api_addr, bootkit_resolve_relocation	bootkit_resolve_relocation	bootkit_resolve_relocation

All samples reused a large portion of open source bootkit code: bootlicker (DmaBackdoorBoot), EfiGuard, umap, and SandboxBootkit. The DmaBackdoorBoot and EfiGuard binaries distributed on GitHub are detected on VirusTotal, with 36 and 13 detections, respectively. On the other hand, umap is not detected even though the code is the same. We believe umap should be detected because the code is reused by several bootkits, including BlackLotus.

The samples identified during our hunting were more than half related to game cheating software. For the remaining 5 samples (bootmgr.exe, Vixen.efi, and BOOTKIT.efi), we were unable to determine their true purpose and whether the samples (or any enhanced variants) are currently being used in-the-wild by threat actors. However, we hope that this research brings clarity on the bootkit threat landscape, enabling AV vendors and security teams to integrate our findings into their telemetry systems and improve their ability to detect bootkits.

Going Beyond the Limits of YARA

The YARA rules introduced in the previous sections were refined over a dataset of malicious and non-malicious firmware. In this section, we explore how their detection accuracy can be improved by leveraging more advanced detection capabilities. As shown in Table 7, these improvements come from different perspectives, such as using code analysis techniques that are more advanced than the byte-matching capabilities offered by YARA.

*Table 7: Advanced detection perspectives*
Detection Perspective	Improvements over YARA
Code analysis (e.g., cross-reference, function type/argument, etc.)	Detect YARA’s false negatives
Semantic information (e.g., GUID, protocol usage, etc.)	Generate more readable rules with code context
Sample classification using machine learning	Classify samples without writing individual rules
Anomaly detection based on differential firmware analysis	Catch unknown threats like supply-chain attacks

Static Analysis Automation for detecting OS Kernel/Driver Hooks

As mentioned before, YARA is not effective at detecting code that clears / restores the WP bit in CR0 when the bitmask value is passed as an argument to AsmWriteCr0. However, by using static analysis this technique and the related code patch can easily be detected with the following algorithm:

Detect the AsmWriteCr0 function
Identify the code range where the WP bit is cleared and restored
Within this code range, find memcpy-like calls and check if the copied size matches the size of the decoded instructions pointed by the source argument.

For example, the inline hook by Bootkitty is detected as follows.

*Figure 13: Detection example in Bootkitty*

We also scanned over 500 samples containing the AsmWriteCr0 function, finding no false positives.

Semantic Detection using FwHunt

Our FwHunt format and related community scanner fwhunt-scan, allows using semantic information like UEFI GUIDs, protocol, PPI, NVRAM variable, and so on for detection, which also provides an improvement over YARA rules. For example, when writing the YARA rule to detect code that clears the WP bit in CR0, we had to define some code sequences to exclude potential false positives. However, as shown in the following picture, this can be easily specified in FwHunt, making the rule simpler to understand and easier to maintain.

*Figure 14: YARA and FwHunt rules for WP bit clearing detection*

ML-based Sample Clustering

Classifying large sets of samples using only YARA, requires writing individual rules for each bootkit family. However, UEFI semantic information can provide features for accurate sample clustering without the need for rules to be developed, similarly to using IAT information for Windows malware classification.

To create clusters based on UEFI semantic information, we take the following steps:

Extract semantic information (GUIDs/protocols/PPIs/NVRAM variables) using FwHunt.
Calculate the TLSH (Trend Micro Locality Sensitive Hash) value of the extracted information. For example, as shown in Figure 15, the GUID and protocol service names are concatenated into one string in address order then the string’s hash value is calculated.
Create clusters using scikit-learn’s DBSCAN algorithm based on the distances calculated by the TLSH values.

*Figure 15: TLSH hash value calculation*

This clustering technique can be used for quickly triaging new suspicious samples. For example, during our research, we identified one unknown sample using VirusTotal Livehunt (MD5: ee7fd78bde28fe707b6847034e0a59fe). This sample had zero detections, so we wanted to analyze it and see if it was matching any of the bootkits we encountered so far. By using this clustering technique, we were able to quickly check similarities with previously analyzed samples.

The clustering result visualized with Multidimensional Scaling (MDS) is shown in Figure 16. As we can see, the target sample belongs to Cluster 7 (which represents umap variants) and the closest sample is a Valkyrie sample.

*Figure 16:* Clustering result visualized by MDS

Semantic vs Binary Clustering

We also went a step further, to understand how clustering using semantic information compares with clustering based on raw data.

We evaluated the results using Adjusted Rand Index (ARI). Below are results with different ratios applied during distance calculation by TLSH in Step 3.

*Figure 17: Comparison of clustering results with different* ratios

The leftmost result (ARI=0.302) demonstrates that fuzzy hashes of raw binary data are too sensitive for effective clustering, even within the same bootkit family. On the other hand, classification based on only semantic information (second from the left, ARI=0.78) puts samples with little semantic information in the same cluster. Through experimentation, the best ratio of binary:GUID:protocol:NVRAM variable (0.2:0.4:0.3:0.1) yielded the highest ARI score (0.898). Fine-tuning DBSCAN's eps parameter to 110 further improved the ARI to 0.922.

Thus, considering semantic information, we could significantly improve the UEFI sample clustering results.

Anomaly Detection Based on Differential Firmware Analysis

Semantic-based detection works well in most cases, but detailed analysis may be needed for infection-type bootkits. For example, MoonBounce and CosmicStrand infect UEFI modules, while ESPecter and bootlicker infect bootloaders. Their code is rather small and has no additional semantic information. But what if a new and unknown infections-style bootkit that doesn’t use any of the code patterns found in this research emerges?

To resolve this issue, we propose anomaly detection based on differential firmware analysis: by comparing different versions of the same firmware over time, we can detect unknown threats that can be present in the supply-chain attack space.

We support multiple difference detections in module information (added/changed/removed modules and module dependency expressions) and semantic information. Additionally, modules with identical names and GUIDs are compared, using function representations that allow us to find near-duplicates using Weisfeiler-Lehman LSH. We then calculate the pairwise module similarity based on the percentage of matched duplicates. We also perform capability diffing and its similarity measurement between the modules.

If the differential analysis finds any changes, we additionally run generic malicious code checkers. For instance, we check UEFI service table function hooks, as well as embedded executables that are later scanned using capa rules.

For example, the figure below demonstrates the detection of MoonBounce infection. The process consists of three steps: the changed module is detected based on file hashing, then the module function similarity is calculated, so finally identifying the malicious kernel driver embedded in the bootkit can be detected. Through the differential analysis, our anomaly detection enables identification of subtle yet malicious changes.

*Figure 18: Anomaly* detection case of MoonBounce infection

Conclusions

The research underscores that traditional bootkit detection technologies are struggling to keep pace with increasingly sophisticated firmware threats. By homing in on OS-persistence techniques, our approach has effectively unearthed previously undetected bootkits, revealing the inadequacies of legacy systems like YARA. Leveraging advanced static analysis, semantic detection, and ML-based clustering, our methodology not only mitigates the risk of false positives but also sets a new standard in proactive firmware security – transforming how we monitor, triage, and counteract these stealthy attacks.

Looking ahead, the work invites a broader rethinking of bootkit detection, pointing to untapped areas such as ARM-based firmware threats and alternative generic detection paradigms. This research lays the groundwork for evolving our defenses against bootkits by integrating diverse detection techniques that capture both known and emergent threats. It is a call to action for the cybersecurity community: to move beyond conventional tools, embrace innovative detection frameworks, and stay ahead in the relentless battle against firmware exploitation.

Rules and Scripts

The YARA rules and IDAPython PoCs created during this research are available here.

Acknowledgments

For this research, we appreciate the help of the following researchers.

Martin Smolár and Anton Cherepanov
Aleksandar Milenkoski
Brian Baskin

They shared their telemetry analysis results or more sample information. The Binarly REsearch team will continue to collaborate with external researchers to fight against common threats.