The economic case to migrate workloads on-premises to the cloud is well established and hardly controversial, but new security challenges stemming from third-party resource ownership remain as an issue. Cloud customers want to know how they can be sure that servers managed by their providers have not been tampered with — and they ask for public, transparent, and independently verifiable answers. Options like Intrusion Detection System (IDS) and File Integrity Monitoring (FIM) are an important part of the solution, but these operate on the top of the software stack, leaving an entire domain of firmware and system software open for undetectable attacks. This is why a solution like server attestation is needed.
The basic idea of attestation is to rely on a mutually trusted (and trustworthy) piece of hardware with immutable characteristics — the root of trust — and delegate to it the task of verifying the integrity of every single component of the (surprisingly long) initialization and boot process on a server. This way, any deviation in the contents of both the firmware (FW) and software (SW) can be promptly detected and reported. Such verification is typically done by applying a hash function to the content of each component in order to produce a digest. These digests can be extracted from the software at rest in controlled circumstances, or the customer can rely on digests published by trusted manufacturers. SW and FW identity is thus independently verifiable in a public and transparent manner.
In the case of any modern server node in a cloud provider datacenter, the root of trust comes in the form of a hardware Trusted Platform Module (TPM). By leveraging the security properties of the TPM, all the measurements (such as the digest output of a hash function applied to each component in the boot process) can be stored and signed providing integrity guarantees.
Starting with the UEFI BIOS image binaries, and all the way through the boot process, the kernel, and the content of every single executable on the root filesystem on the node, all the collected measurements can be attested against well-known values, giving a customer confidence that the entire FW and SW stack is trustworthy. The key technologies that ensure a provider can satisfactorily show that a customer’s node is attested are: Measured Boot (MB), which measures every component of the boot process up to and including the booted kernel; and Integrity Measurement Architecture (IMA), which measures every executable file on the filesystem. Both leverage the security characteristics of the TPM to store and sign the measurements.
While the ability to answer whether a particular server has been tampered with is useful, complex workloads run on clusters of hardware, rather than single nodes. Keylime, a Cloud Native Computing Foundation (CNCF) sandbox project, attests an entire cluster of nodes by providing Remote Attestation. With a two-tier architecture of Registrar/Verifier and Agent, this framework periodically collects both the MB and IMA measurements stored on remote TPMs as well as boot and execution logs provided by the operating system. It also performs a two-step attestation verification process: It first matches the values of the measurements with the logs, ensuring these are trustworthy, and then applies a pre-determined policy to decide if the values are acceptable. For MB, the policy will include information such as EFI BIOS allowed configurations, while for IMA, the policy will include a list of allowed digests for a given binary (e.g., /bin/ls).
Crafting appropriate policies is one of the most challenging and demanding aspects of the whole process, and we’ll cover that in a separate post. But in short, it requires strict governance that covers both relationship with manufacturers and packagers and continuous asset management internally. Keylime's key “server-side” component, the verifier, provides a REST API that can be accessed directly. Through this API, a provider can ensure to a customer that the cluster it manages hasn’t been tampered with since the last measurement collection.
The need for evidence of attestation is often part of a much larger framework of legal and governmental requirements. Third-party auditors might demand that a customer provides proof that a critical application, running on a cloud environment managed by provider, was deployed in a cluster that was attested for a relatively long period of time. In addition to that, such auditors can require unprocessed raw measurement data to apply their own set of tools to determine whether a component did not suffer any tampering. More challenging still, this data might refer to a server that was either retired or damaged beyond repair.
To fulfill this gap, Keylime was extended with a new “Durable Attestation” feature. The key aspect of this new feature is the ability to store all the historical series of collected measurements on a relational or schemaless database. To further enhance the trustworthiness of the collected data, the TPM public keys can be used to check the signature of the measurements that is collected and recorded in a Transparency Log (TL), such as Rekor. With these pieces of information in place — the certified TPM public keys and the (potentially long) list of measurements and (MB+IMA) logs — a provider can definitively be sure that the clusters they manage have not been tampered between two arbitrary dates in the past.
Keylime is an open-source project, whose source code is available online. In addition to being packaged for several community Linux distributions (including Fedora Core 36, Ubuntu 22.04, and openSuSe 15.4), it is also available on Enterprise Linux distributions like RHEL 9.1 and SuSE SLE-15-SP2. Anyone willing to try out all the features described here in a cloud environment (using Virtual Machines with Virtual TPMs) can try them here.