A few weeks ago, iTWire reported on research from Eclypsium that demonstrated the ability to "brick" a cloud hosted bare metal server. This research shows that similar techniques can have more subtle (and nefarious) outcomes.
A "bare metal server" is a computing platform provided by a cloud host that may be given any load requested by the client. The intention is that clients may spin up custom loads for short-term load management, all without requiring the cloud service to create an OS instance – the hardware is the only service provided.
Of course, this means that once the load requirement vanishes, the server is "wiped" and made available for the next customer request.
As was noted previously, there is a vulnerability that permits over-writing the BMC firmware with garbage, thereby "bricking" the hardware, effectively permanently. This new research is more subtle and based on the quite reasonable assumption that while a specific server is dedicated to a single customer for some period of time, that is not true in the longer term – servers are re-purposed for subsequent clients as required.
From a security standpoint, this is important because (as Eclypsium notes),
- Attackers can implant malicious backdoors and code into the firmware of the server or its BMC with minimal skills. Attackers could do this by simply becoming a customer and directly modifying the system or BMC firmware.
- Such backdoors can easily survive the reclamation process and be passed on to the next customer. Truly removing a malicious implant could require the service provider to physically connect to chips to reflash the firmware, which is highly impractical at scale.
- Once implanted, an attacker could damage or disrupt the application or steal sensitive data.
In order to prove this, Eclypsium bought access to a bare metal server on IBM's SoftLayer cloud services. The company points out that this vulnerability is not exclusive to IBM, other providers are equally affected.
They recorded the chassis and product serial numbers in order to be able to re-identify the specific server later. They then made a benign change to the BMC firmware (a simple misspelling of one word in a text string). Obviously, a genuinely malicious attacker would be replacing the entire firmware. However, the simple change proves the technique.
At this point the server was returned to the cloud service provider.
Some time later, the researchers made multiple provisioning requests until they re-acquired the previous server (confirmed by the various serial numbers). The modified firmware was still present, demonstrating that the server reclamation process did not re-flash the BMC firmware, thus proving that a firmware implant could be passed from one customer to another.
Eclypsium noted, "Malicious code in this scenario could allow an attacker to steal data from a new customer, damage data, or bring down the server completely. This is important because it allows an attacker, even with minimal skill, to cause serious damage to the most sensitive applications and services in the cloud. The vulnerability that we identified in our research we are referring to as Cloudborne."
The company published a timeline showing how they had repeatedly contacted IBM (starting in early September last year) in order to report this vulnerability, but had received no meaningful response. However in the past few hours, IBM published their own advisory indicating that "IBM has responded to this vulnerability by forcing all BMCs, including those that are already reporting up-to-date firmware, to be reflashed with factory firmware before they are re-provisioned to other customers. All logs in the BMC firmware are erased and all passwords to the BMC firmware are regenerated".
Eclypsium responded that a researcher was able to re-connect with a server previously patched on 16 February finding that the re-provisioning had not wiped the BMC firmware.
"Until this publication, Eclypsium had no indication that IBM had made changes based upon this work. As recently as 16 February, we had not observed these remediations. We are relieved to learn that IBM appears to be mitigating the issue. Eclypsium does not agree with the characterization of this as a 'Low Severity' issue. Using CVSS 3.0, we would classify it as 9.3 (critical) severity with the following details:
"While the hardware specifications of BMC hardware are low as compared with the host server, the capability for security-critical impact is high. By design, the BMC is intended for managing the host system, and as such, it is more privileged than the host. The BMC has continual access to files, memory (using DMA), keyboard/video, and firmware of the host (which is required because it needs the ability to reinstall/reconfigure it).
"Furthermore, the BMC is able to send data to an external network, even potentially reconfiguring the host network interface. This provides an attacker with all the tools necessary for complete and stealthy control of a victim system. The potential impact includes access/modification of any/all user data as well as permanent denial of service ('bricking') of the equipment as we have previously demonstrated."
Eclypsium's full report (including an extensive analysis and suggested remediation) is available here.