Friday, January 12, 2018

VMware Response to Speculative Execution security issues, CVE-2017-5753, CVE-2017-5715, CVE-2017-5754 (aka Spectre and Meltdown)

Since January 3, 2017, the whole IT industry is mitigating the impact of SPECTRE and MELTDOWN vulnerabilities and administrators are updating their infrastructures.

Three different CVEs have been identified related to the media described issues:
  • CVE-2017-5753
  • CVE-2017-5715 (Spectre)
  • CVE-2017-5754 (Meltdown)
All three security CVEs (common vulnerabilities and exposures) are abusing multi-pipeline architecture (using multiple CPU pipelines) and performance optimization (speculative execution) used in all modern CPUs. Simplified but IMHO very nice explanation is here. If I understand the issue and resolution correctly, it all boils down to new CPU microcodes which must expose special CPU instructions to operating systems which has to incorporate it into their kernels.

VMware infrastructure is, obviously, impacted as well. It is very important to patch the whole infrastructure stack. All subsystems in the infrastructure stack described below must be patched or upgraded for effective mitigation. The virtual infrastructure stack is composed of
  • Operating systems (OS)
  • Virtual Machines / Virtual Appliances
  • Hypervisors
  • CPU microcode (usually part of BIOS)
GUEST OS
For guest OS you should obtain patches from your OS software vendors. However, guest operating systems are using new CPU instructions (capabilities IBRS, IBPB, STIBP) to mitigate Spectre/Meltdown vulnerabilities. These capabilities are exposed from physical CPU microcode (note: Intel cpu microcode is currently unstable) therefore patches from top to down (OS-HYPERVISOR-CPU microcode) is necessary for successful vulnerability mitigation. To expose those new CPU capabilities to guest OS, Virtual Machine hardware 9 and newer must be used otherwise those new capabilities are masked by VM hardware and Guest OS kernel would not be able to use it.

Another consideration should be given to VMware Virtual Appliances. VMware officially released a new Knowledge Base article regarding the investigation of the impact on the VMware Virtual Appliances (KB 52264). 

HYPERVISOR
Hypervisor remediation can be classified into the two following categories:
  • Hypervisor-Specific Remediation (documented in VMSA-2018-0002) - Result of exploitation may allow for information disclosure from one Virtual Machine to another Virtual Machine that is running on the same host. 
  • Hypervisor-Assisted Guest Remediation (documented in VMSA-2018-0004) - This issue may allow for information disclosure between processes within the VM.
For Hypervisor-Specific remediation, all relevant patches already exist and are described in security advisory VMSA-2018-0002. VMware recommends applying described patches.

For Hypervisor-Assisted Guest Remediation, all relevant patches exist and are described in security advisory VMSA-2018-0004Important note: VMware does NO LONGER recommend to install the patches listed in security advisory VMSA-2018-0004. Further details are described in KB 52345. In short, Intel has notified VMware of recent sightings that may affect some of the initial microcode patches that provide the speculative execution control mechanism for a number of Intel Haswell and Broadwell processors. For servers using the Intel Haswell and Broadwell processors that have applied ESXi650-201801402-BG, ESXi600-201801402-BG, or ESXi550-201801401-BG VMware has workaround described in KB 52345. The workaround hides the speculative-execution control mechanism for virtual machines.

Rogue Data Cache Load or “Meltdown” (CVE-2017-5754) does not affect ESXi, Workstation, and Fusion because ESXi does not run untrusted user mode code, and Workstation and Fusion rely on the protection that the underlying operating system provides.

PERFORMANCE IMPACT
The virtualization specific performance impact of these mitigations should be negligible but performance impact of workloads inside guest OS is expected but workload dependent. There are rumors it might be somewhere between 10% and 30%.

HOW TO UPDATE?
Please note, the update order is important.

To enable hardware support for branch target mitigation in vSphere, you should apply these steps, in the order shown below:
  1. For vCenter 6.x, in case you have external PSC, update all external PSCs to the latest patches of PSC 5.5 – 6.5 
  2. Update to the latest patches of vCenter 5.5 – 6.5 
  3. Apply the latest ESXi 5.5 – 6.5 patches
  4. Apply the CPU Microcode by BIOS or appropriate ESXi patch mentioned earlier in this article. << Update 2018-01-13: Validate that CPU Microcode version is error prone. Some Intel CPUs are currently impacted. Check VMware KB 52345
For server firmware (BIOS) and CPU microcode for CVE-2017-5715 , you should obtain fixes from your hardware vendor or apply it via the latest ESXi patches.
ESXi650-201801402-BG microcode *
ESXi600-201801402-BG microcode *
ESXi550-201801401-BG hypervisor and microcode **
Update 2018-01-13: Important note: VMware does NO LONGER recommend to install the patches listed in security advisory VMSA-2018-0004 because of Intel sightings in ESXi Bundled Microcode Patches. I still personally believe that CPU microcode update via ESXi patch is the best from the operational perspective so I will update this blog post when fixed microcode will be bundled with ESXi patched again.

The reason why vCenter Server must be updated before ESXi is to avoid the issue with vMotion and EVC. An ESXi host that is running both a patched CPU microcode and a patched hypervisor will see new CPU features that were not previously available. These new features will be exposed to all Virtual Hardware Version 9+ VMs that are powered-on by that host. Because these VMs now see additional CPU features, vMotion to ESXi without the microcode or hypervisor patches applied will be prevented. There are three new CPU Features
  • IBRS
  • IBPB
  • STIBP
These three new CPU capabilities are not known to existing EVC baselines, therefore it seems to ESXi hosts as newer CPU model and prevents vMotion migration. EVC baselines are defined in vCenter and that's the reason why vCenter patch must be applied before ESXi patch and CPU microcode to add these CPU capabilities into EVC baselines and allow vMotions between ESXi hosts.

HOW TO VERIFY YOUR UPDATE WAS SUCCESSFUL AND YOU ARE NOT VULNERABLE?

You should verify it in different layers.

Hypervisor level
If you want to know if you have Intel microcode updated you can run following shell command on your ESXi host
if [ `vsish -e get /hardware/msr/pcpu/0/addr/0x00000048 2&>1 > /dev/null ;echo $?` -eq 0 ]; then echo -e "\nIntel Security Microcode Updated\n";else echo -e "\nIntel Security Microcode NOT Updated\n";fi 
On Intel machine with microcode update, you get the output "Intel Security Microcode Updated".
On Intel machine with microcode update, you get the output "Intel Security Microcode NOT Updated".

If you want to do the same for AMD microcode following command should work
if [ $(($((`vsish -e get /hardware/cpu/cpuList/0 | grep -i EBX | head -5 | tail -1 | awk -F: '{print $2}'` & 0x00001000)) >> 12)) -eq 1 ]; then echo -e "\nAMD Security Microcode Updated\n";else echo -e "\nAMD Security Microcode NOT Updated\n";fi 
Guest OS - Linux
  • Very nice blog post with the shell script is here “How to check Linux for Spectre and Meltdown vulnerability". Please note, that if you run the shell script inside guest OS on top of update ESXi host, the script will report that CPU microcode is updated even the physical CPU microcode is vulnerable. However, the information if Guest OS is patched is still valuable.
Guest OS - Microsoft Windows

OTHER VENDORS RESPONSES

Intel official response is available here
https://security-center.intel.com/advisory.aspx?intelid=INTEL-SA-00088&languageid=en-fr

DELL official responseMeltdown and Spectre Vulnerabilities
DELL KB article with technical information: http://www.dell.com/support/article/SLN308588

HP response I have found so far: Resources to help mitigate Speculative Execution vulnerability in Intel and other processors
HP Support Customer Bulletin: a00039267en_us

If you know links to other or better server vendors responses, please leave it in comments.

Hope this article is helpful. And as always, if you see some wrong information or you just have a question or some other experience, do not hesitate to write the comment. I will try to update this article in case something new is available.

Sources:

Tuesday, January 09, 2018

Admission Control "Dedicated fail-over hosts"

More then three years ago I published the blog post about "vSphere HA Cluster Redundancy". There are three algorithms
  • Define fail-over capacity by static number of hosts
  • Define fail-over capacity by reserving a percentage of cluster resources
  • Use dedicated fail-over hosts
I discussed first two algorithms very well but the third one "dedicated fail-over hosts" was described briefly by following words ...
This algorithm simply dedicates specified hosts to be unused during normal conditions and used only in case of ESXi host failure. Multiple fail-over dedicated hosts are supported since vSphere 5.0. This algorithm will keep your capacity and performance absolutely predictable and independent on VM reservations. You'll get exactly what you configure.
I have been asked recently by one my customer for some details behavior of  "dedicated fail-over hosts" method.

Question #1: How will be VMs restarted in case of single host failure when two hosts are dedicated for fail-over?

Answer: All impacted VMs are restarted and spread across both dedicated fail-over hosts.

Question #2: Is it possible to vMotion VMs to dedicated hosts for fail-over?

Answer: Of course not. These hosts are dedicated just for fail-over and vSphere Cluster is aware about it, therefore, it will not allow administrator nor DRS to migrate VMs there.

Question #3: What will happen with VMs when the failed host is back?

Answer: VMs will stay on on hosts dedicated for fail-over unless DRS will move them to other hosts. Based on my testing, DRS will do it probably just in case of lack of resources, therefore, some VMs can stay on dedicated fail-over hosts, which is not good. Therefore, vSphere administrator should check the cluster state after host failure and move all VMs out of dedicated fail-over hosts if DRS did not do it before.

Hope this helps broader VMware community to better understand VMware Admission Control. And as always, if you have some other question, opinion or different experience, please, feel free to leave the comment below.