Tuesday, November 13, 2018

VCSA - This appliance cannot be used or repaired ...

I have just got an email from my customer describing the weird issue with VMware vCenter Server Appliance (aka VCSA).

The customer is doing weekly native backups of VCSA manually via VAMI. He wanted to run VCSA native backup again but when he tried to log into virtual appliance management interface (VAMI) he is getting the following error message

Error message - This appliance cannot be used or repaired because of failure was encountered. You need to deploy a new appliance.
The error message includes a resolution. Deploy a new appliance. The recommended solution is the last thing a typical vSphere admin would like to resolve such an issue. Fortunately enough, there is another solution/workaround.

To resolve this issue stop and start all the services on the vCSA,
  • Putty/SSH to vCenter server appliance.
  • Login to VCSA using the root credentials.
  • Enabled "shell".
  • Restart VCSA services

To restart VCSA services run the following commands:
service-control --stop --all
service-control --start --all

In case, simple services restart does not help, you can have an issue with some recent backup job. In such a case, there is another resolution with an additional workaround
  • Putty/SSH to vCenter server appliance.
  • Login to vCSA using the root credentials.
  • Enabled "shell".
  • Move the /var/vmware/applmgmt/backupRestore-history.json file to /var/tmp/.
  • Restart the vCenter Server Appliance.
Hope this helps other folks in VMware community.    

Sunday, November 11, 2018

Intel Software Guard Extensions (SGX) in VMware VM

Yesterday, I have got a very interesting question. I have been asked by a colleague of mine if Intel SGX can be leveraged within VMware virtual machine. We both work for VMware as TAMs (Technical Account Managers), therefore we are the first stop for similar technical questions of our customers.

I'm always curious what is the business reason behind any technical question. The question comes from the customer of my colleague who is going to run infrastructure for some "Blockchain" applications leveraging Intel SGX CPU feature set. The customer would like to run these applications virtualized on top of VMware vSphere to
  • simplify infrastructure management and capacity planning
  • increase server high availability 
  • optimize compute resource management

However, support of SGX is mandatory for such type of application, therefore if Virtual Machines do not support it, they are forced to run it on bare metal.

We, as VMware TAMs, can leverage a lot of internal resources, however, I personally believe that nothing compares to the own testing. After few hours of testing in the home lab, I feel more confidential to discuss the subject with other folks internally within VMware or externally with my customers. By the way, that's the reason I have my own vSphere home lab and this is a very nice example of justification to me and my family why I have invested a pretty decent money into the home lab in our garage. But back to the topic.

Let's start with the terminology and testing method

Intel Software Guard Extensions (SGX) 
is a set of the central processing unit (CPU) instruction codes from Intel that allows user-level code to allocate private regions of memory, called enclaves, that are protected from processes running at higher privilege levels.[Intel designed SGX to be useful for implementing a secure remote computation, secure web browsing, and digital rights management (DRM). [source]

The CPUID opcode is a processor supplementary instruction (its name derived from CPU IDentification) for the x86 architecture allowing software to discover details of the processor. It was introduced by Intel in 1993 when it introduced the Pentium and SL-enhanced 486 processors. [source]

It is worth to read the document "Properly Detecting Intel® Software Guard Extensions (Intel® SGX) in Your Applications" [source]

The most interesting part is ...
What about CPUID?The CPUID instruction is not sufficient to detect the usability of Intel SGX on a platform. It can report whether or not the processor supports the Intel SGX instructions, but Intel SGX usability depends on both the BIOS settings and the PSW. Applications that make decisions based solely on CPUID enumeration run the risk of generating a #GP or #UD fault at runtime.In addition, VMMs (for example, Hyper-V*) can mask CPUID results, and thus a system may support Intel SGX even though the results of the CPUID report that the Intel SGX feature flag is not set.
For our purpose, CPUID detection should be enough as we can test it on bare metal OS and later on Guest OS running inside Virtual Machine. The rest of testing is on the application itself but it is out of this blog post scope.

Another article worth to read is "CPUID — CPU Identification" [source]. The most interesting part of this document is ...
INPUT EAX = 12H: Returns Intel SGX Enumeration InformationWhen CPUID executes with EAX set to 12H and ECX = 0H, the processor returns information about Intel SGX capabilities. 
And the most useful resource is https://github.com/ayeks/SGX-hardware
There is a GNU C source code to test SGX support and clear explanation on how to identify support within the operating system. I have used my favorite OS FreeBSD and simply downloaded the code from GitHub

fetch https://raw.githubusercontent.com/ayeks/SGX-hardware/master/test-sgx.c

compile it

cc test-sgx.c -o test-sgx

and run executable application

./test-sgx

and you can see the application (test-sgx) output with information about SGX support. The output should be similar to this one.

 root@test-sgx-vmhw4:~/sgx # ./test-sgx  
 eax: 406f0 ebx: 10800 ecx: 2d82203 edx: fabfbff 
 stepping 0 
 model 15 
 family 6 
 processor type 0 
 extended model 4 
 extended family 0 
 smx: 0 
 Extended feature bits (EAX=07H, ECX=0H) 
 eax: 0 ebx: 0 ecx: 0 edx: 0 
 sgx available: 0 
 CPUID Leaf 12H, Sub-Leaf 0 of Intel SGX Capabilities (EAX=12H,ECX=0) 
 eax: 0 ebx: 440 ecx: 0 edx: 0 
 sgx 1 supported: 0 
 sgx 2 supported: 0 
 MaxEnclaveSize_Not64: 0 
 MaxEnclaveSize_64: 0 
 CPUID Leaf 12H, Sub-Leaf 1 of Intel SGX Capabilities (EAX=12H,ECX=1) 
 eax: 0 ebx: 3c0 ecx: 0 edx: 0 
 CPUID Leaf 12H, Sub-Leaf 2 of Intel SGX Capabilities (EAX=12H,ECX=2) 
 eax: 0 ebx: 0 ecx: 0 edx: 0 
 CPUID Leaf 12H, Sub-Leaf 3 of Intel SGX Capabilities (EAX=12H,ECX=3) 
 eax: 0 ebx: 0 ecx: 0 edx: 0 
 CPUID Leaf 12H, Sub-Leaf 4 of Intel SGX Capabilities (EAX=12H,ECX=4) 
 eax: 0 ebx: 0 ecx: 0 edx: 0 
 CPUID Leaf 12H, Sub-Leaf 5 of Intel SGX Capabilities (EAX=12H,ECX=5) 
 eax: 0 ebx: 0 ecx: 0 edx: 0 
 CPUID Leaf 12H, Sub-Leaf 6 of Intel SGX Capabilities (EAX=12H,ECX=6) 
 eax: 0 ebx: 0 ecx: 0 edx: 0 
 CPUID Leaf 12H, Sub-Leaf 7 of Intel SGX Capabilities (EAX=12H,ECX=7) 
 eax: 0 ebx: 0 ecx: 0 edx: 0 
 CPUID Leaf 12H, Sub-Leaf 8 of Intel SGX Capabilities (EAX=12H,ECX=8) 
 eax: 0 ebx: 0 ecx: 0 edx: 0 
 CPUID Leaf 12H, Sub-Leaf 9 of Intel SGX Capabilities (EAX=12H,ECX=9) 
 eax: 0 ebx: 0 ecx: 0 edx: 0 

Let's continue with testing of various combination

I have the home lab based on Intel NUC 6i3SYH, which have support for SGX. SGX has to be enabled on BIOS. There are three SGX options within BIOS
  • Software Controlled (default)
  • Disabled
  • Enabled

 
BIOS screenshot
First of all, let's do three tests of SGX support on bare metal (Intel NUC 6i3SYH). I have installed FreeBSD 11.0 on USB disk and tested SGX with all three BIOS options related to SGX.


Physical hardware (Software Controlled SGX)

eax: 406e3 ebx: 100800 ecx: 7ffafbbf edx: bfebfbff
stepping 3
model 14
family 6
processor type 0
extended model 4
extended family 0
smx: 0

Extended feature bits (EAX=07H, ECX=0H)
eax: 0 ebx: 29c6fbf ecx: 0 edx: 0
sgx available: 1 (TRUE)

CPUID Leaf 12H, Sub-Leaf 0 of Intel SGX Capabilities (EAX=12H,ECX=0)
eax: 0 ebx: 0 ecx: 0 edx: 0
sgx 1 supported: 0 (FALSE)
sgx 2 supported: 0
MaxEnclaveSize_Not64: 0 (FALSE)
MaxEnclaveSize_64: 0 (FALSE)

Test result: SGX is available for CPU but not enabled in BIOS


Physical hardware (Disabled SGX)

eax: 406e3 ebx: 1100800 ecx: 7ffafbbf edx: bfebfbff
stepping 3
model 14
family 6
processor type 0
extended model 4
extended family 0
smx: 0

Extended feature bits (EAX=07H, ECX=0H)
eax: 0 ebx: 29c6fbf ecx: 0 edx: 0
sgx available: 1 (TRUE)

CPUID Leaf 12H, Sub-Leaf 0 of Intel SGX Capabilities (EAX=12H,ECX=0)
eax: 0 ebx: 0 ecx: 0 edx: 0
sgx 1 supported: 0 (FALSE)
sgx 2 supported: 0
MaxEnclaveSize_Not64: 0 (FALSE)
MaxEnclaveSize_64: 0 (FALSE)

Test result: SGX is available for CPU but not enabled in BIOS


Physical hardware (Enabled SGX)

eax: 406e3 ebx: 100800 ecx: 7ffafbbf edx: bfebfbff
stepping 3
model 14
family 6
processor type 0
extended model 4
extended family 0
smx: 0

Extended feature bits (EAX=07H, ECX=0H)
eax: 0 ebx: 29c6fbf ecx: 0 edx: 0
sgx available: 1 (TRUE)

CPUID Leaf 12H, Sub-Leaf 0 of Intel SGX Capabilities (EAX=12H,ECX=0)
eax: 1 ebx: 0 ecx: 0 edx: 241f
sgx 1 supported: 1 (TRUE)
sgx 2 supported: 0
MaxEnclaveSize_Not64: 1f (OK)
MaxEnclaveSize_64: 24 (OK)

Test result: SGX is available for CPU and enabled in BIOS



So, we have validated that SGX capabilities are available on FreeBSD operating system running on bare metal when SGX is enabled in BIOS.

Next step is to repeat tests on Virtual Machines running on top of VMware hypervisor (ESXi) installed on the same physical hardware (Intel NUC 6i3SYH).  At the moment, I have vSphere 6.5 (ESXi build 7388607) which support VM hardware up to version 13. Let's run SGX tests on very old VM hardware 4 and on fresh VM hardware 13. All test with VMs were executed on physical system with explicitly enabled SGX in BIOS.



VM hardware version 4

eax: 406f0 ebx: 10800 ecx: 2d82203 edx: fabfbff
stepping 0
model 15
family 6
processor type 0
extended model 4
extended family 0
smx: 0

Extended feature bits (EAX=07H, ECX=0H)
eax: 0 ebx: 0 ecx: 0 edx: 0
sgx available: 0 (FALSE)

CPUID Leaf 12H, Sub-Leaf 0 of Intel SGX Capabilities (EAX=12H,ECX=0)
eax: 0 ebx: 440 ecx: 0 edx: 0
sgx 1 supported: 0 (FALSE)
sgx 2 supported: 0
MaxEnclaveSize_Not64: 0 (FALSE)
MaxEnclaveSize_64: 0 (FALSE)

Test result: SGX is not available for CPU in VM hardware version 4


VM hardware version 13

eax: 406f0 ebx: 10800 ecx: fffa3203 edx: fabfbff
stepping 0
model 15
family 6
processor type 0
extended model 4
extended family 0
smx: 0

Extended feature bits (EAX=07H, ECX=0H)
eax: 0 ebx: 1c2fbb ecx: 0 edx: 0
sgx available: 0 (FALSE)

CPUID Leaf 12H, Sub-Leaf 0 of Intel SGX Capabilities (EAX=12H,ECX=0)
eax: 7 ebx: 340 ecx: 440 edx: 0
sgx 1 supported: 1 (TRUE)
sgx 2 supported: 1 (TRUE)
MaxEnclaveSize_Not64: 0 (FALSE)
MaxEnclaveSize_64: 0 (FALSE)

Test result: CPU SGX functions are deactivated or SGX is not supported

Conclusion

To leverage Intel SGX CPU capabilities in the application, the physical hardware must support SGX and SGX must be enabled on BIOS. 
Note: Explicitly enabled SGX within BIOS has been successfully tested in operating system FreeBSD 11 running on bare metal (physical servers). It might work with BIOS option "Software Controlled" but it would require software enablement within Guest OS. I was not testing such scenario, therefore another testing would be required to prove such an assumption.
Operating system FreeBSD 11 has been tested on bare metal with enabled SGX in BIOS and in such configuration SGX CPU capabilities has been successfully identified within operating system.
SGX support in virtual machines on top of VMware Hypervisor (ESXi 6.5) has been tested solely on physical hardware with SGX explicitly enabled in BIOS. 
Unfortunately, SGX has NOT been successfully identified even on the latest VM hardware for vSphere 6.5 (VM Hardware ver 13) even the CPU capabilities identified in VM hardware 13 by Guest Operating System are significantly extended in comparison to VM hardware 4.
I will try to upgrade my home lab to the latest vSphere 6.7 U1 and do additional testing with VM hardware version 14. In the meantime, I will open discussion inside VMware organization about SGX support because, at the moment, one large VMware customer cannot virtualize a specific type of applications even they would like to.