Wednesday, April 18, 2018

What's new in vSphere 6.7

VMware vSphere 6.7 has been released and all famous VMware bloggers released their blog posts about new features and capabilities. It is worth to read all of these blog posts as each blogger is focused on a different area of SDDC so it can give you a broader context to newly available product features and capabilities. Anyway, industry veterans should start reading product Release Notes and official VMware blog posts first.

Please note, that this blog post is just an aggregation of information published in other places. All used sources are listed below.

Release Notes:
vSphere 6.7 Release Notes

VMware KB:
Important information before upgrading to vSphere 6.7 

VMware official blog posts:
Introducing VMware vSphere 6.7!
Introducing vCenter Server 6.7
Introducing Faster Lifecycle Management Operations in VMware vSphere 6.7
Introducing vSphere 6.7 Security
What’s new with vSphere 6.7 Core Storage
vSphere 6.7 Videos

Community blog posts:
Emad Younis : vCenter Server 6.7 What’s New Rundown
Duncan Epping : vSphere 6.7 announced!
Cormac Hogan : What's new in vSphere and vSAN 6.7 release?
Cody Hosterman : What's new in core storage in vSphere 6.7 part I: in-guest unmap and snapshots
William Lam : All vSphere 6.7 release notes & download links
Florian Greh (Virten) : VMware vSphere 6.7 introduces Skylake EVC Mode

So after reading all resources above let's aggregate and document interesting features area by area.

vSphere Management

vCenter with embedded platform services controller in enhanced linked mode. This is nice because you can leverage "vCenter Server High Availability" to achieve higher availability for PSC without the external load balancer. All benefits listed below.
  • No load balancer required for high availability and fully supports native vCenter Server High Availability.
  • SSO Site boundary removal provides flexibility of placement.
  • Supports vSphere scale maximums.
  • Allows for 15 deployments in a vSphere Single Sign-On Domain.
  • Reduces the number of nodes to manage and maintain.

vSphere 6.7 introduces vCenter Server Hybrid Linked Mode, which makes it easy and simple for customers to have unified visibility and manageability across an on-premises vSphere environment running on one version and a vSphere-based public cloud environment, such as VMware Cloud on AWS, running on a different version of vSphere.

vSphere 6.7 also introduces Cross-Cloud Cold and Hot Migration, further enhancing the ease of management across and enabling a seamless and non-disruptive hybrid cloud experience for customers.

vSphere 6.7 enables customers to use different vCenter versions while allowing cross-vCenter, mixed-version provisioning operations (vMotion, Full Clone and cold migrate) to continue seamlessly.

vCenter Server Appliance (VCSA) Syslog now supports up to three syslog forwarding targets.

The HTML5-based vSphere Client provides a modern user interface experience that is both responsive and easy to use and includes 95% of functionality available in Flash Client. Some of the newer workflows in the updated vSphere Client release include:
  • vSphere Update Manager
  • Content Library
  • vSAN
  • Storage Policies
  • Host Profiles
  • vDS Topology Diagram
  • Licensing
PSC/SSO CLI (cmsso-util) has some improvements. Repointing an external vCenter Server Appliance across SSO Sites within a vSphere SSO domain is supported. Repoint of vCenter Server Appliance across vSphere SSO domains is also supported. This is huge! It seems that SSO domain consolidation is now possible. The domain repoint feature only supports external deployments running vSphere 6.7. The repoint tool can migrate licenses, tags, categories, and permissions from one vSphere SSO Domain to another.

Brand-new Update Manager interface that is part of the HTML5 Web Client. The new UI provides a much more streamlined remediation process. 


New vROps plugin for the vSphere Client. This plugin is available out-of-the-box and provides some great new functionality. When interacting with this plugin, you will be greeted with 6 vRealize Operations Manager (vROps) dashboards directly in the vSphere client! 


Compute

vSphere 6.7 delivers a new capability that is key for the hybrid cloud, called Per-VM EVC. Per-VM EVC enables the EVC (Enhanced vMotion Compatibility) mode to become an attribute of the VM rather than the specific processor generation it happens to be booted on in the cluster. This allows for seamless migration across different CPUs by persisting the EVC mode per-VM during migrations across clusters and during power cycles.

A new EVC mode (Intel Skylake Generation) has been introduced.  Compared to Intel "Broadwell " EVC mode, the Skylake EVC mode exposes following additional CPU features:
  • Advanced Vector Extensions 512
  • Persistent Memory Support Instructions
  • Protection Key Rights
  • Save Processor Extended States with Compaction
  • Save Processor Extended States Supervisor

Single Reboot when updating ESXi hosts. It is reducing maintenance time by eliminating one of two reboots normally required for major version upgrades.

vSphere Quick Boot is a new innovation that restarts the ESXi hypervisor without rebooting the physical host, skipping time-consuming hardware initialization (aka POST, Power-On Self Tests).

Storage

Support for 4K native HDD. Customers may now deploy ESXi on servers with 4Kn HDDs used for local storage (SSD and NVMe drives are currently not supported). ESXi providing a software read-modify-write layer within the storage stack allowing the emulation of 512B sector drives. ESXi continues to expose 512B sector VMDKs to the guest OS. Servers having UEFI BIOS can boot from 4Kn drives.

XCOPY enhancement. XCOPY is used to offload storage-intensive operations such as copying, cloning, and zeroing to the storage array instead of the ESXi host. With the release of vSphere 6.7, XCOPY will now work with specific vendor VAAI primitives and any vendor supporting the SCSI T10 standard. Additionally, XCOPY segments and transfer sizes are now configurable. By default, the Maximum Transfer Size of an XCOPY ranges between 4MB-16MB. In vSphere 6.7, through the use of PSA claim-rules, this functionality is extended to additional storage arrays. Further details should be documented by particular storage vendor.

Configurable Automatic UNMAP. Automatic UNMAP was released with vSphere 6.5 with a selectable priority of none or low. Storage vendors and customers have requested higher, configurable rates rather than a fixed 25MBps. With vSphere 6.7 we’ve added a new method, “fixed” which allows you to configure an automatic UNMAP rate between 100MBps and 2000MBps, configurable both in the UI and CLI.

UNMAP for SESparse. SESparse is a sparse virtual disk format used for snapshots in vSphere as a default for VMFS-6. In this release, automatic space reclamation for VM’s with SESparse snapshots on VMFS-6 is provided. This only works when the VM is powered on and only affect the top-most snapshot.

VVols enhancements. As VMware continues the development of Virtual Volumes, in this release is added support for IPv6 and SCSI-3 persistent reservations. With end-to-end support of IPv6, this enables organizations, including government, to implement VVols using IPv6. With SCSI-3 reservations, this substantial feature allows shared disks/volumes between virtual machines across nodes/hosts. Often used for Microsoft WSFC clusters, with this new enhancement it allows for the removal of RDMs!

Increased maximum number of LUNs/Paths (1K/4K LUN/Path). The maximum number of LUNs per host is now 1024 instead of 512 and the maximum number of paths per host is 4096 instead of 2048. Customers may now deploy virtual machines with up to 256 disks using PVSCSI adapters. Each PVSCSI adapter can support up to 64 devices. Devices can be virtual disks or RDMs. A major change in 6.7 is the increased number of LUNs supported for Microsoft WSFC clusters. The number increased from 15 disks to 64 disks per adapter, PVSCSI only. This changes the number of LUNs available for a VM running MICROSOFT WSFC from 45 to 192 LUNs.

VMFS-3 EOL. Starting with vSphere 6.7, VMFS-3 will no longer be supported. Any volume/datastore still using VMFS-3 will automatically be upgraded to VMFS-5 during the installation or upgrade to vSphere 6.7. Any new volume/datastore created going forward will use VMFS-6 as the default.

Support for PMEM /NVDIMMs. Persistent Memory or PMem is a type of non-volatile DRAM (NVDIMM) that has the speed of DRAM but retains contents through power cycles. It’s a new layer that sits between NAND flash and DRAM providing faster performance and it’s non-volatile unlink DRAM.

Intel VMD (Volume Management Device). With vSphere 6.7, there is now native support for Intel VMD technology to enable the management of NMVe drives. This technology was introduced as an installable option in vSphere 6.5. Intel VMD currently enables hot-swap management, as well as NVMe drive, LED control allowing similar control used for SAS and SATA drives.

RDMA (Remote Direct Memory Access) over Converged Ethernet (RoCE). This release introduces RDMA using RoCE v2 support for ESXi hosts. RDMA provides low latency, and higher-throughput interconnects with CPU offloads between the end-points. If a host has RoCE capable network adaptor(s), this feature is automatically enabled.

Para-virtualized RDMA (PV-RDMA). In this release, ESXi introduces the PV-RDMA for Linux guest OS with RoCE v2 support. PV-RDMA enables customers to run RDMA capable applications in the virtualized environments. PV-RDMA enabled VMs can also be live migrated.

iSER (iSCSI Extension for RDMA). Customers may now deploy ESXi with external storage systems supporting iSER targets. iSER takes advantage of faster interconnects and CPU offload using RDMA over Converged Ethernet (RoCE). We are providing iSER initiator function, which allows ESXi storage stack to connect with iSER capable target storage systems.

SW-FCoE (Software Fiber Channel over Ethernet). In this release, ESXi introduces software-based FCoE (SW-FCoE) initiator than can create FCoE connection over Ethernet controllers. The VMware FCoE initiator works on lossless Ethernet fabric using Priority-based Flow Control (PFC). It can work in Fabric and VN2VN modes. Please check VMware Compatibility Guide (VCG) for supported NICs.

Performance

vSphere 6.7 VCSA delivers phenomenal performance improvements (all metrics compared at cluster scale limits, versus vSphere 6.5):
  • 2X faster performance in vCenter operations per second
  • 3X reduction in memory usage
  • 3X faster DRS-related operations (e.g. power-on virtual machine)

Security

vSphere 6.7 adds support for Trusted Platform Module (TPM) 2.0 hardware devices and also introduces Virtual TPM 2.0, significantly enhancing protection and assuring integrity for both the hypervisor and the guest operating system.

vSphere 6.7 introduces support for the entire range of Microsoft’s Virtualization Based Security technologies aka “Credential Guard” support.

Recoverability

vCenter Server Appliance (VCSA) File-Based Backup introduced in vSphere 6.5 now has a scheduler. Now customers can schedule the backups of their vCenter Server Appliances and select how many backups to retain. Another new section for File-Based backup is Activities. Once the backup job is complete it will be logged in the activity section with detailed information. The Restore workflow now includes a backup archive browser. The browser displays all your backups without having to know the entire backup path.

Conclusion

It seems that vSphere 6.7 is the continuous evolution of the best x86 virtualization platform with a lot of interesting improvements, features, and capabilities. Keep in mind, that this is just a list of features and capabilities which have to be very carefully planned, designed and tested before implementation into production.

Just FYI, I did not finish the reading of all vSphere 6.7 documents so I will update this blog post when find something interesting.

Wednesday, April 11, 2018

How to disable Spectre and Meltdown mitigations?

Today, I have been asked again "How to disable Spectre and Meltdown mitigations on VMs running on top of ESXi". Recently I wrote about Spectre and Meltdown mitigations on VMware vSphere virtualized workloads here.

So, let's assume you have already applied patched and updates to ...
  • Guest OS (Windows, Linux, etc.)
  • Hypervisor - ESXi host (VMSA-2018-0004.3 and  VMSA-2018-0002)
  • BIOS (version having support for IBRS, IBPB, STIBP capabilities)
... therefore, you should be protected against Spectre and Meltdown vulnerabilities known as CVE-2017-5753 (Spectre - Variant 1), CVE-2017-5715 (Spectre - Variant 2), and CVE-2017-5754 (Meltdown - Variant 3).

These security mitigations do not come for free. They have a significant impact on performance. I did some testing in my lab and some results were scaring me. The biggest impact is on workloads having system calls (calls from OS userland to the OS kernel) such as memory, network, and storage I/O operations. The performance impact is the reason why some administrators and application owners are willing to disable security mitigation in systems where interprocess communication is trusted and potential data leaks between them is not a problem. 

So, let's answer the question. Spectre and Meltdown mitigations can be disabled on Guest Operating System level. This is the preferred method.

RedHat

You can disable security mitiggations at runtime with the following three commands. The change is immediately active and does not require a reboot.

    # echo 0 > /sys/kernel/debug/x86/pti_enabled
    # echo 0 > /sys/kernel/debug/x86/ibpb_enabled
    # echo 0 > /sys/kernel/debug/x86/ibrs_enabled

this is not persistent 

MS Windows
In Windows operating system you can control it via the registry.

To enable the mitigation you had to change Registry Settings
  • reg add "HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Control\Session Manager\Memory Management" /v FeatureSettingsOverride /t REG_DWORD /d 0 /f
  • reg add "HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Control\Session Manager\Memory Management" /v FeatureSettingsOverrideMask /t REG_DWORD /d 3 /f
  • reg add "HKLM\SOFTWARE\Microsoft\Windows NT\CurrentVersion\Virtualization" /v MinVmVersionForCpuBasedMitigations /t REG_SZ /d "1.0" /f
  • Restart the server for changes to take effect.
and to disable the mitigation you have to change Registry Settings as follows
  • reg add “HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Control\Session Manager\Memory Management” /v FeatureSettingsOverride /t REG_DWORD /d 3 /f
  • Restart the server for changes to take effect.
 Note: After any change, please, test if your system behaves as expected (secure or not secure).

ESXi - not recommended method

Another method is to disable CPU features on ESXi level. This is not recommended by VMware but in VMware KB 52345 it was published recently as a workaround.  

Here is the procedure how to mask CPU capabilities on ESXi level.

Step 1/ Login to each ESXi host via SSH.

Step 2/ Add the following line in the /etc/vmware/config file:
cpuid.7.edx = "----:00--:----:----:----:----:----:----"

Step 3/ run command /sbin/auto-backup.sh to backup config file and keep the configuration change persistent across ESXi reboot

Step 4/ Power-cycle VMs running on top of ESXi host 

This will hide the speculative-execution control mechanism for virtual machines which are power-cycled afterward on the ESXi host.  So, you have to Power-cycle virtual machines on the ESXi host. Rebooting of the ESXi host is not required. The effect is that the speculative execution control mechanism is no longer available to virtual machines even if the server firmware provides the same microcode independently.

Conclusion

It is important to mention that Guest Operating System inside VM may or may not use CPU Capabilities IBRS, IBPB, STIBP provided by CPU microcode to mitigate security issues. As far as I'm aware these instructions are leveraged by Guest OSes just to mitigate only Spectre Variant 2 (CVE-2017-5715). In some cases, Guest OS can use some other mitigation methods even for Spectre Variant 2. For example, linux kernel is currently trying to leverage “Retpoline” code sequences to decrease the performance impact but “Retpoline” is not applicable for all CPU models. So, there is no single recommendation which would fit all situations.

That's the reason why performance tuning by disabling security enhancements should be always done on Guest Operating System level and not on ESXi level. ESXi workaround is just a workaround which can be useful in case some new bug in CPU microcode will be discovered but performance is always handled by Guest OS.

Monday, April 09, 2018

What is vCenter PNID?

Today I have got the question what is PNID in vCenter.

Well, PNID (primary network identifier) is a VMware internal term and it is officially called "system name".

But my question to the questioner was why he needs to know something about PNID. I have got the expected answer. The questioner did some research how to change vCenter IP address and hostname.

So let's discuss these two requests independently.

First thing first, vCenter hostname cannot be changed. At least for vCenter 6.0 and 6.5. It may or may not change in the future.

On the other hand, vCenter IP can be changed.  However, system name (aka PNID) is very important when you are trying to change vCenter IP address. vCenter IP address can be changed only when you have entered FQDN during vCenter installation. In such case, PNID is the hostname. In case, you did not enter FQDN during vCenter installation, the IP address is used as PNID which would end up in the inability to change the vCenter IP Address.

Below is the command how to check in VCSA what is your vCenter PNID.

root@vc01 [ ~ ]# /usr/lib/vmware-vmafd/bin/vmafd-cli get-pnid --server-name localhost
vc01.home.uw.cz

root@vc01 [ ~ ]# 

In this case above, PNID is the hostname (vc01.home.uw.cz) so I would be able to change IP address.

Sunday, April 08, 2018

Blog.iGICS.com moved to www.vcdx200.com

This Friday, I have got a very nice e-mail from one my long-time reader. I'm not going to publish the e-mail but the reader has mentioned that he was shocked and panicked when he realized that blog.iGICS.com does not exist. Fortunately, this particular reader found the new address of my blog which is www.VCDX200.com. The e-mail forced me to stop for a moment and think about my blogging history.

I have started blogging back in June 2006 on the address http://davidpasek.blogspot.com.

By the way, this address is still available because I still use Google's blogging platform which is free of charge and good enough. My first blog post was very short and simple. It was written in my native (Czech) language and the translation is following ...
May 2006: I have been reading for a long time in various magazines and websites that write blogs is in and trendy. It is said to write blogs! It is quite easy to start, but to write something on a regular basis, and even to have a head and heel, it will not be that simple. And what do I want to write about at all? It will, as it were, mainly IT, which has been for years, and I could say even a decade of my hobby and passion. So let's see how this blog eventually turns out.
Only first four blog posts were written in my native language because I have very soon realized that blog about Information Technologies must be written in the English language to address global IT community. My English language is far from perfect, but it is the only global language, therefore there is no other choice.

Since the beginning, I was thinking about the blog name. The first blog name iGICS was chosen back in times when I worked for Dell GICS (Global Infrastructure Consulting Services) and because domain GICS.com was not available I registered domain igics.com and moved the blog to address
http://blog.iGICS.com

Over the time I thought that domain igics.com is not easy to remember for my readers and because I moved from DELL to VMware and currently blogging mainly about Vmware topics I have decided to change the internet domain to VCDX200.com as all Vmware enthusiasts hopefully know what VCDX is so it is easier to remember.

So this blog has become available at
http://www.vcdx200.com 

For some time there was an automated redirect from *.igics.com to www.vcdx200.com but I have recently decided to not pay igics.com domain anymore. However, all content is preserved and available in the new domain.

After reading the e-mail from one my reader and double checked the blog view statistics I have realized that some my long-time readers probably lost the access to the content I have produced over the years. Therefore I have decided to continue paying the domain iGICS.com and keep the redirect to www.vcdx200.com

It is fair to say, that I'm blogging mainly for my self to document the knowledge which I would like to preserve and share with my customers who are paying my bills. The public content sharing is just a side effect :-) but it is nice to contribute to some communities so this is what I do here as well.

This weekend I stopped for a minute (actually two or three hours) and did some statistics. Over times I published 567 posts but to be honest, first few years I was doing "microblogging" so publishing interesting links to other resources on the internet just to quickly find it when necessary. Sometime in 2013, I have started publishing bigger posts which can be qualified as full articles.

To name some blog posts which are, IMHO, full articles with some value see the list below

2013 - High latency on vSphere datastore backed by NFS (9,069 views)
2013 - Two (2) or four (4) socket servers for vSphere infrastructure? (1,949 views)
2014 - Disk queue depth in an ESXi environment (5,660 views)
2014 - vSphere HA Cluster Redundancy (271 views) + 2018 - Admission Control "Dedicated fail-over hosts" (631 views)
2014 ... 2016 - DELL Force10 : Series (more than 27,000 views)
2015 - End to End QoS solution for Vmware vSphere with NSX on top of Cisco UCS (2,959 views)
2015 - VMware Tools 10 and "shared productLocker" (1,380 views)
2016 - PowerCLI script to report VMtools version(s) (2,415 views)
2016 - Leveraging VMware LogInsight for VM hardware inventory (1,796 views)
2016 - ESXi host vCPU/pCPU reporting via PowerCLI to LogInsight (2,594 views)
2017 - Back to the basics - VMware vSphere networking (3,260 views)
2017 - vSphere Switch Independent Teaming or LACP? (950 views)
2018 - No Storage, No vSphere, No Datacenter (1,506 views)
2018 - Storage QoS with vSphere virtual disk IOPS limits (775 views)
2018 - vSphere 6.5 - DRS CPU Over-Commitment (990 views)
2018 - VMware Response to Speculative Execution security issues, CVE-2017-5753, CVE-2017-5715, CVE-2017-5754 (aka Spectre and Meltdown) (2,586 views)

Anyway, the most popular (visited) posts in the history are

2014 - How to clear all jobs on DELL Lifecycle Controller via iDRAC (10,422 views)
2013 - High latency on vSphere datastore backed by NFS (9,069 views)
2013 - Calculating optimal segment size and stripe size for storage LUN backing vSphere VMFS Datastore (8,981 views)
2014 - DELL Force10 : VLT - Virtual Link Trunking (8,974 views)
2014 - Heads Up! VMware virtual disk IOPS limit bad behavior in VMware ESX 5.5 (5,114 views)
2017 - How to install VMware tools on FreeBSD server - (4,828 views)

The majority of my readers are from the United States. I personally think that US is the leader in Information Technologies so I'm proud my articles are interesting for US IT community. Almost, 10 times fewer readers are from Russia, Germany, Israel, and France which are top IT technology countries in EMEA region.

Pageviews by Countries
So Michael (he is the US reader who has sent me the e-mail), thank you again for your nice e-mail and hope not only you but also others will appreciate that the address blog.iGICS.com is back. Paying $10 per year for another internet domain is probably not ruined me :-) so I hope it helps at least a little bit to global IT community.

Cheers.

Friday, March 23, 2018

Deploying vCenter High Availability with network addresses in separate subnets

VMware vCenter High Availability is a very interesting feature included in vSphere 6.5. Generally, it provides higher availability of vCenter service by having three vCenter nodes (active/passive/witness) all serving the single vCenter service.

This is written in the official vCenter HA documentation
vCenter High Availability (vCenter HA) protects vCenter Server Appliance against host and hardware failures. The active-passive architecture of the solution can also help you reduce downtime significantly when you patch vCenter Server Appliance.
After some network configuration, you create a three-node cluster that contains Active, Passive, and Witness nodes. Different configuration paths are available.   
The last sentence is very true. The simplest VCHA deployment is within the same SSO domain and within the single datacenter with two Layer2 networks, one for management and second for the heartbeat. Such design can be deployed in fully automated manner and you just need to provide dedicated network (portgroup/VLAN) for the heartbeat network and use 3 IP addresses from separated heartbeat subnet. Easy. But is it what you are expecting from vCenter HA? To be honest, the much more attractive use case is to spread vCenter HA nodes across three datacenters to keep vSphere management up and running even one of two datacenters experiences some issue. Conceptually it is depicted in the figure below.

Conceptual vCenter HA Design
In this particular concept, I have embedded PSC controllers because of simplicity and vCenter HA can increase availability even of PSC services. The most interesting challenge in this concept is networking so let's look into the intended network logical design.

vCenter HA - networking logical design
Networking logical design:

  • Each vCenter Server Appliance node has two NICs
  • One NIC is connected to management network and second NIC to heartbeat network
  • Layer 2 Management network (VLAN 4) is stretched across datacenters A and B because vCenter IP address must work without human intervention in datacenter B after VCHA fail-over.
  • In each datacenter we have independent heartbeat network (VCHA-HB-A, VCHA-HB-B, VCHA-HB-C) with different IP subnets to not stretch Layer 2 across datacenters, especially not to datacenter C where is the witness. This requires specific static routes in each vCenter Server Appliance node to have IP reachability over heartbeat network.
  • Specific VCHA network tcp/udp ports must be allowed among VCHA nodes across heartbeat network.
Helpful documents:

Implementation Notes: 

Note 1:
VMware KB 2148442 (Deploying vCenter High Availability with network addresses in separate subnets) is very important to deploy such design but one information is missing there. After cloning of vCenter Server Appliances, you have to go to passive node and configure on eth0 the same IP address you use in active node. Configuration is in file  /etc/systemd/network/10-eth0.network.manual
    Note 2:
    In case of badly destroyed VCHA cluster use following commands to destroy VCHA from the command line
    cd /etc/systemd/networkmv 10-eth0.network.manual 20-eth0.networkdestroy-vchareboot
      The solution was found at https://communities.vmware.com/thread/552084
        Link to the official documentation (Resolving Failover Failures) - https://docs.vmware.com/en/VMware-vSphere/6.5/com.vmware.vsphere.avail.doc/GUID-FE5106A8-5FE7-4C38-91AA-D7140944002D.html

        Friday, March 09, 2018

        How to check I/O device on VMware HCL

        VMware has Hardware Compatibility List of supported I/O devices is available here
        https://www.vmware.com/resources/compatibility/search.php?deviceCategory=io

        VMware HCL for I/O devices

        The best identification of I/O device is VID (Vendor ID), DID (Device ID), SVID (Sub-Vendor ID), SSID (Sub-Device ID). VID, DID, SVID and SSID can be simply entered into VMware HCL and you will find if it is supported and what capabilities have been tested. You can also find supported firmware and driver.

        The get these identifiers you have to log in to ESXi via SSH and use command "vmkchdev -l".  This command shows VID:DID SVID:SSID for PCI devices and you can use grep to filter just VMware NICs (aka vmnic)
        vmkchdev -l | grep vmnic
        You should get similar output


        [dpasek@esx01:~] vmkchdev -l | grep vmnic
        0000:02:00.0 14e4:1657 103c:22be vmkernel vmnic0
        0000:02:00.1 14e4:1657 103c:22be vmkernel vmnic1
        0000:02:00.2 14e4:1657 103c:22be vmkernel vmnic2
        0000:02:00.3 14e4:1657 103c:22be vmkernel vmnic3
        0000:05:00.0 14e4:168e 103c:339d vmkernel vmnic4
        0000:05:00.1 14e4:168e 103c:339d vmkernel vmnic5
        0000:88:00.0 14e4:168e 103c:339d vmkernel vmnic6
        0000:88:00.1 14e4:168e 103c:339d vmkernel vmnic7

        So, in case of vmknic4 there is
        ·       VID:DID SVID:SSID
        ·       14e4:168e 103c:339d


        The same applies to HBAs and disk controllers.  For HBA and local disk controllers use
        vmkchdev -l | grep vmhba
        This is the output from my Intel NUC at my home lab

        [root@esx02:~] vmkchdev -l | more
        0000:00:00.0 8086:0a04 8086:2054 vmkernel 
        0000:00:02.0 8086:0a26 8086:2054 vmkernel 
        0000:00:03.0 8086:0a0c 8086:2054 vmkernel 
        0000:00:14.0 8086:9c31 8086:2054 vmkernel vmhba32
        0000:00:16.0 8086:9c3a 8086:2054 vmkernel 
        0000:00:19.0 8086:1559 8086:2054 vmkernel vmnic0
        0000:00:1b.0 8086:9c20 8086:2054 vmkernel 
        0000:00:1d.0 8086:9c26 8086:2054 vmkernel 
        0000:00:1f.0 8086:9c43 8086:2054 vmkernel 
        0000:00:1f.2 8086:9c03 8086:2054 vmkernel vmhba0

        0000:00:1f.3 8086:9c22 8086:2054 vmkernel 

        vmhba0 is local disk controller
        vmhba32 is USB storage controller
        vmnic0 is network interface

        Hope this helps.

        Wednesday, February 07, 2018

        Storage QoS with vSphere virtual disk IOPS limits

        I'm a long time protagonist of storage QoS applied per each VM virtual disk (aka vDisk). In the past, vSphere virtual disk shares and IOPS limits were the only solutions. Nowadays, there are new architectural options - vSphere virtual disk reservations and VVols QoS. Anyway, whatever option you will decide to use, the reason to use QoS (IOPS limits) is the architecture of all modern shared storages. Let's dig a little bit deeper. In modern storage systems, you usually have a single pool of a large number of disks and volumes (aka LUNs) created on top of this pool. This modern storage array approach has a lot of advantages but the key advantage is the positive impact on performance because single LUN is spread across a large number of disks which multiplies available performance. However, you usually have multiple LUNs and all these LUNs resides on the same disk pool so it sits on the same disks, therefore all these LUNs interfere with each other from the performance point of view. Even you would have a single LUN you would have multiple VMs per single LUN (aka VMFS datastore) therefore VMs would affect each other.

        Design consideration

        What it all means? Well, if you have a typical vSphere and storage configuration depicted in the figure below, single VM can use all performance from the underlying storage system (disk pool) and I/O workload from one VM impacts another VM. We call this situation "noisy neighbor" where one VM has significantly higher storage traffic than others.

        Modern Storage System Architecture - Disk Pool
        The problem with such design is not only performance interference but the fact, that performance is unpredictable. VMs are getting excellent performance when storage is NOT overloaded but very poor performance during peak times.  From use experience point of view is better to limit VMs to decrease the difference between BEST and WORST performance.

        How to design your infrastructure to keep your storage performance predictable? The short answer is to implement QoS. Of course, there is the longer answer.

        Storage QoS

        vSphere Storage QoS is called Storage I/O Control or simply SIOC. Storage-based per LUN QoS will not help you on VMFS because multiple VM virtual disks (vDisks) are accomodated on single LUN (datastore), however, all modern storages support Virtual Volumes (aka VVols) which is VMware framework how to manage vDisks directly on storage but the implementation is always specific to particular storage vendor. Nevertheless, in general, VVols can do QoS per vDisk (aka VMDK) as each vDisk is represented on storage as independent volume (aka LUN or better say sub-LUN or micro-LUN).

        vSphere Storage QoS (SIOC) supports

        • IOPS Limits
        • IOPS Reservations
        • IOPS Shares
        Storage I/O Control (SIOC) was initially introduced in vSphere 4.1. Nowadays it is called SIOC V1. vSphere 6.5 has introduced SIOC V2.

        vSphere SIOC V1

        In case you want to use vSphere SIOC V1 storage QoS, it is good to know, at least some, SIOC V1 "IOPS limit" implementation details
        • I/Os are normalized at 32KB so 128KB I/O is counted as 4 I/Os. This is vSphere default which can be changed by advanced parameter Disk.SchedCostUnit (default value is 32768 bytes) to satisfy different customer's specific requirements. Disk.SchedCostUnit is configurable and allowable values range between 4K to 256K.
        • In the latest ESXi version  ESXi 6.5 build 7526125) IOPS limit does not interfere with SVmotion (Storage vMotion). However, this is different behavior than in previous ESXi versions where storage vMotion was billed to the VM.
        • If VM has multiple disks with IOPS limits on the same datastore (LUN) then all limits are aggregated and enforced on such datastore per the whole VM. If vDisks of single VM are located on different datastores then limits are enforced independently per each datastore. The behavior is different on NFS datastores. All this behavior is explained in VMware KB "Limiting disk I/O from a specific virtual machine" - https://kb.vmware.com/kb/1038241

        vSphere SIOC V2

        It is worth to mention that SIOC V1 and SIOC V2 can co-exist. SIOC V2 is very different when compared to V1.

        SIOC V2 implementation details

        • I/Os are NOT normalized at static I/O size like SIOC V1. In other words, SIOC V2 does not have Disk.SchedCostUnit implemented.
        • SIOCv2 is implemented using IO Filter framework and is managed by using SPBM Policies. 

        VVols QoS

        VVols QoS is out of the scope of this article.


        Hope it is informative but as always do not hesitate to contact me for further details or discussions.