Wednesday, March 24, 2021

What's new in vSAN 7 Update 2

vSAN 7 Update 2 has been released. Let's look what's new in vSAN. It is very interesting release if you ask me.

HCI Mesh 

HCI Mesh was available in the previous release as well but Update 2 brings a significant technical and also licensing enhancements. Let's go through these enhancements one by one.

  • Mount remote vSAN datastore from non-vSAN based vSphere cluster (aka HCI Mesh Compute Cluster). We call it Disaggregated HCI. This is huge because it does not require vSAN licensing in vSphere cluster which mounts remote vSAN datastore.
  • Improved scalability. Up to 128 hosts connected to remote vSAN datastore.
  • Improved Storage policies and their integrations with vSAN datastores. Storage policies now supports
    • Dedup & Compression or Compression-only
    • Data-at-rest encryption
    • Hybrid or all-flash
Disaggregated HCI

Cloud Native Applications

Based on my experience, Kubernetes and DevOps madness is the biggest use case for vSAN7 nowadays. Do not get me wrong, I like agile and DevOps in general as I use these principles it for last 25 years, however, the Enterprise DevOps adoption is little bit "specific" :-) Nevertheless, time pressure drives the DevOps and Agile methodologies even to inflexible enterprises, so VMware hyper-converged stack supports it from technical point of view very nicely.

  • vSAN 7 U2 supports expansion of persistent volumes without the need to take it offline, eliminating any interruption. This has a positive impact on flexibility of container powered workloads.

Native File Services

  • Support for vSAN stretched clusters and 2-node topologies
  • Support of Data-in-Transit Encryption and UNMAP
  • Snapshots for file services volumes (via API only) and it can extract differences between two snapshots (delta between two snapshosts). This is a new snapshotting mechanism for point-in-time recovery of files. This mechanism allows our backup partners to build applications to protect file shares in new and interesting ways.
  • Improved scale, performance and efficiency
    • optimization some metadata handling and data path for more efficient transactions, especially with small files

vSAN File Services in stretched and 2-node clusters

Performance : vSAN over RDMA

The distributed storage systems like vSAN heavily relies on resilient network connectivity, performance, and efficiency.  vSAN 7 U2 now supports clusters configured for RDMA-based networking - RoCE v2 specifically.  Transmitting native vSAN protocols directly over RDMA can offer a level of efficiency that is difficult to achieve with traditional TCP based connectivity over ethernet.  The support for RDMA also means that the vSAN hosts have the intelligence to fall back to TCP connectivity in the event that RDMA is not supported on one of the hosts in a cluster.  vSAN over RDMA is a great step toward supporting topologies which are using the latest fast, efficient delivery of east-west traffic using a commodity network. Below is the list of new RDMA related features

  • Efficient connectivity for vSAN clusters using RDMA
  • Improved CPU utilization and app performance for certain workloads
    • Sequential reads
    • Random mixed reads/writes
  • Supports RDMA over Converged Ethernet v2 (RoCE v2)
  • Automatic detection and handling of RDMA adapters


vSAN over RDMA

vSAN over RDMA improves performance and efficiency. The performance is always tricky but in the slide below you can see some numbers.

vSAN over RDMA performance and efficiency improvements

Performance : Optimizations in vSAN 7 U2 

VMware continues to drive better performance with each new version of vSAN.  Many of these are achieved by optimizing the hypervisor and storage stack for the very latest and greatest hardware, which is precisely how VMware was able to deliver improved performance with vSAN 7 U2. First, changes were made in the hypervisor to better accommodate the architecture of AMD-based chipsets.  Improvements were also made in vSAN (regardless of chipset used) to help reduce CPU resources during I/O activity for objects using the RAID-5 or RAID-6 data placement scheme, this will be especially beneficial for workloads issuing large sequential writes.  vSAN7 U2 also includes enhancements to help I/O commit to the buffer tier with higher degrees of parallelization.  All of these add up to fewer impediments as I/O traverses the storage stack, and a reduction of CPU utilization and storage latency. Let's describe it one by one

  • Improved NUMA awareness for AMD EPYC  Rome processors
  • Improved performance when using RAID-5/6 erasure codes
    • Improved large sequential writes
    • Reduced CPU usage
  • Improved CPU efficiency writing data to cache/buffer tier
    • Improved small random I/O
Performance optimization for AMD platform

Security: vSphere Native Key Provider in vSAN

I like this one, as it is another operational and architectural simplification leading into better infrastructure security. Security through data encryption is top of mind for many VMware customers.  Encrypting data at rest is often a part of this security effort.  With vSphere and vSAN 7 U2, VMware introduces the support the “Native Key Provider” feature which can simplify key management for environments using encryption. For vSAN, the embedded KMS is ideal for Edge or 2-Node topologies, and is a great example of VMware’s approach to intrinsic security. So what is inside?

  • vSphere Native Key Provider enables “out of the box” data-at-rest protections 
  • Key provider integrated in vCenter Server & clustered ESXi hosts
  • Works with ESXi Key Persistence to eliminate dependencies
  • Adds flexible and easy-to-use options for advanced data-at-rest security

vSphere Native Key Provider in vSAN

Manageability: Skyline Health Diagnostic (SHD) tool for VMware customers

This is another great evolution of VMware Skyline. The Skyline Health Diagnostics tool is a self-service tool that brings some of the benefits of Skyline health directly to an isolated environment.  The tool is run by an administrator at a frequency they desire.  It will scan critical log bundles to detect issues, and give notifications and recommendations to important issues and their related KB articles. In a nutshell, it is

  • Ideal for isolated environments unable to use online Skyline Health
  • Self service tool for customer
    • Uses latest signature library from VMware
    • Scans log bundles
    • Detects issues and gives KB recommendations
  • Proactive support of issues without contacting GSS
  • Improved support experience for opened cases with GSS

Skyline Health Diagnostic (SHD) tool


Manageability: Extending vLCM compatibility

The vSphere Lifecycle Manager (vLCM) is VMware’s new lifecycle management platform built into the hypervisor that was first introduced in vSphere 7.  vSphere 7 U2 also includes for environments running vSphere with Tanzu that use NSX-T for their network overlay.  These enhancements are a great example of how vLCM is maturing into a robust management platform for hypervisor management that benefits our customers, and our OEM partners. So what's new in vSAN 7 Update 2?

  • New vendor Plugin* for select Hitachi UCP ReadyNode models
  • Update recommendations automatically refreshed after common change events
    • VMware image depot
    • Change in desired image
  • Support for vSphere with Tanzu using NSX-T

Extending vLCM compatibility

vCenter Server and cluster deployment wizards with vLCM integration

Deploying a new vSAN cluster has never been easier with the improvements introduced in this release.  vSphere 7 U2 has now integrated vLCM into the “Easy Install” and “Cluster Quick Start” deployment wizard for deployments of new hosts and clusters from OEMs that support vLCM.  OEM vendors now have all of the capabilities in place to help their users deploy their new systems in a fast, and fully compliant manner.  The workflows used by administrators for both new cluster creation, as well as a greenfield environment where a vCenter Server appliance is bootstrapped onto a single host have been updated to accommodate the ability to easily reference a host for easy compliance through vLCM.

  • New vLCM image options found in multiple wizards:
    • Easy Install
    • Cluster Quick Start
  • Seed vLCM desired image through a reference host for new deployments
  • OEM Servers include:
    • OEM hypervisor ISO
    • vLCM desired state specs
    • vLCM image depot contents
    • VCSA installer
  • Supports VCSA CLI installer
Cluster deployment wizards with vLCM

Manageability: Fast Restarts of vSAN Hosts during Upgrades

vSAN 7 U2 provides better integration and coordination for hosts using Quick Boot to speed up the host update process.  By introducing a new “suspending the VMs to memory” option, and better integration with the Quick Boot workflow, the amount of data moved during a rolling upgrade is drastically reduced due to reduced VM migrations, and a smaller amount of resynchronization data.

  • Minimizes VM migrations during rolling upgrades using  ‘Suspend to memory’ in Quick Boot
  • Restart hypervisor and vSAN while avoiding time-consuming hardware initialization
  • Faster restarts reduce resynchronization efforts
  • Compliments host restart optimizations made in vSAN 7 U1

Fast Restarts of vSAN Hosts during Upgrades


Recoverability : Enhanced Data Durability During Unplanned Events

vSAN 7 U2 makes a significant improvement to ensuring the latest written data is saved redundantly in the event of an unplanned transient error or outage. When an unplanned outage occurs on a host, vSAN 7 U2 will immediately write all incremental updates to another host in addition to the other host holding the active object replica.  This helps ensure the durability of the changed data in the event that an additional outage occurs on the other host holding the active object replica.  This builds off of the capability first introduced in vSAN 7 U1 that used this technique for planned maintenance events.  These data durability improvements also have an additional benefit:  Improving the time in which data is resynchronized to a stale object.

  • Maintains latest data redundantly in the event of an unplanned transient error or outage
  • Latest writes quickly committed to additional host ensuring durability of new data
  • Efficient and fast resyncs to stale components on recovered or new host
  • More frequent checks for silent disk errors
Enhanced Data Durability During Unplanned Events

Availability : vSAN support of vSphere Proactive High Availability (HA)

vSAN 7 U2 now supports vSphere Proactive HA, where the application state and any potential data stored can be proactively migrated to another host. 

  • Proactive response when vSAN host detects impending failure
    • Evacuates VMs
    • Migrates object data
  • Uses plug-in provided by participating OEM server vendors
  • Supports quarantine mode and maintenance mode
  • Increased application up-time
vSAN support of vSphere Proactive High Availability (HA)

Availability : Integrated DRS awareness of Stretched Cluster configurations

Stretched clusters provides higher availability across two availability zones in metro distance (response time less than 5 ms). Stretched cluster configuration must account for not only a variety of failure scenarios, but recovery conditions. vSAN 7 U2 introduces integration with data placement and DRS so that after a recovered failure condition, DRS will keep the VM state at the same site until data is fully resynchronized, which will ensure that all read operations do not traverse the inter-site link (ISL).  Once data is fully resynchronized, DRS will move the VM state to the desired site in accordance to DRS rules.  This improvement can dramatically reduce unnecessary read operations occurring across the ISL, and free up ISL resources to continue with its efforts to complete any resynchronizations post site recovery.  And finally, vSAN U2 also increases the maximum host count for a stretched cluster configuration.  Stretched cluster maximums are increased from 30 data hosts spread across two sites to 40 data hosts spread across two sites  (not including witness host appliance).

  • Prioritizes I/O read locality over any VM site affinity rules
  • Instructs DRS not to migrate VMs to desired site until resyncs complete
  • Reduces I/O across ISL in recovery conditions
    • Improve read performance
    • Free up ISL for resyncs to regain compliance
  • Support for larger stretched clusters:  20+20+1
Integrated DRS awareness of Stretched Cluster configurations

Manageability : Proactive capacity monitoring

Capacity Monitoring and alerting sees some great improvements with vSAN 7 U2 that make it easier for administrators to understand capacity limits, and oversubscription ratios.  In this release, vSAN 7 U2 introduces the ability for the administrator to see how oversubscribed capacity is for the cluster.  vSAN is inherently thin provisioned, meaning that only the used space of an object is counted against the capacity usage.  Oversubscription visibility helps the administrator understand how much storage has been allocated, so they can easily see the capacity required in a worst-case scenario and adhere to their own sizing practices. vSAN 7 U2 also provides customizable warning and error alert thresholds directly in the Capacity Management UI in vCenter Server.  Redundant alerting for capacity thresholds have also been eliminated to help clarify and simplify the condition reported to the administrator.

  • Capacity estimations for fully allocated capacity of thin provisioned objects
    • Easily see over subscription
    • Factors in storage policy used
  • Customizable alarm thresholds for vSAN cluster capacity
  • Eliminated redundant alerting
Proactive capacity monitoring - Thinprovisioning awareness

Proactive capacity monitoring - Reserved Capacity Alerting

Manageability : Enhanced network monitoring

vSAN 7 U2 introduces several new metrics and health checks to provide better visibility into the switch fabric that connects the vSAN hosts.

  • Advanced networking metrics integrated into vSAN performance monitoring and alerts
  • New network statistics with customizable alerts
    • TCP/IP layer
    • Physical network layer
  • Augments existing networking metrics
  • Integrated into vSAN Support Insight

Enhanced network monitoring

Manageability : Health check history and correlation

SAN 7 U2 introduces new enhancements to help provide context and insight into the sophisticated collection of alerts found in vSAN.
  • View a timeline of discrete error conditions
  • Gain insight into transient conditions that are difficult to track
  • Easily enable/disable based on need
Health check history and correlation

Manageability : vSAN performance ‘top contributors’

vSAN 7 U2 introduces an easy way to determine the heavy utilized VMs, sometimes referred to as noisy neighbors.
  • Easily determine top contributors when experiencing performance issues
    • VMs
    • Disk groups
  • Quickly find potential noisy neighbors and their impact on resources
  • Set time period, and view by latency, throughput, or IOPS
vSAN performance ‘top contributors’


As you can see, vSAN 7 Update 2 brings significant improvements which are very useful for modern datacenters. I personally like it and looking forward to see these enhancements at least in our labs and pre-production environments to get hands on experience.

Other blog posts about vSAN 7 Update 2 News


No comments: