Monday, July 06, 2020

vCenter Server Appliance - Update installation is in progress

A few days ago, I have updated my home-lab VCSA vCenter Server 7.0 GA (15952498) to vCenter Server 7.0.0a (16189094). Everything seemed ok from the vCenter (vSphere Client) perspective. I was seeing there vCenter build 16189207, which is obviously VCSA 16189094.

The only problem I had was the fact, that I was not able to log in to VCSA VAMI.


After user authentication into VAMI, I was getting a message ... "Update installation is in progress". During the problem isolation, I tested the REST API and was getting the same message through Appliance REST API.

To be honest, I had no idea, how to check VCSA update status and move it forward, therefore I did the log files troubleshooting.

In /var/log/vmware/applmgmt/vami.log I had regularly repeating messages... "Executing VMware vCenter Server SNMP post upgrade actions..."

The next step was parsing PatchRunner.log. There was nothing remarkable except info about the patch stageDir located at "/storage/core/software-update/updates/7.0.0.10300/patch_runner" where i had two empty directories and file "patch_phase_context.json"

Normally, the directory /storage/core/software-update/updates/ is empty, however I have there still subdirectory with patch … /storage/core/software-update/updates/7.0.0.10300

I tried to delete the whole directory /storage/core/software-update/updates/7.0.0.10300 but it did not help.

It seemed, that post-upgrade actions (specifically VMware vCenter Server SNMP) cannot finish.

I HAD NO IDEA ... I WAS STUCK ... the next step was the research on the internal VMware slack channel. I have not found anything, therefore asked for help. In a few days, somebody else had the same issue in his home lab and he got the tip from VMware Engineering guys.

The trick was in file /etc/applmgmt/appliance/software_update_state.conf

I had there the following content

{
    "state": "INSTALL_IN_PROGRESS",
    "version": "7.0.0.10300",
    "latest_query_time": "2020-06-12T17:30:46Z",
    "operation_id": "/storage/core/software-update/install_operation"
}

so the trick was to change the content into

{
    "state": "UP_TO_DATE"
}

After the change, I was able to log in to VAMI and use it as usual. Well, almost as usual.
I realized, that I was not able to start another update as buttons [STAGE ONLY] and [STAGE AND INSTALL] are greyed out.



Nevertheless, this is only GUI problem and VCSA CLI update procedure works like a charm, so the update can be done through SSH session and command software-packages install --url or software-packages install --iso

Another positive thing is, that I was able to perform vCenter native backup, so if needed, I have the option to do a VCSA backup and restore, which would probably solve this cosmetic GUI issue.

Disclaimer: This exercise was done in my home lab. If you will experience a similar issue in a production system, please, contact VMware support before any changes in VCSA.

Monday, June 22, 2020

What's new in VxRail 7.0

This is a very short blog post about VxRail 7.0 which has been launched today. First of all, VxRail naming has been aligned with vSphere versioning, hence VxRail 7.0. Here is the summary of the announcement:

  • VxRail 7.0 includes the vSphere 7.0 and vSAN 7.0
  • Customers can now run vSphere Kubernetes on the Dell Tech Cloud Platform, VMware Cloud Foundation 4.0 on VxRail 7.0.
  • With a more accessible Consolidated Architecture, Dell Technologies Cloud Platform can now be deployed starting with a 4-node configuration.
  • Brand-new Dell EMC VxRail D Series – the most extreme yet. The D560/D560F is a ruggedized, durable platform that delivers the full power of VxRail for workloads at the edge, in challenging environments, or for space-constrained areas.
  • Even more platform flexibility with the new VxRail E Series model based on, for the first time, AMD EPYC processors.
  • The single socket, 1U nodes offer dual-socket performance making them ideal platforms for desktop VDI, analytics, and computer-aided design.
  • Enhanced operational benefits with new automation and self-service features enabling customers to schedule and run upgrade health checks in advance of upgrades with VxRail HCI System Software.
  • The addition of Intel® Optane™ DC Persistent Memory to the E560 and P570 platforms offers high performance and significantly increased memory capacity with data persistence.
  • The latest NVIDIA® Quadro RTX™ 6000 and 8000 GPUs to the V570F bringing the most significant advancement in computer graphics in over a decade to professional workflows.
If you were unable to attend the event, you can always visit the VxRail event page, where you can watch it OnDemand!

Monday, May 11, 2020

Undocumented HA Advanced Option - das.restartVmsWithoutResourceChecks

Some time ago, a colleague of mine (@stan_jurena) was challenged by one VMware customer who experienced APD (All Path Down) storage situation in the whole HA Cluster and he expected that VMs will be killed by VMware Hypervisor (ESXi) because of HA Cluster APD response setting "Power off and restart VMs - Aggressive restart policy". To be honest, I had the same expectation. However, after the discussion with VMware engineering, we have been told, that the primary role of HA Cluster is to keep VMs up and running, so "Aggressive restart policy" will restart VM only in certain conditions which are much better described in vSphere Client 7 UI. See the screenshot below.


APD Aggressive restart policy
A VM will be powered off, If HA determines the VM can be restarted on a different host, or if HA cannot detect the resources on other hosts because of network connectivity loss (network partition).

So, what it means? Aggressive restart policy is the same as Conservative but extended for the situation when there is network partitioning. This can be helpful in situations when you have IP storage and experience IP network issues but it does not help in a situation when you have dedicated Fibre Channel SAN and the storage is not available for the whole vSphere Cluster.

We explained to VMware engineering, that there are situations when it is much better to kill all VMs than keep compute (VMs) running without available storage. Based on these discussions, there was created a Feature Request, which was internally named as "super aggressive option" APD. I'm happy to see, that it was implemented and released in vSphere 7 as vSphere advanced option
das.restartVmsWithoutResourceChecks = false (default) / true (super aggressive)
I think this advanced option will be very useful for infrastructure architects / technical designers who will have a good justification to use this advanced option. Here are my typical justifications
  • When the storage subsystem is unavailable for some time, Linux operating system switch file system to Read-Only mode which has a negative impact on running applications. Such a situation typically leads to server restart anyway.
  • When you have an OS/Application clustering solution (for example MSCS) on top of vSphere clustering, having one Application node on one vSphere cluster and another Application node on different vSphere cluster, you prefer to kill VM (App Node) on the problematic cluster (without available storage) to fail-over to App Node (VM) running on the healthy cluster.
Hope this makes sense.

Please leave the comment if you will find this advanced option useful. VMware Engineering might consider adding this option into GUI, based vSphere architects / technical designers' feedback.

References
  • Duncan Epping wrote the blog post about it here.
  • For other "Advanced configuration options for VMware High Availability in vSphere 5.x and 6.x" check VMware KB 2033250.


Friday, May 08, 2020

CPU capacity planning and sizing

During infrastructure capacity planning and sizing, the technical designer has to calculate CPU, RAM, Storage, and Network resource requirements. Recently, I had an interesting discussion with my colleagues on how to estimate CPU requirements for application workload.

Each computer application requires some CPU resources for computational tasks and additional resources for I/O tasks. It is obvious that the computational tasks require CPU cycles, however, it is not so obvious that there are CPU cycles associated also with I/O. In other words, each I/O requires some CPU resources. It does not matter if it is memory, storage, or network I/O.

For example, a generally accepted rule of thumb in the networking is that
1 Hertz of CPU processing is required to send or receive 1 bit/s of TCP/IP.
[Source: VMware vSphere 6.5 Host Resources Deep Dive]

This would mean 2.5Gb/s would require ~ 2.5Ghz CPU, thus ~ 100% of one CPU Core @ 2.5 GHz.

It would be nice to have a similar rule of thumb for storage I/O. I did quick research (googling) but was not able to find any information about CPU requirements for storage I/O.  I did a quick test in my home lab and start the synthetic random workload (4KB I/O) on 4vCPU VM on ESXi host having CPU at 2 GHz, where I was able to see 5,000 IOPS with CPU utilization 8.5%. This would mean one 4KB I/O requires 136 Hz.

4KB I/O on 4x vCPU VM with pCPU @ 2 GHz 
I did another test with 512 B I/O where 1 IOPS requires 114 Hz.
And for 64KB I/O size, 1 IOPS requires 161 Hz.

0.5 KB I/O => 512 Bytes I/O (4,096 bits) = 114 Hz
4 KB I/O => 4,096 Bytes I/O (32,768 bits) = 136 Hz
64 KB I/O => 65,536 Bytes I/O (52,4288 bits) = 161 Hz

Based on my observations, it is difficult to define the rule of thumb for 1 Bit/s or Byte/s but rather I would define CPU (Hz) requirements for 1 storage I/O.

Based on multiple assessments of real datacenter environments, I would say that typical average storage I/O size is around 40-50 KB, therefore here is my rule of thumb
1 Storage I/O requires ~ 150 Hz of CPU processing
This would mean 10,000 IOPS would require ~ 1.5 GHz, thus 60% of one CPU Core @ 2.5 GHz.

Please, be aware that this is a very simplified calculation but clearly shows that storage workload is always associated with CPU requirements and it can help with capacity planning and infrastructure sizing.

What do you think about this calculation?
Do you observe different numbers?
Would you calculate it differently?
You can leave the comment below the article.

Sunday, May 03, 2020

vSphere 7 - Storage Requirements for the vCenter Server Appliance

I have upgraded vSphere in my home lab and realized that VCSA 7.0 storage requirements increased significantly.

Here are the requirements of vCenter Server Appliance 6.7


Here are the requirements of vCenter Server Appliance 7.0


You can see the difference by yourself. VCSA 7.0 requires roughly 30%-60% more storage than VCSA 6.7. It is good to know it especially for home labs where hardware resources are limited or during logical designs of new environments where you do some math calculation to plan hardware requirements.

Monday, April 20, 2020

What's New in vSAN 7

vSAN 7.0 introduces the following new features and enhancements.

vSphere Lifecycle Manager (vLCM).

vLCM enables simplified, consistent lifecycle management for your ESXi hosts. It uses a desired-state model that provides lifecycle management for the hypervisor and the full stack of drivers and firmware. vLCM reduces the effort to monitor compliance for individual components and helps maintain a consistent state for the entire cluster. In vSAN 7.0, this solution supports Dell and HPE ReadyNodes.

Integrated File Services. 

vSAN native File Service delivers the ability to leverage vSAN clusters to create and present NFS (v4.1 and v3) file shares. vSAN File Service extends vSAN capabilities to files, including availability, security, storage efficiency, and operations management.

Native support for NVMe hotplug.

This feature delivers a consistent way of servicing NVMe devices and provides operational efficiency for select OEM drives.

I/O redirect based on capacity imbalance with stretched clusters.

vSAN redirects all VM I/O from a capacity-strained site to the other site, until the capacity is freed up. This feature improves the uptime of your VMs.

Skyline integration with vSphere health and vSAN health. 

Joining forces under the Skyline brand, Skyline Health for vSphere and vSAN are available in the vSphere Client, enabling a native, in-product experience with consistent proactive analytics.

Remove EZT for a shared disk. 

vSAN 7.0 eliminates the prerequisite that shared virtual disks using the multi-writer flag must also use the eager zero thick format.

Support vSAN memory as a metric in performance service. 

vSAN memory usage is now available within the vSphere Client and through the API.

Visibility of vSphere Replication objects in vSAN capacity view. 

vSphere replication objects are visible in vSAN capacity view. Objects are recognized as vSphere replica type, and space usage is accounted for under the Replication category.

Support for large capacity drives. 

Enhancements extend support for 32TB physical capacity drives and extend the logical capacity to 1PB when deduplication and compression are enabled.

Immediate repair after a new witness is deployed. 

When vSAN performs a replacement witness operation, it immediately invokes a repair object operation after the witness has been added.

vSphere with Kubernetes integration. 

CNS is the default storage platform for vSphere with Kubernetes. This integration enables various stateful containerized workloads to be deployed on vSphere with Kubernetes Supervisor and Guest clusters on vSAN, VMFS and NFS datastores.

File-based persistent volumes. 

Kubernetes developers can dynamically create shared (Read/Write/Many) persistent volumes for applications. Multiple pods can share data. vSAN native File Services is the foundation that enables this capability.

vVol support for modern applications. 

You can deploy modern Kubernetes applications to external storage arrays on vSphere using the CNS support added for vVols. vSphere now enables unified management for Persistent Volumes across vSAN, NFS, VMFS, and vVols.

vSAN VCG notification service.

You can subscribe to vSAN HCL components such as vSAN ReadyNode, I/O controller, drives (NVMe, SSD, HDD) and get notified through email about any changes. The changes include firmware, driver, driver type (async/inbox), and so on. You can track the changes over time with new vSAN releases.

Thursday, April 16, 2020

Logical design - storage performance sizing

Storage performance is always a kind of magic because multiple factors come in to play and not all disks are equal, however, in logical design, we have to do some math because capacity (and performance) planning is a very important part of logical design.

How I do it? I do math with some performance assumptions.

Here are assumptions about various disk type performance I use for my capacity planning exercises.

The below numbers are estimated for the random I/O of 64KB I/O size.

Mechanical hard drives
SAS 15k - 200 IOPS
SATA 7k - 80 IOPS

Read Intensive Solid-state disks (SSD)
SATA Read Intensive SSD - 5,000 IOPS (read) / 1,500 IOPS (write)
SAS Read Intensive SSD - 10,000 IOPS (read) / 2,000 IOPS (write)
NVMe Read Intensive SSD - 30,000 IOPS (read) / 2,500 IOPS (write)

Mixed Used Solid-state disks (SSD)
SATA Mixed Used SSD - 5,000 IOPS (read) / 1,800 IOPS (write)
SAS Mixed Used SSD - 12,500 IOPS (read) / 5,000 IOPS (write)
NVMe Mixed Used SSD - 45,000 IOPS (read) / 10,000 IOPS (write)

Write Intensive Solid-state disks (SSD)
SAS Write Intensive SSD - 12,500 IOPS / 7,500 IOPS (write)

SSD assumptions are based on hardware vendors' spec sheets. One of these spec sheets is available here https://www.slideshare.net/davidpasek/dell-power-edge-ssd-performance-specifications

So with these assumptions, the performance math is relatively simple.

Let's have for example 4x SAS Read Intensive SSD within a disk group.
Such a disk group should have the aggregated read performance 4 x 10,000 IOPS = 40,000 IOPS

As we see in the performance numbers above, there is a significant performance difference between SSD read and write.

For our SAS Read Intensive SSD disk we have 10,000 IOPS for 100% read but only 2,000 IOPS for 100% write so we have to normalize these numbers based on expected read/write ratio. If the planned storage workload is 70% read and 30% write, we can assume the single SSD disk will give as 7,000 + 600 IOPS, so in total 7,600 IOPS.

Storage is typically protected by some RAID protection, where the write penalty comes into play. Write penalty is the number of I/O operations required on the backend for a single frontend I/O operation.

Here are write penalties for various RAID protections
RAID 0 (no protection) - write penalty 0
RAID 1 (mirror) - write penalty 2
RAID 5 (erasure coding / single parity) - write penalty 4
RAID 6 (erasure coding / souble parity) - write penalty 6

So, let's calculate the write penalty and write overhead.

If the planned storage workload is 70% read and 30% write and we have total aggregated normalized performance 30,400 IOPS (4 x 7,600) and we have to split the available performance into READ bucket and WRITE bucket.

In our example scenario, we have
READ bucket (70%) - 21,280 IOPS
WRITE bucket (30%) - 9,120 IOPS

Now we have to apply write penalty on write bucket. So let's say we would like to have RAID 5 protection, therefore 9,120 IOPS available on the backend can handle only 2,280 IOPS coming from the frontend.

Based on these calculations, the aggregated performance of RAID 5 protected disk group of 4 Read Intensive SSD disks should be able to handle 23,560 IOPS (21,280 + 2,280) of front-end storage workload.  Please note, that the considered workload pattern is random, 64KB I/O size with Read/Write ratio 70/30.

Do not forget, that this is just logical planning and estimation, every physical system can introduce additional overhead. In real systems, you can have bottlenecks not considered in this simplified calculation. Example of such bottleneck can be
  • storage controller, driver, firmware
  • low queue depth int the storage path (controller, switch, expander, disk), not allowing I/O parallelism
  • network or other bus latency
Therefore, any design should be always tested after implementation and performance results validated with expected numbers.

Are you doing similar design exercises? Any comment or suggestion is always welcome and appreciated.