The customer has vSphere 6.5 U2 (build 9298722) and IBM TSM VE 188.8.131.52. They observed the problem just on VMs where VM hardware was upgraded to version 13. The customer opened a support case with VMware GSS and IBM support.
IBM Support observed VADP/VDDK API function QueryChangedDiskAreas was failing with TSM log message similar to ...
10/19/2018 12:04:26.230   : ..\..\common\vm\vmvisdk.cpp(2436): ANS9385W Error returned from VMware vStorage API for virtual machine '
"Error caused by file /vmfs/volumes/583eb2d3-4345fd68-0c28-3464a9908b34/
VMware Support (GSS) instructed my customer to reset CBT - https://kb.vmware.com/kb/2139574 or disable and re-enable CBT - https://kb.vmware.com/kb/1031873 and observe if it solves the problem.
A few days after CBT reset, the problem with backup occurred again, therefore it was not a resolution.
I did some research and found another KB - CBT reports larger area of changed blocks than expected if guest OS performed unmap on a disk (59608). We believe that this the root cause and KB contains workaround and final resolution.
The root cause mentioned in VMware KB 59608 ...
When an unmap is triggered in the guest, the OS issues UNMAP requests to underlying storage. However, the requested blocks include not only unmapped blocks but also unallocated blocks. And all those blocks are captured by CBT and considered as changed blocks then returned to backup software upon calling the vSphere API queryChangedDiskAreas(changeId).Workaround ...
Disable unmap in guest VM.In MS Windows Operating Systems UNMAP can be disabled by command
fsutil behavior set Disable DeleteNotify 1
and re-enabled by command
fsutil behavior set Disable DeleteNotify 0
Anyway, the final problem resolution has to be done by the backup software vendor ...
If you have VDDK 6.7 or later libraries, take the intersection of VixDiskLib_QueryAllocatedBlocks() and queryChangedDiskAreas(changeId) to calculate the actually changed blocks.They cannot use just API function QueryChangedDiskAreas but also function QueryAllocatedBlocks and calculate disk blocks for incremental backups. Based on VDDK 6.7 Release Notes, it can be leveraged even for vSphere 6.5 and 6.0. For more info read Release Notes here.
I believe the problem occurs on the following conditions
- The virtual disk must be thin-provisioned.
- VM Hardware 11 and later because for older versions do not pass through UNMAP SCSI commands
- The guest operating system must be able to identify the virtual disk as thin
- The guest operating system is issuing UNMAP down to the storage system
My customer is going to test workaround (disable UNMAP in Guest OS's) as a short-term solution and start the investigation of resolution with IBM TSM (note: TSM was renamed to IBM Spectrum Protect Data Mover). It seems IBM Spectrum Protect Data Mover 8.1.6 is leveraging VDDK 6.7.1 so upgrade from current version 8.1.4 to 8.1.6 could solve the issue.