More...
The problem remains. I use GhettoVCB latest to do thin clone backups to a Windows 2003 Server compressed volume, projecting the volume as an NFS share. GhettoVCB deletes the oldest backups, so there is no reason for the volume to run out of space, and indeed both Windows and ESXi agree on the amount of free space, and that there are many times what is required for the backup even if the backups were not thin. This is what I've observed:
1. After so many cycles, the same, and largest VM, will not backup anymore. The error from vmkfstools is that it is out of disk space, even though it shows it has almost a terabyte of free space. I'm assuming the reason the problem is with this VM is because it's larger than the rest all put together. (80 declared, 55 GB thin) I have other declared at 80 also, but they are much smaller from a thin standpoint. It will get to 90%+ complete before it fails.
2. When I run into the problem, no matter how many backups I erase, and no matter how much free space the NFS volumes shows I have, I can no longer backup the VMs. The error returned in vmkfstools is that there is not enough disk space, even when there is almost a terabyte free.
3. If I run chkdsk, no switches, from within Windows (read only), it shows no errors.
4. If I run chkdsk /f, and reboot, it finds and fixes errors. However, if I repeat the process, I will get the same errors on the same files no matter how many times I do it. This behavior is consistent for the system drive as well, which is not exported NFS and has no thin backups on it, and is not compressed. I also get the same behavior on my other 2003 Server VM. It seems chkdsk doesn't work properly inside of a VM.
5. The last process of chckdsk /f runs is recalculate the free space. After that, I can backup again.
Thoughts:
- The fact that Windows and ESXi agree on the amount of free space is not impressive because NFS is not a files system, it is simply a protocol, and ESXi gets its informatin from Windows through the NFS protocol.
- It seems likely that vmkfstools actually does run out of disk space, even though Windows shows there is far more than enough.
- Since the problem is related to the number of cycles, I would theorize that the space is not being returned for use when a backup is deleted. This is supported by the fact that after chkdsk /f, I can backup again.
- The main thing that has changed recently is moving from ESXi 4.0 to 5.1.
- Take together with the fact that chkdsk /f does not work properly with the Windows VMs, there appears to be a compatability issue between Windows and VMware.Has anybody else had this problem, and if you found a way around the problem, what was it?
Thanks!