I looked at the server and this server was extremely unresponsive. Yet according to vCenter, the load and I/O on the server were not peculiar. There was also no problematic I/O on our filer. There was also plenty of free space on the disks.
The explorer process froze in the console session and Windows suggested to stop it after the timeout. I grew tired of waiting and tried to shut down the server. The server froze again in the shutting down process and after waiting for 15 minutes, I just turned the virtual machine off.
I rebooted and wanted to look what caused the virtual machine to behave like this.
The web application didn't spit out any clues.
Then I checked more general OS stuff and warnings.
I noticed some interesting messages:
- Reset to device, \Device\RaidPort0, was issued , appeared since 22.00 yesterday (not coincidentally when our backup starts), every minute or so.
- Event ID: 57 NTFS WarningThe system failed to flush data to the transaction log. Corruption may occur. Appears for ages on the machine. I remembered I checked this a year or so ago, and wanted to write about this as a follow up on this article. This issue is related to the System Reserved Partition Windows creates. This partition holds the Boot Manager code and the Boot Configuration Database. It is also required by the BitLocker Drive Encryption feature. It also causes VSS to misbehave.
Relevant VMware KB's are:
We updated the LSI driver to > 1.32.01 and will now wait for the daily backup to see it the issue is resolved. Strange the issue pops up now, the server is running for more than 2 years now.