VMWare issues

Start of Observed Problem

So today I noticed my Symantec server having some issues. I issued a reboot command due to it not being a critical production server. About 20 minutes later, I realized it wasn’t back up yet and wasn’t responding to pings. So I log into my vSphere client and see that the VMWare Host seems to be acting up.

Initial Analysis

Notice how the virtual machines along the left are all gray? They should have different icons to indicate whether they are powered on or off. Also earlier, the CPU usage & Memory Usage didn’t have a Capacity amount and the General box wasn’t filled in.

Here, you can see at 10am today my network usage shot up to well above average for during the day. No backups are going at the moment. There is no scheduled task that would be causing this and no servers that are presently running are experiencing any issues.

Further Analysis

Strangely, the vSphere client started responding again. I went into the performance tab for the host and found the read-rate for the Lefthand iSCSI was about the same rate as the MRTG graph showed above. I started going through each server and observing their read rates. Each server was low until I hit my Exchange server. The Exchange server had the same high rates (but extremely low latency).

I remoted into the Exchange server and opened performance monitor. I observed high read rates that matched the high read rates that vSphere indicated and matched the graph MRTG produced. Then the read rates dropped and everything started responding. There is nothing in the event viewer to indicate a backup, defrag, or virus scan was occurring at the time I noticed the high read rates. Processor usage was low at the same time.

Conclusion

I started this blog entry to chronicle my troubleshooting and what I found was the cause of the problem and how I fixed it. Unfortunately, I can’t do either of those goals. At the moment, I am stuck and have no clues as far as what was causing the high read rates.

My vSphere client has started responding correctly and my Symantec server has successfully powered on and booted.

I wish I could come up with a definitive solution to what caused the problem and what I did to fix the problem. Maybe with some more observations I can figure it out and update the entry with my results.