Whilst doing a little storage vMotion using VMware this week, I got a little lesson in patience from VMware and thought that I would share my experiences to help a few others who are looking at doing storage migrations of larger VM's. My experiences are with VMware ESX 3.5 and Virtual Center 2.5. Not the most recent, I know, but we haven't had time yet to upgrade to vSphere 4.
The point of my migration was to get a particular vmdk off a slower set of disks and only a faster set of disks. Due to the fact that ESX 3.5 doesn't directly support moving only one vmdk at a time, I had to do a little dance to get the one vmdk that I wanted to move, moved. I had to move the virtual machine itself and ended up moving a couple of extra disks in order to to get it to move.
Near as I could figure, Storage vMotion (with ESX 3.5) has the following stipulations:
- The virtual machine files must move.
- The virtual machine files must move to a datastore that is large enough to hold the largest vmdk.
- Apparently, the host needs enough memory free as the size of memory allocated to the already-running virtual machine (during the move, memory usage spiked which caused all kinds of problems for me since this was a large virtual machine, but not really the point). I haven't verified this requirement yet, but that's what my initial thoughts are after seeing the behavior.
Seeing the migration options for vSphere 4, I wanted to cry on how difficult my life was made by these requirements, but that's another story. We'll be scheduling that upgrade shortly. :)
Anyway, I ran the Storage vMotion. At which point, I managed to bring the virtual machines on my host to a screeching halt, not knowing number 3 right off hand. After killing a few virtual machines and moving a few others away from this host, we were back under way.
Near the end of the Storage vMotion (at 90% to be specific), the interface stayed at the same percent for several minutes and I was greeted with this friendly error:
A general system error occurred: failed to reparent/commit disk(s) (vim.fault.Timedout)
Uh-oh. A quick Google serch found this VMware knowledgebase article.
Unfortunately, I missed the "Incorrectly" throws a timeout error message and panicked a bit. I started digging around in the destination and original datastores and found tons of "DMotion" files everywhere. While desperately looking for solutions around the web, my Virtual Center screen refreshed with the new datastores being associated with the disks and moved over. Yup, while I was freaking out, the whole thing just took care of itself.
Apparently, when working with Virtual Center, one must always have a bit of patience and remember to double-check timeouts inside Virtual Center with the ESX hosts directly. I suppose I should have known better as I've seen this kind of behavior while working with snapshots in the past. If you come across this post while searching for this error, take a few minutes to relax and let VMware do its thing in the background while Virtual Center shows stupid error.