Monday, August 30, 2010

ESX Hosts Disconnecting After Upgrade to vSphere 4.1

When you upgrade to vSphere 4.1 your hosts might start disconnecting from your vCenter Server with the following error message: A general system error occurred: internal error: vmodl.fault.HostCommunication.  Restarting the management agents does not resolve the error, nor does rebooting the host.  This VMware KB points to name resolution issues, but that is not at fault here.  The issue is vCenter Server cannot manage an ESX 4.1 host.

Workaround / Solution

Currently there are two solutions available:
  1. Upgrade your vCenter Server to version 4.1.  (Once you've upgraded you'll have to remove the hosts from your inventory and re-add it - simply reconnecting didn't work in my case)
  2. Downgrade your ESX hosts to version 4
Strangely enough I could not find this documented anywhere on the VMware Knowledge Base, even though it seems to be a pretty widely reported problem.

Upgrading to vSphere 4.1 via a SSH CLI Session

I was tasked with upgrading a clients vSphere installation from vSphere 4.0 to 4.1.  Due to various external factors the client couldn't make use of the vSphere Update Manager, so I had to do it old-school style from the command line.  Here's how to do it:


Download the required updates
  1.  Navigate to http://downloads.vmware.com/d/info/datacenter_downloads/vmware_vsphere_4/4
  2. Download pre-upgrade-from-ESX4.0-to-4.1.0-0.0.260247-release.zip
  3. Download  upgrade-from-ESX4.0-to-4.1.0-0.0.260247-release.zip
Install the updates
  1. Put your ESX host in maintenance mode with the following command: vimsh -n -e /hostsvc/maintenance_mode_enter
  2. Install the pre-upgrade patch:  esxupdate update --bundle=pre-upgrade-from-ESX4.0-to-4.1.0-0.0.260247-release.zip
  3. Install the actual upgrade patch: esxupdate update --bundle=upgrade-from-ESX4.0-to-4.1.0-0.0.260247-release.zip
  4. Reboot the ESX host via the reboot command
  5. Last step is to exit maintenance mode: vimsh -n -e /hostsvc/maintenance_mode_exit
 All of the above can of course be automated using the Update Manager, but for those occasions where it's not possible to use it the above will come in handy.

Friday, August 20, 2010

Truncating SQL 2005 Log Files

Had a fun scenario recently where a customer (a Casino!) ran out of disk space on their SQL 2005 Server.  This in turn caused all their slot / video poker / whatever gambling machines to stop working.  Turns out old Bill Shakespeare had it all wrong, because it turns out that Hell Hath No Fury Like A Gambler Scorned!

Long story short - their SQL log file grew to gargantuan proportions, and yours truly had to whip them back into shape.  Here's how it went down!

  1. Run the following stored procedure: "use your_db_name"
  2. Followed by: "exec sp_helpfile".  This will return the physical names and attributes of files associated with your DB.  Record the DB and log filenames, without the path and extension
  3.  Enter the folowwing commands
    1. USE your_db_name
    2. GO
    3. BACKUP LOG your_db_name WITH TRUNCATE_ONLY
    4. GO
    5. DBCC SHRINKFILE (your_dblog_filename, 1) 
    6. GO
    7. DBCC SHRINKFILE (your_db_filename, 1)
    8. GO
    9. exec sp_helpfile
Step 9 should output the same info has Step 2, you can now compare the filesizes to see if the process was succesfull.

You might get an error "Cannot shrink log file because all logical log files are in use".  In that case you can follow the instruction here to resolve.  I've detailed the steps below, if you're too lazy to follow the link.
  1. Open SQL Enterprise Manager
  2. Right-click on the database you want to shrink and click Properties
  3. from the Data Properties go to Options.
  4. Set the Recovery Model to Simple and click OK and try to shrink the database
Your Database and Database log files should now shrink succesfully!

Thursday, August 19, 2010

Using Veeam backup to relocate VM's

Nope, I didn't fudge up the title.  Veeam Backup and Replication is a wonderful product, allowing you to replicate vSphere VMs to a offsite Disaster Recovery location.  When disaster strikes, it's a pretty straight-forward process to fail over to your DR site.  It's what I call a forehead procedure - you only have to hit the spacebar with your forehead.    Thanks - I'll be here all night!  

What's not so intuitive and well documented is using Veeam Backup to move VM's to a different location, for example a Server Room / Data Center relocation *and then commiting those changes*, i.e. not failing back to Production.  The below steps assume we've replicated and failed over our VM's to our DR location already.

Ruan's Step By Step Guide on using Veeam Backup to relocate VM's

  1. Delete all Veeam VM snapshots using the vSphere Snapshot Manager
  2. By default your DR replica will be named "VMname_replica", rename it back to its original name, i.e. VMname
  3. Remove the VM replica from the list of replicas in the Veeam Backup and Replication Console
  4. Delete the Production -> DR replication job responsible for replication the VM in question.  Recreate it to reflect the new Source and Target locations
  5. Delete the .vrb file from the VM datastore, as we will no longer be using these restore points
  6. Delete the replica.vrb and running.rbk files
  7. Pat yourself on the back - you've just done the easiest VM move you'll ever do!