Our first production HA failover

A quite unexpected event yesterday was the very first HA failover in production. Although we had tested it and seen it work a number of times in our testing environment it was something else to see it in the production environment. As a result of which we weren’t looking for an HA failover when all of a sudden 14 servers went down.

After reviewing the logs we found out that the servers in question were moved because of a failover. It turned out that one of the hosts lost it’s network connection for three seconds (we still don’t know why) and that HA decided to power off and move all the servers as a precaution.

We can safely say that HA works.

3 Responses to “Our first production HA failover”

  1. PC Blade Daily Links 2007-02-14 - PC Blade Daily - Practical News and Views on Centralized Computing Says:

    [...] Documenting a Virtualization Project: Our First Production HA Failover “We weren’t looking for an HA failover when all of a sudden 14 servers went down … We can safely say that HA works.” [...]

  2. Pranav Says:

    What kind of HA system do you guys use ? Heartbeat-DRBD ? or is it something else ? I too have implemented HA for our product, so was curious.

  3. martijnl Says:

    We use the VMWare HA option. See http://www.vmware.com/products/vi/features.html#c836 for details

Leave a Reply