Our first production HA failover
Posted by martijnl on February 13, 2007
A quite unexpected event yesterday was the very first HA failover in production. Although we had tested it and seen it work a number of times in our testing environment it was something else to see it in the production environment. As a result of which we weren’t looking for an HA failover when all of a sudden 14 servers went down.
After reviewing the logs we found out that the servers in question were moved because of a failover. It turned out that one of the hosts lost it’s network connection for three seconds (we still don’t know why) and that HA decided to power off and move all the servers as a precaution.
We can safely say that HA works.