July 7, 2007
Our virtualization project has come to an end a while ago but I would like to share the end score and some of the savings achieved with this project:
Our current VI consists of:
- Six cluster hosts (DL585 G2, 48GB memory per host)
- 120 Virtual Servers (consolidation 20:1)
- Two VDI hosts (DL585 G1, 32GB memory per host)
- 50 desktops (currently, there’s room for more)
- One dedicated ESX3 host for SAP testing and development purposes
- One dedicated ESX3 host for Lotus Notes / Domino testing and development purposes
- The above two servers have lower continuity requirements and are therefore stand-alone machines (not part of the cluster)
- 1 Virtual Center Management Server
Real estate saved (focusing on servers only, not the VDI):
- Total rack units used for VI: 30
- Total racks used for VI: 2
- Total rack units saved: 240
- Total racks saved: 8
- Total sq. meters saved: 50 (we would have had to move into a datacenter suite twice as large to accomodate for growth)
- Total real estate cost (OTC) saved: € 30.000 (approx.)
- Total real estate cost (MRC) saved: € 6.000 (approx.)
- OTC: One Time Charge
- MRC: Monthly Recurring Charge
Power savings:
- Extrapolated extra power requirement: 10 - 15 KiloWatts
- Estimated monthly power savings: € 1500 - 2500
There are also additional benefits like the massive increase in continuity, the time saved on provisioning new servers and the transparency in costs.
5 Comments |
BlogPosts |
Permalink
Posted by martijnl
July 4, 2007
Two weekends ago we have moved our data center to a new location about 60 kilometers away. While we have experience with moving a data center (we moved it to it’s previous location only two years ago) this was the first time we had to move 150 servers contained in only 30 physical servers.
Of the 150 servers about 120 reside on only six physical servers (DL585 G2) that are our VI cluster hosts. We also have two DL585 G1’s that hold about 50 Virtual Desktops, a DL585 G2 that holds a dedicated SAP testing environment. So there was quite a risk involved that servers would not survive the transport phase and we would be left with too few cluster hosts to run all the servers.
Because all the cluster servers have 24/7 HP support contracts on them we decided that the best we could do was to split up the transports in two.
In the weekend itself the transportation and mounting of the servers went without incident. It was only with powering on the first three 585’s that two of them gave critical hardware failures. One server (the dedicated SAP server) failed to boot altogether and needed to have it’s processor board replaced. On the other server the memory bank of processor 4 failed and soon after the memory bank of processor 2 failed as well. While we were able to get the server to boot and complete it’s memtest with memory in banks 1 and 3 only, ESX expects memory in all four banks or else it can’t address it. So ESX refused to boot with a kernel error.
After trying several things on Saturday I filed the support call with HP on Sunday, around 09:00 in the morning. After 15 minutes we already had HP support on the line and soon after a courier was dispatched with the necessary spares (also a new processor board and several memory modules). The HP technician was on site around 13:30 and fixed the machine.
Luckily the other cluster hosts were undamaged and they booted without incident. It was quite surprising though that such new servers would fail (they are less than a year old). We were very glad that we had stuck to our performance criteria of expanding the cluster when utilization would reach 60-65 percent per host so we had plenty of headroom to move the servers around.
5 Comments |
BlogPosts |
Permalink
Posted by martijnl
July 4, 2007
Everything on the VDI project works as expected and we are in stable production at the moment.
There have been two problems in the past two months:
- User complaining of slow screen refresh
- This is inevitable because of the distance. We get 160ms latency and some slowdowns can be expected. It is nothing critical however and the advantages outweigh this minor issue. Maybe in the future we will try to optimize graphics performance so it takes up even less bandwidth.
- Sudden packet loss on the connection
- We are routing this connection over a trunk port on a Cisco switch to the external datacenter we have just moved to and we are expecting some kind of problem with this.
No Comments » |
BlogPosts |
Permalink
Posted by martijnl
July 1, 2007
These are the specifications that we have determined to be right for us:
- 1x HP Proliant DL585R01 O\2.4-1MB Model 4-8GB (4 x Opteron 880 Dual Core cpu’s) / 8GB PC3200 memory)
- 6x 4 GB memory module (PC3200/ECC/DDR/SDRAM/2 x 2 GB Modules)
- 2x 36GB disk option (15K/Ultra320/hot-pluggable/1-inch high/universal)
- 2x FC2143 4GB PCI-X2.0 (Fiber Host Bus Adapter)
- 2x Gigabit PCI-X NC7170 Dualport 1000T Server adapter
- 1x Hot pluggable Redundant Power Supply Unit(PSU) for ProLiant DL585R01
Additional information:
Memory
32GB PC2-3200 internal memory is the maximum for the chassis. We choose memory speed over total memory size because we ended up with needing four clusternodes for the amount of Virtual Machines we want to run (80 - 100). Which means we end up with 128GB of PC2-3200 memory anyway.
Disks
It is a personal choice if you want to use disks for your ESX / VI installation or if you want to boot from SAN. We like to have the OS disks in the chassis.
SAN connection
Because of the higher impact of a component failure we use two single port Fiber HBA’s coupled to two SAN switches linked to dual controllers in the disk arrays. Failure of one component in the chain doesn’t affect production availability in this way.
Network
Same goes for the Network Interface Cards (NIC). Having a four way Gb NIC makes for higher density but a failure of the card also means failure of all four ports.
Chassis
The decision on the chassis specs is hardly something I would put under rocket science. HP has a configuration with 4 Opteron 880’s (2,4Ghz Dual Cores) built in from the factory and the premium for the 2,6Ghz model was a bit too much for my liking. We also liked this model for not having to do a lot of assembly of the server chassis.
The new models:
Have a look at my post for more explanation. The specs of the DL585 G2’s are as follows:
- HP ProLiant DL585R02 O/2,4-2P 2 GB (2 Opteron DC 8216 cpu’s / 1 MB cache / SA P400i - 512 MB - BBWC)
- 2x AMD Opteron 8216 2,4 GHz- 1 MB Dual Core (95 Watt) processor option
- 32 GB Dual Rank memory (PC2-5300 / ECC / DDR / SDRAM / (2 x 2 GB)
- 2x 36.4-GB 10K SFF SAS harddisk (10K, SFF SAS, hot pluggable)
- 3x HP NC360T PCI Express Dual Port Gigabit Server Adapter
- Hot Plug Redundant PSU for ProLiant DL580R03/R04 en ML570 G3/G4 en DL585G2
- 2x Fiber channel PCI-X 2.0 4Gb Single HBA
Comments Off |
General Info |
Permalink
Posted by martijnl