DRS Activity

July 20, 2006

During our troubleshooting times with AutoYaST we also had the opportunity to fiddle with the new Distributed Resource Scheduling option in VI3. While the option seems great we still haven’t got it to a point where we would be comfortable with it’s performance.

The option itself is great and really has some potential and VMotion has proved itself as well so on the technical side there are no worries. The issue we are struggling with at the moment is the way DRS decides when to move VM’s away from one machine and on to the other. Even at the most conservative setting we can’t do an installation of a VM without DRS deciding to start moving machines. During the course of our SLES installations we regularly had only the newly installing VM on one clusterserver and all the other VM’s on second because DRS decided to move them all away.

If anybody has similar experiences or tips in this area we would love to hear them.


AutoYaST final

July 20, 2006

suse logoAlthough there is some configuration work with AutoYaST still to be done we’ve managed to get a working SLES9 SP3 install going. The root cause of the problem were errors in our file repository config. We have used another AutoYaST repository where an installation of SLES9 with SP2 worked flawlessly. Updating it through the YaST Online Updater tool (YOU) to SP3 also presented no problems so it’s now just a question of getting the correct files in the correct order into the repository.

Both VM’s will be installed with Oracle Application Server. We will get some load off those environments to see how the testing cluster holds up and if there’s any impact on the performance side (positive or negative).


AutoYaST partial solution

July 17, 2006

Although the cause is still unclear we have got it working now with SLES9 and Service Pack 2. This is still not what we would want because SP3 is listed as a supported configuration with VMWare.

The other odd thing about this problem is that we did manage to get it working with the Buslogic SCSI driver and SP3. However the Buslogic driver is not supported with VM’s with more than 4GB internal memory which we will be using in the Oracle database servers (among others).

To rule out a problem with a straight SP3 install we’ll try a new AutoYaST install with SP2 and then upgrade that install over the internet to SP3.


Problems with AutoYaST

July 14, 2006

We are experiencing some problems with our Suse Linux Enterprise Server 9 installations in the testlab. Our default AutoYaST configuration installs fine but on reboot it can’t find the harddisks anymore (/dev/sda.) and fails to boot. Before the reboot of the VM there is no problem at all and the server functions fine.

We haven’t found anything on the VMTN forums or other VM resources. The only thing that comes close is a problem someone described with Workstation 5.5. As both SLES8 and SLES9 are supported OS’s the problem is probably an AutoYaST problem.

We’ll try to fix this next week, I’ll post the cause and solution when we find it.


Application Licensing

July 14, 2006

The reason for making a post about licensing is the possible pitfall that could exist in your software vendor’s licensing terms.

While VMWare itself doesn’t make a big deal out of multiple core processors we nearly fell into the trap of Oracle’s dual core and virtualization standpoint. For us, having about half the servers on Microsoft Windows 2000 or 2003 we will have to keep an accurate licensing count but for the sake of the example I’ll just stick with Oracle.

Oracle’s general licensing terms are described in the Oracle Software Investment Guide (SIG) which is currently located at: http://www.oracle.com/corporate/pricing/sig.html. This is a 56 page document outlining everything there is to know about the different Oracle products.

To make life easier….. Oracle uses different metrics and licensing criteria for virtually each product. We use both Oracle 9i Enterprise database software and Oracle 10g Application Server software (which is version 9 release 4 to make it simple). In our situation there were two important criteria in determining the licensing:

1. Processor Metric

Because of the size of the company we use perpetual per processor licensing. Because we are currently using single core servers but are switching to dual-core Opteron servers it is important to keep track of Oracle’s definition of the Processor Metric. The SIG describes this as follows: “…………..For the purposes of counting the number of processors which require licensing for AMD and Intel multicore chips, “n” cores shall be determined by multiplying the total number of cores by a core processor licensing factor of .50“.

2. Partitioning Servers

Oracle doesn’t mention virtualization as a criterium in the SIG document. The wider definition used there is “Partitioning”. The SIG gives the following definition for partitioning in relation to Oracle licensing:

Oracle recognizes that customers may elect to partition servers for various reasons. These might include achieving lowered costs and simplified management by running multiple operating systems, such as Windows NT and UNIX, on the same server, or improving the workload balancing and distribution by managing processor allocations within applications and among users on the server. While there are two broad categories of partitioning – software partitioning and hardware partitioning - for licensing purposes, Oracle only recognizes hardware partitioning as a mean to install and license Oracle on fewer than the total number of processors in the box.

In short this means that for our DL585’s the Processor Metric is:

8 cores * 0.50 = 4

So if you are thinking (like we were) about ditching a two way Oracle server for a single vCPU Virtual Machine on a 4-way server which you then give extra resources to achieve the same performance and use that to save on Oracle licensing then you will see that’s not possible given Oracle’s current licensing position with regards to software partition / virtualization.

In the preparation for the project I am fortunate that I only have to check this for a limited amount of apps but keep this in mind if you have this form of licensing for other applications.


The testlab

July 13, 2006

The testing lab gave us some headaches because of the capital involved with the hardware. We did the ESX3 beta on one spare IBM xSeries 346 (our standard 2U server) but for all the advanced options in the VI3 release (such as DRS and HA) and for vMotion testing we needed two dual CPU servers with some extra internal memory.

At first we looked at renting the servers for a few months but the pricetag for two HP DL 385’s for 10 weeks was a whopping € 3500,-. Combined with the fact that the impact of a one time renting fee on our budget is rather heavy (compared to just the two or three months depreciation for a server that we own) this was not an option for us.

After some creative thinking we ended up upgrading the testing server and buying a second X346 for which we both have a use after the VMWare testing is over.

The first server will become our VirtualCenter management server. We got the tip that having the management server outside of the cluster of VMWare servers is the better option should the cluster fail.

The second server will become our monitoring server (we use NimBUS from NimSoft). Having the monitoring server outside of the cluster seemed a good idea for the same reason.

In the end the testing servers were configured as follows:

  • IBM xSeries 346
  • Dual Intel Xeon 3,6Ghz CPU
  • 7GB internal memory each
  • 2x 36 GB 15k SCSI disk for VMWare ESX installation
  • IBM 2Gb Fiber HBA
  • Intel Dual Gbit NIC

As I wrote earlier everything is installed and running. We have some VM’s online at the moment to get some load going. The biggest applications running now are Exchange 2003 Enterprise (which we use to do restores on, sometimes that’s easier than restore groups in Exchange) and the server we use for configurating the NimBUS monitoring application. To be added in the near future are testing environments for CODA, Oracle 10g Application server and Oracle 9i database, Tridion CMS and SuperOffice CRM.


Storage environment

July 12, 2006

Parallel to our virtualization project we are in the process of upgrading our storage environment.

Our current environment is an FC-SAN from HP Storageworks. It consists of three MSA 1000 arrays. Each MSA1000 has 1 additional MSA30 attached to it. The SAN Switch is an HP SAN Switch 2/16 (Brocade internals). Backups are done with an MSL5026 tape robot that is connected to an N1200 Storage Router.

Not all physical servers are connected to the SAN at the moment. Older servers and non-critical machines have not been migrated. Most of the new servers were equipped with a Fiber HBA and connected to the SAN. The server environment is a mixed batch of Linux and Windows (roughly 50/50 split).

With virtualization we will migrate all servers and their storage to the SAN. Because the current components are nearly depreciated (some have already been declared end-of-life by HP) we decided to move our planned SAN upgrade up in time (with about 4 to 5 months) rather than invest in three year old equipment.

The choices for new environment are EMC or IBM. At the moment the following configurations have been proposed:

EMC:

IBM:

These will probably not be the definitive solutions (we’ll drop the tape library for example in favor of disk because the equipment will be housed in an external location where we have to pay extra for tapehandling).

The full setup will be fully redundant with dual fiber HBA’s in the servers and dual SAN Switches etc. Our current setup doesn’t even have dual controllers and we only have a single SAN switch.


More in a week

July 2, 2006

I’m off for a short holiday so more will follow when I get back to work (July 11th).

The series of posts concerning the budget will be rounded off with a description of the testing setup (we didn’t need to budget this in specifically but as we have made some expenses on the testservers I will put it in there).

After that we have more or less arrived at the point where we are now with regards to having a working testing setup with all the necessary VMware and Platespin software installed and I will keep you updated about everything that we run into during the project.


The budget: hardware is half the fun

July 1, 2006

In our particular businesscase it certainly isn’t as the software licenses we need add up to roughly one third of the total cost.

Firstly there is the VMWare Virtual Infrastructure licensing. Compared to the different ESX server licenses it has become much easier to select the right package for your needs. The different VI3 editions described on the VMWare website should give all the information you need.

vmware.gifFor our particular case the choice was easy. Because of our recently renewed focus on Business Continuity Management (partly because of an ongoing effort to get ISO27001 certified) having the ability to use the new High Availability and VMotion options were “must-haves”. This meant that we needed the “Enterprise” version of VI3. Retail pricing for our setup (8 x 2 CPU seats, VMWare is licensed per CPU seat and not per core — kudos to them) will end up around € 40000.

There are a number of other software tools that can make your life and the migration to a Virtual Infrastructure a lot easier. During our preparations for this project we had a look at two vendors:

  • Platespin (see links in sidebar)
    • PowerConvert, which streams servers from physical machines to virtual machines and image archives over the network (and vice versa and a lot of other directions, depending on product version).
    • PowerRecon, which can be used for consolidation planning. It scans your existing network gathering data on machine utilization and workload levels and stores it in a database. After enough data has been gathered (preferably one month or longer) PowerRecon can provide a detailed report on the consolidation options for the different machines in the network. Depending on the VMWare experience available to you/your company this product competes with VMWare’s own planning and sizing programme. The consultant running this programme has to be certified (VMWare Certified Professional with specific additional training if my information is correct) to do so however. Again, it will depend on your preference. Do you want to keep it in house or hire external expertise or make a combination of both.
    • Platespin offers a (paid) Proof of Concept Package that is good for 10 Physical to Virtual (P2V) conversions and PowerRecon for 10 servers. We are in the middle of testing this package so I will write more about this in later posts. Should we decide to buy the application we will probably go for the 100 server package (about € 10000-15000) and not the Universal. For us the extra flexibility doesn’t weigh up to the price difference. Maybe if we get more and more hands-on experience this will change and we’ll upgrade at a later date.
  • Vizioncore
    • Of the products that Vizioncore offers we have had a look at ESXRanger. This can make hot backups of a running VM. It is unlikely that we will be buying it as VI3 also offers a consolidated backup solution and we have a completely different backup and DR strategy at the moment but it deserves a look if you are planning your VI.