Category Archives: ESX 4

vShield Zones – Some Serious Gotchas

OK..I’ll admit it: I am spoiled by the capabilities of vSphere. What other platform lets you schedule system updates that will occur unattended and without outages of the applications being used? I don’t mean the winders patches, they require a monthly reboot. I am talking about the hypervisor updates. VMware Update Manager coordinates all of this for you. Then along comes vShield Zones to break it all.

First, let me explain what I am trying to do. To simplify things, vShield Zones is a firewall for vSphere Virtual Machines. Rather than regurgitate how it works, take a look at Rodney’s excellent post. A customer has decided to use vShield Zones to help with PCI Compliance. The desire is that only certain VMs will be allowed to communicate with certain other VMs using specific network ports, and to audit that traffic. ’nuff said.

vShield Zones seems to be the perfect solution for this. It works almost seamlessly with vCenter and the underlying ESXi hosts. It provides hardened Linux Virtual Appliances (vShield Agents) to provide the firewalling. It provides a fairly nice management interface to create the firewall rules and distribute them to the vShield Agents. Best of all, IT’S FREE! At least for vSphere Advanced versions and above. Keep in mind, that this is still considered a 1.x release and some things need to be worked out.

Now, on to the gotchas.

Gotcha #1 – Networking

When it comes to networking, the vShield Agent is designed to sit between a vSwitch that is externally connected via physical NICs (pNICs) and a vSwitch that is isolated from the outside world. The vShield Agent installation wizard will prompt you to select a vSwitch to protect. This is illustrated below. The red line indicates network traffic flow.

Click the Image to Enlarge

Click the Image to Enlarge

This works like a champ in this configuration, using a vSwitch for management, which is naturally on an isolated network to begin with, using a vSwitch for VMs to connect to the vShield Agent and using a vSwitch to connect everything to the outside world.  This can also be deployed with limited down time. If you are lucky enough to have the Enterprise Plus version, you may want to use a vNetwork Distributed Switch or even a Cisco 1000v. You will need to make some manual configurations to make this work as outlined in the admin guide.

The gotcha is with blade servers or “pizza box” servers that have limited I/O slots. If all of the VM traffic must flow through the same physical NICs and you use a vSwitch, then you need the vShield Agent to protect a port group rather than an entire vSwitch. You will need to create a vSwitch with a protected port group and connect it to the pNICs. Then you you can install the vShield Agent. Once the vShield Agent is installed, you will need to go back to the vSwitch attached to the pNICs and add an unprotected port group. This is illustrated below. The red line is the protected traffic and the blue line is the unprotected traffic.

Click on Image to Enlarge

Click on Image to Enlarge

As you can see, there is an unprotected Port Group (ORIGINAL Network). This needs to be added to the vSwitch AFTER the vShield Agent is installed. If the ORIGINAL Network is already a part of the vSwitch, it will need to be removed BEFORE installing the vShield Agent. In order to avoid an outage, you will need to disable DRS and manually vMotion all VMs off of the ESX/ESXi host before installing the vShield Agent and modifying the port groups.

Gotcha #2 – DRS/HA Settings

The vShield Agents attach to isolated vSwitches with no pNIC connection. As you should already know, using DRS and vMotion on an isolated vSwitch could cause inter-connectivity between VMs to fail. By default, you cannot vMotion a VM that is attached to an isolated vSwitch. You will need to enable this by editing the vpxd.cfg file. You will also need to disable HA and DRS for the vShield Agents so they stay on the hosts where they are  installed. Both are well documented. Obviously, you will need to install a vShield Agent on every ESX/ESXi host in the cluster.

The Gotcha here is that, with HA disabled for the vShield Agent, there is no facility for automatic startup. There is an automatic startup setting in the startup/shutdown section of the configuration settings. First, this is an all-or-nothing setting. Second, according to the Availability Guide:

“NOTE The Virtual Machine Startup and Shutdown (automatic startup) feature is disabled for all virtual machines residing on hosts that are in (or moved into) a VMware HA cluster. VMware recommends that you do not manually re-enable this setting for any of the virtual machines. Doing so could interfere with the actions of cluster features such as VMware HA or Fault Tolerance.”

So, if a host fails, HA will restart all protected VMs on different hosts. If the host comes back on line, you risk having DRS migrate protected VMs back to that host. This will cause those VMs to become disconnected because the vShield Agent will not automatically start. If a host fails, hope that it fails good enough so it won’t restart.

Gotcha #3 – Maintenance Mode

At the beginning of this post, I mentioned how VMware Update Manager has spoiled me. VUM can be scheduled to patch VMs and hosts. When host patching is scheduled, VUM will place one host in Maintenance Mode, which will evacuate all VMs. Then, it will apply whatever patches are scheduled to be applied, reboot and then exit Maintenance Mode. It will repeat this for each host in a cluster. This works great unless there are running VMs that have DRS disabled, like the vShield Agent.

In the test environment, when a host was manually set to enter Maintenance Mode, it would stall at 2% without moving the test VMs. I am not sure the order that VMs are migrated off, but none were migrated in the test environment. This could vary in different installations. Here’s the gotcha: you cannot power the vShield Agent off because the protected VMs would become disconnected. You cannot migrate it to a different host because it would cause a serious conflict and cause protected VMs to become disconnected. The only thing you can do is place the host in Maintenance Mode, then MANUALLY (*GASP*) migrate all of the protected VMs and then power the vShield Agent off. So much for automated patch management. We’re back to the “oughts.”


I said already that vShield Zones is a 1.x product. It’s a great firewall, but it has a few gotchas that you need to consider. The benefits may outweigh the negatives. But vSphere is a 4.0 product.Some of this should be able to be addressed by tweaking vCenter or host settings.

vShield Zones should be smart enough to allow us to select specific port groups to protect rather than an entire vSwitch. I guess whatever scripting is being done in the background will need to be changed for this. Maybe we need a Ghetto vShield?

One of the REALLY smart people at VMware should be able to tell us the “order of migration” when a host is placed in Maintenance Mode. Once that is determined, there is probably a configuration file somewhere that we could tweak to change it.

There should be a way to set up automatic startup and shutdown of individual VMs. The Startup/Shutdown settings sort of deprecated once DRS was introduced. The only time it is useful is with a stand-alone server or in a NON-DRS cluster. I guess the only thing that could be done is to add a script somewhere in rc.d or rc.local to start up these VMs, but how can that be done in a “supported” fashion with ESXi and is it supported in either ESX or ESXi?

I brought these issues up with some VMware engineers and they assure me that they are working on this. Hopefully they will figure it out soon. I hate doing things manually. It seems like it is anti-cloud.

Creating an Automated ESXi Installer

Back in the summer, I saw Stu’s Post about automating the installation of ESXi. I was reminded again by Duncan’s Post. Then, I found myself in a situation when a customer bought 160 blades for VMware ESXi. With this many systems, it would be almost impossible to do this without mistakes. I took the ideas from Stu and Duncan and created an ESXi automated installer that works from a PXE deployment server, like the Ultimate Deployment Appliance. I took it a step further and added the ability to use a USB stick or a CD for those times when PXE is not allowed. The document below is a result of it.

This is a little different than the idea of a stateless ESXi server, where the hypervisor actually boots from PXE. This is the installer booting from PXE so that the hypervisor can be installed on local disk, an internal USB stick or SD card. You could also use it for a “boot from SAN” situation, but extreme care should be taken so you don’t accidentally format a VMFS disk.

As always, if anyone has comments, corrections, etc., please feel free to post a comment below.

The document can be found here -> Creating an Automated ESXi Installer


The ability to use an automated, unattended installation routine for a hypervisor is necessary whenever it is deployed to multiple systems or is done frequently. Automated installations help avoid a misconfiguration caused by human error, which become common when repetitive tasks are performed.  Because the “traditional” version of VMware ESX Server contains a Red Hat Linux based console operating system, we have been able to leverage kickstart scripts for automated installation. With the ESXi hypervisor, much of this functionality is not available because of the smaller footprint.

This document explains how to set up ESXi with little intervention. The modifications explained here can be used to deploy ESXi using a PXE server. In our examples we will use the Ultimate Deployment Appliance, but these methods will also transfer to such commercial packages as HP Rapid Deployment Pack, Altiris, or even a home grown PXE server. The modifications can also be used for deploying ESXi using a USB stick or a customized CD.


  • ESXi Server Installable The ESXi CD image can be downloaded from the VMware site, however using a systems management and monitoring server, such as HP SIM or Dell OpenManage is highly recommended. Since there are usually vendor specific CIM providers to enhance the monitoring capabilities, some vendors will provide a customized CD image with the CIM providers. These additional CIM providers will also allow for more information to be displayed in the hardware sections of the vSphere Client. A search for “ESXi” on the HP and Dell sites produced links to the latest customized images.
  • Deployment Server A deployment server will allow for a controlled, automated installation of the ESXi Server software. The ability to handle multiple operating system installations is also desired. The ability to provide PXE and DHCP services is required as well. Most times, the deployment server will be running PXE services and TFTP. The DHCP services may be running on a different server in an enterprise. This document does not explain how to set up a separate DHCP server. For this document, we will be using the Ultimate Deployment Appliance (UDA) version 2.0 (beta).
  • Virtualization Software The UDA runs as a “Virtual Appliance,” which is a pre-configured virtual machine. It will run under VMware ESXi (available as a free or licensed instance), VMware Workstation (available for purchase), VMware Player (free) or VMware Server (free). In this document, VMware Workstation is used.
  • Optional software Although no additional software is required when using the UDA, you will need additional software if you plan on using a USB stick or if you plan on creating a customized CD image:
    • VMware Converter If you plan on using ESXi or Server to host the UDA, VMware Converter can be used to import the virtual appliance.
    • Syslinux In order to make a bootable USB stick, you will need the syslinux utility. This utility is available for Linux and Windows. The UDA does not include it. As an alternative, you can use the unetbootin utility.
    • CD Imaging and Burning In order to create a bootable CD image, you will need software to create the CD image (mkisofs) and then software to burn the image to the CD media (cdrecord). The cdrtools project includes versions for Linux and Windows. Most Debian versions of Linux, such as Ubuntu, come with the cdrkit, which uses genisoimage for imaging and wodim for burning.
    • Linux Desktop If you look at the contents of the ESXi CDs using Windows (Windows 7 was used), you may see all of the files listed using all capital letters. Since the ESXi software is based on Linux, all file operations are case sensitive and expect the files to be all lower case. This may cause errors when attempting to create the automated installer. For this reason, a Linux desktop is recommended. For most of the operations, UDS may be used. The only missing software on the UDA is syslinux. For a feature rich Linux desktop, Ubuntu is recommended. A few pre-configured Ubuntu Desktop virtual appliances are also available.


Once you have a hypervisor installed you will need to configure the server and add it to vCenter in an automated fashion. Look for a future doc covering this. For now, check out these resources for post install configurations:

ESX vs. ESXi Which is Better? Revisited.

For over a year now, I have started off telling customers in Plan and Design engagements that they would be using ESXi unless we uncovered a compelling reason to NOT use it. The “which do I use” argument is still going strong. Our blog post “ESX vs. ESXi which is better?“  was posted in April and is still the most popular. It seems to be a struggle for many people to let go of the service console. VMware is trying to go in the direction of the thinner ESXi hypervisor. They are working to provide alternatives to using the service console.

VMware has provided a comparison of ESX vs. ESXi for version 3.5 for a while. Well, VMware posted a comparison for ESX vs. ESXi for version 4 last night. It’s a great reference.

Changes to the ESX Service Console and ESX vs. ESXi…again

A whitpaper was posted in the VMTN communities Thursday outlining the differences between the ESX 3.x and ESX 4.x service console. It further offers resources for transitioning COS based apps and scripts to ESXi via the vSphere Management Assistant and the vSphere CLI. Also mentioned briefly was the vSphere PowerCLI. If you are a developer or write scripts for VMware environments, also check out the Communities Developer section.

I hear it time and time again…The full ESX console is going away. ESXi is the way to go. I know there are valid arguments for keeping ESX around, but they are few. Failing USB keys may be a valid argument, but I have not heard of this happening. If that is the case, use boot from SAN. You need SAN anyway. As for hung VM processes, there are a few ways to address this in ESXi.

If the techie wonks at VMware are publishing articles about how to transition to ESXi, then resistance is futile…you WILL be assimilated…

Setting up a Splunk Server to Monitor a VMware Environment

In a previous article, I compared syslog servers and decided to use Splunk. Splunk is easy to set up as a generic Syslog server, but it can be a pain in the ass getting the winders machines to send to it. There is a home brewed java based app on the Splunk repository of user submitted solutions, but I have heard complaints about its stability and decided that I was going to set out to find a different way to do it.

During my search, I discovered some decent (free!) agents on sourceforge. One will send event logs to a syslog server (SNARE) and one will send text based files to a syslog server (Epilog). Using the SNARE agents appear to be more stable than using the Java App and does a pretty good job. So I basically came up with a free way to set up a great Syslog server using Ubuntu Server, Splunk, SNARE and Epilog.

I created a “Proven Practice Guide” for VI:OPS and posted it there, but it seems that it is stuck in the approval process. I usually psot the doc on VI:OPS and then link to it in my blog post, and follow up later with a copy on our downloads area. To hurry things along, I also posted it in both places:

VMTN: I/O Performance in vSphere, Block Sizes and Disk Alignment

Yes folks, it rears its ugly head again…Disk Alignment… If you have not read it yet, check out the whitepaper on disk alignment from VMware.

First, chethan from VMware posted a great thread on VMTN about I/O performance in vSphere. The start of the thread talks about I/O, then leads into anice discussion about block size. A couple of weeks ago, Duncan Epping posted a very informative article about block sizes. It convinced me to use 8MB blocks in VMFS designs.

Finally, the thread kicked into a discussion about disk alignment. As you know, the VMFS partitions created using the VI Client will aoutmatically be aligned. This is why I advocate NOT putting VMFS partitioning into a kcikstart script. The whitepaper demonstrates how to create aligned patrtitions on winders and Linux guests as well. The process is highly recommended for any intensive app. But I have always questioned the need to do this for system drives (C:) on guests. To do it requires a multi step process or the use of a tool, like mbrscan and mbralign, And I have wondered if it was worth the effort. Well, Jason Boche gave me a reason why it should be done across the board. And it makes sense: “This is an example of where the value of the savings is greater than the sum of all of its parts.”

Jas also outlined a very nice process for aligning Linux VMs and fixing a common Grub issue. Thanks for the tip Jas!

I should also thank everyone else involved: Chethan, Duncan and Gabe!

vSphere Install and Upgrade Best Practices KB Articles and Links

So, I use NewsGator to aggregate a BAZILLION feeds from several sources, blogs, like this one, actual news feeds and a bunch of VMware feeds. The VMware feeds are from the VI:OPS and VMTN forums. The VMTN forums allow you to create a custom feed by selecting the RSS link at the bottom right of each page or you can get a feed from a specific section of the forum by clicking the link on the bottom left of a list. On of the custom feed options is to get a feed of the new KB articles.

VMware has released quite a lot of new KB articles surrounding vSphere. They just released nice best practice guidelines for installing or upgrading to ESX 4 and vCenter 4. They are short and to the point. There is also a nice article covering best practices for upgrading an ESX 3.x virtual machine to ESX 4.0. One thing I noticed, but never thought about is this :

“Note: If you are using dynamic DNS, some Windows versions require ipconfig/reregister to be run.”

Eric Seibert over at vSphere-Land posted a nice set of “missing links” for everything vSphere. This is a nice, comprehensive set of links to evetrything you need for vSphere upgrades or installs.So, go check that out as well.

Installing VMware ESX 4 in Text Mode

There are many reasons to install VMware ESX in text mode. The main reasons I use text mode are that it seems quicker for me and text mode responds better when using remote console connections, such as iLo, DRAC or console over IP. Previous versions of VMware used a text mode that incorporated Anaconda and was very similar to the text mode for RPM based Linux distributions. The new text mode in ESX 4 is VERY rudimentary when compared to the earlier versions. Hoever, it performs very well and is fairly straight forward to use.

The text mode installer uses simple lists of choices. Usually, 1 is for continue or to answer yes. Some items will have more than one choice. Here is a screenshot:


The console OS truly appears as a VM in this version. You must create a datastore and then a VMDK that represents the COS. A disk of sufficient size will be required for this. My first attempt, using an 8GB disk failed. My second attempt, using a 10GB disk was successful.

You can download a doc outlining text mode installation HERE.