I received an invitation to take the VCAP5-DCD BETA Exam a couple of weeks ago and took the exam at Partner Exchange. Like Jason Boche’s experience, the exam consisted of 131 questions and I was given 225 Minutes to complete. The experience was similar to my VCAP4-DCD experience over a year ago.
I’ve been meaning to write this post since May 2010! Never got around to it but I am starting to see and hear about it again. The costs of Windows Server systems become important if you are renewing licenses or you are in a consolidation project. I find that many businesses will purchase server hardware with a Windows OEM license. The problem with an OEM license is that it CANNOT be transferred to a different server. Some people find this out the hard way during P2V conversions.
Interesting that I stumbled on this right after I posted a How-To for the free vSphere Hypervisor.
Mike Adams over at VMware wants to know if you are using the free vSphere Hypervisor. If so, he would like you to complete a very short survey so he can understand how you are using it. Check out his post on the VMware blog.
One of the things that some of the Microsoft Hyper-V users tout is that Hyper-V is free. Sometimes, like in smaller offices or branch locations, it may even make sense to use Hyper-V where there are four or less VMs on a host. The Windows 2008 Server Enterprise Edition license will allow for the base Hyper-V installation and up to four Windows 2008 Server VMs on the hardware. But there are many valid reasons to run VMware’s vSphere Hypervisor, the free version of ESXi in your datacenter.
This is one of those situations where I really start to hate computers! I was working with vCloud Director with a goal of having a winders VM run through guest customization, change the name, get a fixed IP from the network pool, join an Active Directory Domain and move to a specific OU in the AD.
There is a spot in the VM properties to specify a domain to join. You can use the settings specified in the organization or enter the domain information directly.
This post includes an important security “gotcha” that I recently uncovered with vCloud Director 1.5 running on vSphere 5. If you are using vCloud Director, you should check your settings.
The BIG Security Issue
The PAVMUG session on Sept 22nd, 2011 that seemed to have the second most active audience was the session where I discussed vSphere 5 licensing and some of the design considerations. There were several good questions that I would like to re-address here and share some helpful links that I promised during the session. There is a great PowerCLI script and a tool that VMware themselves offer.
For the September 2011 PAVMUG all day meeting, I participated in four sessions. To me, the session with the most audience participation was about virtualizing business critical applications. My session really dug deep into Microsoft Exchange but also covered some basics around SQL and Oracle. I wanted to expand some of the ideas that were discussed during the session and post the presentation slides.
Welcome to Tech-Tap, where we adjust things with a ball peen hammer. Thanks for checking us out!
If you have ever wanted to give something a “technical tap” with a ball peen hammer, then you have come to the right place.
Below, please find archives of posts from my previous blogs. New posts are coming soon, so check back in or subscribe to our RSS feed.
I was doing some research for session I am presenting at an upcoming PAVMUG session about vSphere remote management when I came across an apology by one of the PowerGUI guys. Essentially, he was apologizing for something that VMware changed in the functionality of PowerCLI that affects how the PowerGUI Virtualization Powerpack interacts with it.
If you didn’t know it yet, VMware announced a while back that future releases of VMware will not include the “traditional” ESX Server. From their site:“VMware vSphere 4.1 and its subsequent update and patch releases are the last releases to include both ESX and ESXi hypervisor architectures. Future major releases of VMware vSphere will include only the ESXi architecture.”
If you are in a “24/7/365″ shop then the applications running in your private cloud should currently be in virtual data centers (vDC) that are contained in DRS/HA clusters and the migration can be completed with no downtime to the applications. However, there are still other systems, such as development and test systems or possibly some minor infrastructure services applications that may not benefit from vSphere’s availability features. I know many people have scheduled outages, shutdowns, etc. during the upcoming holidays. It may the best time to migrate to ESXi…
A Little Bit About ESXi 4.1
I have actually been pushing ESXi since version 3.5. In every plan and design engagement where I have been involved, I have always started with ESXi as the version to be used unless there was a compelling reason NOT to use ESXi. I think the only thing right now that requires ESX is HP’s Matrix. I have yet to find anyone that uses that behemoth. There have been many improvements to the features of ESXi since version 4.0. VMware has a nice “Why ESXi” web page to explain these new features. The biggest thing is that ESXi has a smaller footprint and has fewer security updates needed than the “traditional” ESX Server. For instance, the latest round of patches has 11 for ESX and only two for ESXi. The small footprint allows ESXi to be installed on a USB stick, an SDHC or just on less disk space. There is also a nice VMware KB article explaining all of the differences between the current version of ESX and ESXi.
Here are some other things that I like about ESXi:
Treat the ESXi Server like it is an appliance
The ESXi installation should be treated as firmware. It is even called firmware in Update Manager. This means that there is no actual upgrade. The firmware is installed fresh every time. This also brings the idea of having stateless servers at some point in time. More on that later.
Really, there are three methods for a mass deployment. One is to use Host Profiles, but this requires manual steps and the extra costs associated with Enterprise Plus licenses. A second method is to install a base install of ESXi using the method I outlined in a previous post and then use PowerCLI or the vSphereCLI to customize the server. The new preferred method is good ‘ol Kickstart.
Several Boot Options
Now, you can boot from local disk, SAN Disk, a USB Key or an SDHC Card. One of the arguments some have against booting from USB or SDHC is the dreaded single point of failure (SPOF). My answer has always been that HA will cover this. And, if you think about it, an internal array controller is a SPOF as well. But, now you can boot from SAN if you wish. Just remember to create a space for logs and core dumps if you go the route of USB or SDHC.
Active Directory Integration
ESXi servers can now become members of a Micro$oft Active Directory Domain so that administrators can authenticate to the directory. Setting this up is done through the vSphere Client under the configuration tab. It appears that you can call the administrators group anything you want in AD as long as it is “ESX Admins”. Maybe that will change in future versions.
You can just do away with SNMP monitoring and use the Common Information Model (CIM) providers. The stock version comes with some basic CIM monitoring, but the major hardware companies provide custom baked versions of ESXi with OEM-specific CIM providers for more granular monitoring. Rather than setting up your monitoring software for SNMP, you just point it to the ESXi server and set it up for CIM or WBEM. If you are not using a monitoring server, you can use the vCenter server to handle alerts and alarms.
Enhanced Tech Support Mode and vCLI Operations
Using Tech Support Mode, formerly known as “Unsupported Mode,” is now supported. You can log in as an administrative user and all commands run in TSM are logged. The biggest reason many people went into TSM was to kill a stuck VM. There is now a vCLI operation that will allow this.
The Migration Path to ESXi
Here is the “Super Dave Method” for migrating to ESXi. Like I mentioned before, if you have at least ESX Clusters with vMotion enabled, this will be a relatively painless process, with no downtime. Obviously, if you are also upgrading from a previous version, there will be reboots required for VMware Tools and possibly to upgrade the VM Hardware to version 7.
TEST TEST TEST
Yeah, you heard me. Take some time to familiarize yourself with the differences between ESX and ESXI by setting up at least one test system so you can hammer it before changing your production systems. The nice thing is that you don’t need hardware to do this. You can set up a VM and run ESXi as a VM.
vMA, vCLI and PowerCLI – Oh My
If you are already familiar with doing things from the ESX console, the vCLI will be very familiar to you. If you are a Winders guy, you should already know PowersHell, so the PowerCLI will be familiar. Also, get to know the vSphere Management Appliance (vMA). The vMA is a RHEL5 based VM Appliance that is pre-configured with the Perl SDK and vCLI. It also includes vi-fastpass command, which allow you to pass authentication through to the hosts and to vCenter. It also includes vilogger, which allows you to create a (very) poor man’s syslog server. I think you are better served using Splunk! or phplogcon. Either will allow you to parse the syslog entries eisier if you need to do any troubleshooting or forensics. If you want a nice guide to the vMA, head over to the vGhetto for some nice tips and tricks. If you are already a Linux shop and RHEL is not your cup of tea, then you can bake your own vMA with the Perl SDK and vCLI, but you will lose the vi-fastpass and vilogger capabilites. vi-fastpass is not a huge loss and you can set up your home baked vMA to include syslogd and phplogcon.
Create a Kickstart Script
The most efficient way of performing an installation on more than one server is to script it. VMware now supports using Kickstart. Test your KickStart script on the test ESXi server, weather it is physical or virtual, you should be able to test most of the functionality. Check out the nifty “DryRun” setting too. I am not going to rehash what has already been done or said regarding Kickstart, so here are some decent links:
- The VMware ESXi Kickstart Documentation Page
- The VMware in SMB Blog
- The vGhetto
- Kendrick Coleman’s Guide
- vMA and AD Troubleshooting Guide
Take a look at this VMware Labs Fling if you want to create a nice deployment server that automates the ENTIRE process. This brings up the idea of having stateless ESXi servers. As each one is inserted into the vDC, ESXi is automatically set up for you. All the “important stuff” is stored on shared disk.
If you decide to go the route of using PowerCLI, take a look at this nice post.
Back up EVERYTHING!
Back up EVERYTHING! ‘Nuff said? There is no better feeling when the shit hits the fan than to have a good backup. Make sure you back up the vCenter database and capture the configuration settings of all of the ESX servers. If you have Enterprise Plus, capture a host profile for each server. Take screen shots of the configurations settings in the vSphere Client. Include networking and all of the tabs in the vSwitches. Include storage and all of the tabs of any iSCSI initiators. Don’t forget the IQNs! Check the advanced settings for any tweaks. Document the whole thing.
Pick the First Victim and Move Your VMs
If you are using DRS and HA, disable HA. Then place the first victim host in Maintenance Mode. This should automagically move the VMs to other hosts in the cluster if DRS is working. Or you can manually (*GASP*) vMotion your VMs. If they are on local storage, set up an iSCSI server – OpenFiler is free. You can then use Storage vMotion to move the VMs to the iSCSI storage.
Remove from inventory
Yep! If the server is a part of a cluster, remove it from the cluster. Before removing it from inventory, unassign the license. Then, remove it from inventory. Once it has been disassociated from the vCenter Server, shut it down.
Now is a good time to update the BIOS, device firmware, etc. Blow out the dust. Perform some hardware TLC. If you do not trust yourself, disconnect it from the SAN. If you are using software iSCSI initiators or NFS, no worries because you will need to reconfigure this stuff after the install.
Now is the time to actually perform your first ESXi installation. Go ahead, we’ll wait.
Once ESXi is installed, you can add it back to the vCenter Server, add the license and perform any post configuration steps like apply the host profile or run your PowersHell script for post configuration. Once the post configuration is completed, confirm that all of the settings are correct and match what you documented previously. If all is good, add it back to the original cluster. If you want to set up Enhanced vMotion Compatability (EVC) and it’s not currently set up in the original cluster, you can create a new cluster for the ESXi Servers with EVC enabled.
Repeat the process for each server. If you decided to create a new cluster, start moving VMs to the new cluster.
If you are installing ESXi to a USB Stick or an SDHC card:
- Make sure the stick or card is supported by the hardware manufacturer.
- Set up a syslog server. This is very important.
- If you don’t set up a syslog server, make sure you have a swap partition on the centralized storage. If you do not do this, the logs are deleted on reboot. Then set up a syslog server.
Make sure that during the test phase you understand how to set up authentication, syslog settings and nay required custom settings. Also make sure you know your way around the DCUI and the Tech Support Mode console.
Yesterday, I took the VCAP4-DCD beta exam (VDCD410), like many others have done this week. There is a thread started already in the communities forum. Just like previous beta exams with VMware, I had to take the exam at a Pearson-owned facility. There are a few within an hour of me, so I didn’t need to fly anywhere. It’s funny, Jason Boche mentions that he couldn’t take coffee into the testing center. The person at the facility where I took the exam said no food, drink, candy, etc. The center where I usually take exams is more laid back and are OK with coffee as long as there is a lid. In fact, they offer to sell you bottled water to take in with you. VMware only requires a finger print scan and a photo with the two forms of ID. I noticed some people were getting their hand vein patterns recorded. Crazy.
Passing the VCAP4-DCD exam is one of the requirements for anyone, including a VCDX3, to achieve the VCDX4 certification. So passing this exam is very important to me.
The beta exam was 131 questions/tasks with four hours to complete. (There was a guy before me that was taking a test that lasts 630 minutes!) I would think that some of these may get cast off as improper, too easy or too hard. If all of the questions prove to be OK, then VMware has a nice, fair pool of design questions. I would also think that the “GA” exam will only be a portion of these questions.
There are three types of questions or tasks: The standard multiple guess questions, a few “match the object to a category” drag and drop tasks and a few diagramming tasks.
Exam Content – What You Need to Know!
I am under countless NDAs on this, so there will be no “scoops” here. I can say that the exam is true to the Blueprint. Rather than giving a direct link to the PDF, which could change, I will tell you to go to www.vmware.com/go/vcap. Click on the “Datacenter Design” tab. There is a link to the current blueprint there. It was just changed to fix an issue with broken hyperlinks. There are also links to a FAQ, the VCAP Communities landing page and a link to a demo of the diagramming tool.
Make sure you read and understand all of the documents and web pages that are linked in the blueprint. VMware leaves no stone unturned. I would not advise trying to memorize all of the content listed in the blueprint, your head would explode. Just comprehend what you read. Much of this is conceptual and revolves around the methodology and best practices GUIDANCE that VMware chooses to publish.
Make sure you take a look at the VCAP4-DCD Exam UI Demo. This is the exam version of a Visio tool. One of my complaints about the VI3 Design Exam was the quality of the diagramming tool. It is greatly improved in this exam. In the VCAP4-DCD beta exam, there were more than one diagramming task. I don’t think VMware is looking for the Mona Lisa. This is more of a “show your work” kind of thing. The diagramming tasks are not that complex and will only cover a few design criteria in each task.
Since this is a DESIGN exam, there are plenty of scenarios that involve capacity planning. Since you will not have any tools available to do your work, you will need to understand the math involved in capacity planning. There is a simple calculator available via a link at the top of the screen. You will also need to understand the math involved with calculating HA, DRS, reservations, shares and limits.
Finally, with the beta, there was a time constraint. I think I had about 10 or 15 minutes left when I was done. Make sure you manage your time. There was no “back” button and there was no way to mark questions or tasks for review in the beta exam. This may or may not hold true for the “GA” exam. Remember: If there is no “back” button or way to mark a question, make sure you are OK with your answer before clicking the “next” button. I clicked it a few times as I was thinking “Maybe I should read that again….”
My Soapbox Moment
I don’t want to “toot my own horn” or sound arrogant here, but I purposely did not “study” for this exam. I did read the blueprint and skim some of the documents. The hyperlinks were broken in the 1.2 version and I didn’t try to find them too hard. I didn’t study for the VCAP4-DCA exam either (I passed by the skin of my teeth!). In my (humble) opinion, the exams require that you have EXPERIENCE in the subject of the exam. I don’t think VMware intends to have “paper VCAPs” although I am sure there will eventually be some out there.
If you want to pass the VCAP4-DCA exam, you should have experience managing a vSphere environment. If you want to pass the VCAP4-DCD exam, you should have experience in designing at least one vSphere environment. You need to go through the thought processes involved in the ASSESS – DESIGN – IMPLEMENT – MANAGE cycle. I am sure that the design workshop will assist you in gaining the knowledge and some experience in designing a vSphere environment, but it won’t give you everything you need for passing this exam. Certainly, if you want to progress to the final step and submit and defend a design, you will need EXPERIENCE. This is why there is such a high fail rate for the design defense.
Since everyone else in the world is heralding the release of vSphere 4.1, I figured I would post some bad news. The stuff you may want to know BEFORE you jump into upgrading to vSphere 4.1. Before I start, I want to make it clear that vSphere 4.1 is a great product overall. And I have already been leaning to ESXi, so the announcement that this will be the last release with the “traditional” ESX has been expected. I will talk about ESXi and its improvements in a later post. I just want you to be aware of these rather significant Gotchas.
Gotcha #1 – Read Only Role allows members to add VMKernel NICs
From the release notes (You actually READ these, right?):
- Newly added users with read-only role can add VMkernel NICs to ESX/ESXi hosts
Newly added users with a read-only role cannot make changes to the ESX/ESXi host setup with the exception of adding VMkernel NICs, which is currently possible.
Workaround: None. Do not rely on this behavior because read-only users will not be able to add VMkernel NICs in the future.
This is a fairly big security issue. I just LOVE the workaround notes. To be fair, I have found only one installation in my experience that uses the Read-Only Role. In my opinion, if they don’t have access to the physical data center, they don’t need any access to vCenter. But this is just something that should have been corrected before release.
Gotcha #2 – ESX/ESXi installations on HP systems require the HP NMI driver
- ESX installations on HP systems require the HP NMI driver
ESX 4.1 instances on HP systems require the HP NMI driver to ensure proper handling of non-maskable interrupts (NMIs). The NMI driver ensures that NMIs are properly detected and logged. Without this driver, NMIs, which signal hardware faults, are ignored on HP systems with ESX.
CAUTION: Failure to install this driver might result in silent data corruption.
Workaround: Download and install the NMI driver. The driver is available as an offline bundle from the HP Web site. Also, see KB 1021609.
It seems that every time HP releases a new set of SIM agents for ESX, something breaks. Is this VMware’s way of putting it on HP? Or was this an “OOPS”? If you search for “HP VMware NMI Driver” you come up with nothing. No download. It was no where to be found on Monday, but I did find it today on the HP support site.
Gotcha #3 – VMware View Composer 2.0.x is not supported in a vSphere vCenter Server 4.1 managed environment
The basic issue here is that vCenter 4.1 only works on a 64-bit system. View Composer only works on a 32-bit system. From the KB Article:
Don’t these guys talk to each other? Didn’t they learn their lesson with the PCoIP issues? And why can’t you just admit it in the release notes instead of putting a link to the KB article? I completely missed this Monday morning.
Gotcha #4 – vCenter Installer SILENTLY Changes SQL Server Settings to Allow Named Pipes
- vCenter Server installation or upgrade silently changes Microsoft SQL Server settings to enable named pipes
When you install vCenter Server 4.1 or upgrade vCenter Server 4.0.x to vCenter Server 4.1 on a host that uses Microsoft SQL Server with a setting of “Using TCP/IP only,” the installer changes that setting to “Using TCP/IP and named pipes” and does not present a notification of the change.Workaround: The change in setting to “Using TCP/IP and named pipes” does not interfere with the correct operation of vCenter Server. However, you can use the following steps to restore the setting to the default of “Using TCP/IP only.”
- Select Start > Programs > Microsoft SQL Server 2005 > Configuration Tools > SQL Server Surface Area Configuration.
- Select Surface Area Configuration for Services and Connections.
- Under the SQL Server instance you are using for vCenter Server, select Remote Connections.
- Change the option under Local and Remote Connections and click Apply.
Can you hear the DBAs pissing and moaning?
Gotcha #4a – SQL Database is changed to Bulk Recovery Model (updated 10/27)
This on is funny. I just found out about it on 10/27/2010. When is comes to SQL for the vCenter database, VMware recommends using a simple recovery model. So, with their attention to detail, the upgrade process changes the database to a bulk recovery model. Inn this model, the logs keep growing until a backup purges it. No good.
Transaction log for vCenter Server database grows large after upgrading to vCenter Server 4.1 – http://kb.vmware.com/kb/1026430
Again vSphere 4.1 brings some great improvements and some welcome changes. As the product matures and more vendors work with the APIs, we will see some nice features that will help you in your journey to the private cloud. The Gotchas listed above may not exist if quality assurance is tightened. I think I would rather hear that a release is delayed because of pending bug fixes. How long will we need to wait to fix these? In any case, if the Read-Only Role or the View Composer gotchas don’t apply, then jump right in and install or upgrade to vSphere 4.1. Just make sure you install the NMI drivers and fix the SQL settings.
I got a tweet from William Lam last night. It looks like versions are hard-coded in Capacity-IQ making it incompatible with vSphere 4.1. Will also explains two ways to make it work.
In case you have been living under a rock and haven’t heard, VMware is getting ready to release a new set of advanced certification exams that will take you along the path to become a VMware Certified Design Expert on vSphere 4 (VCDX4). Just like VCDX3, it starts with the requirement of being a VMware Certified Professional on vSphere 4 (VCP). You will then need to pass two exams before being able to submit and defend your design. VMware has decided to award new certification statuses for passing these exams. The exam to become a VMware Certified Advanced Professional on vSphere 4 – Datacenter Administration (VCAP-DCA) is currently finishing up its beta run. The exam to become a VMware Certified Advanced Professional on vSphere 4 – Datacenter Design (VCAP-DCD) is not yet in beta. The path to achieve VCDX4 status is laid out on VMware’s site and is illustrated below:
Just like Jason Boche, William Lam and Duncan Epping, I had the privilege of taking the beta version exam. As you can see from the upgrade path, I am not required to take the exam to obtain the VCDX4, but I am a glutton for punishment I guess. Also, not having it as a requirement took some of the pre-test jitters off of me. At first, scheduling conflicts prevented me from being able to sit for the exam within VMware’s original deadline. However, I got a call on June 17th that I could take it on July 2nd. Wow…a two week notice, and on my only scheduled day off since April. But I eagerly accepted the invite. Because of the limited notice and the fact that I was juggling a few projects at the same time I debated even studying for the exam. An unscientific survey on twitter showed that 4 out of 4 followers recommended that I study for the exam. I don’t want to come across as arrogant or as a “know-it-all.” My argument here is that I am already a VCDX, I should know this stuff. My schedule and my severe procrastination tendencies made me decide to do a little bit of review the night before.
Before I begin with my thoughts on the exam content, I want to express that I only had two “issues” with the exam experience itself. First a little bit of background: The exam consists of 41 “questions”, which are actually multifaceted problems that you need to solve with the tools that are presented to you. You have 4.5 hours to complete the exam. The problems are presented in a familiar Vue test engine. You click a button to switch to a desktop session with a few of the typical tools used to administer a vSphere environment. The issue was with the screen refresh for the GUI based tools. When I clicked on an item, sometimes all of the tabs are not presented properly or the content is not complete. This was pretty annoying and sometimes a hindrance. When I participated in the beta exam for the VI3 Advanced Administration Exam, I did not experience this. Hopefully, this will be cleared up before the exam becomes GA. I would think that a leader in desktop virtualization would have a method to avoid this type of thing. The second issue is a provision for breaks. You can take “unscheduled breaks” but I think the clock keeps ticking. It would be nice to actually have a scheduled break without a time penalty. As you get older, you NEED the breaks…
Now, on to the content. Forget about me actually telling you the actual content of the exam. The NDA prevents this and I want to participate in future beta exams. I got my VCDX3 via beta exams and I hope to get my VCDX4 this way!
I’ll admit it. Working primarily in the SMB market limits your skills a bit. I am not as exposed to some of the more advanced features of vSphere 4 as I used to be when I worked in an “enterprise” market. I skipped a couple of problems because of this. I intended to return to them, but the clock ran out before I could. The problems were a very good compendium of the advanced skills required of a more senior VMware Administrator. It was the toughest exam that I have ever taken. The second toughest was the VI3 Advanced Administration Exam. I thought the questions were very fair and there was nothing in the content that caused me any objections.
I was pretty relaxed when I started the exam, but started to PANIC during the last 30 minutes.
The one (personal) issue I have with this type of exam is that it measures you at a point in time on how much you have memorized. Since I don’t want to use an example of a problem that may be on a VMware exam, I will use one of my cars as an example here. Say, for instance that I am sitting in on the 1972 Ford Gran Torino Advanced Administration Exam…
Let’s say a question on the exam is to set the Ignition Points gap. This is something I did a few times on several cars. I know where to find the ignition points. I know how to set the gap. I have the proper tools to do it. But I don’t know what that setting should be. In the REAL world, I would look it up in a manual or on Google. And I looked up the setting every time I did it. Would I fail the test because I know HOW to do it, but don’t know the proper setting? Probably. My teenie brain can’t hold all of this information – especially with all of the Monty Python references in there, not to mention the words for almost every song by Rush and Iron Maiden…
Back on track… Echoing Duncan, Jason and William, I have a few tips to offer for this exam:
- Read the Exam Blueprint. Perform each task listed in the blueprint a few times, so you know HOW to do it. You DO have access to “–help” and man pages during the exam if you are stumped. However, refer to item #3.
- Build a LAB! You will need it for item #1. You don’t have to go out and buy servers and storage. All you need is a reasonably fast 64bit PC or laptop with a decent amount of RAM. Some things may be slow, but you will get through it. You can make an ESX server in a VM. Use VMware Player or VMware Workstation to host your lab VMs. Every VMware product in the blueprint is either free or has an evaluation period. Didn’t you get a free VMware Workstation license with your VCP?
- Manage your time! I ran out of it. You have the opportunity to go back. Skip questions if you don’t know how to do it or think it will take a while. The other thing I noticed was that, since the exam is using a live lab environment, the tasks happen in real time. During my panic state, I started to multitask and work on more than one problem at a time. Instead of clicking “Next” and waiting for the task to complete, click “Next” and start on the next problem. Juggle two or three problems. Use your dry erase board to keep track of skipped problems and multitasking. I am not very fast with my typing and I am constantly mixing up letters in words. I call it “typing dyslexia” and it doesn’t help me in these situations!
I don’t know if I passed this one. I am a little bit pessimistic at this time. I will find out in “4-6 weeks”, but that is VMware Time… Good luck to all that have or are planning to take this exam.
OK..I’ll admit it: I am spoiled by the capabilities of vSphere. What other platform lets you schedule system updates that will occur unattended and without outages of the applications being used? I don’t mean the winders patches, they require a monthly reboot. I am talking about the hypervisor updates. VMware Update Manager coordinates all of this for you. Then along comes vShield Zones to break it all.
First, let me explain what I am trying to do. To simplify things, vShield Zones is a firewall for vSphere Virtual Machines. Rather than regurgitate how it works, take a look at Rodney’s excellent post. A customer has decided to use vShield Zones to help with PCI Compliance. The desire is that only certain VMs will be allowed to communicate with certain other VMs using specific network ports, and to audit that traffic. ’nuff said.
vShield Zones seems to be the perfect solution for this. It works almost seamlessly with vCenter and the underlying ESXi hosts. It provides hardened Linux Virtual Appliances (vShield Agents) to provide the firewalling. It provides a fairly nice management interface to create the firewall rules and distribute them to the vShield Agents. Best of all, IT’S FREE! At least for vSphere Advanced versions and above. Keep in mind, that this is still considered a 1.x release and some things need to be worked out.
Now, on to the gotchas.
Gotcha #1 – Networking
When it comes to networking, the vShield Agent is designed to sit between a vSwitch that is externally connected via physical NICs (pNICs) and a vSwitch that is isolated from the outside world. The vShield Agent installation wizard will prompt you to select a vSwitch to protect. This is illustrated below. The red line indicates network traffic flow.
This works like a champ in this configuration, using a vSwitch for management, which is naturally on an isolated network to begin with, using a vSwitch for VMs to connect to the vShield Agent and using a vSwitch to connect everything to the outside world. This can also be deployed with limited down time. If you are lucky enough to have the Enterprise Plus version, you may want to use a vNetwork Distributed Switch or even a Cisco 1000v. You will need to make some manual configurations to make this work as outlined in the admin guide.
The gotcha is with blade servers or “pizza box” servers that have limited I/O slots. If all of the VM traffic must flow through the same physical NICs and you use a vSwitch, then you need the vShield Agent to protect a port group rather than an entire vSwitch. You will need to create a vSwitch with a protected port group and connect it to the pNICs. Then you you can install the vShield Agent. Once the vShield Agent is installed, you will need to go back to the vSwitch attached to the pNICs and add an unprotected port group. This is illustrated below. The red line is the protected traffic and the blue line is the unprotected traffic.
As you can see, there is an unprotected Port Group (ORIGINAL Network). This needs to be added to the vSwitch AFTER the vShield Agent is installed. If the ORIGINAL Network is already a part of the vSwitch, it will need to be removed BEFORE installing the vShield Agent. In order to avoid an outage, you will need to disable DRS and manually vMotion all VMs off of the ESX/ESXi host before installing the vShield Agent and modifying the port groups.
Gotcha #2 – DRS/HA Settings
The vShield Agents attach to isolated vSwitches with no pNIC connection. As you should already know, using DRS and vMotion on an isolated vSwitch could cause inter-connectivity between VMs to fail. By default, you cannot vMotion a VM that is attached to an isolated vSwitch. You will need to enable this by editing the vpxd.cfg file. You will also need to disable HA and DRS for the vShield Agents so they stay on the hosts where they are installed. Both are well documented. Obviously, you will need to install a vShield Agent on every ESX/ESXi host in the cluster.
The Gotcha here is that, with HA disabled for the vShield Agent, there is no facility for automatic startup. There is an automatic startup setting in the startup/shutdown section of the configuration settings. First, this is an all-or-nothing setting. Second, according to the Availability Guide:
“NOTE The Virtual Machine Startup and Shutdown (automatic startup) feature is disabled for all virtual machines residing on hosts that are in (or moved into) a VMware HA cluster. VMware recommends that you do not manually re-enable this setting for any of the virtual machines. Doing so could interfere with the actions of cluster features such as VMware HA or Fault Tolerance.”
So, if a host fails, HA will restart all protected VMs on different hosts. If the host comes back on line, you risk having DRS migrate protected VMs back to that host. This will cause those VMs to become disconnected because the vShield Agent will not automatically start. If a host fails, hope that it fails good enough so it won’t restart.
Gotcha #3 – Maintenance Mode
At the beginning of this post, I mentioned how VMware Update Manager has spoiled me. VUM can be scheduled to patch VMs and hosts. When host patching is scheduled, VUM will place one host in Maintenance Mode, which will evacuate all VMs. Then, it will apply whatever patches are scheduled to be applied, reboot and then exit Maintenance Mode. It will repeat this for each host in a cluster. This works great unless there are running VMs that have DRS disabled, like the vShield Agent.
In the test environment, when a host was manually set to enter Maintenance Mode, it would stall at 2% without moving the test VMs. I am not sure the order that VMs are migrated off, but none were migrated in the test environment. This could vary in different installations. Here’s the gotcha: you cannot power the vShield Agent off because the protected VMs would become disconnected. You cannot migrate it to a different host because it would cause a serious conflict and cause protected VMs to become disconnected. The only thing you can do is place the host in Maintenance Mode, then MANUALLY (*GASP*) migrate all of the protected VMs and then power the vShield Agent off. So much for automated patch management. We’re back to the “oughts.”
I said already that vShield Zones is a 1.x product. It’s a great firewall, but it has a few gotchas that you need to consider. The benefits may outweigh the negatives. But vSphere is a 4.0 product.Some of this should be able to be addressed by tweaking vCenter or host settings.
vShield Zones should be smart enough to allow us to select specific port groups to protect rather than an entire vSwitch. I guess whatever scripting is being done in the background will need to be changed for this. Maybe we need a Ghetto vShield?
One of the REALLY smart people at VMware should be able to tell us the “order of migration” when a host is placed in Maintenance Mode. Once that is determined, there is probably a configuration file somewhere that we could tweak to change it.
There should be a way to set up automatic startup and shutdown of individual VMs. The Startup/Shutdown settings sort of deprecated once DRS was introduced. The only time it is useful is with a stand-alone server or in a NON-DRS cluster. I guess the only thing that could be done is to add a script somewhere in rc.d or rc.local to start up these VMs, but how can that be done in a “supported” fashion with ESXi and is it supported in either ESX or ESXi?
I brought these issues up with some VMware engineers and they assure me that they are working on this. Hopefully they will figure it out soon. I hate doing things manually. It seems like it is anti-cloud.
Back in the summer, I saw Stu’s Post about automating the installation of ESXi. I was reminded again by Duncan’s Post. Then, I found myself in a situation when a customer bought 160 blades for VMware ESXi. With this many systems, it would be almost impossible to do this without mistakes. I took the ideas from Stu and Duncan and created an ESXi automated installer that works from a PXE deployment server, like the Ultimate Deployment Appliance. I took it a step further and added the ability to use a USB stick or a CD for those times when PXE is not allowed. The document below is a result of it.
This is a little different than the idea of a stateless ESXi server, where the hypervisor actually boots from PXE. This is the installer booting from PXE so that the hypervisor can be installed on local disk, an internal USB stick or SD card. You could also use it for a “boot from SAN” situation, but extreme care should be taken so you don’t accidentally format a VMFS disk.
As always, if anyone has comments, corrections, etc., please feel free to post a comment below.
The document can be found here -> Creating an Automated ESXi Installer
The ability to use an automated, unattended installation routine for a hypervisor is necessary whenever it is deployed to multiple systems or is done frequently. Automated installations help avoid a misconfiguration caused by human error, which become common when repetitive tasks are performed. Because the “traditional” version of VMware ESX Server contains a Red Hat Linux based console operating system, we have been able to leverage kickstart scripts for automated installation. With the ESXi hypervisor, much of this functionality is not available because of the smaller footprint.
This document explains how to set up ESXi with little intervention. The modifications explained here can be used to deploy ESXi using a PXE server. In our examples we will use the Ultimate Deployment Appliance, but these methods will also transfer to such commercial packages as HP Rapid Deployment Pack, Altiris, or even a home grown PXE server. The modifications can also be used for deploying ESXi using a USB stick or a customized CD.
- ESXi Server Installable The ESXi CD image can be downloaded from the VMware site, however using a systems management and monitoring server, such as HP SIM or Dell OpenManage is highly recommended. Since there are usually vendor specific CIM providers to enhance the monitoring capabilities, some vendors will provide a customized CD image with the CIM providers. These additional CIM providers will also allow for more information to be displayed in the hardware sections of the vSphere Client. A search for “ESXi” on the HP and Dell sites produced links to the latest customized images.
- Deployment Server A deployment server will allow for a controlled, automated installation of the ESXi Server software. The ability to handle multiple operating system installations is also desired. The ability to provide PXE and DHCP services is required as well. Most times, the deployment server will be running PXE services and TFTP. The DHCP services may be running on a different server in an enterprise. This document does not explain how to set up a separate DHCP server. For this document, we will be using the Ultimate Deployment Appliance (UDA) version 2.0 (beta).
- Virtualization Software The UDA runs as a “Virtual Appliance,” which is a pre-configured virtual machine. It will run under VMware ESXi (available as a free or licensed instance), VMware Workstation (available for purchase), VMware Player (free) or VMware Server (free). In this document, VMware Workstation is used.
- Optional software Although no additional software is required when using the UDA, you will need additional software if you plan on using a USB stick or if you plan on creating a customized CD image:
- VMware Converter If you plan on using ESXi or Server to host the UDA, VMware Converter can be used to import the virtual appliance.
- Syslinux In order to make a bootable USB stick, you will need the syslinux utility. This utility is available for Linux and Windows. The UDA does not include it. As an alternative, you can use the unetbootin utility.
- CD Imaging and Burning In order to create a bootable CD image, you will need software to create the CD image (mkisofs) and then software to burn the image to the CD media (cdrecord). The cdrtools project includes versions for Linux and Windows. Most Debian versions of Linux, such as Ubuntu, come with the cdrkit, which uses genisoimage for imaging and wodim for burning.
- Linux Desktop If you look at the contents of the ESXi CDs using Windows (Windows 7 was used), you may see all of the files listed using all capital letters. Since the ESXi software is based on Linux, all file operations are case sensitive and expect the files to be all lower case. This may cause errors when attempting to create the automated installer. For this reason, a Linux desktop is recommended. For most of the operations, UDS may be used. The only missing software on the UDA is syslinux. For a feature rich Linux desktop, Ubuntu is recommended. A few pre-configured Ubuntu Desktop virtual appliances are also available.
Once you have a hypervisor installed you will need to configure the server and add it to vCenter in an automated fashion. Look for a future doc covering this. For now, check out these resources for post install configurations:
VI:OPS is was a VMware Forum that dedicates dedicated itself to providing information related to operations surrounding a VMware Infrastructure. The “Proven Practice” documents are were submitted and reviewed by moderators before they are published. The published documents allow for peers to comment on the documents.
I made it point to meet Stevie Chambers because he used to be the driving force behind VI:OPS. When he took his helmet with the big red plume and his sword and armored kilt over to Cisco, everything seemed to just freeze at VI:OPS. It took a week to have my last post approved. PMs were not returned quickly. It just died. No gladiator to defend it.
This morning, I was trying to answer a VCB question on the forums. The person posting had a simple question about the operation of VCB with BackupExec. I have not been very active on the communities lately, but I still scan through them and try to post answers when I can. Most of the time, my response to VCB questions include a reference to a “Proven Practice Guide that I posted on the VI:OPS communities:
Its GONE! I suspected something was up when someone posted that they could not find the Visio stencils that were on VI:OPS. What happened? Stwike them woughly centurions!
I hastily posted the PDF here for the forum response because I was trying to hurry out the door. I am working to update the doc and will post it soon. Check out the posted copy and use the comments section or DM me on twitter with any corrections.
Since VI:OPS seems to have died and the content was gobbled up and reindexed my the main VMware communities site (They miss you Stevie!), I am posting my VCB proven practice here. It is dated, since its last version covered ESX 3.5, but most of it still applies in ESX4. If you have comments or changes that you wish to see, please comment here.
You can get it here -> http://www.dailyhypervisor.com/?file_id=8
OK, so my last post brought on a blizzard of remarks questioning some of the validity of the data presented. I used what I was told during a presentation was a “Gartner recommended” configuration for a VM. My error was that I could not find this recommendation anywhere, but the sizing seems fairly valid, so I went with it. I went back to some of the assessments I have done and took data from about 2,000 servers to come up with some more real-world averages. I wanted to post these averages tonight. Remember what I said previously: This is just a set of numbers. You must ASSESS and DESIGN your virtual infrastructure properly. This is only a small piece of it.
I apologize for the images instead of tables, but I spent way too long trying to get tables to lay out properly in WordPress. Click on the images for larger views. I can post the raw data if someone wants to look at it, but I have to work on stripping away proprietary data first. So, here we go:
If you have ever done a Virtualization Assessment, you will recognize this from the summary page of the workbook. We are going to look at data from 1956 servers. Average RAM usage is about 2069MB. Average CPU utilization is about 5.2%. Average network is about 31KB/s.
From the same page in the workbook. From this chart, we see that the average ALLOCATED RAM is about 4342MB and the average FREE RAM is about 2273MB. This is where we get the average RAM usage from above.
This is the averages calculated for each row in the raw data summary.
This final chart is from a storage summary report. Average disk read bytes per sec (442,00) + average write bytes per sec (200,000) is about 600,000 bytes. So, total I/O bytes is about 632,000 (600,000 storage + 32,000 network). I used Google to convert this to gigabits: 632 000 bytes = 0.00470876694 gigabits. This is WAY less than the 0.3Gb recommended. So, here is my calculated AVERAGE VM sizing:
- RAM = 2GB
- I/O = 0.005Gb
- Network I/O = 0.0002 Gb
- Storage I/O = 0.004 Gb
I am not going to claim that this is my recommendation for a VM configuration, because it isn’t. My recommendation is still and will always be to ASSESS YOUR UNIQUE ENVIRONMENT and come up with your own data. I am not going to redo my previous post with these numbers because it is pointless. The intent of the previous post was to come up with a number of VMs in a chassis or rack based on a set of criteria. I also wanted to show a comparison of capabilities of each blade. If I use the numbers from this post, it will only show that each blade in question is capable of hosting even more VMs.
I attended the second day of the HP Converged Infrastructure Roadshow in NYC last week. Most of the day was spent watching PowerPoints and demos for the HP Matrix stuff and Virtual Connect. Then came lunch. I finished my appetizer and realized that the buffet being set up was for someone else. My appetizer was actually lunch! Thanks God there was cheesecake on the way…
There was a session on unified storage, which mostly covered the LeftHand line. At one point, I asked if the data de-dupe was source based or destination based. The “engineer” looked like a deer in the headlights and promptly answered “It’s hash based.” ‘Nuff said… The session covering the G6 servers was OK, but “been there done that.”
Other than the cheesecake, the best part of the day was the final presentation. The last session covered the differences in the various blade servers from several manufacturers. Even though I work for a company that sells HP, EMC and Cisco gear, I believe that x64 servers, from a hardware perspective, are really generic for the most part. Many will argue why their choice is the best, but most people choose a brand based on relationships with their supplier, the manufacturer or the dreaded “preferred vendor” status. Obviously, this was an HP – biased presentation, but some of the math the Bladesystem engineer (I forgot to get his name) presented really makes you think.
Lets start with a typical configuration for VMs. He mentioned that this was a “Gartner recommended” configuration for VMs, but I could not find anything about this anywhere on line. Even so, its a pretty fair portrayal of a typical VM.
Typical Virtual Machine Configuration:
- 3-4 GB Memory
- 300 Mbps I/O
- 100 Mbps Ethernet (0.1Gb)
- 200 Mbps Storage (0.2Gb)
Processor count was not discussed, but you will see that may not be a big deal since most processors are overpowered for todays applications (I said MOST). IOps is not a factor either in these comparisons, that would be a factor of the storage system.
So, let’s take a look at the typical server configuration. In this article, we are comparing blade servers. But this is even typical for a “2U” rack server. He called this an “eightieth percentile” server, meaning it will meet 80% of the requirements for a server.
Typical Server Configuration:
- 2 Sockets
- 4-6 cores per socket
- 12 DIMM slots
- 2 Hot-plug Drives
- 2 Lan on Motherboard (LOM)
- 2 Mezzanine Slots (Or PCI-e slots)
Now, say we take this typical server and load it with 4GB or 8GB DIMMs. This is not a real stretch of the imagination. It gives us 48GB of RAM. Now its time for some math:
Calculations for a server with 4GB DIMMs:
- 48GB Total RAM ÷ 3GB Memory per VM = 16 VMs
- 16 VMs ÷ 8 cores = 2 VMs per core
- 16 VMs * 0.3Gb per VM = 4.8 Gb I/O needed (x2 for redundancy)
- 16 VMs * 0.1Gb per VM = 1.6Gb Ethernet needed (x2 for redundancy)
- 16 VMs * 0.2Gb per VM = 3.2Gb Storage needed (x2 for redundancy)
Calculations for a server with 8GB DIMMs:
- 96GB Total RAM ÷ 3GB Memory per VM = 32 VMs
- 32 VMs ÷ 8 cores = 4 VMs per core
- 32 VMs * 0.3Gb per VM = 9.6Gb Ethernet needed (x2 for redundancy)
- 32 VMs * 0.1Gb per VM = 3.2Gb Ethernet needed (x2 for redundancy)
- 32 VMs * 0.2Gb per VM = 6.4Gb Storage needed (x2 for redundancy)
Are you with me so far? I see nothing wrong with any of these yet.
Now, we need to look at the different attributes of the blades:
* The IBM LS42 and HP BL490c Each have 2 internal non-hot plug drive slots
The “dings” against each:
- Cisco B200M1 has no LOM and only 1 mezzanine slot
- Cisco B250M1 has no LOM
- Cisco chassis only has one pair of I/O modules
- Cisco chassis only has four power supplies – may cause issues using 3-phase power
- Dell M710 and M905 have only 1GbE LOMs (Allegedly, the chassis midplane connecting the LOMs cannot support 10GbE because they lack a “back drill.”)
- IBM LS42 has only 1GbE LOMs
- IBM chassis only has four power supplies – may cause issues using 3-phase power
Now, from here, the engineer made comparisons based on loading each blade with 4GB or 8GB DIMMs. Basically, some of the blades would not support a full complement of VMs based on a full load of DIMMS. What does this mean? Don’t rush out and buy blades loaded with DIMMs or your memory utilization could be lower than expected. What it really means is that you need to ASSESS your needs and DESIGN an infrastructure based on those needs. What I will do is give you a maximum VMs per blade and per chassis. It seems to me that it would make more sense to consider this in the design stage so that you can come up with some TCO numbers based on vendors. So, we will take a look at the maximum number of VMs for each blade based on total RAM capability and total I/O capability. The lower number becomes the total possible VMs per blade based on overall configuration. What I did here to simplify things was take the total possible RAM and subtract 6GB for hypervisor and overhead, then divide by 3 to come up with the amount of 3GB VMs I could host. I also took the size specs for each chassis and calulated the maximum possible chassis per rack and then calculated the number of VMs per rack. The number of chassis per rack does not account for top of rack switches. If these are needed, you may lose one chassis per rack most of the systems will allow for an end of row or core switching configuration.
One thing to remember is this is a quick calculation. It estimates the amount of RAM required for overhead and the hypervisor to be 6GB. It is by no means based on any calculations coming from a real assessment. The reason why the Cisco B250M1 blade is capped at 66 VMs is because of the amount of I/O it is capable of supporting. 20Gb redundant I/O ÷ 0.3 I/O per VM = 66 VMs.
I set out in this journey with the purpose of taking the ideas from an HP engineer and attempted as best as I could to be fair in my version of this presentation. I did not even know what the outcome would be, but I am pleased to find that HP blades offer the highest VM per rack numbers.
The final part of the HP presentation dealt with cooling and power comparisons. One thing that I was surprised to hear, but have not confirmed, is that the Cisco blades want to draw more air (in CFM) than one perforated tile will allow. I will not even get into the “CFM pre VM” or “Watt per VM” numbers, but they also favored HP blades.
Please, by all means challenge my numbers. But back them up with numbers yourself.