The most significant part of VMware PEX, for me this year, was the Solutions Exchange floor and the rather small number of vendors. My focus was on convergence of compute and storage resources. This appears to be a popular path. There were a few things along the way, other than cheap swag, that caught my eye. One interesting conversation involved FusionIO. They validated that many customers concentrate more on storage space instead of performance and that this is not good. Some more progressive enterprises are very focused on performance. For instance, eBay actually measures costs based on url per kilowatt.
I have been attending various VMware Partner Exchange (PEX) and VMworld events since around 1996. Typically, I prefer to attend PEX over VMworld. The number of attendees is significantly smaller and access to the VMware brain trust is easier. There is usually a good mix of NDA roadmaps and decent technical information. The Solutions Exchange floor is less crowded and the vendors are able to spend more time with you. The hands on labs are the same top-quality as VMworld, but typically with no lines.
I did not attend VMworld 2013, but I did attend PEX 2014 last week. Sadly, I was a little bit underwhelmed again this year, as I was last year too. There was no real feeling of innovation. No buzz. Just ho-hum. There seemed to be fewer exhibitors on the Solutions Exchange floor as compared to previous years. I did have some very educational conversations with some vendors that I will detail later.
Despite all of the drama around Veeam and Nutanix being missing in action, there was no mention of VMTurbo. The people at the Cisco booth had nothing about Whiptail or Insieme.
The push is bundled suites of software, which will offer a single management point and a common interface. But many parts of these suites will likely end up as shelf-ware to many. Let’s face it, many enterprises will need to change significantly in order to fully utilize the vCloud suite. If it is not a top down directive, either the networking silo or storage silo is going to protest over lack of control. Try explaining to network engineers that you need a pool of VLANs and associated IP addresses that will be out of their control and probably will be difficult to integrate with the many manual processes that are used. Try telling a storage administrator that we can now do self-service provisioning of a storage array, including all of the parts in the middle, like zoning and masking. Like VDI, they see that ROI is heavy on “soft costs.” Try explaining how you save on labor costs without reducing workforce in this economy where IT shops are already understaffed. Many people don’t get it when it comes to Cloud, which suddenly became Software Defined Data Center (SDDC) and now Software Defined Enterprise (SDE). But that can be the subject of a future post.
There was a little video before the keynote on Tuesday morning that hammered it home. Think about VMware in its first ten years. No one “got” virtualization. But VMware persisted and stuck to its guns. Now it has become standard issue in a data center. VMware is taking the same approach to the Software Defined Data Center. They are even starting to call it Software Defined Enterprise instead. Their stance is that, with persistence, the SDDC message will be heard and will become the norm. There is a constant struggle between the people that deliver IT and the people that consume IT. The fundamental ideas of SDDC help calm that struggle.
Right now, Microsoft appears to be hot on the tail of VMware and I see many people seriously reconsidering their method of delivery. I think the biggest things that VMware has going for it right now are vCenter Operations Manager (vCOPS) and Site Recovery Manager (SRM). I think that SRM is possibly the only thing keeping some enterprises on VMware for business critical applications. VMware is trying to display an air of non concern. Possibly, they are ignoring the “Evil Empire.” When many enterprises are already paying for Windows Datacenter Edition and System Center, there needs to be justifications to keep vSphere. I see Hyper-V taking a foothold, especially in SMB and branch offices and in places where basic server virtualization is good enough.
It is interesting to see some of the visions play out over the course of time. I remember back in 2007 or 2008, when PEX was still Technical Solutions Exchange (TSX). I remember Carl Eishenbach (http://www.vmware.com/company/leadership/carl-eschenbach.html) announcing the VCDX program. I remember he said that there would be about 200 of us by the end of 2008. Back then, customers needed someone to design their greenfield environment and assist with migration from physical to virtual. I don’t find myself needing to prove that vMotion works any more. Although there are likely many greenfield opportunities out there, most of my design expertise is now spent assisting with creating higher consolidation ratios and helping customers deliver a more optimized datacenter that may not always have vSphere at the top of the list. I have seen VMware go from a stand alone hypervisor to centrally managed solution to the defacto standard then to what I describe as “meh.” There is no pop anymore. No excitement. Maybe I am getting too pessimistic in my old age.
Stay tooned! I have more to come on the interesting finds on the Solutions Exchange floor.
A few weeks ago, I had the privilege of returning to the customer that was the subject of my VCDX Design Defense.
First, let’s travel back into time. It was mid-December, 2008. I was a VMware Delivery Engineer for my company at the time. The customer engagement was a VMware Plan and Design delivery. I had the audacity to design a virtual datacenter running ESXi 3.5.0 Update 2 on HP BladeSyStem servers booting from USB sticks. The Virtual Center server was a VM. This was my fourth or fifth Plan and Design engagement with my company, so it was a piece of cake to me.
I’ve been meaning to write this post since May 2010! Never got around to it but I am starting to see and hear about it again. The costs of Windows Server systems become important if you are renewing licenses or you are in a consolidation project. I find that many businesses will purchase server hardware with a Windows OEM license. The problem with an OEM license is that it CANNOT be transferred to a different server. Some people find this out the hard way during P2V conversions.
Interesting that I stumbled on this right after I posted a How-To for the free vSphere Hypervisor.
Mike Adams over at VMware wants to know if you are using the free vSphere Hypervisor. If so, he would like you to complete a very short survey so he can understand how you are using it. Check out his post on the VMware blog.
One of the things that some of the Microsoft Hyper-V users tout is that Hyper-V is free. Sometimes, like in smaller offices or branch locations, it may even make sense to use Hyper-V where there are four or less VMs on a host. The Windows 2008 Server Enterprise Edition license will allow for the base Hyper-V installation and up to four Windows 2008 Server VMs on the hardware. But there are many valid reasons to run VMware’s vSphere Hypervisor, the free version of ESXi in your datacenter.
NOTE: This is no longer required in vCD 5.1 & above!
This is one of those situations where I really start to hate computers! I was working with vCloud Director with a goal of having a winders VM run through guest customization, change the name, get a fixed IP from the network pool, join an Active Directory Domain and move to a specific OU in the AD.
There is a spot in the VM properties to specify a domain to join. You can use the settings specified in the organization or enter the domain information directly.
This post includes an important security “gotcha” that I recently uncovered with vCloud Director 1.5 running on vSphere 5. If you are using vCloud Director, you should check your settings.
The BIG Security Issue
The PAVMUG session on Sept 22nd, 2011 that seemed to have the second most active audience was the session where I discussed vSphere 5 licensing and some of the design considerations. There were several good questions that I would like to re-address here and share some helpful links that I promised during the session. There is a great PowerCLI script and a tool that VMware themselves offer.
For the September 2011 PAVMUG all day meeting, I participated in four sessions. To me, the session with the most audience participation was about virtualizing business critical applications. My session really dug deep into Microsoft Exchange but also covered some basics around SQL and Oracle. I wanted to expand some of the ideas that were discussed during the session and post the presentation slides.
Welcome to Tech-Tap, where we adjust things with a ball peen hammer. Thanks for checking us out!
If you have ever wanted to give something a “technical tap” with a ball peen hammer, then you have come to the right place.
Below, please find archives of posts from my previous blogs. New posts are coming soon, so check back in or subscribe to our RSS feed.
I was doing some research for session I am presenting at an upcoming PAVMUG session about vSphere remote management when I came across an apology by one of the PowerGUI guys. Essentially, he was apologizing for something that VMware changed in the functionality of PowerCLI that affects how the PowerGUI Virtualization Powerpack interacts with it.
If you didn’t know it yet, VMware announced a while back that future releases of VMware will not include the “traditional” ESX Server. From their site:“VMware vSphere 4.1 and its subsequent update and patch releases are the last releases to include both ESX and ESXi hypervisor architectures. Future major releases of VMware vSphere will include only the ESXi architecture.”
If you are in a “24/7/365″ shop then the applications running in your private cloud should currently be in virtual data centers (vDC) that are contained in DRS/HA clusters and the migration can be completed with no downtime to the applications. However, there are still other systems, such as development and test systems or possibly some minor infrastructure services applications that may not benefit from vSphere’s availability features. I know many people have scheduled outages, shutdowns, etc. during the upcoming holidays. It may the best time to migrate to ESXi…
A Little Bit About ESXi 4.1
I have actually been pushing ESXi since version 3.5. In every plan and design engagement where I have been involved, I have always started with ESXi as the version to be used unless there was a compelling reason NOT to use ESXi. I think the only thing right now that requires ESX is HP’s Matrix. I have yet to find anyone that uses that behemoth. There have been many improvements to the features of ESXi since version 4.0. VMware has a nice “Why ESXi” web page to explain these new features. The biggest thing is that ESXi has a smaller footprint and has fewer security updates needed than the “traditional” ESX Server. For instance, the latest round of patches has 11 for ESX and only two for ESXi. The small footprint allows ESXi to be installed on a USB stick, an SDHC or just on less disk space. There is also a nice VMware KB article explaining all of the differences between the current version of ESX and ESXi.
Here are some other things that I like about ESXi:
Treat the ESXi Server like it is an appliance
The ESXi installation should be treated as firmware. It is even called firmware in Update Manager. This means that there is no actual upgrade. The firmware is installed fresh every time. This also brings the idea of having stateless servers at some point in time. More on that later.
Really, there are three methods for a mass deployment. One is to use Host Profiles, but this requires manual steps and the extra costs associated with Enterprise Plus licenses. A second method is to install a base install of ESXi using the method I outlined in a previous post and then use PowerCLI or the vSphereCLI to customize the server. The new preferred method is good ‘ol Kickstart.
Several Boot Options
Now, you can boot from local disk, SAN Disk, a USB Key or an SDHC Card. One of the arguments some have against booting from USB or SDHC is the dreaded single point of failure (SPOF). My answer has always been that HA will cover this. And, if you think about it, an internal array controller is a SPOF as well. But, now you can boot from SAN if you wish. Just remember to create a space for logs and core dumps if you go the route of USB or SDHC.
Active Directory Integration
ESXi servers can now become members of a Micro$oft Active Directory Domain so that administrators can authenticate to the directory. Setting this up is done through the vSphere Client under the configuration tab. It appears that you can call the administrators group anything you want in AD as long as it is “ESX Admins”. Maybe that will change in future versions.
You can just do away with SNMP monitoring and use the Common Information Model (CIM) providers. The stock version comes with some basic CIM monitoring, but the major hardware companies provide custom baked versions of ESXi with OEM-specific CIM providers for more granular monitoring. Rather than setting up your monitoring software for SNMP, you just point it to the ESXi server and set it up for CIM or WBEM. If you are not using a monitoring server, you can use the vCenter server to handle alerts and alarms.
Enhanced Tech Support Mode and vCLI Operations
Using Tech Support Mode, formerly known as “Unsupported Mode,” is now supported. You can log in as an administrative user and all commands run in TSM are logged. The biggest reason many people went into TSM was to kill a stuck VM. There is now a vCLI operation that will allow this.
The Migration Path to ESXi
Here is the “Super Dave Method” for migrating to ESXi. Like I mentioned before, if you have at least ESX Clusters with vMotion enabled, this will be a relatively painless process, with no downtime. Obviously, if you are also upgrading from a previous version, there will be reboots required for VMware Tools and possibly to upgrade the VM Hardware to version 7.
TEST TEST TEST
Yeah, you heard me. Take some time to familiarize yourself with the differences between ESX and ESXI by setting up at least one test system so you can hammer it before changing your production systems. The nice thing is that you don’t need hardware to do this. You can set up a VM and run ESXi as a VM.
vMA, vCLI and PowerCLI – Oh My
If you are already familiar with doing things from the ESX console, the vCLI will be very familiar to you. If you are a Winders guy, you should already know PowersHell, so the PowerCLI will be familiar. Also, get to know the vSphere Management Appliance (vMA). The vMA is a RHEL5 based VM Appliance that is pre-configured with the Perl SDK and vCLI. It also includes vi-fastpass command, which allow you to pass authentication through to the hosts and to vCenter. It also includes vilogger, which allows you to create a (very) poor man’s syslog server. I think you are better served using Splunk! or phplogcon. Either will allow you to parse the syslog entries eisier if you need to do any troubleshooting or forensics. If you want a nice guide to the vMA, head over to the vGhetto for some nice tips and tricks. If you are already a Linux shop and RHEL is not your cup of tea, then you can bake your own vMA with the Perl SDK and vCLI, but you will lose the vi-fastpass and vilogger capabilites. vi-fastpass is not a huge loss and you can set up your home baked vMA to include syslogd and phplogcon.
Create a Kickstart Script
The most efficient way of performing an installation on more than one server is to script it. VMware now supports using Kickstart. Test your KickStart script on the test ESXi server, weather it is physical or virtual, you should be able to test most of the functionality. Check out the nifty “DryRun” setting too. I am not going to rehash what has already been done or said regarding Kickstart, so here are some decent links:
- The VMware ESXi Kickstart Documentation Page
- The VMware in SMB Blog
- The vGhetto
- Kendrick Coleman’s Guide
- vMA and AD Troubleshooting Guide
Take a look at this VMware Labs Fling if you want to create a nice deployment server that automates the ENTIRE process. This brings up the idea of having stateless ESXi servers. As each one is inserted into the vDC, ESXi is automatically set up for you. All the “important stuff” is stored on shared disk.
If you decide to go the route of using PowerCLI, take a look at this nice post.
Back up EVERYTHING!
Back up EVERYTHING! ‘Nuff said? There is no better feeling when the shit hits the fan than to have a good backup. Make sure you back up the vCenter database and capture the configuration settings of all of the ESX servers. If you have Enterprise Plus, capture a host profile for each server. Take screen shots of the configurations settings in the vSphere Client. Include networking and all of the tabs in the vSwitches. Include storage and all of the tabs of any iSCSI initiators. Don’t forget the IQNs! Check the advanced settings for any tweaks. Document the whole thing.
Pick the First Victim and Move Your VMs
If you are using DRS and HA, disable HA. Then place the first victim host in Maintenance Mode. This should automagically move the VMs to other hosts in the cluster if DRS is working. Or you can manually (*GASP*) vMotion your VMs. If they are on local storage, set up an iSCSI server – OpenFiler is free. You can then use Storage vMotion to move the VMs to the iSCSI storage.
Remove from inventory
Yep! If the server is a part of a cluster, remove it from the cluster. Before removing it from inventory, unassign the license. Then, remove it from inventory. Once it has been disassociated from the vCenter Server, shut it down.
Now is a good time to update the BIOS, device firmware, etc. Blow out the dust. Perform some hardware TLC. If you do not trust yourself, disconnect it from the SAN. If you are using software iSCSI initiators or NFS, no worries because you will need to reconfigure this stuff after the install.
Now is the time to actually perform your first ESXi installation. Go ahead, we’ll wait.
Once ESXi is installed, you can add it back to the vCenter Server, add the license and perform any post configuration steps like apply the host profile or run your PowersHell script for post configuration. Once the post configuration is completed, confirm that all of the settings are correct and match what you documented previously. If all is good, add it back to the original cluster. If you want to set up Enhanced vMotion Compatability (EVC) and it’s not currently set up in the original cluster, you can create a new cluster for the ESXi Servers with EVC enabled.
Repeat the process for each server. If you decided to create a new cluster, start moving VMs to the new cluster.
If you are installing ESXi to a USB Stick or an SDHC card:
- Make sure the stick or card is supported by the hardware manufacturer.
- Set up a syslog server. This is very important.
- If you don’t set up a syslog server, make sure you have a swap partition on the centralized storage. If you do not do this, the logs are deleted on reboot. Then set up a syslog server.
Make sure that during the test phase you understand how to set up authentication, syslog settings and nay required custom settings. Also make sure you know your way around the DCUI and the Tech Support Mode console.
Yesterday, I took the VCAP4-DCD beta exam (VDCD410), like many others have done this week. There is a thread started already in the communities forum. Just like previous beta exams with VMware, I had to take the exam at a Pearson-owned facility. There are a few within an hour of me, so I didn’t need to fly anywhere. It’s funny, Jason Boche mentions that he couldn’t take coffee into the testing center. The person at the facility where I took the exam said no food, drink, candy, etc. The center where I usually take exams is more laid back and are OK with coffee as long as there is a lid. In fact, they offer to sell you bottled water to take in with you. VMware only requires a finger print scan and a photo with the two forms of ID. I noticed some people were getting their hand vein patterns recorded. Crazy.
Passing the VCAP4-DCD exam is one of the requirements for anyone, including a VCDX3, to achieve the VCDX4 certification. So passing this exam is very important to me.
The beta exam was 131 questions/tasks with four hours to complete. (There was a guy before me that was taking a test that lasts 630 minutes!) I would think that some of these may get cast off as improper, too easy or too hard. If all of the questions prove to be OK, then VMware has a nice, fair pool of design questions. I would also think that the “GA” exam will only be a portion of these questions.
There are three types of questions or tasks: The standard multiple guess questions, a few “match the object to a category” drag and drop tasks and a few diagramming tasks.
Exam Content – What You Need to Know!
I am under countless NDAs on this, so there will be no “scoops” here. I can say that the exam is true to the Blueprint. Rather than giving a direct link to the PDF, which could change, I will tell you to go to www.vmware.com/go/vcap. Click on the “Datacenter Design” tab. There is a link to the current blueprint there. It was just changed to fix an issue with broken hyperlinks. There are also links to a FAQ, the VCAP Communities landing page and a link to a demo of the diagramming tool.
Make sure you read and understand all of the documents and web pages that are linked in the blueprint. VMware leaves no stone unturned. I would not advise trying to memorize all of the content listed in the blueprint, your head would explode. Just comprehend what you read. Much of this is conceptual and revolves around the methodology and best practices GUIDANCE that VMware chooses to publish.
Make sure you take a look at the VCAP4-DCD Exam UI Demo. This is the exam version of a Visio tool. One of my complaints about the VI3 Design Exam was the quality of the diagramming tool. It is greatly improved in this exam. In the VCAP4-DCD beta exam, there were more than one diagramming task. I don’t think VMware is looking for the Mona Lisa. This is more of a “show your work” kind of thing. The diagramming tasks are not that complex and will only cover a few design criteria in each task.
Since this is a DESIGN exam, there are plenty of scenarios that involve capacity planning. Since you will not have any tools available to do your work, you will need to understand the math involved in capacity planning. There is a simple calculator available via a link at the top of the screen. You will also need to understand the math involved with calculating HA, DRS, reservations, shares and limits.
Finally, with the beta, there was a time constraint. I think I had about 10 or 15 minutes left when I was done. Make sure you manage your time. There was no “back” button and there was no way to mark questions or tasks for review in the beta exam. This may or may not hold true for the “GA” exam. Remember: If there is no “back” button or way to mark a question, make sure you are OK with your answer before clicking the “next” button. I clicked it a few times as I was thinking “Maybe I should read that again….”
My Soapbox Moment
I don’t want to “toot my own horn” or sound arrogant here, but I purposely did not “study” for this exam. I did read the blueprint and skim some of the documents. The hyperlinks were broken in the 1.2 version and I didn’t try to find them too hard. I didn’t study for the VCAP4-DCA exam either (I passed by the skin of my teeth!). In my (humble) opinion, the exams require that you have EXPERIENCE in the subject of the exam. I don’t think VMware intends to have “paper VCAPs” although I am sure there will eventually be some out there.
If you want to pass the VCAP4-DCA exam, you should have experience managing a vSphere environment. If you want to pass the VCAP4-DCD exam, you should have experience in designing at least one vSphere environment. You need to go through the thought processes involved in the ASSESS – DESIGN – IMPLEMENT – MANAGE cycle. I am sure that the design workshop will assist you in gaining the knowledge and some experience in designing a vSphere environment, but it won’t give you everything you need for passing this exam. Certainly, if you want to progress to the final step and submit and defend a design, you will need EXPERIENCE. This is why there is such a high fail rate for the design defense.
Since everyone else in the world is heralding the release of vSphere 4.1, I figured I would post some bad news. The stuff you may want to know BEFORE you jump into upgrading to vSphere 4.1. Before I start, I want to make it clear that vSphere 4.1 is a great product overall. And I have already been leaning to ESXi, so the announcement that this will be the last release with the “traditional” ESX has been expected. I will talk about ESXi and its improvements in a later post. I just want you to be aware of these rather significant Gotchas.
Gotcha #1 – Read Only Role allows members to add VMKernel NICs
From the release notes (You actually READ these, right?):
- Newly added users with read-only role can add VMkernel NICs to ESX/ESXi hosts
Newly added users with a read-only role cannot make changes to the ESX/ESXi host setup with the exception of adding VMkernel NICs, which is currently possible.
Workaround: None. Do not rely on this behavior because read-only users will not be able to add VMkernel NICs in the future.
This is a fairly big security issue. I just LOVE the workaround notes. To be fair, I have found only one installation in my experience that uses the Read-Only Role. In my opinion, if they don’t have access to the physical data center, they don’t need any access to vCenter. But this is just something that should have been corrected before release.
Gotcha #2 – ESX/ESXi installations on HP systems require the HP NMI driver
- ESX installations on HP systems require the HP NMI driver
ESX 4.1 instances on HP systems require the HP NMI driver to ensure proper handling of non-maskable interrupts (NMIs). The NMI driver ensures that NMIs are properly detected and logged. Without this driver, NMIs, which signal hardware faults, are ignored on HP systems with ESX.
CAUTION: Failure to install this driver might result in silent data corruption.
Workaround: Download and install the NMI driver. The driver is available as an offline bundle from the HP Web site. Also, see KB 1021609.
It seems that every time HP releases a new set of SIM agents for ESX, something breaks. Is this VMware’s way of putting it on HP? Or was this an “OOPS”? If you search for “HP VMware NMI Driver” you come up with nothing. No download. It was no where to be found on Monday, but I did find it today on the HP support site.
Gotcha #3 – VMware View Composer 2.0.x is not supported in a vSphere vCenter Server 4.1 managed environment
The basic issue here is that vCenter 4.1 only works on a 64-bit system. View Composer only works on a 32-bit system. From the KB Article:
Don’t these guys talk to each other? Didn’t they learn their lesson with the PCoIP issues? And why can’t you just admit it in the release notes instead of putting a link to the KB article? I completely missed this Monday morning.
Gotcha #4 – vCenter Installer SILENTLY Changes SQL Server Settings to Allow Named Pipes
- vCenter Server installation or upgrade silently changes Microsoft SQL Server settings to enable named pipes
When you install vCenter Server 4.1 or upgrade vCenter Server 4.0.x to vCenter Server 4.1 on a host that uses Microsoft SQL Server with a setting of “Using TCP/IP only,” the installer changes that setting to “Using TCP/IP and named pipes” and does not present a notification of the change.Workaround: The change in setting to “Using TCP/IP and named pipes” does not interfere with the correct operation of vCenter Server. However, you can use the following steps to restore the setting to the default of “Using TCP/IP only.”
- Select Start > Programs > Microsoft SQL Server 2005 > Configuration Tools > SQL Server Surface Area Configuration.
- Select Surface Area Configuration for Services and Connections.
- Under the SQL Server instance you are using for vCenter Server, select Remote Connections.
- Change the option under Local and Remote Connections and click Apply.
Can you hear the DBAs pissing and moaning?
Gotcha #4a – SQL Database is changed to Bulk Recovery Model (updated 10/27)
This on is funny. I just found out about it on 10/27/2010. When is comes to SQL for the vCenter database, VMware recommends using a simple recovery model. So, with their attention to detail, the upgrade process changes the database to a bulk recovery model. Inn this model, the logs keep growing until a backup purges it. No good.
Transaction log for vCenter Server database grows large after upgrading to vCenter Server 4.1 – http://kb.vmware.com/kb/1026430
Again vSphere 4.1 brings some great improvements and some welcome changes. As the product matures and more vendors work with the APIs, we will see some nice features that will help you in your journey to the private cloud. The Gotchas listed above may not exist if quality assurance is tightened. I think I would rather hear that a release is delayed because of pending bug fixes. How long will we need to wait to fix these? In any case, if the Read-Only Role or the View Composer gotchas don’t apply, then jump right in and install or upgrade to vSphere 4.1. Just make sure you install the NMI drivers and fix the SQL settings.
I got a tweet from William Lam last night. It looks like versions are hard-coded in Capacity-IQ making it incompatible with vSphere 4.1. Will also explains two ways to make it work.
In case you have been living under a rock and haven’t heard, VMware is getting ready to release a new set of advanced certification exams that will take you along the path to become a VMware Certified Design Expert on vSphere 4 (VCDX4). Just like VCDX3, it starts with the requirement of being a VMware Certified Professional on vSphere 4 (VCP). You will then need to pass two exams before being able to submit and defend your design. VMware has decided to award new certification statuses for passing these exams. The exam to become a VMware Certified Advanced Professional on vSphere 4 – Datacenter Administration (VCAP-DCA) is currently finishing up its beta run. The exam to become a VMware Certified Advanced Professional on vSphere 4 – Datacenter Design (VCAP-DCD) is not yet in beta. The path to achieve VCDX4 status is laid out on VMware’s site and is illustrated below:
Just like Jason Boche, William Lam and Duncan Epping, I had the privilege of taking the beta version exam. As you can see from the upgrade path, I am not required to take the exam to obtain the VCDX4, but I am a glutton for punishment I guess. Also, not having it as a requirement took some of the pre-test jitters off of me. At first, scheduling conflicts prevented me from being able to sit for the exam within VMware’s original deadline. However, I got a call on June 17th that I could take it on July 2nd. Wow…a two week notice, and on my only scheduled day off since April. But I eagerly accepted the invite. Because of the limited notice and the fact that I was juggling a few projects at the same time I debated even studying for the exam. An unscientific survey on twitter showed that 4 out of 4 followers recommended that I study for the exam. I don’t want to come across as arrogant or as a “know-it-all.” My argument here is that I am already a VCDX, I should know this stuff. My schedule and my severe procrastination tendencies made me decide to do a little bit of review the night before.
Before I begin with my thoughts on the exam content, I want to express that I only had two “issues” with the exam experience itself. First a little bit of background: The exam consists of 41 “questions”, which are actually multifaceted problems that you need to solve with the tools that are presented to you. You have 4.5 hours to complete the exam. The problems are presented in a familiar Vue test engine. You click a button to switch to a desktop session with a few of the typical tools used to administer a vSphere environment. The issue was with the screen refresh for the GUI based tools. When I clicked on an item, sometimes all of the tabs are not presented properly or the content is not complete. This was pretty annoying and sometimes a hindrance. When I participated in the beta exam for the VI3 Advanced Administration Exam, I did not experience this. Hopefully, this will be cleared up before the exam becomes GA. I would think that a leader in desktop virtualization would have a method to avoid this type of thing. The second issue is a provision for breaks. You can take “unscheduled breaks” but I think the clock keeps ticking. It would be nice to actually have a scheduled break without a time penalty. As you get older, you NEED the breaks…
Now, on to the content. Forget about me actually telling you the actual content of the exam. The NDA prevents this and I want to participate in future beta exams. I got my VCDX3 via beta exams and I hope to get my VCDX4 this way!
I’ll admit it. Working primarily in the SMB market limits your skills a bit. I am not as exposed to some of the more advanced features of vSphere 4 as I used to be when I worked in an “enterprise” market. I skipped a couple of problems because of this. I intended to return to them, but the clock ran out before I could. The problems were a very good compendium of the advanced skills required of a more senior VMware Administrator. It was the toughest exam that I have ever taken. The second toughest was the VI3 Advanced Administration Exam. I thought the questions were very fair and there was nothing in the content that caused me any objections.
I was pretty relaxed when I started the exam, but started to PANIC during the last 30 minutes.
The one (personal) issue I have with this type of exam is that it measures you at a point in time on how much you have memorized. Since I don’t want to use an example of a problem that may be on a VMware exam, I will use one of my cars as an example here. Say, for instance that I am sitting in on the 1972 Ford Gran Torino Advanced Administration Exam…
Let’s say a question on the exam is to set the Ignition Points gap. This is something I did a few times on several cars. I know where to find the ignition points. I know how to set the gap. I have the proper tools to do it. But I don’t know what that setting should be. In the REAL world, I would look it up in a manual or on Google. And I looked up the setting every time I did it. Would I fail the test because I know HOW to do it, but don’t know the proper setting? Probably. My teenie brain can’t hold all of this information – especially with all of the Monty Python references in there, not to mention the words for almost every song by Rush and Iron Maiden…
Back on track… Echoing Duncan, Jason and William, I have a few tips to offer for this exam:
- Read the Exam Blueprint. Perform each task listed in the blueprint a few times, so you know HOW to do it. You DO have access to “–help” and man pages during the exam if you are stumped. However, refer to item #3.
- Build a LAB! You will need it for item #1. You don’t have to go out and buy servers and storage. All you need is a reasonably fast 64bit PC or laptop with a decent amount of RAM. Some things may be slow, but you will get through it. You can make an ESX server in a VM. Use VMware Player or VMware Workstation to host your lab VMs. Every VMware product in the blueprint is either free or has an evaluation period. Didn’t you get a free VMware Workstation license with your VCP?
- Manage your time! I ran out of it. You have the opportunity to go back. Skip questions if you don’t know how to do it or think it will take a while. The other thing I noticed was that, since the exam is using a live lab environment, the tasks happen in real time. During my panic state, I started to multitask and work on more than one problem at a time. Instead of clicking “Next” and waiting for the task to complete, click “Next” and start on the next problem. Juggle two or three problems. Use your dry erase board to keep track of skipped problems and multitasking. I am not very fast with my typing and I am constantly mixing up letters in words. I call it “typing dyslexia” and it doesn’t help me in these situations!
I don’t know if I passed this one. I am a little bit pessimistic at this time. I will find out in “4-6 weeks”, but that is VMware Time… Good luck to all that have or are planning to take this exam.
OK..I’ll admit it: I am spoiled by the capabilities of vSphere. What other platform lets you schedule system updates that will occur unattended and without outages of the applications being used? I don’t mean the winders patches, they require a monthly reboot. I am talking about the hypervisor updates. VMware Update Manager coordinates all of this for you. Then along comes vShield Zones to break it all.
First, let me explain what I am trying to do. To simplify things, vShield Zones is a firewall for vSphere Virtual Machines. Rather than regurgitate how it works, take a look at Rodney’s excellent post. A customer has decided to use vShield Zones to help with PCI Compliance. The desire is that only certain VMs will be allowed to communicate with certain other VMs using specific network ports, and to audit that traffic. ’nuff said.
vShield Zones seems to be the perfect solution for this. It works almost seamlessly with vCenter and the underlying ESXi hosts. It provides hardened Linux Virtual Appliances (vShield Agents) to provide the firewalling. It provides a fairly nice management interface to create the firewall rules and distribute them to the vShield Agents. Best of all, IT’S FREE! At least for vSphere Advanced versions and above. Keep in mind, that this is still considered a 1.x release and some things need to be worked out.
Now, on to the gotchas.
Gotcha #1 – Networking
When it comes to networking, the vShield Agent is designed to sit between a vSwitch that is externally connected via physical NICs (pNICs) and a vSwitch that is isolated from the outside world. The vShield Agent installation wizard will prompt you to select a vSwitch to protect. This is illustrated below. The red line indicates network traffic flow.
This works like a champ in this configuration, using a vSwitch for management, which is naturally on an isolated network to begin with, using a vSwitch for VMs to connect to the vShield Agent and using a vSwitch to connect everything to the outside world. This can also be deployed with limited down time. If you are lucky enough to have the Enterprise Plus version, you may want to use a vNetwork Distributed Switch or even a Cisco 1000v. You will need to make some manual configurations to make this work as outlined in the admin guide.
The gotcha is with blade servers or “pizza box” servers that have limited I/O slots. If all of the VM traffic must flow through the same physical NICs and you use a vSwitch, then you need the vShield Agent to protect a port group rather than an entire vSwitch. You will need to create a vSwitch with a protected port group and connect it to the pNICs. Then you you can install the vShield Agent. Once the vShield Agent is installed, you will need to go back to the vSwitch attached to the pNICs and add an unprotected port group. This is illustrated below. The red line is the protected traffic and the blue line is the unprotected traffic.
As you can see, there is an unprotected Port Group (ORIGINAL Network). This needs to be added to the vSwitch AFTER the vShield Agent is installed. If the ORIGINAL Network is already a part of the vSwitch, it will need to be removed BEFORE installing the vShield Agent. In order to avoid an outage, you will need to disable DRS and manually vMotion all VMs off of the ESX/ESXi host before installing the vShield Agent and modifying the port groups.
Gotcha #2 – DRS/HA Settings
The vShield Agents attach to isolated vSwitches with no pNIC connection. As you should already know, using DRS and vMotion on an isolated vSwitch could cause inter-connectivity between VMs to fail. By default, you cannot vMotion a VM that is attached to an isolated vSwitch. You will need to enable this by editing the vpxd.cfg file. You will also need to disable HA and DRS for the vShield Agents so they stay on the hosts where they are installed. Both are well documented. Obviously, you will need to install a vShield Agent on every ESX/ESXi host in the cluster.
The Gotcha here is that, with HA disabled for the vShield Agent, there is no facility for automatic startup. There is an automatic startup setting in the startup/shutdown section of the configuration settings. First, this is an all-or-nothing setting. Second, according to the Availability Guide:
“NOTE The Virtual Machine Startup and Shutdown (automatic startup) feature is disabled for all virtual machines residing on hosts that are in (or moved into) a VMware HA cluster. VMware recommends that you do not manually re-enable this setting for any of the virtual machines. Doing so could interfere with the actions of cluster features such as VMware HA or Fault Tolerance.”
So, if a host fails, HA will restart all protected VMs on different hosts. If the host comes back on line, you risk having DRS migrate protected VMs back to that host. This will cause those VMs to become disconnected because the vShield Agent will not automatically start. If a host fails, hope that it fails good enough so it won’t restart.
Gotcha #3 – Maintenance Mode
At the beginning of this post, I mentioned how VMware Update Manager has spoiled me. VUM can be scheduled to patch VMs and hosts. When host patching is scheduled, VUM will place one host in Maintenance Mode, which will evacuate all VMs. Then, it will apply whatever patches are scheduled to be applied, reboot and then exit Maintenance Mode. It will repeat this for each host in a cluster. This works great unless there are running VMs that have DRS disabled, like the vShield Agent.
In the test environment, when a host was manually set to enter Maintenance Mode, it would stall at 2% without moving the test VMs. I am not sure the order that VMs are migrated off, but none were migrated in the test environment. This could vary in different installations. Here’s the gotcha: you cannot power the vShield Agent off because the protected VMs would become disconnected. You cannot migrate it to a different host because it would cause a serious conflict and cause protected VMs to become disconnected. The only thing you can do is place the host in Maintenance Mode, then MANUALLY (*GASP*) migrate all of the protected VMs and then power the vShield Agent off. So much for automated patch management. We’re back to the “oughts.”
I said already that vShield Zones is a 1.x product. It’s a great firewall, but it has a few gotchas that you need to consider. The benefits may outweigh the negatives. But vSphere is a 4.0 product.Some of this should be able to be addressed by tweaking vCenter or host settings.
vShield Zones should be smart enough to allow us to select specific port groups to protect rather than an entire vSwitch. I guess whatever scripting is being done in the background will need to be changed for this. Maybe we need a Ghetto vShield?
One of the REALLY smart people at VMware should be able to tell us the “order of migration” when a host is placed in Maintenance Mode. Once that is determined, there is probably a configuration file somewhere that we could tweak to change it.
There should be a way to set up automatic startup and shutdown of individual VMs. The Startup/Shutdown settings sort of deprecated once DRS was introduced. The only time it is useful is with a stand-alone server or in a NON-DRS cluster. I guess the only thing that could be done is to add a script somewhere in rc.d or rc.local to start up these VMs, but how can that be done in a “supported” fashion with ESXi and is it supported in either ESX or ESXi?
I brought these issues up with some VMware engineers and they assure me that they are working on this. Hopefully they will figure it out soon. I hate doing things manually. It seems like it is anti-cloud.