A question regarding VMWare ESX Server

Discussion in 'Windows Desktop Systems' started by gballard, Apr 30, 2004.

  1. gballard

    gballard Moderator

    Messages:
    549
    We are started to implement some systems here at work using VMWare ESX Server 2. The problem is...the team leader of the Unix group here has convinced or bullied our supervisor into giving him control of all servers running VMWare. He justifies this by saying it is Linux controlling all the operations of the VMWare ESX servers and therefore he should have total control of the boxes. This leaves us Intel Server Group folks in a bit of a quandry because we have to depend on him to install ESX and configure it correctly and then he gives us access to install the guest Windows 2000 and 2003 operating systems. So far we have one of these implementations in production and the base VMWare ESX server is no where near right as far as how it was setup and configured. I have been trying to find concrete evidence that the Linux kernel is not controlling the entire VMWare ESX server. I found a white paper on VMWare's site where it gave a little information about inner workings of the product. In the section titled definitions, the only reference to linux I see is in the Service Console where it says it is based on RedHat 7.2...as far as the kernel that acts as the host for the virtual machine...it refers to VMKernel as a proprietary micro-kernal acts as a host for the virtual machines. If this is the case, the linux kernel isn't acting as a host for the virtual machines at all. Can anyone direct me to more detailed information perhaps so I can have bulletproof data supporting my group's position that we should have control of the VMWare ESX Servers since they will be hosting Win2k and Win2k3 guest operating systems.
     
  2. fitz

    fitz Just Floating Along Staff Member Political User Folding Team

    Messages:
    4,076
    Location:
    Chicagoland
    In my opinion, there is no easy answer for this (there rarely is an easy answer for a mostly political issue).

    Again, my opinion, is that the best management solution is one of teamwork and cooperation between the administrator of the host and the administrator of the guests. I believe that they are two distinct and seperate entities that should be managed seperatly.

    However, the administrator of the host ESX server needs to realize that the administrator of the guests require some degree of control over the properties of the virtual guests. If the administrator of the ESX host creates all his virtual machines owned by the root account.. that is a bad thing. VMWare's own best practices for ESX v2 (http://www.vmware.com/pdf/esx2_best_practices.pdf) state the following:

    "Avoid having virtual machines owned by root... Design a user and group structure that allows your IT staff to manage the virtual machines they own. Those personnel should have individual logins so that there is an audit trail in the event of problems."

    Now, should your group be allowed to create your own virtual machines? Honestly, I don't think that makes sense. If you allow people to create their own virtual machines, that opens a whole can of worms and too many hands are in the administration pot.

    My advice is to let the unix group create the virtual machines and grant your group access to manage those machines. Keep a paper/email trail of any requests you make of the specific (and I mean as specific as you can get) requirements for the virtual machines you need created. If/When they don't meet your requirements, go back to them with the email/paper work and ask them to correct the problems. If they are unable to meet your requirements, go back to your supervisor with the paper/email trail that shows what you need and what need is not being met and either ask then for more direct input or ask for a more knowledgable VMWare administrator.

    Our own implementation of ESX here follows this model. I don't have root access on the ESX server and request specific machines setup. And I have used the paper trail to make my case that someone more competant needs to manage the host environment and gotten changes made. You may not be as successful in your environment (again, this gets back to a political issue), but I wish you the best of luck.
     
  3. gballard

    gballard Moderator

    Messages:
    549
    well it gets even better...I found out that the Unix group isn't going to be doing the initial install of ESX on these servers. One of our vendors is going to be doing the install. So maybe they will be set up correctly but who knows about the permissions and so forth...
     
  4. fitz

    fitz Just Floating Along Staff Member Political User Folding Team

    Messages:
    4,076
    Location:
    Chicagoland
    That sounds like a *good* thing to have the vendors come down and handle the ESX install. Try to get involved with the installation process and have them consult with your group to gather your needs and expectations of the ESX environment. It's nice to have the "experts" handle the install and configuration, but make sure they know what you need in order to make the right configuration choices for your particular purposes.
     
  5. gballard

    gballard Moderator

    Messages:
    549
    well the problem is...once the system is installed and in production and something goes wrong with the ESX server box...who will troubleshoot it? In an ideal situation...the group responsible for the box would but in our enviroment here...the Unix group will more than likely pawn it off on my group..Intel Server group...and we are technically just responsible for the guest OSes...
     
  6. fitz

    fitz Just Floating Along Staff Member Political User Folding Team

    Messages:
    4,076
    Location:
    Chicagoland
    Who Supports What?

    Yes, support can be an issue. While we haven't run into that problem in supporting out ESX environment, we have run into that problem with other products that blur the support lines between groups.

    For our ESX virtual machine environment, we are lucky in that our unix operations folks have taken ownership of the physical host. They are responsible for the inital creation and any host-level configuration maintenance, monitoring, etc.. For example: they are responsible for making sure the host machine is running and stable. Unless the host is grossly misconfigured (ie: having too many virtual machines that the host hardware can support), there is nothing a virtual machine can do that will affect the stability of the host.

    Simply put, if there is a problem within any given guest OS, it shouldn't affect the host (that is one of the main purposes of virtualization of hardware!). Thus, it is our problem. If all the guests are having problems and nothing is misconfigured, it becomes a host issue that they (having taken ownership) need to resolve.

    I understand your situation might be different. The only other advice I will try to give you is to spend some serious time talking and discussing with them outlinling any and all possible scenario's that you can think of and coming to an agreement with them on how to proceed. Get these things in writing! I cannot stress that enough. Get a document, and SLA, a commitment, etc.. etc.. whatever you want to call it that details these agreements. This should include not only the WHAT they support (hardware, software,troubleshooting roles) but also HOW and more importantly WHEN they will commit to fixing it. Granted, you can never gurantee a problem will be fixed in any certain amount of time, so what we have been doing is comitting to 1) responding to a problem by a certain time frame and 2) providing timely status updates. For excample: in a total system failure, someone is commited to respond and start working on a solution within 2 hours, regardless of time or day (24/7 support). If the system cannot be repaired within 2 hours, they shall notify the necessary parties every 2 hours with a status update. If there is a lesser fault, say, a redundant power supply fails, they will respond within 4 hours during normal business hours and commit to a resolution within 1 business day.

    Once an agreement is reached, let your group sign it and get it signed by their groups.. then get your respective managers to sign and acknowledge it.
    This document is your backup if or, more likely, when something goes wrong.

    My final note: if something goes wrong and you know how to fix it but it is not your responsibility according the agreement to fix it - do NOT go off and fix it yourself. Go and find the one who IS responsible and work with them to fix it. It's slower.. and more frustrating but better in the long run - trust me.. I've learned this fact many times through personal experience. Oh the stories I could tell about that..

    I don't know how feasible this solution is and it does make life a little more political.. how far and how formal you take this process is up to you. We have over 8000 internal users that we support in Canada and in the US spread out from coast to coast. These 8000 users are only our small division of a larger holding company that totals over 300,000 users world-wide. We are forced to do things bigger and, by nature, deal with more politics than you might...

    Anyway, it's late for me and I need to go to bed. I hope I've typed my thoughts clearly.. it's about 3:00am and sometimes this late I tend to ramble..
     
  7. gballard

    gballard Moderator

    Messages:
    549
    well our problem here is that although we are one group...there are 3 distinct groups within our group and they are basically little kingdoms with their own rulers. So basically it comes down to political powerplays and unfortunately middle management doesn't seem to have the stones to stand up and make the groups work together..such is life though I guess