VMWare Service Expectations & Agreements
The OIT VMware ESX service is maintained 24 x 7 x 365 and is expected to be available to the Customer and Users except during regularly scheduled maintenance windows or emergency updates. Due to the design of the ESX environment which allows VMs to be migrated online to other ESX hosts, these planned outages are expected to be rare as they would be limited to patches/upgrades that require the entire cluster to be down at once and/or other infrastructure outages (network, power, etc. that affect the entire VMware cluster). Any outages (planned or unplanned) will be posted to our IT Alerts page.
The OIT VMware ESX infrastructure has been designed to provide redundancy for the following components:
- Automatic load balancing across the VMware cluster using VMotion. When an ESX host becomes overloaded, the system is configured to automatically migrate VMs to other ESX hosts within the cluster. During the VMotion of the VM from one ESX host to another, the VM remains online at all times and is not impacted.
- Automatic failover due to an ESX host failover using VMware’s HA feature. In the event of an ESX hardware failure, all VMs that were running on that ESX host are restarted on another ESX host within the cluster. This is the equivalent of a reboot of the server so any server will see a brief outage for the length of the restart of that server/service.
- The capacity of all OIT ESX clusters is maintained at an n-1 level. This means that in the event of a single ESX host fails, there is enough capacity to maintain all VM’s in that cluster on the remaining n-1 ESX hosts without any expected performance impact. If a second ESX host within a cluster were to fail, we would evaluate the capacity and determine if we would need to begin shutting down copper VMs until the host issues are resolved.
As discussed above, the environment has been designed and the capacity is maintained at an n-1 configuration. This combined with the HA and VMotion tools being utilized allows us to maintain a consistent performance expectation. However, if a specific VM is undersized, then it is very likely that performance issues will be seen. This is no different than if a physical server is used instead of a VM.
Enterprise backups are configured and maintained at the VM level as if the VM was a physical server. It is the responsibility of the assigned Sys Admin to configure and manage these backups.
As discussed above, the HA functionality limits any single ESX host from causing a service outage. However, this failover functionality is limited to a single ESX cluster. At this time any failover between clusters must be designed at the app level to be able to either load balance or failover to another VM.
For a Self Supported VM, OIT provides the VM itself (i.e. the replacement for a physical server) and the associated Standard VMware System Administration. All OS and application installation, patching, configuration, support, etc. are expected to be provided by the VM owner.
VM Owner is responsible for:
- Full OS level Sys Admin
- Installation and all patching of the OS
- Troubleshooting and correcting any OS-level issues
- Installation and management of any OS-level configurations required by the application
- Register Hostname with DNS
- Submit requests for and management of enterprise backups
- Any and all Application level support, configuration, or patching
VMWARE OS SYSTEM ADMINISTRATION
Scope of services
OIT-Systems Infrastructure supports the day-to-day operations of the server and OS through the maintenance and support of the operating system (OS) and OS-related components. There are 2 levels of OS System Administration offered:
- 8-5 (Monday-Friday 8 AM to 5 PM) – This offering is intended for services and applications that are not expected to be supported outside of regular business hours. Typically these VMs are for tolerant services and/or test/dev VMs.
- 24/7 – This offering is intended for production services and applications that are expected to be supported 24/7. Outside of business hours, staff will be paged as required to resolve emergency issues.
For VMs that are going to be used for webhosting, see the Webhosting Sys Admin option which covers the additional requirements for this unique environment.
OIT-SI Sys Admin responsibilities:
- Define/Design/Provision the VM
- Install the Operating System and any OS-related options or services
- Maintain Current security patch levels
- Configure and maintain enterprise backups (nightly unless otherwise agreed to with the VM Owner)
- Monitor the health of systems.
- Provide responses to production-level issues during defined support hours (24/7 or 8/5)
- Provide 8/5 responses to test/development issues, upgrades, and/or configuration changes.
- Document all Server OS-related changes.
- Maintain a strong working relationship with the Service/Application prime
WEBHOSTING SYS ADMIN EXPECTATIONS
Scope of services
In addition to the Standard VM administration and Standard OS System Administration, Linux web hosting VMs require an additional level of System Administration for the configuration and software support required for these environments. There are 2 levels of Linux OS System Administration offered:
- 8-5 (Monday-Friday 8 AM to 5 PM) – This offering is intended for Linux web hosts that are not expected to be supported outside of regular business hours. Typically these Linux VMs are for tolerant-level sites and/or test/dev sites.
- 24/7 – This offering is intended for production sites that are expected to be supported 24/7. Outside of business hours, staff will be paged as required to resolve emergency issues.
Webhosting Sys Admin responsibilities (in addition to Linux OS Sys Admin):
- Configure and maintain the Linux Webserver environment (Apache) including monitoring and alerting.
- Configure and Maintain MySQL Database servers as needed for the web hosting environment.
- Create and configure the Linux Virtual Hosts/Websites environments
- Install and setup Shibboleth
- Register DNS
- Purchase and install SSL Certificates
- Assist VM Owner and/or Duke Web Services to troubleshoot site issues
Content of the site, including any site-specific code and CMS/Application patching or upgrades, is expected to be provided and maintained by the VM Owner and/or Duke Web Services, neither of which are included in the cost of the Webhosting System Administration service.
Storage, Backups & Hosting