OIT VM Hosting - Monitoring and Alarm Remediation

OIT offers monitoring for any of its hosted VMs upon request.  The article below list what is included in the basic, default monitoring that is configured if monitoring is requested during a VM's creation in Clockworks:

Default Linux Monitoring

The default Linux monitoring checks include Up/Down, File System Usage, and SSH

Default Windows Monitoring

The default Windows Monitoring checks include Up/Down, Disk, and Remote Desktop

Monitoring is setup through a combination of Icinga, SCOM and Prometheus.  Additional monitoring of specific processes and transactions can be put in place if requested. To request additional monitoring (or remove monitoring) use the OIT – Monitoring Request form which can be found in ServiceNow in the Service Request Catalog.

Log into Support@Duke > On the left under Self Service, click Service Request Catalog > Under the section Enterprise Monitoring, click Request For - OIT Monitoring > Complete the form and click Request Now on the top right.

 

Maintenance Mode

Placing a host in Maintenance Mode will suppress any alarms. 

If work is being done on a host then it should be placed in Maintenance Mode.  This can also be done via Enso or by contacting the Operations-Service Ops Center-OIT (either by a ServiceNow request or by phone).

If a host requires a recurring maintenance schedule submit a request to Operations Management-OIT to have that configured.

 

Alarm Remediation

Alarms from monitored hosts are responded to by the Service Operations Center (SOC).  Notifications can be adjusted as necessary - submit a ServiceNow Request to Monitoring-OIT.  If there are particular steps that the SOC should take for a given alarm then the Service Management-OIT team will document those.

 

Performance Monitoring Graphs

Performance Monitoring is provided via Grafana which can be accessed at https://graphs.oit.duke.edu.

Article number: KB0023678

Valid to: July 27, 2024