This site requires JavaScript to be enabled
Welcome|
Recent searches
IE BUMPER

Service Level Agreement for Enterprise Monitoring and Metrics

Number of views : 11
Article Number : KB0016979
Published on : 2020-08-06
Last modified : 2020-08-06 14:14:50
Knowledge Base : IT Public Self Help

Purpose of Agreement

This Service Level Agreement ("SLA") defines the services and service levels between ITS Systems ("Provider") and the users ("Customer") of Enterprise Monitoring and Metrics services ("Service"). Eligible customers are the IT units within the Financial and Administrative Services (“FAS”) portfolio at the University of Texas at Austin.  This SLA is designed to cover service related terms and conditions, costs, roles and responsibilities, and provides a framework for communication, problem escalation and service resolution. The SLA will be reviewed annually to assess and update for accuracy.

Service Scope

Enterprise Monitoring and Metrics services provide monitoring and metrics tied to actionable alerts for data center devices, web applications, and campus services.

Term of the Agreement

The SLA will be reviewed and modified, where appropriate, on an annual basis. The SLA limits service to those specifically described in this document.

Assumptions

  1. Under normal circumstances, Provider staff are available between 8 am - 5 pm Monday through Friday, excluding all holidays and university closures.
  2. Operational support with no reasonable risk of causing disruption of a service will take place during normal business hours, Monday - Friday, 8 a.m. – 5 p.m.
  3. After-Hours requests for support and emergency support will be fulfilled on a best-effort basis. Priorities will be determined based on urgency and level of impact.
  4. Occasionally, it is necessary for Provider to escalate a problem to another entity. In these instances, Provider cannot guarantee the response time of the other entities. Provider will continue to act as the point-of-contact for cases that require support from an outside entity.
  5. Provider is not responsible for downtime or unplanned outages caused by outside entity.
  6. Provider will prioritize and process incoming incident requests if it meets any one of the following criteria within covered service hours: Provider response to priority requests or incidents may delay response to other requests.
    • Number of people affected.
    • Percentage of total tasks that can no longer be performed by individuals.
    • Academic and Administrative Calendar deadlines.
    • Impact on course delivery.
    • Risk to safety, law, rule, or policy compliance.
  7. Should there be a dispute about service rendered, escalations can be requested by the Customer to the direct management of Provider.

 

Cost

Service is provided at no cost to IT units within the Financial and Administrative Services (“FAS”) portfolio at the University of Texas at Austin.

 

Requests

A Service Request means any request made by the Customer to the Provider for routine operational support.To make a Service Request, the Customer must create a ticket in the UT Service Desk ticketing system.

Note: Customers requesting new monitoring and metrics services must submit a request online using the appropriate intake form on the Service Catalog page.

During normal business hours, Service Requests will be responded to within one (1) business day after notification to the Provider. Service Request changes will be made during normal business hours. Requests made after normal business hours will be responded to the following business day. If a Service Request is not responded to with the response times outlined above, the Customer may escalate by directly contacting ITS Service Desk. Please refer to the ticket number when escalating.

 

Incident Reporting

An Incident means any interruption of the normal function of EMM services. One of the following methods may be used for reporting an incident:

During normal business hours, Incidents will be responded to within four (4) hours after notification to the service provider. Reports made after normal business hours may not be processed until the following business day. After-Hours requests for support and emergency support will be fulfilled on a best-effort basis. If an Incident t is not responded to with the response times outlined above, the Customer may escalate by directly contacting ITS Service Desk. Please refer to the ticket number when escalating.

Note: When calling outside of regular business hours, a voicemail can be left or, where applicable, it is possible to follow phone prompts to be connected to an after-hours operator. Be sure to provide your name, department, and indicate that you are a customer of Zenoss, Splunk, or Thousand Eyes services. This information will allow the operators to more quickly identify and look up the correct person to escalate the call to.

If an Incident  is not responded to with the response times outlined above, the Customer may escalate by directly contacting ITS Service Desk. Please refer to the ticket number when escalating.

 

Routine Maintenance

Because the central IT environment is regularly updated to allow for growth and change in the use of information technology, the Customer must expect routine maintenance to be scheduled periodically to comply with new standards and upgrades. Provider will schedule with Customer in advance when such work is needed, with recurring maintenance windows established for such work. Growth or change initiated by the Customer may warrant a Service Review and new Consultation of their current environment.

Zenoss:

  • The monitoring team will notify customers about both scheduled and unscheduled maintenance via zenoss-notice@utlists.utexas.edu.  Services may not be available during maintenance periods.
  • Scheduled maintenance occurs every Tuesday from 12 pm – 2 pm. To the maximum extent possible, installation of service, application, and security updates will be performed during scheduled maintenance.

Splunk:

  • Scheduled maintenance occurs on Thursdays 10 am - 12 pm.
  • Service announcements will be sent to splunk-notice@utlists.utexas.edu. This includes disruptive maintenance, major version upgrades, announcements of new product features, and service incidents.

ThousandEyes:

  • Alternating Tuesdays, 6pm PST, except for emergency updates.  This maintenance windows is hard established by the third-party “ThousandEyes, Inc.” and cannot be canceled or moved.
  • The Enterprise Agent will have an ongoing maintenance window each Monday at 10am.
  • Customers can stay apprised of the ThousandEyes service via the ThousandEyes status page or via Twitter (following tweets from the @ThousandEyesOps handle).  Scheduled and unscheduled UT specific maintenance will be communicated via thousandeyes-notice@utlists.utexas.edu.

 

Business Recovery and Continuity
Provider does not provide disaster recovery or business continuity planning for Customers. The Information Security Office has developed a comprehensive risk management program that focuses on proactive risk reduction in compliance with university rules and policies as well as all relevant state and federal laws. This program helps units identify and monitor risk to information resources on campus and develop strategies to manage that risk over time to include business recovery and continuation. For more information, see Risk Management Services .

 

Security
Provider will make recommendations based on data classification and risk assessment as determined by the Customer. Provider will implement Customer approved solutions to protect the data. All security controls should be proportional to the confidentiality, integrity, and availability requirements of the data processed by the system. Provider will not be held liable for loss or compromise of data due to improper data security controls. Please see the Campus IT Policies website for more information on minimum security standards

 

Provider Responsibilities

Splunk

  1. Communicate service related announcements.
  2. Ensure service remains on a consistent update cycle to provide the latest features and capabilities to customers.
  3. Provide one (2) hours of consultation to new customers to gather requirements and provide an overview of the service offering.
  4. Perform planned maintenance on a scheduled basis based during the designated maintenance window.
  5. Assist customer with troubleshooting any issues encountered while using the Splunk web console.
  6. Provide common good resources within Splunk such as dashboards and reports for general use.
  7. Evaluate app and add-on requests from customers to determine their value and need.
  8. Monitor Splunk environment to ensure reliability and stability of service.

Zenoss

  1. Communicate service related announcements with consumers.
  2. Ensure service remains on a consistent update cycle to help stay within vendor supported versions.
  3. Provide one (1) hour of consultation to new customers to gather requirements and provide an overview of the service offering.
  4. Perform planned maintenance on a scheduled basis based on the designated maintenance window.
  5. Set up customer devices for monitoring and alerting.
  6. Monitor Zenoss service to ensure reliability and stability of service.

ThousandEyes

  1. Ensure that updates and communication provided by “ThousandEyes, Inc.” are communicated to all customers on a timely basis.
  2. Provide one (1) hour of consultation to new customers to gather requirements and provide an overview of the service offering.
  3. Perform planned maintenance on a scheduled basis based during the designated maintenance window.
  4. Assist customer with troubleshooting any issues encountered while using the ThousandEyes web console.
  5. Create and maintain tests and alerts for customers as requested

 

Customer Responsibilities

Splunk

  1. Respond to Provider inquiries in a professional and timely manner.
  2. Adhere to relevant University acceptable use and security policies and standards related to the acquisition, development, testing, implementation, and production usage of servers, software, networking, related systems, or data stored on their respective systems.
  3. Customers are responsible for sending data to Splunk. This includes installing, configuring, and maintaining the Splunk Universal Forwarder.
  4. Customers are responsible for the searches that are written efficiently and do not consume excessive resources within Splunk.

Zenoss

  1. Assign and maintain a current on-site departmental technical contact for Provider.
  2. Respond to Provider inquiries in a professional and timely manner.
  3. Agree to a maintenance window for scheduled maintenance.
  4. Adhere to relevant University acceptable use and security policies and standards related to the acquisition, development, testing, implementation, and production usage of servers, software, networking, related systems, or data stored on their respective systems.
  5. Regularly check email for service announcements.
  6. Provide and maintain contact information for support purposes.
  7. Ensure firewalls and protocols which allow device monitoring are functioning properly.
  8. Notify the provider team if devices are decommissioned or have new IP addresses.
  9. Notify the provider team if users leave and should no longer have access.
  10. Assign and provide contact information for a secondary TSC that will assume all responsibilities of the primary TSC when the primary TSC is unavailable.
  11. Use supported client software.
  12. For purposes of resolving customer issues, be willing and available to provide information in a timely manner when requested.

ThousandEyes

  1. Assign and maintain a current on-site departmental technical contact for Provider.
  2. Respond to Provider inquiries in a professional and timely manner.
  3. Agree to a maintenance window for scheduled maintenance.
  4. Provide timely notification when employees have left the University and need to be removed from the ThousandEyes service.
  5. Adhere to relevant University acceptable use and security policies and standards related to the acquisition, development, testing, implementation, and production usage of servers, software, networking, related systems, or data stored on their respective systems.
  6. Provide timely notification when tests and alerts are no longer needed or a service no longer needs monitoring.
  7. Customer should ensure that test activity is being monitored and alerts are being acted upon.

Appendix A: Definitions

  • Customer – individual users or units that use the Managed Server Support Service.
  • CSU – Colleges, Schools, and Units
  • Provider – the provider team for all Enterprise Monitoring and Metrics services.
  • Service Account Manager – designated individual on the Provider team that will act as the business service liaison with the Customer.

Thank You! Your feedback has been submitted.

Feedback