Getting Started with Zenoss Cloud
Questions not answered here? Read the Zenoss Cloud FAQ
Overview |
||||||||||
---|---|---|---|---|---|---|---|---|---|---|
Table of ContentsWhat are the features of Zenoss Cloud?
What is it for?Zenoss Cloud is an optional infrastructure monitoring service offering managed by the ITS Campus Solutions Monitoring Team. The service provides basic monitoring for CPU, Disk, Memory and Network utilization for networked devices (including hosts.) Many advanced and non-standard services (such as web monitoring and metrics generation) are not officially supported by the monitoring team. Looking for something else?
|
||||||||||
Getting Started |
|
|||||||||
Enrolling a GROUP into Zenoss CloudIf your academic unit or group is not set up in Zenoss Cloud and you would like it to be, a few hurdles need to be cleared before adding users and hosts to Zenoss Cloud. If you just need to add a new user to an existing group, skip ahead. It is the responsibility of customers to inform the Monitoring Team if the name of the group changes, or moves under a different sponsor, or if new team members need to be added or removed. While the Groups do use AD for authentication and access, these users and groups have the layers across the platform that need to be maintained by admins in order for multi-tenancy to work with adfs. For this reason, users cannot supply their own AD group. Group Name: Members:
Second, we need to set up alerting for your devices. As a courtesy, the Monitoring Team will add devices for newly enrolled Groups, but users are expected to add, remove, and maintain their own monitors.
|
||||||||||
|
||||||||||
Once you have created an account, and the test monitor has been verified working, log into the Cloud instance at https://ut.zenoss.io/cz0/zport/dmd/itinfrastructure with your UT email. You'll then see the familiar interface of the Collection Zone, analogous to the Resource Manager in the on-prem Zenoss instance.
|
||||||||||
|
|
|||||||||
Click on the "+" sign in the Infrastructure view and in the pulldown, select, Add a Single Device.
Hostname or IP Address:Always use full FQDN. Never use IP addresses as it creates huge problems later down the road.
Collector: AUSTIN Production Status: I keep this as Pre-Production until ready to begin monitoring, but one can leave it as Production if they want monitoring to begin immediately. TEST and DEV hosts should be kept in Out-of-Production except when testing code changes as part of the development cycle. Users are on the honor system to keep them in this state to ensure there are ample resources for everyone. Device Priority: Typically Normal. High and Highest will provoke USC Operator calls when alerts trigger IF the hosts are added to a service within Sysmod. Systems: This is essential as alerting is mapped to this location. Choose the directory you want your hosts to be found within Zenoss
Leave everything else as default and click Add. |
||||||||||
|
||||||||||
To remove a monitor, click and highlight in the Infrastructure view and click the "-" button.
Click "Remove" |
||||||||||
|
||||||||||
Day to day management of monitors is the responsibility of the user. While the topic can be quite involved, most users should be familiar with the basics. | ||||||||||
|
||||||||||
Production states are for specific situations: Pre-Production -- Monitor is being set up. Not actively alerting, but will collect metrics and monitor. After two weeks, it should be set to either Production or Out-of-Service. Hosts left in Pre-Production for more than two weeks will be set to Production by policy. Production – Actively monitoring, collecting metrics and alerting. Maintenance – Monitor is in a designated Maintenance Window. It will continue to monitor and collect metrics but will not alert. Like Pre-Production, it will revert to Production after two weeks. Out-of-Service – Monitor is neither in Production, Pre-Production, or Maintenance, but is not intended to be Decommissioned. It will not collect metrics, monitor, or alert. When in doubt, use this state. This is the default for DEV and TEST hosts except for testing code changes as part of the development cycle. Decommissioned – Monitor is at end-of-life. It will not collect metrics, monitor, or alert.
Monitors left in Maintenance or Pre-Production for more than 2 weeks will be put into Production. TEST and DEV hosts left in Production will be set to Out-of-Service as resources demand. |
||||||||||
|
||||||||||
Priority can be changed from the default Normal to High or Highest, which can get UDC Operators to call the registrants for the service in Sysmod. This can be done individually, with the drop down menu under the production state. |
||||||||||
|
||||||||||
From the individual view of a device, click on Device Administration in the left-hand column. |
||||||||||
|
||||||||||
If a host changes names or is repurposed with a new OS or has a new OS installed, REMOVE the monitor and create a new one. If the only change is IP, however, select the "*" in the bottom left, and select Reset/Change IP Address. Leave it blank and click save.
|
||||||||||
Changing Device Type |
|
|||||||||
Sometimes you might have a Windows or Linux monitor that is pingable but otherwise isn't working due to some configuration on the server having recently changed. You'll often want to set it to ping-only in order to maintain monitoring until the issue is addressed. Conversely, you might have a monitor that is Ping-only that you may want to change to Windows or Linux.
That should open a box to select the specific device class you want to change the device to. It should only allow you to select the device classes your group owns.
Once you have changed the device class, you can test that everything works as it should by clearing all alerts associated with the device, and remodel by choosing Modeling--> Model Device.
|
||||||||||
|
|
|||||||||
What to do when you get an email?Zenoss Cloud monitoring alerts are actionable items. Red and yellow colored Events signify something that is broken and is need of attention. It is your responsibility as a consumer of the service to triage, troubleshoot, and resolve device conditions that may have led to the event. The Zenoss Cloud monitoring team does not provide device support. Acknowledging EventsAcknowledging an event is a way to tell others in your team, "I am responding to this alert and am working on it." It also will suppress alerts while you work on the issue.
You may also acknowledge events by clicking on the respective link within the text of the alert email. Click on the Event and then click the CHECKMARK to let your team know that you ACKNOWLEDGE the alert and suppress more email notifications. Closing an Event
|
||||||||||
|
||||||||||
Users with just a few monitors might not need more, but for users managing a service with 20, 30, or 100 hosts, it becomes crucial to find a solution to automate these tasks. | ||||||||||
|
||||||||||
To create an API key, log into ut.zenoss.io (aka the Cloud Layer).
|
||||||||||
|
||||||||||
|
||||||||||
Other Functions |
||||||||||
Zenoss Cloud is a very powerful tool with a wide range of uses. With multi-tenancy, users are empowered to customize and experiment with custom solutions for their individual needs and use cases without concern that it will impact other users. That does not mean however, that they can't break their own monitors as a result and users should be mindful of this. They should also be mindful of the limitations of the Monitoring Team resources to officially support more than basic monitoring of Linux and Windows servers, and ping-only endpoints. Users are encouraged to read the vendor's documentation and look at the work of other UT customers if they are looking for custom solutions that suit their use case. |
||||||||||