Home >

KB0018901

Getting Started with Zenoss Cloud

Number of views : 38

Article Number : KB0018901

Published on : 2023-12-11

Last modified : 2023-12-11 17:34:54

Knowledge Base : ESM External

Questions not answered here? Read the Zenoss Cloud FAQ

Overview

What are the features of Zenoss Cloud?

Basic monitoring for CPU, Disk, Memory and Network Utilization
Monitored devices polled every five minutes
A web console to see the status of all your Windows and Linux servers in one place, including WARNINGS and ALERTS.
Email alerting to a group email for ALERTS.

What is it for?

Zenoss Cloud is an optional infrastructure monitoring service offering managed by the ITS Campus Solutions Monitoring Team. The service provides basic monitoring for CPU, Disk, Memory and Network utilization for networked devices (including hosts.) Many advanced and non-standard services (such as web monitoring and metrics generation) are not officially supported by the monitoring team.

Looking for something else?

ThousandEyes Web Application Monitoring for robust monitoring capabilities for on-prem and cloud based web application services.
Splunk> Metrics Collection and Monitoring for robust ability to collect hardware and OS environment metrics that is sent to Splunk> for dashboard and alerting.
Troubleshooting and Ongoing Issues with Zenoss Cloud

Enrolling a new USER into an existing group in Zenoss Cloud

If you already have devices being monitored in Zenoss Cloud and just need to add a new user, you'll just need to send a request to monitoring@its.utexas.edu specifying your group, the person's name, EID, and UT email.

Getting Started

Enrolling a GROUP into Zenoss Cloud

If your academic unit or group is not set up in Zenoss Cloud and you would like it to be, a few hurdles need to be cleared before adding users and hosts to Zenoss Cloud. If you just need to add a new user to an existing group, skip ahead.

It is the responsibility of customers to inform the Monitoring Team if the name of the group changes, or moves under a different sponsor, or if new team members need to be added or removed. While the Groups do use AD for authentication and access, these users and groups have the layers across the platform that need to be maintained by admins in order for multi-tenancy to work with adfs. For this reason, users cannot supply their own AD group.

Group Name:
EXAMPLE: FAS →ITS→ Campus Solutions → ESS → Application & Delivery Development

Members:
Standard UT emails for all users of all services within the group

Setting up monitoring and alerting for Services in a Group

Second, we need to set up alerting for your devices. As a courtesy, the Monitoring Team will add devices for newly enrolled Groups, but users are expected to add, remove, and maintain their own monitors.

Under any given Group, there could be any number of SERVICES. These Services are defined separately from each other. All members of the Group a service belongs to will have console access to the related monitors, but the recipients of alerts will be specified by a single group email maintained by the customer.

Users are expected to maintain their own monitors, including adding and deleting servers, but the Monitoring Team will add them in one go as a courtesy for new services, as batch adding can only be done by users via the API. Hosts added in this way will be set in Pre-Production or to Out-Of-Service by default if they are TEST or DEV hosts (*d0* or *t0* hostnames).

Zenoss Cloud has limited resources, so TEST and DEV monitors are only to be used for testing changes to services within the application lifecycle and set back to Out-of-Service to free up resources for others. If users abuse this or if resources no longer allow this to be sustainable, TEST and DEV hosts will be removed by admins.

For those needing full-time monitoring of TEST and DEV hosts, consider the Monitoring Team's sister service, Thousand Eyes.

Service Name	EXAMPLE: FAS→ITS→ Campus Solutions → ESS → Application & Delivery Development → DEM
Official Contact	A group email address (such as a UT List) that is maintained by the sponsoring unit that will receive all alerts regarding the service and Zenoss service. EXAMPLE: dem-team@utlists.utexas.edu
Windows Hostnames	A comma-delimited list of all FQDNs of Windows servers to be monitored.
Linux Hostnames	A comma-delimited list of all FQDNs of Linux servers to be monitored.
Ping-Only Hostnames	A comma-delimited list of all FQDNs of servers to be monitored ONLY by ping (up/down).

Accessing Zenoss Cloud

Once you have created an account, and the test monitor has been verified working, log into the Cloud instance at https://ut.zenoss.io/cz0/zport/dmd/itinfrastructure with your UT email.

You'll then see the familiar interface of the Collection Zone, analogous to the Resource Manager in the on-prem Zenoss instance.

Adding Monitors

Don't forget, the campus-based collector range is 10.157.27.164/30 Make sure you whitelist this network range on any Access Control Lists or firewalls for any host you want to monitor.

To set up monitoring on Linux (SNMP) or Windows (WINRM), please refer to the FAQ and scroll down to Windows Specifc Issues or Linux Specific Issues

Click on the "+" sign in the Infrastructure view and in the pulldown, select, Add a Single Device.

Enterprise Systems Management > (Zenoss Cloud) Getting Started With Zenoss Cloud > Screenshot 2021-07-23 at 1.09.10 PM.png

Hostname or IP Address:Always use full FQDN. Never use IP addresses as it creates huge problems later down the road.
Device Class: Choose between:

/Server/Linux/UT/$GROUPNAME
/Server/Microsoft/Windows/UT/$GROUPNAME
/Ping/UT/$GROUPNAME

Collector: AUSTIN
Never choose localhost.

Production Status: I keep this as Pre-Production until ready to begin monitoring, but one can leave it as Production if they want monitoring to begin immediately. TEST and DEV hosts should be kept in Out-of-Production except when testing code changes as part of the development cycle. Users are on the honor system to keep them in this state to ensure there are ample resources for everyone.

Device Priority: Typically Normal. High and Highest will provoke USC Operator calls when alerts trigger IF the hosts are added to a service within Sysmod.

Systems: This is essential as alerting is mapped to this location. Choose the directory you want your hosts to be found within Zenoss

Enterprise Systems Management > (Zenoss Cloud) Getting Started With Zenoss Cloud > Screenshot 2021-07-23 at 1.55.39 PM.png

Leave everything else as default and click Add.

Removing Monitors

To remove a monitor, click and highlight in the Infrastructure view and click the "-" button.

Enterprise Systems Management > (Zenoss Cloud) Getting Started With Zenoss Cloud > Screenshot 2021-07-23 at 2.28.27 PM.png

Click "Remove"

Enterprise Systems Management > (Zenoss Cloud) Getting Started With Zenoss Cloud > Screenshot 2021-07-23 at 2.29.08 PM.png

Managing Monitors

Day to day management of monitors is the responsibility of the user. While the topic can be quite involved, most users should be familiar with the basics.

Changing Production State

Production states are for specific situations:

Enterprise Systems Management > (Zenoss Cloud) Getting Started With Zenoss Cloud > Screenshot 2021-07-23 at 2.43.22 PM.png

Pre-Production -- Monitor is being set up. Not actively alerting, but will collect metrics and monitor. After two weeks, it should be set to either Production or Out-of-Service. Hosts left in Pre-Production for more than two weeks will be set to Production by policy.

Production – Actively monitoring, collecting metrics and alerting.

Maintenance – Monitor is in a designated Maintenance Window. It will continue to monitor and collect metrics but will not alert. Like Pre-Production, it will revert to Production after two weeks.

Out-of-Service – Monitor is neither in Production, Pre-Production, or Maintenance, but is not intended to be Decommissioned. It will not collect metrics, monitor, or alert. When in doubt, use this state. This is the default for DEV and TEST hosts except for testing code changes as part of the development cycle.

Decommissioned – Monitor is at end-of-life. It will not collect metrics, monitor, or alert.

Monitors left in Maintenance or Pre-Production for more than 2 weeks will be put into Production. TEST and DEV hosts left in Production will be set to Out-of-Service as resources demand.

Changing Priority

Priority can be changed from the default Normal to High or Highest, which can get UDC Operators to call the registrants for the service in Sysmod.

This can be done individually, with the drop down menu under the production state.

Enterprise Systems Management > (Zenoss Cloud) Getting Started With Zenoss Cloud > priority.png

Maintenance Windows

From the individual view of a device, click on Device Administration in the left-hand column.

At the top of the next page, under Maintenance Windows, click the "+" to create a maintenance window.

From here you can create a maintenance window of your choosing, and you can create as many of them as you like, though it should be noted that users should verify and add the weekly Maintenance Window set by ITS by first looking that information up in SCCM.

Enterprise Systems Management > (Zenoss Cloud) Getting Started With Zenoss Cloud > maint.png

Changing IP, Renaming Hostnames

If a host changes names or is repurposed with a new OS or has a new OS installed, REMOVE the monitor and create a new one.

If the only change is IP, however, select the "*" in the bottom left, and select Reset/Change IP Address. Leave it blank and click save.

Always create monitors with FQDN. Never, ever insert IP numbers. It will work, but it is very difficult to support.

Enterprise Systems Management > (Zenoss Cloud) Getting Started With Zenoss Cloud > Screenshot from 2021-08-05 11-53-38.png

Changing Device Type

Sometimes you might have a Windows or Linux monitor that is pingable but otherwise isn't working due to some configuration on the server having recently changed.

You'll often want to set it to ping-only in order to maintain monitoring until the issue is addressed.

Conversely, you might have a monitor that is Ping-only that you may want to change to Windows or Linux.

Go to the lower left of the detail page of the specific monitor, and select the star "*" menu and choose Change Device Class.

Change Device Class

That should open a box to select the specific device class you want to change the device to. It should only allow you to select the device classes your group owns.

Once you have changed the device class, you can test that everything works as it should by clearing all alerts associated with the device, and remodel by choosing Modeling--> Model Device.

Managing Alerts

What to do when you get an email?

Zenoss Cloud monitoring alerts are actionable items. Red and yellow colored Events signify something that is broken and is need of attention. It is your responsibility as a consumer of the service to triage, troubleshoot, and resolve device conditions that may have led to the event. The Zenoss Cloud monitoring team does not provide device support.

When you receive an email alert from Zenoss, log into the console at https://ut.zenoss.io/cz0/zport/dmd/itinfrastructure

Acknowledging Events

Acknowledging an event is a way to tell others in your team, "I am responding to this alert and am working on it." It also will suppress alerts while you work on the issue.

Select one or more events in the event console view.
Click the checkmark icon. A check mark will appear next to the acknowledged event.

You may also acknowledge events by clicking on the respective link within the text of the alert email.

Click on the Event and then click the CHECKMARK to let your team know that you ACKNOWLEDGE the alert and suppress more email notifications.

Closing an Event

When you have remedied the issue, you can close the event and move it to the event archive. To do this:

Select one or more events in the event console view.
Click the "X" icon. The selected events are closed and moved to the archive.To view events in the event archive, select EVENTS > Event Archive.
Click the Refresh icon to update the event list. The closed events are removed from the display

Click on the Event and then click the "X" button above it to CLOSE and event. If the issue still persists, Zenoss will open it as a new event and alert email.

Advanced Topics

Users with just a few monitors might not need more, but for users managing a service with 20, 30, or 100 hosts, it becomes crucial to find a solution to automate these tasks.

Creating an API Key

To create an API key, log into ut.zenoss.io (aka the Cloud Layer).

Look in the upper right for the pulldown with your name.
Click on it.
Choose "Settings."
On the left side, click "Collection Zone API Keys"
On the right side, click the "Add Key" button.

Enterprise Systems Management > (Zenoss Cloud) Getting Started With Zenoss Cloud > Api001.png

This will bring up the API ADD CLIENT page. Enter the name associated with the account and click SAVE. This will create an API user with the same permissions of the account creating it.

Enterprise Systems Management > (Zenoss Cloud) Getting Started With Zenoss Cloud > Api03.png

To see what others have done with the API:

Michael Kaczmarczik from the UT Web team has written up a deep dive into the API, including using it with Ansible.
Rabindra Kar wrote up how to do REST API calls.
For other questions regarding the API, we recommend reading official documentation published by the vendor and otherwise contacting the Monitoring Team.

Creating and Using Service Accounts With the API

Users will need to request an AD account from the AD team to create a service account to use with Zenoss Cloud, for the singular purpose of generating an API key not tied to a specific user.
That account will be added as a member of their Group and will have the same access as any other user, but will be unable to access the console as it does not have the means to authenticate via MFA.
Users will need to make a request to the monitoring team to grant temporary access for a brief window in order to disable MFA for a service account so a user can access the console with the account and generate an API key.

Other Functions

Zenoss Cloud is a very powerful tool with a wide range of uses. With multi-tenancy, users are empowered to customize and experiment with custom solutions for their individual needs and use cases without concern that it will impact other users.

That does not mean however, that they can't break their own monitors as a result and users should be mindful of this. They should also be mindful of the limitations of the Monitoring Team resources to officially support more than basic monitoring of Linux and Windows servers, and ping-only endpoints.

Users are encouraged to read the vendor's documentation and look at the work of other UT customers if they are looking for custom solutions that suit their use case.

Was this useful?

Thank You! Your feedback has been submitted.

Feedback

Popular Services

MATLAB for Students Only

PNA Data Plan

ITS Managed Lab Support

Remote EID Upgrade Request

Charter School IT Requests