Zenoss Core & Resource Manager 5 Beta 2 Evaluation Guide

 Zenoss Core & Resource Manager 5 Beta 2 Evaluation Guide Warning: This is a beta release, as such this release has several significant defects that you should be aware of as the evaluator. Please read through this guide carefully prior to commencing the evaluation. Known Issues ● Not allocating adequate CPU and RAM resources will result in a poor user experience
● Deployments may take a long time (10­20 min), especially if you do multi­host deployments ● Control Center will not start automatically, you will need to start Control Center manually (cmd: start serviced) Note: For every UI action there is an CLI equivalent. Zenoss can be fully scripted from initial service deployment to retirement, backup etc. CLI options are called out periodically throughout the guide and should be used in tandem with the UI options. Copyright © 2014, Zenoss Inc. All rights reserved. 1 of 27 Table of Contents Getting Started 3 Workload Distribution & Load Balancing3 Application Service Model View
4 Control Center Performance Monitoring 5 Editing Services 6 Application Backup & Restore 7 Advanced Control Center Features
8 Attaching to a running container 9 ZenDMD Access 10 Service Actions 11 Service Deletion 12 Service Scaling 13 Service Scale Out13 Service Scale In 13 Service Recovery 14 Core & Resource Manager Feature Evaluation 14 Self­Monitoring Feature 17 Time Zone Support
18 Refactored Event Class and Transform UI with Syntax highlighting
OpenTSDB Interface
19 Asymmetrical KPI Collection & Visualization 20 Component Chart Rollup 20 Integration with Zenoss Control Center 21 Key Functions
21 Services may be started, stopped and restarted 21 Distributed Collectors
22 Collector Scaling 23 Scalable Collector Configuration
24 Troubleshooting
25 I can’t see my application status in Control Center
25 Daemons page doesn’t look like it is working
25 Can’t Access Control Center UI
25 I can’t access the Core/RM UI
25 Static Port Mapping
26 My system is hosed, what do I do?
26 Reinstall Process 26 18 Copyright © 2014, Zenoss Inc. All rights reserved. 2 of 27 Getting Started Warning: Before commencing the beta evaluation, take a snapshot of your virtual machine, so you can quickly roll back if required. The Beta 2 program will use a virtual appliance and also will allow for manual deployment of the product. Core 5 Deployment Instructions ­>http://wiki.zenoss.org/Zenoss_Core_5_Beta/Download RM 5 Deployment Instructions ­>http://support.zenoss.com Workload Distribution & Load Balancing Zenoss Control Center will automatically distribute services based on service RAM configuration across as many hosts as there are available. The current distribution algorithm is random and uses available memory on each host to determine whether to deploy a service or not. In Beta 2, the job scheduler randomly assigns workloads across hosts without assessing memory needs aside from basic host memory size. In order to load balance or re­distribute Zenoss, restart or add additional services and they will be distributed across all available hosts automatically. Application Service Model View For a visual topology of how the Zenoss application is constructed and deployed, use the “Service Map” feature located under the “Deployed Applications” menu option. The “Service Map” model is maintained real time; as services are spun up and down, the model is updated. The resulting service model is then propagated to Service Impact, which is an add­on module to Resource Manager and not available for beta testing currently. An example of a multi­host deployment of Zenoss is shown below: Copyright © 2014, Zenoss Inc. All rights reserved. 3 of 27 Note: The Service Model view is available even when an application has not been deployed. In
that case, there will be no host reference since the application is ‘scheduled’ to run via a given
resource pool/host mapping. Thus only at run time is the host assigned. Control Center Performance Monitoring Basic CPU/Memory metrics are available via the Hosts menu. Simply drill down on a given host and you should see performance information over time. Copyright © 2014, Zenoss Inc. All rights reserved. 4 of 27 Note: Resource Manager 5 will ship with the Control Center ZenPack which will enable declarative service monitoring of the Zenoss monitoring application and will include Service Impact support via the Control Center API. Editing Services Note: Adding and removing instances via the service edit command is currently not supported
but will be supported in the generally available release. 1. SSH into Control Center master host (first host set up) Copyright © 2014, Zenoss Inc. All rights reserved. 5 of 27 2. SU ­ zenoss (or an account that has wheel group membership) 3. Find the service to edit i.e. “Zenoss.regmgr” zenoss@zenosscc:~# sudo serviced service list Note down the ServiceID or simply grep the output using the command as shown below: Copy and paste the ServiceID into the following command zenoss@zenosscc:~# sudo serviced service edit 2jeghyagie9m96ytiseme0pt0 The following command will save you time by parsing the appropriate serviceid to a specified environment variable. zenoss@zenosscc:~# Zope=$(sudo serviced service list | grep Zope | awk '{print $2}'); Alternatively, use the environment variable defined earlier i.e. Zope zenoss@zenosscc:~# sudo serviced service edit $Zope The following service definition will now appear in your default text editor i.e. vi. Copyright © 2014, Zenoss Inc. All rights reserved. 6 of 27 { "Id": "apj8yamhpvdmzsosu1f3xtpq9", "Name": "Zope_Beta", "Context": "null", "Startup": "su ­ zenoss ­c \"/opt/zenoss/bin/zopectl fg\" ", "Description": "Zope Process", "Tags": [ "daemon" ], "OriginalConfigs": { "/opt/zenoss/etc/zope.conf": { "Filename": "/opt/zenoss/etc/zope.conf", "Owner": "zenoss:zenoss", "Permissions": "660", Make requisite edits and then save the file. The service template is updated automatically updated. Application Backup & Restore User Interface 1. Click on Create Backup 2. An online backup will be done across all Zenoss services, including distributed services on remote hosts. 3. Once the backup is complete, you will see the backup file name, timestamp and the option to restore. Copyright © 2014, Zenoss Inc. All rights reserved. 7 of 27 Command Line Interface The process can also be done by a single command. As shown below zenoss@zenosscc:~# sudo serviced backup . The backup process may take several minutes to complete. The default location of backups can be found under: /opt/serviced/backup­[YYYY]­[MM]­[DD]­[BACKUPID] zenoss@zenosscc:/opt/serviced/backup­2014­06­28­205423$ ls ­la total 252 drwxr­xr­x 4 root root 4096 Jun 28 20:54 . drwxr­xr­x 9 1000 1000 4096 Jun 28 20:54 .. drwxr­xr­x 2 root root 4096 Jun 28 21:01 images drwxr­xr­x 2 root root 4096 Jun 28 20:54 snapshots ­rw­r­­r­­ 1 root root 239835 Jun 28 20:54 templates.json Note: Restore from backup currently does not work correctly and will be addressed prior to the release. Advanced Control Center Features Attaching to a running container zenoss@zenosscc:~# sudo serviced service attach Zope NAME SERVICEID
DOCKERID Zope 1hxn8nwo1gu15zd1cw98199pt
4c840522e32d5a23a2256ec506b8aa4b7efba30adab5361fe3e5ecef65034275 Zope 1hxn8nwo1gu15zd1cw98199pt
e194719ca6abb82ff3dba08bd7fefc64b2f9a1552d3257bbe12e98df8712da84 Zope 1hxn8nwo1gu15zd1cw98199pt
dc15c9376006379cc702585c15ec42a3ee96d7105e0ae49afac0fe5144a2a404 Zope 1hxn8nwo1gu15zd1cw98199pt
8eb8467aa524a925edd337d6daf91a9525e0002ee438c7fa9c2aed95e3670eac Copyright © 2014, Zenoss Inc. All rights reserved. 8 of 27 multiple results found; select one from list Note: If more than one instance exists, you will need to attach to a unique instance via the DOCKERID. In addition, you must be on the local Control Center host in order to attach to a running container. zenoss@zenosscc:~# sudo serviced service attach 8eb8467aa524a925edd337d6daf91a9525e0002ee438c7fa9c2aed95e3670eac [root@8eb8467aa524 /]# ps ­ef UID PID PPID C STIME TTY TIME CMD root 1 0 0 Jun27 ? 00:05:30 /serviced/serviced service proxy 1hxn8nwo1gu15zd1cw98199pt 0 su ­ zenoss ­c "/opt/zenoss/bin/ root 18 1 0 Jun27 ? 00:00:45 /usr/local/serviced/resources/logstash/logstash­forwarder ­old­files­hours=26280 ­config /etc root 29 1 0 Jun27 ? 00:00:00 su ­ zenoss ­c /opt/zenoss/bin/zopectl fg [root@8eb8467aa524 /]# su ­ zenoss [zenoss@a7a997835909 /]# cd /opt/zenoss/etc [zenoss@a7a997835909 etc]$ ls ­la zo*.conf ­rw­­­­­­­ 1 zenoss zenoss 96 Jun 20 20:57 zodb_db_main.conf ­rw­­­­­­­ 1 zenoss zenoss 104 Jun 20 20:57 zodb_db_session.conf ­rw­rw­­­­ 1 zenoss zenoss 31933 Jun 29 16:07 zope.conf ZenDMD Access Inside Container [zenoss@a7a997835909 etc]$ zendmd Welcome to the Zenoss dmd command shell! 'dmd' is bound to the DataRoot. 'zhelp()' to get a list of commands. Use TAB­TAB to see a list of zendmd related commands. Tab completion also works for objects ­­ hit tab after an object name and '.' (eg dmd. + tab­key). >>> listFacades() ['applications', 'device', 'devicedumpload', 'devicemanagement', 'eventclasses', 'jobs', 'manufacturers', 'metric', 'mibs', 'monitors', 'network', 'network6', 'process', 'properties', 'reports', 'search', 'service', 'template', 'triggers', 'zenpack', 'zep'] >>> Copyright © 2014, Zenoss Inc. All rights reserved. 9 of 27 Outside Container zenoss@zenosscc:~# ZOPE=$(sudo serviced service list | grep Zope | awk '{print $2}'); zenoss@zenosscc:~# sudo serviced service run $ZOPE zendmd Note: Any changes you make to the Service container outside of using persistent services commands such as ZenDMD and/or ZenPack command will not be committed unless you snapshot the service. Currently the snapshot feature is not working correctly in this beta. Service Actions Service Actions are effectively macros that map labels to commands as shown in the example below. By using the meta data contained in the service template definition, any given action can be reused for multiple services depending on the nature of the command. Excerpt from default service template located under /opt/serviced/templates/ "Actions": { "debug": "su ­ zenoss ­c '${ZENHOME:­/opt/zenoss}/bin/{{.Name}} debug'", "stats": "su ­ zenoss ­c '${ZENHOME:­/opt/zenoss}/bin/{{.Name}} stats'" }, Actions are completely customizable in the service template on the premise that the specific action command is available in the context of the service. zenoss@zenosscc:~$ serviced service action zencommand debug If no error is returned, the zencommand collector process will toggle on debug mode Service Deletion This can be done through the user interface or CLI. User Interface Instructions 1. Navigate to Deploy Applications UI Copyright © 2014, Zenoss Inc. All rights reserved. 10 of 27 2. Click on “Delete” on the target service The process may take several minutes and can be verified by looking at the following log: zenoss@zenosscc:/opt/serviced$ sudo grep delete /var/log/upstart/serviced.log I0628 21:43:55.989933 19364 leader.go:206] Shutting down due to node delete 2jeghyagie9m96ytiseme0pt0 Service Scaling Service Scale Out User Interface 1. Navigate to Deployed Apps 2. Click on target Zenoss application 3. Click on target service 4. Increment number of instances i.e. 2­4 for Zope Copyright © 2014, Zenoss Inc. All rights reserved. 11 of 27 5. Click on Save Changes 6. Wait a few seconds and there should be one or more new service instances running Note: If you have set up multiple Control Center hosts, the specific hosts may differ between each instance due to the job scheduler algorithm. Service Scale In 1. Navigate to Deployed Apps 2. Click on target Zenoss application 3. Click on target service 4. Decrement number of instances i.e. 4­2 for Zope Note: A stack or LIFO process is used to remove additional running instances. This means the last instance created will be removed first. Tip: The context for all things manageable in Control Center is normally exposed as a query string reference i.e. Zope ServiceID can be observed in the URL bar and used as a shortcut. Copyright © 2014, Zenoss Inc. All rights reserved. 12 of 27 Service Recovery If you kill a service process or multiple processes, the service restarts automatically. A tally of running service instances is maintained by Control Center and is defined in the configuration. The correct way to stop a service is either through the UI or by running the CLI command: sudo serviced service stop [SERVICEID] The service structure is hierarchical, this means that if the top level service is stopped, all other services will be stopped. To understand the hierarchy, one can look at the underlying template configuration or use the following command: sudo serviced service list Core & Resource Manager Feature Evaluation Note: This section assumes the user is familiar with the Zenoss User Interface, please refer to the Zenoss administration guide if additional context is needed. 1. Open up browser and navigate to https://Zenoss5x.[HOST] Note: If you don’t see Zenoss come up after navigating to the appliance web address, you may need to check a couple of things in the appliance, which are described in the troubleshooting section at the end of this document. The following Zenoss setup screen will appear in your browser: 2. Click on get started and enter in the Control Center host details in the following screen: Copyright © 2014, Zenoss Inc. All rights reserved. 13 of 27 Hostname: [HOST] Device Type: Linux Server (SSH) Note: Do not use localhost as the hostname/IP address. SSH Credentials User: zenoss Password: zenoss 3. Click on the save button and then skip the rest of the setup 4. Navigate to the device you just added in the Infrastructure view (tip: use the search feature to quickly find the device) 8. The following components should now appear on the left hand pane of the device: Data can also be exported to CSV through each graph widget via the cog icon embedded in the title bar: Copyright © 2014, Zenoss Inc. All rights reserved. 14 of 27 The underlying JSON data structure can also be viewed by clicking on “Definition” If you look carefully, you can see where the metric and tag data is stated. This information can be used directly for querying and graphing time series data directly in the TSD interface section below. Copyright © 2014, Zenoss Inc. All rights reserved. 15 of 27 Self­Monitoring Feature Note: This feature is only available in Resource Manager and currently does not work with non­local
host collectors 1. Add Control Center master agent IP to the ControlCenter device class 2. The output below shows a successful modeling run of a Control Center master instance. 3. The basic model is shown below. Copyright © 2014, Zenoss Inc. All rights reserved. 16 of 27 Time Zone Support 1. Under the user configuration, select your timezone of choice. 2. All UI charts will be updated automatically with the new time zone offset. Refactored Event Class and Transform UI with Syntax highlighting Copyright © 2014, Zenoss Inc. All rights reserved. 17 of 27 OpenTSDB Interface For those who are curious as to how we are leveraging TSD under the covers, you can take a look at the TSD UI via https://OpenTSDB.[HOST]. The TSD UI has built­in autocomplete, so you can quickly find interesting metrics to graph against. Try a combination of uppercase and lowercase letters when searching for metrics. For tags, you can use wildcard expressions, thus any metric with one or more tags can be rendered on the same graph. You can even plot different metrics on the same graph across multiple targets. In the future by clicking on the + tab. API documentation will be provided so you can build your own graphs and also be able to import/export data as needed. Note: The current version of OpenTSDB has no authentication functionality, thus it will most likely only be able for debugging/development in the final release. Copyright © 2014, Zenoss Inc. All rights reserved. 18 of 27 Asymmetrical KPI Collection & Visualization Since RRD is no longer used, we can now collect at arbitrary rates without concerns for RRD step rates. To test, change the poll rate for any collector daemon from 300s (default) to 15s or less, observe for several minutes then revert back to normal. The example below is based on the zencommand collector and shows how transient performance spikes can be caught. Theoretically it is possible to script a pseudo dynamic threshold/poll rate based on events. Component Chart Rollup Copyright © 2014, Zenoss Inc. All rights reserved. 19 of 27 Integration with Zenoss Control Center 1. Log into Zenoss application as an admin user 2. Navigate to Advanced Menu 3. Click on Daemons menu Key Functions ●
●
Services may be started, stopped and restarted Service configuration may be edited directly Copyright © 2014, Zenoss Inc. All rights reserved. 20 of 27 ●
●
Service logs are pre­filtered Service performance templates can be edited Distributed Collectors Note: This feature is only available in Resource Manager Adding Collectors is very similar to the process in 4.X, however you can now associate resource pools with a given collector distribution. Copyright © 2014, Zenoss Inc. All rights reserved. 21 of 27 1. From the daemons menu, click on the type left hand cog and select Add Collector option. Collector Scaling
Note: This feature is only available in Resource Manager This feature enables automatic device monitoring load distribution across multiple collector workers. For example, ZenCommand may have 10 workers configured and if you have 100 devices mapped to the ZenCommand collector, then you may see a distribution of 10 devices across each ZenCommand worker. If you add more workers, then you’ll see fewer devices per worker. The load balancing feature is based on a simple hashing algorithm, so in certain situations you may not always see congruent load balancing. Scalable Collector Configuration Copyright © 2014, Zenoss Inc. All rights reserved. 22 of 27 1. Navigate to the daemons page in Zenoss Resource Manager 2. Click on target Zenoss collector i.e. zencommand 3. Set display drop down to “Configuration Files” 4. Change the number of workers from 1 to as many cores as you have available on the host 5. Restart the collector and all associated devices will be automatically load balanced across the number of workers specified. Troubleshooting Copyright © 2014, Zenoss Inc. All rights reserved. 23 of 27 I can’t see my application status in Control Center It may take a long time to deploy for the first, especially with a multi­host configuration. 1. Check /var/log/upstart/serviced.log for any error messages 2. Run browser in developer mode (if you are a Chrome user) and look at the console output for any error messages/status 3. You should see something like the following under the JS console view: Refreshing pools Using cached pools Retrieved list of app templates Retrieved list of services refresh services called controlplane.min.js:1286 Refreshing pools controlplane.min.js:1371 Using cached pools controlplane.min.js:361 Retrieved list of app templates controlplane.min.js:258 Retrieved list of services Daemons page doesn’t look like it is working Be patient, it may take a while to load due to the integration with Control Center. 1. Check /var/log/upstart/serviced.log for any error messages 2. Run browser in developer mode (if you are a Chrome user) and look at the console output for any error messages/status Can’t Access Control Center UI 1. Restart Control Center: sudo restart serviced 2. Verify TCP 443 is open: netstat ­an | grep [:]443 (may take a few minutes) 3. Check /var/log/upstart/serviced.log for any error messages I can’t access the Core/RM UI 1. Check to make sure Zenoss is running a. Review Control Center Application health status b. Check Zauth/Zope/Zenevent* containers are running without any failed health checks c. Check port 443 is exported by running: docker ps | grep [:]443 d. Restart Zenoss via Control Center e. Reboot server if nothing else works 2. Make sure your local client host file has the correct DNS config. Review installation documentation on how to set this correctly. 3. If you are still having issues after going through steps 1 and 2, try the following steps: a. Define the Application port and protocol in the zproxy export section (8080 and Copyright © 2014, Zenoss Inc. All rights reserved. 24 of 27 tcp) in the service template (refer to the static port mapping example below if you are unsure on how to do this) b. Disable HTTPS CGI support for Zope in the Zope service configuration or in the source template and restart or redeploy the app. This will disable the HTTPS absolute URI feature allowing you to use other ports beyond 443, thus any high port will work. Static Port Mapping
This example will show you how to add a static port mapping for external application access for a given service. In this example, we will show you how to expose the Zope application port used for Zenoss UI access. Detailed instructions on how to edit a service can be found earlier in the deployment guide. Static Port Mapping Steps 1. serviced service edit Zenoss.resmgr (or Zenoss.core) 2. Find AddressConfig reference as shown below 3. Update Port and Protocol key value pairs with appropriate settings i.e. TCP 8080 for Zope. 4. Disable HTTPS CGI support (can be done through Control Center UI) 4. Save and restart Application <Sample Service Definition> "Name": "zproxy", "Purpose": "export", "Protocol": "tcp", "PortNumber": 8080, "PortTemplate": "", "VirtualAddress": "", "Application": "zproxy", "ApplicationTemplate": "zproxy", "AddressConfig": { "Port": 8080, "Protocol": "TCP" }, "VHosts": [ "zenoss5x" My system is hosed, what do I do? You have three options in order of preference: 1. Restore from a snapshot Copyright © 2014, Zenoss Inc. All rights reserved. 25 of 27 2. Reinstall 3. Trying and figure out what happened by digging into log files Reinstall Process sudo stop serviced sudo sh ­c "echo manual > /etc/init/docker.override" sudo reboot sudo dpkg ­r zenoss­[resmgr|core]­service sudo dpkg ­r serviced sudo rm /var/lib/docker ­Rf sudo rm /etc/init/docker.override sudo rm ­rf /opt/serviced sudo start docker Now you are clean again, go ahead and install Zenoss by following the installation instructions referenced at the start of the guide. Bugs We gladly accept bugs as a token of appreciation from our Zenoss community. The filing process is the same as it is for available Core releases. The main difference is the affected version is code named ‘europa’. For Zenoss folks evaluating Core/RM, please use your Zenoss Jira account to file defects. If you don’t have a Jira account already, please go ahead and create one, by navigating to jira.zenoss.com. Sign­up link: https://jira.zenoss.com/secure/Signup!default.jspa Once you have created an account, the process is pretty straight forward for filing a bug. 1. Click on the Create Issue button on the top of the UI Copyright © 2014, Zenoss Inc. All rights reserved. 26 of 27 2. Fill out the following fields and attach any relevant screenshots/log files as required 3. Please use the defect priority guide below when filing bugs. Defect/Bug Priorities Priority Description 1 System crash, loss of data, loss of monitoring. This bug will take top priority with development and stop the release of the software until fixed. 2 A Feature is not functioning and no workaround exists. Next highest priority for development; will not hold up release. 3 A Feature is not functioning a workaround exists. 4 Lowest severity, minor issues, cosmetic issues. Copyright © 2014, Zenoss Inc. All rights reserved. 27 of 27