Skip to content

Monitoring (& Management) software

Oleksandr Motsak edited this page Oct 5, 2016 · 18 revisions

(Free Open Source?) Host & Apps (& Network?) monitoring solutions with WEB interface

Reqs?:

  • Customizable web interface (Dashboard? Status-board)

  • Host & Application & Service monitoring

  • Alerting/Notification

  • Display Monitoring?

  • Custom Heartbeat

LowLevel monitoring & management:

Overviews (concerning OMD and Check_MK):

Nagios

  • monitors systems, networks and infrastructure.

  • offers monitoring and alerting services for servers, switches, applications and services.

  • open source software licensed under the GNU GPL V2.

It provides:

  • Monitoring of network services (SMTP, POP3, HTTP, NNTP, ICMP, SNMP, FTP, SSH)

  • Monitoring of host resources (processor load, disk usage, system logs) on a majority of network operating systems, including Microsoft Windows with the NSClient++ plugin or Check MK.

  • Monitoring of anything else like probes (temperature, alarms,etc.) which have the ability to send collected data via a network to specifically written plugins

  • Monitoring via remotely run scripts via Nagios Remote Plugin Executor

  • Remote monitoring supported through SSH or SSL encrypted tunnels.

  • A simple plugin design that allows users to easily develop their own service checks depending on needs, by using their tools of choice (shell scripts, C++, Perl, Ruby, Python, PHP, C#, etc.)

  • Available data graphing plugins

  • Parallelized service checks

  • The ability to define network host using 'parent' hosts, allowing the detection of and distinction between hosts that are down or unreachable

  • Contact notifications when service or host problems occur and get resolved (via e-mail, pager, SMS, or any user-defined method through plugin system)

  • The ability to define event handlers to be run during service or host events for proactive problem resolution

  • Automatic log file rotation

  • Support for implementing redundant monitoring hosts

  • An optional web-interface for viewing current network status, notifications, problem history, log files, etc.

  • Data storage via text files rather than database

Links:

Howto use out images with OMD and Check_MD:

  • Get and run OMD:

# download everything locally:

git clone https://github.com/hilbert/hilbert-docker-images.git DIR
cd DIR/images/omd
make pull

# Run OMD
make check CMD=omd_entrypoint.sh
Note: don’t forget to take care of ptmx (e.g. cd ptmx && make check)
  • Install Check_mk Agent:

Login to http://localhost/default/omd/ with omdadmin / omd with any Web-Browser (note: localhost is the system where OMD is running). Download check-mk-agent_1.2.6p12-1_all.deb (or newer) from http://localhost:5000/default/check_mk/agents/

sudo dpkg -i check-mk-agent_1.2.6p12-1_all.deb && sudo apt-get install -fy
sudo apt-get install check-mk-agent-logwatch

Now follow Start Monitoring Target Hosts from http://blog.unicsolution.com/2014/02/how-to-setup-omd-in-1-hour.html

NOTE1: do not forget to go to services and activate missing services (if you have any)

NOTE2: do not forget to Activate Changes!

  • Don’t forget get a snapshot via Backup & Restore (WATO / CheckMK):

Xymon

Xymon offers graphical monitoring, listing the various services of each machine, as well as listing the number of mail messages queued after a defined level of downtime. Statistics are shown graphically for all monitored services.

Monitored hosts require installation of a client, which is also free software, and which forwards monitoring information to a Xymon server. Clients are available for Unix and Linux (in formats including source tarball, RPM and Debian package) from the Xymon download site at Sourceforge. Windows hosts can use the Big Brother and Xymon-compatible BBWin client. Plugins extend monitoring to new types of applications and services, and many extension scripts for Big Brother will run unchanged on Xymon.

Zabbix

  • enterprise open source monitoring solution for networks and applications. It is designed to monitor and track the status of various network services, servers, and other network hardware.

  • A Zabbix agent can also be installed on UNIX and Windows hosts to monitor statistics such as CPU load, network utilization, disk space, etc.

  • As an alternative to installing an agent on hosts, Zabbix includes support for monitoring via SNMP, TCP and ICMP checks, as well as over IPMI, JMX, SSH, Telnet and using custom parameters. Zabbix supports a variety of real-time notification mechanisms, including XMPP.

  • Links: http://www.zabbix.com/ & https://en.wikipedia.org/wiki/Zabbix

Munin

Munin - computer system monitoring, network monitoring and infrastructure monitoring software application. Munin offers monitoring and alerting services for servers, switches, applications, and services. It alerts the users when things go wrong and alerts them a second time when the problem has been resolved.

Munin is written in Perl and uses the RRDtool to create graphs, which are accessible over a web interface. Its emphasis is on plug and play capabilities. About 500 monitoring plugins are currently available. Using Munin you can monitor the performance of your computers, networks, SANs, and applications. It is intended to make it easy to determine "what’s different today" when a performance problem crops up and to provide visibility into capacity and utilization of resources.

Munin has a master/node architecture in which the master connects to all the nodes at regular intervals and asks them for data. It then stores the data in RRD files, and (if needed) updates the graphs. One of the main goals has been ease of creating new plugins (graphs).

Collectd

Collectd - Unix daemon that collects, transfers and stores performance data of computers and network equipment. The acquired data is meant to help system administrators maintain an overview over available resources to detect existing or looming bottlenecks.

collectd uses a modular design: The daemon itself only implements infrastructure for filtering and relaying data as well as auxiliary functions and requires very few resources, it even runs on OpenWrt-powered embedded devices. Data acquisition and storage is handled by plug-ins in the form of shared objects.[3] This way code specific to one operating system is mostly kept out of the actual daemon. Plug-ins may have their own dependencies, for example a specific operating system or software libraries. Other tasks performed by the plug-ins include processing of “notifications” and log messages.

Data acquisition plug-ins, called "read plug-ins" in collectd’s documentation, can be roughly put into three categories: * Operating system plug-ins collect information such as CPU utilization, memory usage, or number of users logged into a system. These plug-ins usually need to be ported to each operating system. Not all such plug-ins are available for all operating systems. * Application plug-ins collect performance data from or about an application running on the same or a remote computer, for example the Apache HTTP Server. These plug-ins often use software libraries but are usually otherwise operating system independent. * Generic plug-ins offer basic functions that the user can employ to perform specific tasks. Examples are querying of network equipment using SNMP or execution of custom programs or scripts.

So called "write plug-ins" offer the possibility to store the collected data on disk using RRD- or CSV-files, or to send data over the network to a remote instance of the daemon.

Argus

Argus is a systems and network monitoring application. It is designed to monitor the status of network services, servers, and other network hardware. It will send alerts when it detects problems.

It is open-source software written entirely in Perl, and provides a web based interface.

  • Can monitor most network services.

  • Supports both IPv4 and IPv6

  • Includes graphing.

  • Web based front end.

  • Can monitor tens-of-thousands of services on common PC hardware.

  • Supports distributed and redundant configurations.

  • Configured using simple text files.

  • Monitoring of server resources: CPU load, network load, and disk usage, using an agent.

  • Monitoring of the results of any command or script.

  • Monitoring is easily extendable through user-written scripts.

MRTG - Multi Router Traffic Grapher

Alternatively to SNMP, MRTG can be configured to run a script or command, and parse its output for counter values. The MRTG website contains a large library of external scripts to enable monitoring of SQL database statistics, firewall rules, CPU fan RPMs, or virtually any integer-value data.

VNC Server for monitoring:

  • x11vnc: shows the actual content of a real screen including hardware accelerated material like OpenGL and video

Power management options

  • Wake on LAN (BIOS)

  • Wake on RTC (BIOS)

  • rtcwake (software to adjust Wake on RTC settings and then put system to sleep or power off)