Network monitoring from the cloud

Lookout

© Lead Image © damedeeso, 123RF.com

© Lead Image © damedeeso, 123RF.com

Author(s):

Netdata helps you monitor your network with ease through a cloud dashboard.

Netdata [1] is a distributed, real-time, performance and health monitoring tool that can be used to monitor machines running Linux, FreeBSD, and macOS, as well as Kubernetes and Docker. Available for free under a GPLv3 license, you can run Netdata on physical machines, virtual machines, containers, and even on Internet of Things (IoT) devices, thanks to its minimal resource footprint.

Netdata's dashboard balances both form and function and does a nice job of visualizing a computer's processes and services. You can use Netdata to monitor the CPU, RAM usage, disk I/O, and network traffic, along with several other aspects of the systems on which it runs. In addition to hardware, it can also keep an eye on web servers, databases, and applications. Netdata's interactive dashboard can also store long-term historical metrics for days, weeks, or months, all at one second granularity.

Designed by system administrators, DevOps engineers, and developers, Netdata not only visualizes the collected metrics, but it also identifies and troubleshoots complex performance problems without wasting time. In fact, Netdata complements its monitoring features with an alarm notification system that will detect performance and availability issues.

One of the best things about Netdata is that the tool ships with sensible defaults, letting you put it into active service immediately after installation. That said, once you are more familiar with Netdata, you can rely on its extensive customization options and tune it to better align with your requirements.

Client Rollout

To use Netdata, you'll have to install the Netdata Agent on all your computers. The agent also installs its own custom database engine to store all the collected metrics. These will then be visualized on the cloud-based dashboard.

Netdata offers several installation options [2], although the recommended way is to use its one-line installation script. The script will fetch all the components necessary to compile the Netdata Agent on your computer. Fire up a terminal on the machine you want to monitor and enter the following command as a regular Linux user:

$ bash <(curl -Ss https://my-netdata.io/kickstart.sh)

That's all there's to it. The script will refresh the package management repositories and install all the dependencies before compiling the Netdata Agent. It'll also ensure that the agent keeps itself updated with nightly releases.

Another popular option is to install the Netdata Agent inside a Docker container [3]. This is useful for a one-off analysis of a host, since it makes no permanent changes to the host computer and can be easily removed.

After installing the agent, fire up a web browser on the computer and head to http://localhost:19999. This will bring up Netdata's dashboard (Figure 1) and show you the live metrics from the system. You can also view the metrics from any other computer on the network. Just replace localhost with the IP address of the computer on which you've installed the agent.

Figure 1: Netdata claims it collects thousands of metrics per server every second, using just about one percent of a single core's CPU.

Monitor Multiple Machines

To monitor multiple machines, use the one-line installation script to install the agent on all the computers you want to monitor. The agents are distributed by design, which means they operate independently of each other and only collect and chart the metrics for the system on which they are installed.

To string together the various agents into a single interface, you'll have to sign up for a free account with the Netdata Cloud service, which will collate and display metrics from all the agents deployed on your network (see the "Netdata Cloud" box).

Netdata Cloud

Netdata Cloud shows you all the computers on your network in single screen view. To get started, head to https://app.netdata.cloud and use any of the options to create an account with the service.

Once your account has been created, the Netdata Cloud will take you through a brief on-boarding process. During this process, you'll be asked to create at least one Space and one War Room. The Spaces and War Rooms help you organize the computers in your network. Think of them as virtual spaces that help you customize how you and your team use Netdata Cloud to monitor your infrastructure.

You can read about the Netdata Cloud's organizational benefits to familiarize yourself with the concept of Spaces and War Rooms [4]. For now, just create one of each as requested. You'll use them later to monitor your computers.

Once you have the login credentials, bring up the dashboard on any of the monitored computers. Click the Sign In button at the top of the dashboard and then enter your Netdata Cloud account credentials when prompted. You'll be redirected back to the Netdata dashboard, which now sports a new menu on the left. The computer will be listed under the Visited Nodes section.

To add more computers, navigate to their respective dashboards and use the Sign In button to connect them to your Netdata Cloud account. As you connect more computers, they'll start populating the Visited Nodes section. Once you've added more than one node, you can switch between them from the menu.

Claim Systems

When you sign into the Netdata Cloud service from the dashboard, your computers will be listed under the Visited Nodes section. You should also take a moment to add computers to the Space you created when signing up for the Netdata Cloud account.

The process of adding a node to a Space is called claiming. Claiming makes sure you have administrative access to the computer and the Netdata Agent running on it. You'll be given a chance to claim a computer during the signup process for your Netdata Cloud account. Of course, you can also do this later from within the Space as well.

To add a computer to a Space, log into the Netdata Cloud account and select a Space from the list on the left. Now click the green + icon within the Space, which will reveal a rather long command (Figure 2). Copy the command, and paste it in a terminal on the computer you want to claim.

Figure 2: Netdata uses the Agent-Cloud link (ACLK) to securely transmit the metrics from the agent to the Cloud.

After verifying your credentials, the computer will be added to the Space (Figure 3). Now you can repeat the claiming process on every computer you want to add to Netdata Cloud. Remember that you can only claim a computer inside a single Space. Once claimed, however, you can add that computer to multiple War Rooms within the Space.

Figure 3: In addition to the metrics, Netdata also transmits information regarding all configured alarms and their current status to the Cloud.

Dashboard Tour

Now that you've installed the agents on multiple computers and can access them from the Cloud dashboard, it's time to familiarize yourself with the dashboard.

Netdata collects monitoring data from dozens of hardware and software components, such as CPU, memory, disks, networking, filesystems, and more. What makes it even more useful though is that Netdata can also collect metrics from hundreds of popular services and applications.

Netdata deploys collectors to gather the metrics. It includes collectors for collating performance data from some of the popular services and apps, such as Apache, NGINX, Tomcat, MySQL, Postgres, MongoDB, Ceph, OpenLDAP, Tor, Docker, and more [5].

All the collected metrics are exposed via the Netdata dashboard as interactive charts. Netdata shows all its charts on a single scrollable page. You can also navigate between the various elements using the menu placed on the dashboard's right-hand side. Note, however, that if you run Netdata on multiple computers that run different operating systems or different versions, the menus might look a little different for each one.

Using the mouse, you can drag the charts to the left or right to move forward and backward through the different time intervals. Similarly, you can change the time markers by holding down the Shift key as you scroll within a chart. To reset a chart to its default view, simply double click inside it.

The good thing about Netdata's visualization is that when you change the view on one chart, it automatically replicates the same view on the other charts as well. Thanks to this feature, you'll always get a synchronized view of the metrics.

The charts themselves are self-explanatory. At the top, you get an overview of the computer's resources. This is followed by a summary of the computer's CPUs, including their utilization and information about the interrupts handled by each, in addition to other aspects. Similarly, you get real-time information about the system's memory utilization, and so on (Figure 4).

Figure 4: In addition to the built-in collectors, you can pull in additional ones via plugins.

Most of the charts have a brief description to explain the feature they display along with its importance. Unless you're well-versed with monitoring Linux/BSD systems, you should spend some time exploring the individual metrics and how they can be used to monitor your systems' health.

Get Alerts

In addition to the active performance monitoring, the Netdata Agent can also help you ensure your systems and applications are healthy by alerting you about possible issues. The Netdata Agent includes dozens of preconfigured alarms that trigger alerts when a monitoring component requires your attention.

As mentioned earlier, these alarms are preconfigured with sensible defaults. Just like Netdata itself, these alarms have been designed by the tool's system administrator community, which means the alarms will be activated automatically upon the agent's installation. That said, while you don't need to edit them, the alarms can be customized to meet your needs.

You can access Netdata's alarm notifications system by clicking the alarms button (the bell icon) at the top of the dashboard. This will bring up a screen that shows the currently raised alarms, along with tabs to view all running alarms, as well as the alarms log (Figure 5).

Figure 5: You can view details about the raised alarms, as well as go to the chart where the alarms were raised for further analysis.

To tune a default alarm, switch to the All tab. This page will list the various alarms along with their preconfigured settings. The source row in the tab points to the configuration file that controls the settings for a particular alarm (Figure 6). You'll need to edit the file and adjust the settings as per your requirements.

Figure 6: Below every alarm's name is a badge that updates automatically to show the chart's current value.

For instance, the /usr/lib/netdata/conf.d/health.d/ram.conf file controls the alarms related to a computer's physical RAM. By default, Netdata will warn you when the amount of used RAM crosses the 80 percent threshhold. You can change this behavior by editing the value in the warn line.

After you've saved the file, you can reload the health monitoring settings with:

$ sudo netdatacli reload-health

See the project's documentation on the health monitoring system [6] to understand the other lines in an alarm's configuration file.

Going Further

Once you've become accustomed to Netdata, it's time to explore the various settings and configure it to meet your requirements. The project has excellent documentation (as referenced throughout the article).

While I've covered most of Netadata's basic features, there's a lot more you can do with it. You can, for instance, export and import snapshots [7] of your dashboard's contents, which helps diagnose major errors and anomalies. You can also create custom dashboards that do a better job of visualizing the metrics in which you are interested.

Despite server monitoring being an already crowded space, Netdata has managed to create a wide berth for itself thanks to its ease of use and customizability. No wonder then that it is one of the most starred projects in the Cloud Native Computing Foundation landscape.

The Author

Mayank Sharma has been writing and reporting on open source software from all over the globe for almost two decades.