Removing a Stale Node in Self-hosted Streamed Netdata

Netdata is a lightweight and user-friendly platform for monitoring infrastructure. This tutorial allows you to remove stale nodes when self-hosting it.

Removing a Stale Node in Self-hosted Streamed Netdata
Photo by Sam Pak / Unsplash

Netdata is a great platform for monitoring your infrastructure. It is lightweight, feature-complete, and easy to set up and use. I have used it for personal and work use for a long time and love it.

One day, I removed an unused server with netdata installed in a parent-child streaming configuration. I was stuck with a single node that could not be removed from the UI.

An unused, stale node in Netdata

In Netdata Cloud, there's an option to remove a stale node from the UI by accessing the Space settings page, and clicking the remove node button on the stale nodes.

Node setting page, only visible on Netdata Cloud

Unfortunately, due to cost and compliance requirements, we decided on a self-hosted Netdata configuration with one Netdata parent node and multiple child nodes installed on an internal network with limited external access.

Method #1: netdatacli remove-stale-node

This method was introduced in the latest version of Netdata, allowing the removal of a stale node with one single command. To use this feature you need to know your Netdata node ID by accessing the UI and clicking the Node Information tab.

Following your click on the node information (i) button, a panel should appear and open to the right. Click the "View node info in JSON" button. A notification stating "JSON copied to clipboard" will appear and you can paste the JSON to a text editor.

Netdata node JSON value

On your Netdata parent server, execute the command below:

netdatacli remove-stale-node [your_node_id]
Response of remove-stale-node command

After running the command and restarting the Netdata service, the stale node should be removed from the UI; however, if you were unable to delete the node using this method due to an older version or an unexpected issue, you can proceed with the second method.

Method #2: Remove the node via the internal database

This method requires you to access the internal SQLite database in the Netdata cache folder and execute this command to access the metadata file.

# access the internal database
sudo sqlite3 /var/cache/netdata/netdata-meta.db

# get host list from metadata
SELECT quote(host_id), hostname FROM host;
Host list data from SQLite

After running the query, you will see your connected node list, including the internal host_id value, which you must save. To begin removing the stale node, execute the query below in the SQLite prompt.

delete from host where host_id = X'A13A087E55D511EE8769577D69007174';
delete from host_label where host_id = X'A13A087E55D511EE8769577D69007174';
delete from host_info where host_id = X'A13A087E55D511EE8769577D69007174';
delete from node_instance where host_id = X'A13A087E55D511EE8769577D69007174';
delete from chart where host_id = X'A13A087E55D511EE8769577D69007174';

Query for node removal, in this case (staging-2) will be removed

sudo service netdata restart

Restart Netdata service after removing node from meta database

That's it; you've successfully removed the stale node from the parent server, and perhaps this advice will be useful if you're running the same configuration as I am.

Subscribe to A Gema's Ramble.

Don’t miss out on the latest issues. Sign up now to get access to the library of members-only issues.
jamie@example.com
Subscribe