Removing a Stale Node in Self-hosted Streamed Netdata
Netdata is a lightweight and user-friendly platform for monitoring infrastructure. This tutorial allows you to remove stale nodes when self-hosting it.
Netdata is a great platform for monitoring your infrastructure. It is lightweight, feature-complete, and easy to set up and use. I have used it for personal and work use for a long time and love it.
One day, I removed an unused server with netdata installed in a parent-child streaming configuration. I was stuck with a single node that could not be removed from the UI.
In Netdata Cloud, there's an option to remove a stale node from the UI by accessing the Space settings page, and clicking the remove node button on the stale nodes.
Unfortunately, due to cost and compliance requirements, we decided on a self-hosted Netdata configuration with one Netdata parent node and multiple child nodes installed on an internal network with limited external access.
Method #1: netdatacli remove-stale-node
This method was introduced in the latest version of Netdata, allowing the removal of a stale node with one single command. To use this feature you need to know your Netdata node ID by accessing the UI and clicking the Node Information tab.
Following your click on the node information (i) button, a panel should appear and open to the right. Click the "View node info in JSON" button. A notification stating "JSON copied to clipboard" will appear and you can paste the JSON to a text editor.
On your Netdata parent server, execute the command below:
netdatacli remove-stale-node [your_node_id]
After running the command and restarting the Netdata service, the stale node should be removed from the UI; however, if you were unable to delete the node using this method due to an older version or an unexpected issue, you can proceed with the second method.
Method #2: Remove the node via the internal database
This method requires you to access the internal SQLite database in the Netdata cache folder and execute this command to access the metadata file.
# access the internal database
sudo sqlite3 /var/cache/netdata/netdata-meta.db
# get host list from metadata
SELECT quote(host_id), hostname FROM host;
After running the query, you will see your connected node list, including the internal host_id value, which you must save. To begin removing the stale node, execute the query below in the SQLite prompt.
That's it; you've successfully removed the stale node from the parent server, and perhaps this advice will be useful if you're running the same configuration as I am.