3

Storytime.

Our prometheus node, one of your oldest systems (somehow fits the Titan reference..), is about to be relieved of its duties after several years of loyal services to the crew.

We decided to run with another Prometheus node in the ring, that will run simultaneously with the old one, so that the new one can start to collect metrics that we need for alerting (some historic metrics are needed too..). sort of an Prometheus cluster, without the cluster fun and with 2 different Prometheus versions.

The problems with this? Well it's not the new node or the latest shit versions of Prometheus per se.

1: The node exporter.
those dudes decided to make some breaking changes in a minor update, so that you will need to run with some magic bullshittery, that the latest Prometheus can make something out of the old metrics provided by the old node exporters.

The other one is the related puppet code.
The node definitions for Prometheus were built via exported resources on the target nodes.
The code worked like a charm with only one Prometheus node, but try that with two instances in the same way.

Still WIP, but some targets are already included in the new Prometheus instance.
alerting works so far.

Can't wait to close this ticket for good..

Comments
Add Comment