Ask questionsAdd data deduplication from HA Prometheus pair based on `--query.replica-label` arg similar to Thanos Query


VM seems great and we'd like to replace our Thanos setup (for 6 K8s Clusters) with it and we do understand how to label series with different labels to split Cluster data but I still struggle with data deduplication. In order to drastically reduce infrastructure costs we are running multiple Prom instances on preemptible hosts. With Thanos Query dedup capabilities it works fine. We don't need to perform any extra actions to get metrics once from all Prom instances for the given cluster. In addition to that dedup handles partial responses for us.

Is there any chance dedup can be included into VM in the observable future or this feature is irrelevant for the case?


Answer questions valyala

What if two prometheus do remote_write the same metrics to victoriametrics-server with the same labels. Is victoriametrics return latest value or both? So if one of prom will down - there are only metrics will be 2 times less often?

VictoriaMetrics stores all the data points for the same time series received from any clients. It doesn't do any deduplication on the received data. This may be OK in certain cases, but usually this hurts compression ratio and may break certain queries involving such functions as rate, count_over_time, sum_over_time, etc.

