profile
viewpoint
Björn Rabenstein beorn7 @grafana Berlin @prometheus developer working at @grafana. Alumnus of @soundcloud and @google. Hope is not a strategy – but it springs eternal.

beorn7/perks 55

Effective Computation of Things

beorn7/talks 45

Public speaking since 2015

beorn7/concurrentcount 10

Experiments to benchmark implementations of a concurrent counter.

beorn7/histogram_experiments 8

Simulated Prometheus histograms from real-world datasets.

beorn7/promhacks 4

Quick & dirty stuff for working with @Prometheus

beorn7/midgard-szenarien 1

Scenarios for the Midgard RPG (in German)

beorn7/rrsim 1

Simulate rolling restarts of many tasks exposing a Prometheus request counter.

beorn7/rsmod 1

Skeleton of moderator code for Rolling Stock

beorn7/thanos 1

Highly available Prometheus setup with long term storage capabilities.

created repositorycaddy-dns/alidns

Caddy module: dns.providers.alidns

created time in 10 minutes

created repositorylibdns/alidns

AliDNS provider for libdns

created time in 13 minutes

PR opened prometheus/prometheus

UI: Remove useless else-if

That else-if can never be reached.

Signed-off-by: Julien Pivotto roidelapluie@inuits.eu

<!-- Don't forget!

- If the PR adds or changes a behaviour or fixes a bug of an exported API it would need a unit/e2e test.

- Where possible use only exported APIs for tests to simplify the review and make it as close as possible to an actual library usage.

- No tests are needed for internal implementation changes.

- Performance improvements would need a benchmark test to prove it.

- All exposed objects should have a comment.

- All comments should start with a capital letter and end with a full stop.

-->

+0 -2

0 comment

1 changed file

pr created time in an hour

startedtateru-io/tateru

started time in 2 hours

startedsharkdp/hyperfine

started time in 2 hours

starteddanistefanovic/build-your-own-x

started time in 2 hours

GollumEvent

issue openedprometheus/alertmanager

How to disable Watchdog initial notification in Alertmanager

<!--

Please do *NOT* ask usage questions in Github issues.

If your issue is not a feature request or bug report use:
https://groups.google.com/forum/#!forum/prometheus-users. If
you are unsure whether you hit a bug, search and ask in the
mailing list first.

You can find more information at: https://prometheus.io/community/

-->

**I installed Alertmanager in my EKS cluster along with Prometheus and set up some alerts, they all working fine except one annoying alert that spin up every time which is the Watchdog notification that tells that the entire pipeline is working fine, I know it's an important alert but we have one receiver that accepts all kind of alerts, and it's really annoying to get notified at 12pm to only see that one alert i tried to et rid of it by redirecting it to a null receive but it doesn't work. **

Disable the Watchdog alert

The Watchdog alert keeps firing all the time

Environment

  • System information:

    EKS cluster v1.16

  • Alertmanager version:

    v0.21.0

  • Prometheus version:

    v2.21.0

  • Alertmanager configuration file:

config:
    global:
      resolve_timeout: 5m
    route:
      group_by: ['job']
      group_wait: 30s
      group_interval: 5m
      repeat_interval: 4h
      receiver: prometheus-msteams
      routes:
      - match:
          alertname: Watchdog
        receiver: prometheus-msteams
    receivers:
    - name: prometheus-msteams
      webhook_configs:
      - url: "http://prometheus-msteams:2000/alertmanager"
        send_resolved: true

created time in 3 hours

PR opened prometheus/docs

Add Vector to tools that export Prometheus metrics

Vector has both a Prometheus source and sink and a remote_write source and sink.

@brian-brazil

+1 -0

0 comment

1 changed file

pr created time in 3 hours

created repositoryroidelapluie/eira8OoK

created time in 3 hours

issue openedprometheus/haproxy_exporter

crashes on http.ListenAndServe (during accepting new connections)

Version: 0.11.0

Crashes for me every few hours during the ListenAndServe call, as it looks like when trying to accept new incoming connections

runtime: checkdead: find g 491671 in status 1 fatal error: checkdead: runnable g runtime stack: runtime.throw(0x9f811e, 0x15) /usr/local/go/src/runtime/panic.go:1116 +0x72 runtime.checkdead() /usr/local/go/src/runtime/proc.go:4407 +0x390 runtime.mput(...) /usr/local/go/src/runtime/proc.go:4824 runtime.stopm() /usr/local/go/src/runtime/proc.go:1832 +0x95 runtime.exitsyscall0(0xc000001500) /usr/local/go/src/runtime/proc.go:3268 +0x111 runtime.mcall(0x0) /usr/local/go/src/runtime/asm_amd64.s:318 +0x5b goroutine 1 [IO wait, 25002 minutes]: internal/poll.runtime_pollWait(0x7fb060191f18, 0x72, 0x0) /usr/local/go/src/runtime/netpoll.go:203 +0x55 internal/poll.(*pollDesc).wait(0xc0000b8918, 0x72, 0x0, 0x0, 0x9ef93b) /usr/local/go/src/internal/poll/fd_poll_runtime.go:87 +0x45 internal/poll.(*pollDesc).waitRead(...) /usr/local/go/src/internal/poll/fd_poll_runtime.go:92 internal/poll.(*FD).Accept(0xc0000b8900, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0) /usr/local/go/src/internal/poll/fd_unix.go:384 +0x1d4 net.(*netFD).accept(0xc0000b8900, 0xf0146b39697dc950, 0x1000000000000, 0xf0146b39697dc950) /usr/local/go/src/net/fd_unix.go:238 +0x42 net.(*TCPListener).accept(0xc000130ac0, 0x5fb30338, 0xc00004bb60, 0x4cf116) /usr/local/go/src/net/tcpsock_posix.go:139 +0x32 net.(*TCPListener).Accept(0xc000130ac0, 0xc00004bbb0, 0x18, 0xc000000180, 0x6c79fc) /usr/local/go/src/net/tcpsock.go:261 +0x64 net/http.(*Server).Serve(0xc00014a000, 0xaba1a0, 0xc000130ac0, 0x0, 0x0) /usr/local/go/src/net/http/server.go:2901 +0x25d net/http.(*Server).ListenAndServe(0xc00014a000, 0xc00014a000, 0x1) /usr/local/go/src/net/http/server.go:2830 +0xb7 net/http.ListenAndServe(...) /usr/local/go/src/net/http/server.go:3086 main.main() /app/haproxy_exporter.go:616 +0x1788 goroutine 491668 [IO wait, 1 minutes]: internal/poll.runtime_pollWait(0x7fb060191e38, 0x72, 0xffffffffffffffff) /usr/local/go/src/runtime/netpoll.go:203 +0x55 internal/poll.(*pollDesc).wait(0xc0006b1298, 0x72, 0x0, 0x1, 0xffffffffffffffff) /usr/local/go/src/internal/poll/fd_poll_runtime.go:87 +0x45

created time in 3 hours

issue openedprometheus/statsd_exporter

reliability: option to safe guard against cardinality explosion

prometheus doesn't support cardinality explosion by default, so when applications send a lot of metrics by mistake both statsd exporter and prometheus can get overwhelmed and lead to statsd exporter crashing due to memory or prometheus ingesting thousands of series and crashing or slow ingestion + slow queries.

So essentially I'm proposing a cli option to only maintain first N metric series received. and the rest to be simply dropped. Any thoughts on this area?

created time in 4 hours

created repositorytateru-io/tateru

Tateru Project

created time in 4 hours

startedkitech/qt.go

started time in 5 hours

starteddastergon/kubectl-diagnose

started time in 6 hours

pull request commentprometheus/client_java

Make DropwizardExports methods protected

Each of our service instances normally exports around 5K of metrics. Having those extra 4 quantiles making it 20K, and with 5 running instances reaches a hundred thousand which is not a problem but it requires slightly bigger Prometheus server instance to avoid crashing.

It would be nice to have a way to control how Dropwizard's histograms are presented in Prometheus, a protected method of some sort.

Alexey1Gavrilov

comment created time in 6 hours

PublicEvent

issue commentprometheus/client_java

Split package in simpleclient.servlet and simpleclient.pushgateway

Basically you cannot build a modular Java application having a module which imports classes from both simpleclient.pushgateway and simpleclient.servlet modules

Here a repo reproducing the problem: https://github.com/Alexey1Gavrilov/java-module-issues/tree/master/promethues The compilation fails with the following error:

> mvn compile
...
[ERROR] COMPILATION ERROR :
[INFO] -------------------------------------------------------------
[ERROR] the unnamed module reads package io.prometheus.client.exporter from both simpleclient.servlet and simpleclient.pushgateway
[ERROR] module simpleclient.pushgateway reads package io.prometheus.client.exporter from both simpleclient.servlet and simpleclient.pushgateway
[ERROR] module simpleclient.servlet reads package io.prometheus.client.exporter from both simpleclient.pushgateway and simpleclient.servlet
[ERROR] module simpleclient reads package io.prometheus.client.exporter from both simpleclient.servlet and simpleclient.pushgateway
[ERROR] /Users/agavrilov/Work/git/java-module-issues/promethues/src/main/java/module-info.java:[1,1] module javamodule.promethues reads package io.prometheus.client.exporter from both simpleclient.servlet and simpleclient.pushgateway
[INFO] 5 errors

A possible workaround for a modular application is to avoid importing from both in the one application module, ie have a two separate submodules requiring one of simpleclient.pushgateway or simpleclient.servlet.

Generally speaking splitting packages across .jar files should be avoided. This SO question is a good source of info.
I think it makes sense to rename the package in a major version update. Should not cause much troubles to the users.

Alexey1Gavrilov

comment created time in 6 hours

startedtomnomnom/gron

started time in 6 hours

startedstretchr/testify

started time in 7 hours

startedgoogle/go-intervals

started time in 10 hours

startedgoogle/minions

started time in 10 hours

startedgoogle/monologue

started time in 10 hours

startedgoogle/triemap

started time in 10 hours

startedgoogle/gnostic-grpc

started time in 10 hours

startedgoogle/martian

started time in 10 hours

startedgoogle/jsonapi

started time in 10 hours

startedgoogle/addlicense

started time in 10 hours

startedgoogle/gnostic-go-generator

started time in 10 hours

startedgoogle/alertmanager-irc-relay

started time in 10 hours

more