profile
viewpoint
Matthias Loibl metalmatze Red Hat Berlin https://matthiasloibl.com Software Engineer working on monitoring with Prometheus and Kubernetes at Red Hat CoreOS. Interested in web development, distributed systems and metal.

gopasspw/gopass 3714

The slightly more awesome standard unix password manager for teams

metalmatze/alertmanager-bot 406

Bot for Prometheus' Alertmanager

conprof/conprof 389

Continuous profiling in for pprof compatible profiles.

brancz/kube-rbac-proxy 222

Kubernetes RBAC authorizing HTTP proxy for a single upstream.

brancz/kubernetes-grafana 172

The future of Grafana on Kubernetes with Prometheus.

justwatchcom/github-releases-notifier 114

Receive Slack notifications for new releases of your favorite software on GitHub.

go-pluto/pluto 53

A distributed IMAP server based on Conflict-free Replicated Data Types.

drone/drone-kubernetes-runtime 51

Goto drone/drone-runtime

metalmatze/awesome-jsonnet 20

A curated list of awesome Jsonnet projects and mixins

startedmssun/passforios

started time in 2 hours

fork tboerger/easy-novnc

Single-binary noVNC instance, web UI, and multi-host proxy.

fork in 4 hours

created repositoryrolehippie/novnc

created time in 4 hours

issue commentthanos-io/thanos

Lock for compactors

Hello 👋 Looks like there was no activity on this issue for the last two months. Do you mind updating us on the status? Is this still reproducible or needed? If yes, just comment on this PR or push a commit. Thanks! 🤗 If there will be no activity in the next two weeks, this issue will be closed (we can always reopen an issue if we need!). Alternatively, use remind command if you wish to be reminded at some point in future.

sylr

comment created time in 4 hours

startedphotopea/photopea

started time in 6 hours

push eventthanos-io/thanos

Giedrius Statkevičius

commit sha d57813b2bc9b349842e1f9a06313731b005c6e00

compact: do not cleanup blocks on boot (#3532) Do not cleanup blocks on boot because in some very bad cases there could be thousands of blocks ready-to-be deleted and doing that makes Thanos Compact exceed `initialDelaySeconds` on k8s. Signed-off-by: Giedrius Statkevičius <giedriuswork@gmail.com>

view details

push time in 6 hours

PR merged thanos-io/thanos

compact: do not cleanup blocks on boot component: compact

Do not clean up blocks on boot because in some very bad cases there could be thousands of blocks ready-to-be deleted and doing that makes Thanos Compact exceed initialDelaySeconds on k8s.

Signed-off-by: Giedrius Statkevičius giedriuswork@gmail.com

  • [x] I added CHANGELOG entry for this change.
  • [ ] Change is not relevant to the end user.

Changes

  • Removed cleanup on boot; added another metric which shows how many loops were performed. Used that metric in tests.

Verification

Updated e2e tests that pass.

Fixes #3395.

+23 -25

5 comments

3 changed files

GiedriusS

pr closed time in 6 hours

startedmetalmatze/slo-libsonnet

started time in 8 hours

pull request commentopenshift/telemeter

Bug 1803106: Backport cam_app_workload_migrations metric to release-4.3

@djwhatle: This pull request references Bugzilla bug 1803106. The bug has been updated to no longer refer to the pull request using the external bug tracker. All external bug links have been closed. The bug has been moved to the NEW state.

<details>

In response to this:

Bug 1803106: Backport cam_app_workload_migrations metric to release-4.3

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. </details>

djwhatle

comment created time in 9 hours

pull request commentopenshift/telemeter

Bug 1803106: Backport cam_app_workload_migrations metric to release-4.3

@openshift-bot: Closed this PR.

<details>

In response to this:

Rotten issues close after 30d of inactivity.

Reopen the issue by commenting /reopen. Mark the issue as fresh by commenting /remove-lifecycle rotten. Exclude this issue from closing again by commenting /lifecycle frozen.

/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. </details>

djwhatle

comment created time in 9 hours

PR closed openshift/telemeter

Reviewers
Bug 1803106: Backport cam_app_workload_migrations metric to release-4.3 bugzilla/valid-bug lifecycle/rotten size/S

Follow up to https://github.com/openshift/telemeter/pull/298, backport of cam_app_workload_migrations metric was requested by Clayton per https://github.com/openshift/telemeter/pull/298#issuecomment-577158997

+14 -6

20 comments

7 changed files

djwhatle

pr closed time in 9 hours

pull request commentopenshift/telemeter

Bug 1803106: Backport cam_app_workload_migrations metric to release-4.3

Rotten issues close after 30d of inactivity.

Reopen the issue by commenting /reopen. Mark the issue as fresh by commenting /remove-lifecycle rotten. Exclude this issue from closing again by commenting /lifecycle frozen.

/close

djwhatle

comment created time in 9 hours

startedgoogle/go-intervals

started time in 9 hours

startedgoogle/minions

started time in 9 hours

startedgoogle/monologue

started time in 9 hours

startedgoogle/triemap

started time in 9 hours

push eventthanos-io/thanos

Paweł Krupa

commit sha 28af4b6003450b31b2a1ab80a54d8bacbc0637ff

mixin/alerts: remove .rules suffix from alert groups (#3542) * mixin/alerts: remove .rules suffix from alert groups Signed-off-by: paulfantom <pawel@krupa.net.pl> * examples/alerts: regenerate Signed-off-by: paulfantom <pawel@krupa.net.pl> * pkg/rules: adjust rule groups in tests Signed-off-by: paulfantom <pawel@krupa.net.pl>

view details

push time in 9 hours

PR merged thanos-io/thanos

mixin/alerts: remove .rules suffix from alert groups

<!-- Keep PR title verbose enough and add prefix telling about what components it touches e.g "query:" or ".*:" -->

<!-- Don't forget about CHANGELOG!

Changelog entry format:
- [#<PR-id>](<PR-URL>) Thanos <Component> ...

<PR-id> Id of your pull request.
<PR-URL> URL of your PR such as https://github.com/thanos-io/thanos/pull/<PR-id>
<Component> Component affected by your changes such as Query, Store, Receive.

-->

  • [ ] I added CHANGELOG entry for this change.
  • [ ] Change is not relevant to the end user.

Changes

<!-- Enumerate changes you made --> Removed .rules suffix from alert groups in thanos mixin. This is similar to how node_exporter handles alert groups.

Verification

<!-- How you tested it? How do you know it works? --> Run make examples and look at output.

Fixes #3347

/cc @kakkoyun

+40 -40

0 comment

11 changed files

paulfantom

pr closed time in 9 hours

issue closedthanos-io/thanos

mixin: alerts and rules using same groupname

<!-- In case of issues related to exact bucket implementation, please ping corresponded maintainer from list here: https://github.com/thanos-io/thanos/blob/master/docs/storage.md -->

Thanos, Prometheus and Golang version used:

mixin folder coming from the master branch.

What happened:

Prometheus is complaining thanos-query.rules, thanos-receive.rules, thanos-store.rules and thanos-bucket-replicate.rules group names are repeated in the same file.

What you expected to happen:

That alerts and rules mixins can be imported with prometheus-operator/kube-prometheus.

How to reproduce it (as minimally and precisely as possible):

Import thanos rules and alerts mixins with prometheus-operator/kube-prometheus deployment.

<details>jsonnetfile.json <p>

{
  "version": 1,
  "dependencies": [
    {
      "source": {
        "git": {
          "remote": "https://github.com/prometheus-operator/kube-prometheus.git",
          "subdir": "jsonnet/kube-prometheus"
        }
      },
      "version": "master"
    },
    {
      "source": {
        "git": {
          "remote": "https://github.com/thanos-io/thanos.git",
          "subdir": "mixin"
        }
      },
      "version": "master",
      "name": "thanos-mixin"
    }
  ],
  "legacyImports": true
}

</p> </details>

<details>kube-prometheus.jsonnet <p>

local kp =
  (import 'kube-prometheus/kube-prometheus.libsonnet') +
  (import 'kube-prometheus/kube-prometheus-all-namespaces.libsonnet') +
  // Uncomment the following imports to enable its patches
  (import 'kube-prometheus/kube-prometheus-anti-affinity.libsonnet') +
  (import 'kube-prometheus/kube-prometheus-managed-cluster.libsonnet') +
  // (import 'kube-prometheus/kube-prometheus-node-ports.libsonnet') +
  // (import 'kube-prometheus/kube-prometheus-static-etcd.libsonnet') +
  (import 'kube-prometheus/kube-prometheus-thanos-sidecar.libsonnet') +
  (import 'kube-prometheus/kube-prometheus-custom-metrics.libsonnet') +
  // prometheusAlertsFilter + prometheusAlertsUpdate +
  {
    _config+:: {
      namespace: 'monitoring',

      versions+:: {
        alertmanager: "v0.21.0",
        nodeExporter: "v1.0.1",
        kubeStateMetrics: "1.9.7",
        prometheusOperator: "v0.42.1",
        prometheus: "v2.22.0",
        grafana: '7.2.1',
        thanos: 'v0.16.0-rc.1',
      },
      ...
    },
    ...
    prometheusRules+:: 
      (import 'thanos-mixin/rules.jsonnet') +
      {
        // groups+: [],
      },
    prometheusAlerts+:: 
      (import 'thanos-mixin/alerts.jsonnet') +
      {
        // cat existingrule.yaml | gojsontoyaml -yamltojson > existingrule.json
        // groups+: (import 'existingrule.json').groups,
      },
  };

</p> </details>

Full logs to relevant components:

<details>Logs <p>

level=error ts=2020-10-21T10:50:54.930Z caller=manager.go:946 component="rule manager" msg="loading groups failed" err="/etc/prometheus/rules/prometheus-k8s-rulefiles-0/monitoring-prometheus-k8s-rules.yaml: 2142:3: groupname: \"thanos-query.rules\" is repeated in the same file"
level=error ts=2020-10-21T10:50:54.930Z caller=manager.go:946 component="rule manager" msg="loading groups failed" err="/etc/prometheus/rules/prometheus-k8s-rulefiles-0/monitoring-prometheus-k8s-rules.yaml: 2236:3: groupname: \"thanos-receive.rules\" is repeated in the same file"
level=error ts=2020-10-21T10:50:54.930Z caller=manager.go:946 component="rule manager" msg="loading groups failed" err="/etc/prometheus/rules/prometheus-k8s-rulefiles-0/monitoring-prometheus-k8s-rules.yaml: 2352:3: groupname: \"thanos-store.rules\" is repeated in the same file"
level=error ts=2020-10-21T10:50:54.930Z caller=manager.go:946 component="rule manager" msg="loading groups failed" err="/etc/prometheus/rules/prometheus-k8s-rulefiles-0/monitoring-prometheus-k8s-rules.yaml: 2592:3: groupname: \"thanos-bucket-replicate.rules\" is repeated in the same file"
level=error ts=2020-10-21T10:50:54.930Z caller=main.go:881 msg="Failed to apply configuration" err="error loading rules, previous rule set restored"

</p> </details>

Anything else we need to know:

closed time in 9 hours

sdurrheimer

startedgoogle/gnostic-grpc

started time in 9 hours

startedgoogle/martian

started time in 9 hours

startedgoogle/jsonapi

started time in 9 hours

startedgoogle/addlicense

started time in 9 hours

startedgoogle/gnostic-go-generator

started time in 9 hours

startedgoogle/alertmanager-irc-relay

started time in 9 hours

startedgoogle/go-licenses

started time in 9 hours

startedgoogle/ts-bridge

started time in 9 hours

startedgoogle/nixery

started time in 9 hours

startedgoogle/readahead

started time in 9 hours

startedgoogle/truestreet

started time in 10 hours

more