profile
viewpoint

brancz/ambench 5

Tool to perform load tests on the Prometheus Alertmanager project.

brancz/base-app 2

Rails base app, Rspec, Devise, Dynamic Role System with CanCan and AngularJS

brancz/coredns-jsonnet 1

Jsonnet code to render Kubernetes manifests for coredns.

brancz/coverageanalysis 1

Analyzing coverage reports with go

adracus/node-chat 0

Node chat (for fun, without security concern)

adracus/pickmeup 0

Pick me up app for codefest8

brancz/alertmanager 0

Prometheus Alertmanager

brancz/alpine-rails 0

A lightweight Rails image based on Alpine Linux

PR opened prometheus/prometheus

UI: Remove useless else-if

That else-if can never be reached.

Signed-off-by: Julien Pivotto roidelapluie@inuits.eu

<!-- Don't forget!

- If the PR adds or changes a behaviour or fixes a bug of an exported API it would need a unit/e2e test.

- Where possible use only exported APIs for tests to simplify the review and make it as close as possible to an actual library usage.

- No tests are needed for internal implementation changes.

- Performance improvements would need a benchmark test to prove it.

- All exposed objects should have a comment.

- All comments should start with a capital letter and end with a full stop.

-->

+0 -2

0 comment

1 changed file

pr created time in an hour

GollumEvent

issue openedprometheus/alertmanager

How to disable Watchdog initial notification in Alertmanager

<!--

Please do *NOT* ask usage questions in Github issues.

If your issue is not a feature request or bug report use:
https://groups.google.com/forum/#!forum/prometheus-users. If
you are unsure whether you hit a bug, search and ask in the
mailing list first.

You can find more information at: https://prometheus.io/community/

-->

**I installed Alertmanager in my EKS cluster along with Prometheus and set up some alerts, they all working fine except one annoying alert that spin up every time which is the Watchdog notification that tells that the entire pipeline is working fine, I know it's an important alert but we have one receiver that accepts all kind of alerts, and it's really annoying to get notified at 12pm to only see that one alert i tried to et rid of it by redirecting it to a null receive but it doesn't work. **

Disable the Watchdog alert

The Watchdog alert keeps firing all the time

Environment

  • System information:

    EKS cluster v1.16

  • Alertmanager version:

    v0.21.0

  • Prometheus version:

    v2.21.0

  • Alertmanager configuration file:

config:
    global:
      resolve_timeout: 5m
    route:
      group_by: ['job']
      group_wait: 30s
      group_interval: 5m
      repeat_interval: 4h
      receiver: prometheus-msteams
      routes:
      - match:
          alertname: Watchdog
        receiver: prometheus-msteams
    receivers:
    - name: prometheus-msteams
      webhook_configs:
      - url: "http://prometheus-msteams:2000/alertmanager"
        send_resolved: true

created time in 2 hours

PR opened prometheus/docs

Add Vector to tools that export Prometheus metrics

Vector has both a Prometheus source and sink and a remote_write source and sink.

@brian-brazil

+1 -0

0 comment

1 changed file

pr created time in 2 hours

startedmssun/passforios

started time in 2 hours

issue commentthanos-io/thanos

Lock for compactors

Hello 👋 Looks like there was no activity on this issue for the last two months. Do you mind updating us on the status? Is this still reproducible or needed? If yes, just comment on this PR or push a commit. Thanks! 🤗 If there will be no activity in the next two weeks, this issue will be closed (we can always reopen an issue if we need!). Alternatively, use remind command if you wish to be reminded at some point in future.

sylr

comment created time in 4 hours

push eventthanos-io/thanos

Giedrius Statkevičius

commit sha d57813b2bc9b349842e1f9a06313731b005c6e00

compact: do not cleanup blocks on boot (#3532) Do not cleanup blocks on boot because in some very bad cases there could be thousands of blocks ready-to-be deleted and doing that makes Thanos Compact exceed `initialDelaySeconds` on k8s. Signed-off-by: Giedrius Statkevičius <giedriuswork@gmail.com>

view details

push time in 6 hours

PR merged thanos-io/thanos

compact: do not cleanup blocks on boot component: compact

Do not clean up blocks on boot because in some very bad cases there could be thousands of blocks ready-to-be deleted and doing that makes Thanos Compact exceed initialDelaySeconds on k8s.

Signed-off-by: Giedrius Statkevičius giedriuswork@gmail.com

  • [x] I added CHANGELOG entry for this change.
  • [ ] Change is not relevant to the end user.

Changes

  • Removed cleanup on boot; added another metric which shows how many loops were performed. Used that metric in tests.

Verification

Updated e2e tests that pass.

Fixes #3395.

+23 -25

5 comments

3 changed files

GiedriusS

pr closed time in 6 hours

push eventthanos-io/thanos

Paweł Krupa

commit sha 28af4b6003450b31b2a1ab80a54d8bacbc0637ff

mixin/alerts: remove .rules suffix from alert groups (#3542) * mixin/alerts: remove .rules suffix from alert groups Signed-off-by: paulfantom <pawel@krupa.net.pl> * examples/alerts: regenerate Signed-off-by: paulfantom <pawel@krupa.net.pl> * pkg/rules: adjust rule groups in tests Signed-off-by: paulfantom <pawel@krupa.net.pl>

view details

push time in 9 hours

PR merged thanos-io/thanos

mixin/alerts: remove .rules suffix from alert groups

<!-- Keep PR title verbose enough and add prefix telling about what components it touches e.g "query:" or ".*:" -->

<!-- Don't forget about CHANGELOG!

Changelog entry format:
- [#<PR-id>](<PR-URL>) Thanos <Component> ...

<PR-id> Id of your pull request.
<PR-URL> URL of your PR such as https://github.com/thanos-io/thanos/pull/<PR-id>
<Component> Component affected by your changes such as Query, Store, Receive.

-->

  • [ ] I added CHANGELOG entry for this change.
  • [ ] Change is not relevant to the end user.

Changes

<!-- Enumerate changes you made --> Removed .rules suffix from alert groups in thanos mixin. This is similar to how node_exporter handles alert groups.

Verification

<!-- How you tested it? How do you know it works? --> Run make examples and look at output.

Fixes #3347

/cc @kakkoyun

+40 -40

0 comment

11 changed files

paulfantom

pr closed time in 9 hours

issue closedthanos-io/thanos

mixin: alerts and rules using same groupname

<!-- In case of issues related to exact bucket implementation, please ping corresponded maintainer from list here: https://github.com/thanos-io/thanos/blob/master/docs/storage.md -->

Thanos, Prometheus and Golang version used:

mixin folder coming from the master branch.

What happened:

Prometheus is complaining thanos-query.rules, thanos-receive.rules, thanos-store.rules and thanos-bucket-replicate.rules group names are repeated in the same file.

What you expected to happen:

That alerts and rules mixins can be imported with prometheus-operator/kube-prometheus.

How to reproduce it (as minimally and precisely as possible):

Import thanos rules and alerts mixins with prometheus-operator/kube-prometheus deployment.

<details>jsonnetfile.json <p>

{
  "version": 1,
  "dependencies": [
    {
      "source": {
        "git": {
          "remote": "https://github.com/prometheus-operator/kube-prometheus.git",
          "subdir": "jsonnet/kube-prometheus"
        }
      },
      "version": "master"
    },
    {
      "source": {
        "git": {
          "remote": "https://github.com/thanos-io/thanos.git",
          "subdir": "mixin"
        }
      },
      "version": "master",
      "name": "thanos-mixin"
    }
  ],
  "legacyImports": true
}

</p> </details>

<details>kube-prometheus.jsonnet <p>

local kp =
  (import 'kube-prometheus/kube-prometheus.libsonnet') +
  (import 'kube-prometheus/kube-prometheus-all-namespaces.libsonnet') +
  // Uncomment the following imports to enable its patches
  (import 'kube-prometheus/kube-prometheus-anti-affinity.libsonnet') +
  (import 'kube-prometheus/kube-prometheus-managed-cluster.libsonnet') +
  // (import 'kube-prometheus/kube-prometheus-node-ports.libsonnet') +
  // (import 'kube-prometheus/kube-prometheus-static-etcd.libsonnet') +
  (import 'kube-prometheus/kube-prometheus-thanos-sidecar.libsonnet') +
  (import 'kube-prometheus/kube-prometheus-custom-metrics.libsonnet') +
  // prometheusAlertsFilter + prometheusAlertsUpdate +
  {
    _config+:: {
      namespace: 'monitoring',

      versions+:: {
        alertmanager: "v0.21.0",
        nodeExporter: "v1.0.1",
        kubeStateMetrics: "1.9.7",
        prometheusOperator: "v0.42.1",
        prometheus: "v2.22.0",
        grafana: '7.2.1',
        thanos: 'v0.16.0-rc.1',
      },
      ...
    },
    ...
    prometheusRules+:: 
      (import 'thanos-mixin/rules.jsonnet') +
      {
        // groups+: [],
      },
    prometheusAlerts+:: 
      (import 'thanos-mixin/alerts.jsonnet') +
      {
        // cat existingrule.yaml | gojsontoyaml -yamltojson > existingrule.json
        // groups+: (import 'existingrule.json').groups,
      },
  };

</p> </details>

Full logs to relevant components:

<details>Logs <p>

level=error ts=2020-10-21T10:50:54.930Z caller=manager.go:946 component="rule manager" msg="loading groups failed" err="/etc/prometheus/rules/prometheus-k8s-rulefiles-0/monitoring-prometheus-k8s-rules.yaml: 2142:3: groupname: \"thanos-query.rules\" is repeated in the same file"
level=error ts=2020-10-21T10:50:54.930Z caller=manager.go:946 component="rule manager" msg="loading groups failed" err="/etc/prometheus/rules/prometheus-k8s-rulefiles-0/monitoring-prometheus-k8s-rules.yaml: 2236:3: groupname: \"thanos-receive.rules\" is repeated in the same file"
level=error ts=2020-10-21T10:50:54.930Z caller=manager.go:946 component="rule manager" msg="loading groups failed" err="/etc/prometheus/rules/prometheus-k8s-rulefiles-0/monitoring-prometheus-k8s-rules.yaml: 2352:3: groupname: \"thanos-store.rules\" is repeated in the same file"
level=error ts=2020-10-21T10:50:54.930Z caller=manager.go:946 component="rule manager" msg="loading groups failed" err="/etc/prometheus/rules/prometheus-k8s-rulefiles-0/monitoring-prometheus-k8s-rules.yaml: 2592:3: groupname: \"thanos-bucket-replicate.rules\" is repeated in the same file"
level=error ts=2020-10-21T10:50:54.930Z caller=main.go:881 msg="Failed to apply configuration" err="error loading rules, previous rule set restored"

</p> </details>

Anything else we need to know:

closed time in 9 hours

sdurrheimer

issue commentprometheus-operator/kube-prometheus

Errors using `kube-prometheus-thanos-sidecar.libsonnet`

It prevents the user from including other prometheusAlerts and prometheusRules from the thanos-mixin/mixin.libsonnet entirely as Prometheus will complain that the thanos-sidecar.rules group name is declared twice

This should be fixed upstream in https://github.com/thanos-io/thanos/pull/3542

sdurrheimer

comment created time in 11 hours

PR opened thanos-io/thanos

mixin/alerts: remove .rules suffix from alert groups

<!-- Keep PR title verbose enough and add prefix telling about what components it touches e.g "query:" or ".*:" -->

<!-- Don't forget about CHANGELOG!

Changelog entry format:
- [#<PR-id>](<PR-URL>) Thanos <Component> ...

<PR-id> Id of your pull request.
<PR-URL> URL of your PR such as https://github.com/thanos-io/thanos/pull/<PR-id>
<Component> Component affected by your changes such as Query, Store, Receive.

-->

  • [ ] I added CHANGELOG entry for this change.
  • [ ] Change is not relevant to the end user.

Changes

<!-- Enumerate changes you made --> Removed .rules suffix from alert groups in thanos mixin. This is similar to how node_exporter handles alert groups.

Verification

<!-- How you tested it? How do you know it works? --> Run make examples and look at output.

Fixes #3347

/cc @kakkoyun

+32 -32

0 comment

10 changed files

pr created time in 12 hours

push eventobservatorium/deployments

Kemal Akkoyun

commit sha d7ac1b35172882088ee978800b2501645961c385

Fix issues discovered after integration Signed-off-by: Kemal Akkoyun <kakkoyun@gmail.com>

view details

push time in 12 hours

push eventobservatorium/deployments

Kemal Akkoyun

commit sha 376249d5846e672c6776b817e50f2ec9b1d23987

Fix issues discovered after integration Signed-off-by: Kemal Akkoyun <kakkoyun@gmail.com>

view details

push time in 13 hours

push eventobservatorium/deployments

Kemal Akkoyun

commit sha e8e58a936bab49b2751b0a373e5439a609190f67

Fix tests Signed-off-by: Kemal Akkoyun <kakkoyun@gmail.com>

view details

push time in 13 hours

issue openedthanos-io/thanos

Incorporate new improvements from upstream in the React UI

In past few months there have been a sizable amount of UI related improvemnts in the upstream Prometheus code. A lot of those improvements make sense for Thanos as well.

Here's a list of PRs (in no specific order) that I think we should take a look at:

  • [ ] Fix styling bug for target labels with special names. prometheus/prometheus#7902
  • [ ] Remove the need of pathPrefix in the React UI. prometheus/prometheus#7979
  • [ ] Add key to StatusWithStatusIndicator component in loop. prometheus/prometheus#6879
  • [ ] Fix react UI bug with series going on and off. prometheus/prometheus#7804
  • [ ] Support new duration format in graph range input. prometheus/prometheus#7833
  • [ ] Fix detail swatch glitch. prometheus/prometheus#7805
  • [ ] Fix button display when there is no panels. prometheus/prometheus#8155
  • [ ] Make React UI the default, keep old UI under /classic. prometheus/prometheus#8142 (Related to #3111)
  • [ ] React UI: Change "metrics autocomplete" with "autocomplete". prometheus/prometheus#8174

Most of these are pretty straightforward and small changes and should be easy for first time contributors.

created time in 13 hours

issue commentkubernetes/kube-state-metrics

Pod metrics reported for deleted pod

@fejta-bot: Closing this issue.

<details>

In response to this:

Rotten issues close after 30d of inactivity. Reopen the issue with /reopen. Mark the issue as fresh with /remove-lifecycle rotten.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta. /close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. </details>

csmarchbanks

comment created time in 14 hours

issue closedkubernetes/kube-state-metrics

Pod metrics reported for deleted pod

<!-- This form is for bug reports and feature requests ONLY!

If you're looking for help check KUBE-STATE-METRICS and the troubleshooting guide. -->

Is this a BUG REPORT or FEATURE REQUEST?:

/kind bug

What happened: Running kube-state-metrics, and a different pod is deleted. After the other pod is removed kube-state-metrics continues to report metrics for that pod indefinitely.

What you expected to happen: Metrics pertaining to a deleted pod to no longer be updated soon after deletion.

How to reproduce it (as minimally and precisely as possible): I don't know how to reproduce this deterministically. It has shown up in a couple environments, and restarting the kube-state-metrics pod will fix it.

It it is possibly related to kube-state-metrics startup. When we have seen this behavior usually the pod with metrics reported indefinitely was deleted shortly after the kube-state-metrics pod was started.

Anything else we need to know?: I am running with auto-sharding turned on. I don't know if this is related or not.

Environment:

  • Kubernetes version (use kubectl version): 1.14.10
  • Kube-state-metrics image version: v1.9.5

closed time in 14 hours

csmarchbanks

issue commentkubernetes/kube-state-metrics

Pod metrics reported for deleted pod

Rotten issues close after 30d of inactivity. Reopen the issue with /reopen. Mark the issue as fresh with /remove-lifecycle rotten.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta. /close

csmarchbanks

comment created time in 14 hours

issue commentthanos-io/thanos

thanos-compact should exit on critical error

Hello 👋 Looks like there was no activity on this issue for the last two months. Do you mind updating us on the status? Is this still reproducible or needed? If yes, just comment on this PR or push a commit. Thanks! 🤗 If there will be no activity in the next two weeks, this issue will be closed (we can always reopen an issue if we need!). Alternatively, use remind command if you wish to be reminded at some point in future.

belm0

comment created time in 15 hours

PR closed thanos-io/thanos

React-ui: Display statistics in Block viewer stale

<!-- Keep PR title verbose enough and add prefix telling about what components it touches e.g "query:" or ".*:" --> Related to #3112 <!-- Don't forget about CHANGELOG!

Changelog entry format:
- [#<PR-id>](<PR-URL>) Thanos <Component> ...

<PR-id> Id of your pull request.
<PR-URL> URL of your PR such as https://github.com/thanos-io/thanos/pull/<PR-id>
<Component> Component affected by your changes such as Query, Store, Receive.

-->

  • [ ] I added CHANGELOG entry for this change.
  • [ ] Change is not relevant to the end user.

Changes

<!-- Enumerate changes you made --> Display refreshedAt, total blocks, and blocks in each group in the Block viewer

Verification

<!-- How you tested it? How do you know it works? --> Checked it in the browser

+188 -162

3 comments

5 changed files

daksh-sagar

pr closed time in 15 hours

issue commentthanos-io/thanos

Incorrect comment in DNS provider

The comment is still incorrect.

bboreham

comment created time in 15 hours

Pull request review commentthanos-io/thanos

Added stats for Blocks #3112

 import React, { FC } from 'react'; import { Block, BlocksPool } from './block'; import { BlockSpan } from './BlockSpan'; import styles from './blocks.module.css';+import { Popup } from 'semantic-ui-react';

Use https://reactstrap.github.io/components/tooltips/ as we already use Reactstrap all over.

kunal-kushwaha

comment created time in 15 hours

Pull request review commentthanos-io/thanos

Added stats for Blocks #3112

     "tempusdominus-bootstrap-4": "^5.1.2",     "tempusdominus-core": "^5.0.3",     "typescript": "^3.3.3",-    "use-query-params": "^1.1.6"+    "use-query-params": "^1.1.6",+    "semantic-ui-react": "^2.0.1"

Using a high level UI library just for tooltips is a bit too much. It will increase the bundle size by a lot too.

We are already using Reactstrap all over. Use https://reactstrap.github.io/components/tooltips/

kunal-kushwaha

comment created time in 15 hours

issue commentthanos-io/thanos

Shipping sourcemaps for the React UI in production

I checked and that'll reduce the size difference by quite a lot (about under 500KB). But I am not sure if it'll give us much value in return. If we don't include sourcemaps for a vast majority of code, we will not have proper stacktraces for a vast amount of crashes. Any thoughts on this?

prmsrswt

comment created time in 15 hours

startedKethku/neovide

started time in 17 hours

pull request commentprometheus/prometheus

Consider status code 429 as recoverable errors to avoid resharding

Ah, any suggestions on how we should move ahead here?

Harkishen-Singh

comment created time in 18 hours

pull request commentthanos-io/thanos

fix index out of bound bug when comparing ZLabelSets

when release v0.17.2 ? i meet this bug, so had to go back v0.16.0.

yeya24

comment created time in 19 hours

issue commentthanos-io/thanos

receive: Cannot use optimized multi part upload.

same here, but the compactor seems to be working though, is it just a warning or it actually stops sth from working?

bwplotka

comment created time in 19 hours

more