profile
viewpoint
Marco Pracucci pracucci Grafana Labs Italy https://pracucci.com Software Engineer at Grafana Labs

grafana/cortex-jsonnet 37

This repo has the jsonnet for deploying and also the mixin for monitoring Cortex

pracucci/node-cidr-matcher 20

Fast CIDR matcher. Given an input IPv4 or IPv6 address, it checks if it's inside a set of IP ranges, expressed in CIDR notation.

pracucci/php-on-kubernetes 6

Lessons learned running PHP on Kubernetes in production

grafana/puppet-promtail 5

Deploy and configure Grafana's Promtail with Puppet

pracucci/elasticsearch-playstore 3

Google Play Store App Analytics importer for ElasticSearch

pracucci/etcd 1

Distributed reliable key-value store for the most critical data of a distributed system

pracucci/lokitool 1

Tooling for Grafana Loki

pracucci/phoneid-php-sdk 1

Phone.id PHP SDK

pracucci/pracucci 1

GitHub profile description

pracucci/alertmanager 0

Prometheus Alertmanager

issue closedthanos-io/thanos

Idea: use caching bucket for caching index

Caching bucket is currently configured to ignore "index" files, because for index we use an IndexCache instead.

Instead of using IndexCache, we could configure caching bucket to cache ranges of "index" objects instead, similar how we cache ranges of segment objects.

That would have some benefits:

  • it would be easier to control size of cached items (eg. for chunks we use 16000 bytes)
  • it would unify and simplify the caching code

On the other side:

  • it would break existing index-cache configuration and metrics
  • it wouldn't be possible to apply postings compression anymore (however this was mostly done to reduce size of cached items)

(We delayed this work because it's not clear that pros outweigh the cons here, and existing index cache works just fine.)

closed time in 4 hours

pstibrany

issue commentthanos-io/thanos

Idea: use caching bucket for caching index

Closing for now as promised, let us know if you need this to be reopened! 🤗

pstibrany

comment created time in 4 hours

issue commentthanos-io/thanos

store panics with "slice bounds out of range"

Hello 👋 Looks like there was no activity on this issue for the last two months. Do you mind updating us on the status? Is this still reproducible or needed? If yes, just comment on this PR or push a commit. Thanks! 🤗 If there will be no activity in the next two weeks, this issue will be closed (we can always reopen an issue if we need!). Alternatively, use remind command if you wish to be reminded at some point in future.

csonp

comment created time in 4 hours

issue commentthanos-io/thanos

Compact: Offline deduplication

Hello 👋 Looks like there was no activity on this issue for the last two months. Do you mind updating us on the status? Is this still reproducible or needed? If yes, just comment on this PR or push a commit. Thanks! 🤗 If there will be no activity in the next two weeks, this issue will be closed (we can always reopen an issue if we need!). Alternatively, use remind command if you wish to be reminded at some point in future.

smalldirector

comment created time in 4 hours

issue commentthanos-io/thanos

Block Viewer: Allow deleting a block from Block Viewer itself

Hello 👋 Looks like there was no activity on this issue for the last two months. Do you mind updating us on the status? Is this still reproducible or needed? If yes, just comment on this PR or push a commit. Thanks! 🤗 If there will be no activity in the next two weeks, this issue will be closed (we can always reopen an issue if we need!). Alternatively, use remind command if you wish to be reminded at some point in future.

prmsrswt

comment created time in 4 hours

Pull request review commentcortexproject/cortex

WIP: Multi tenant query federation

 store_gateway_client: # (ingesters shuffle sharding on read path is disabled). # CLI flag: -querier.shuffle-sharding-ingesters-lookback-period [shuffle_sharding_ingesters_lookback_period: <duration> | default = 0s]++tenant_federation:+  # If enabled, multi tenant query federation can be used by supplying multiple+  # tenant IDs in the read path (experimental).+  # CLI flag: -querier.tenant-federation.enabled+  [enabled: <boolean> | default = false]

If some cluster enables multi tenant querying, the flag should be set across all the instances and not only on the query frontend/scheduler, so that e.g. the distributor rejects multi tenant ingestion. I feel the prefix -querier. is misleading

simonswine

comment created time in 5 hours

issue commentthanos-io/thanos

s3/aliyunOSS: Support multipart upload without pre known object size.

Hi @wujinhu, it seems that this issue hasn't been fixed yet. Do we have any plans for that? Is there some work that I can do for solving it?

We use aliyun OSS in our production environment as Thanos bucket store. But we failed to do downsampling jobs because we always get annoying errors unsupported type of io.Reader.

And I find there is a todo comment referring to this issue in line 72 in the source file oss.go.

bwplotka

comment created time in 6 hours

push eventthanos-io/thanos

Ben Kochie

commit sha f5a2399c6337110850119a67f15e620dcf90b111

Update CircleCI build (#3513) * Update to CircleCI 2.1 config. * Use new executors feature. * Use new CirleCI images. * Update docs. Signed-off-by: Ben Kochie <superq@gmail.com>

view details

push time in 7 hours

PR merged thanos-io/thanos

Update CircleCI build
  • Update to CircleCI 2.1 config.
  • Use new executors feature.
  • Use new CirleCI images.
  • Update docs.

Signed-off-by: Ben Kochie superq@gmail.com

<!-- Keep PR title verbose enough and add prefix telling about what components it touches e.g "query:" or ".*:" -->

<!-- Don't forget about CHANGELOG!

Changelog entry format:
- [#<PR-id>](<PR-URL>) Thanos <Component> ...

<PR-id> Id of your pull request.
<PR-URL> URL of your PR such as https://github.com/thanos-io/thanos/pull/<PR-id>
<Component> Component affected by your changes such as Query, Store, Receive.

-->

  • [ ] I added CHANGELOG entry for this change.
  • [x] Change is not relevant to the end user.

Changes

<!-- Enumerate changes you made -->

Verification

<!-- How you tested it? How do you know it works? -->

+15 -43

0 comment

5 changed files

SuperQ

pr closed time in 7 hours

Pull request review commentthanos-io/thanos

Update CircleCI build

 # NOTE: Current plan gives 1500 build minutes per month.-version: 2-# https://circleci.com/blog/circleci-hacks-reuse-yaml-in-your-circleci-config-with-yaml/-defaults: &defaults-  docker:-    # Built by Thanos make docker-ci-    - image: &default-docker-image quay.io/thanos/thanos-ci:v1.2-go1.15-node

Hm.. for 15m worth of build cutting, 1/3 is IMO super worth it, but yes, we need some automation and more work on this. In fact, what would be helpful is to leverage CircleCI build cache instead. Let's merge this for now :+1:

SuperQ

comment created time in 7 hours

issue commentcortexproject/cortex

HA tracker shows incorrect userid<>cluster associations after some time

The new configuration has been running for ~20 hours now with out any issues. It looks like the default ha_cluster_label value of cluster was causing all of the problems for me.

zdykstra

comment created time in 7 hours

issue commentcortexproject/cortex

Add test for tablemanager when no scaling is enabled

This issue has been automatically marked as stale because it has not had any activity in the past 60 days. It will be closed in 15 days if no further activity occurs. Thank you for your contributions.

gouthamve

comment created time in 7 hours

issue commentcortexproject/cortex

Improve limits documentation

This issue has been automatically marked as stale because it has not had any activity in the past 60 days. It will be closed in 15 days if no further activity occurs. Thank you for your contributions.

pracucci

comment created time in 7 hours

issue commentcortexproject/cortex

Store-gateway blocks resharding during rollout

This issue has been automatically marked as stale because it has not had any activity in the past 60 days. It will be closed in 15 days if no further activity occurs. Thank you for your contributions.

pracucci

comment created time in 7 hours

Pull request review commentthanos-io/thanos

Update CircleCI build

 # NOTE: Current plan gives 1500 build minutes per month.-version: 2-# https://circleci.com/blog/circleci-hacks-reuse-yaml-in-your-circleci-config-with-yaml/-defaults: &defaults-  docker:-    # Built by Thanos make docker-ci-    - image: &default-docker-image quay.io/thanos/thanos-ci:v1.2-go1.15-node

IMO, 5 minutes is not worth it, as you need to constantly keep up with Go releases. We have the same problem with Prometheus golang-builder, but we have some cron jobs to bump the Go versions.

SuperQ

comment created time in 8 hours

Pull request review commentthanos-io/thanos

Update CircleCI build

 # NOTE: Current plan gives 1500 build minutes per month.-version: 2-# https://circleci.com/blog/circleci-hacks-reuse-yaml-in-your-circleci-config-with-yaml/-defaults: &defaults-  docker:-    # Built by Thanos make docker-ci-    - image: &default-docker-image quay.io/thanos/thanos-ci:v1.2-go1.15-node

True, but this is because we did not update this image lately OR somehow we put those tools in wrong dir or simply Makefile mod times are somehow not matching so Makefile reinstall. From looking on current pending it takes at least 5min to reinstall all tools to prebaking makes a lot of sense, WDYT?

SuperQ

comment created time in 8 hours

Pull request review commentthanos-io/thanos

Update CircleCI build

 # NOTE: Current plan gives 1500 build minutes per month.-version: 2-# https://circleci.com/blog/circleci-hacks-reuse-yaml-in-your-circleci-config-with-yaml/-defaults: &defaults-  docker:-    # Built by Thanos make docker-ci-    - image: &default-docker-image quay.io/thanos/thanos-ci:v1.2-go1.15-node

Looking at recent test pipeline runs, it's not really helping at all. Still about 14min to run the "test" pipeline.

It also looks like the current .promu.yml and other steps are missing -mod=vendor, so the vendor dir isn't being used at all.

SuperQ

comment created time in 8 hours

Pull request review commentthanos-io/thanos

Update CircleCI build

 # NOTE: Current plan gives 1500 build minutes per month.-version: 2-# https://circleci.com/blog/circleci-hacks-reuse-yaml-in-your-circleci-config-with-yaml/-defaults: &defaults-  docker:-    # Built by Thanos make docker-ci-    - image: &default-docker-image quay.io/thanos/thanos-ci:v1.2-go1.15-node

I mean we ended up reinstalling all anyway due to not updated thanos-ci image, but that separate story ;p https://app.circleci.com/pipelines/github/thanos-io/thanos/4397/workflows/f06871b1-d847-4332-95c0-bbd23f60c4fc

SuperQ

comment created time in 8 hours

Pull request review commentthanos-io/thanos

Update CircleCI build

 # NOTE: Current plan gives 1500 build minutes per month.-version: 2-# https://circleci.com/blog/circleci-hacks-reuse-yaml-in-your-circleci-config-with-yaml/-defaults: &defaults-  docker:-    # Built by Thanos make docker-ci-    - image: &default-docker-image quay.io/thanos/thanos-ci:v1.2-go1.15-node

This was actually quite improving our build times we were not needed to prebuilding our deps like many Prometheus versions, alertmanager etc. Are we sure we can get rid of this? :thinking:

SuperQ

comment created time in 8 hours

PR opened thanos-io/thanos

Update CircleCI build
  • Update to CircleCI 2.1 config.
  • Use new executors feature.
  • Use new CirleCI images.

Signed-off-by: Ben Kochie superq@gmail.com

<!-- Keep PR title verbose enough and add prefix telling about what components it touches e.g "query:" or ".*:" -->

<!-- Don't forget about CHANGELOG!

Changelog entry format:
- [#<PR-id>](<PR-URL>) Thanos <Component> ...

<PR-id> Id of your pull request.
<PR-URL> URL of your PR such as https://github.com/thanos-io/thanos/pull/<PR-id>
<Component> Component affected by your changes such as Query, Store, Receive.

-->

  • [ ] I added CHANGELOG entry for this change.
  • [ ] Change is not relevant to the end user.

Changes

<!-- Enumerate changes you made -->

Verification

<!-- How you tested it? How do you know it works? -->

+13 -19

0 comment

1 changed file

pr created time in 9 hours

pull request commentthanos-io/thanos

[Docs] Add indent in remote write example

@bwplotka we can close this :)

kubadawczynski

comment created time in 10 hours

issue commentthanos-io/thanos

store: Query failure on Seg fault

unfortunately I do not know how to reproduce it. So far it happened only 2 times since upgrade to v0.17.

bwplotka

comment created time in 10 hours

PR opened cortexproject/cortex

typo

<!-- Thanks for sending a pull request! Before submitting:

  1. Read our CONTRIBUTING.md guide
  2. Rebase your PR if it gets out of sync with master -->

What this PR does:

Which issue(s) this PR fixes: Fixes #<issue number>

Checklist

  • [ ] Tests updated
  • [ ] Documentation added
  • [ ] CHANGELOG.md updated - the order of entries should be [CHANGE], [FEATURE], [ENHANCEMENT], [BUGFIX]
+1 -1

0 comment

1 changed file

pr created time in 11 hours

Pull request review commentcortexproject/cortex

fix panic in inverted index delete operation when expected fp is not present

 func (shard *indexShard) delete(labels labels.Labels, fp model.Fingerprint) { 		j := sort.Search(len(fingerprints.fps), func(i int) bool { 			return fingerprints.fps[i] >= fp 		})++		// see if search didn't find fp which matches the condition which means we don't have to do anything.+		if j == len(fingerprints.fps) {

That's a valid point, we need to verify that returned value is < len(fps) and value at returned index is what we're looking for.

sandeepsukhani

comment created time in 11 hours

Pull request review commentcortexproject/cortex

fix panic in inverted index delete operation when expected fp is not present

 func (shard *indexShard) delete(labels labels.Labels, fp model.Fingerprint) { 		j := sort.Search(len(fingerprints.fps), func(i int) bool { 			return fingerprints.fps[i] >= fp 		})++		// see if search didn't find fp which matches the condition which means we don't have to do anything.+		if j == len(fingerprints.fps) {

What if the desired fingerprint is not in the slice, but there are other fingerprints with a greater value? The search will return the position in the slice where the fp should have been.
eg if fingerprints.fps=[0,1,3] and you search for the fingerprint with a value of "2", j will be set to 2. This will then result in the fingerprint with value of "3" being deleted.

Looks like the checks used in the docs, https://golang.org/pkg/sort/#Search are probably what is needed here.

sandeepsukhani

comment created time in 11 hours

issue openedcortexproject/cortex

Two ingester updates in quick succession can fail

We do CI of Cortex builds into our staging area; about once a month I'm alerted to an ingester rollout which has stuck.

I think what happens is that Kubernetes kills one newly-arrived ingester before it asserts its place in the ring, and the replacement doesn't manage to take over in its place.

I guess the killed ingester should take care to reset state before exiting, or the leaving ingester should detect that it died and go back to looking for someone to hand over to.

I will attempt to find the right logs from a recent occurrence, or post them the next time it happens.

created time in 12 hours

issue commentgrafana/cortex-jsonnet

Mixtool generates empty alerts and rules

Good catch!

muecs

comment created time in 12 hours

pull request commentthanos-io/thanos

store add touch series limit

cc @yeya24 as we touch this code

lisuo3389

comment created time in 12 hours

Pull request review commentthanos-io/thanos

store add touch series limit

 func registerStore(app *extkingpin.App) { 	maxSampleCount := cmd.Flag("store.grpc.series-sample-limit", 		"Maximum amount of samples returned via a single Series call. The Series call fails if this limit is exceeded. 0 means no limit. NOTE: For efficiency the limit is internally implemented as 'chunks limit' considering each chunk contains 120 samples (it's the max number of samples each chunk can contain), so the actual number of samples might be lower, even though the maximum could be hit."). 		Default("0").Uint()+	maxTouchSeriesCount := cmd.Flag("store.grpc.touch-series-limit",

Amazing. What I miss here is the definition of "touched" (:

Is this applied before fetching Series ID and chunk metas? Is this applied while fetchink chunk metas and only if there is chunk matching time range?

This will allow understanding this better for users (: I see from code it's the former and that's ok for now, however, hopefully, we can improve and touch less series here with this work: https://github.com/thanos-io/thanos/issues/3512

lisuo3389

comment created time in 12 hours

Pull request review commentthanos-io/thanos

store add touch series limit

 func blockSeries( 		return storepb.EmptySeriesSet(), indexr.stats, nil 	} +	// Reserve seriesLimiter
	// Reserve series seriesLimiter.
lisuo3389

comment created time in 12 hours

more