profile
viewpoint

Ask questionsAdd loki to sourcegraph.com

Related: https://github.com/sourcegraph/sourcegraph/issues/3175

For reasons similar to what's stated in https://github.com/sourcegraph/sourcegraph/issues/3175, we should also consider adding a standard logging solution for our deployments.

We should try using https://grafana.com/loki on sourcegraph.com to see if it fits our needs.

--

One thing that jumps out at me:

Unlike most logging solutions, Loki does not parse incoming logs, or do full text indexing.

Instead, we index and group log streams using the same labels you’re already using with Prometheus. This makes it significantly more efficient to scale and operate.

Also:

Loki is currently in alpha. You should not depend on it for production use.


An ELK stack could be an alternative solution.

sourcegraph/sourcegraph

Answer questions ggilmore

We're getting a lot of OOM issues with some benign queries on sourcegraph.com

Screen Shot 2019-04-08 at 12 55 58 PM

These memory spikes doesn't really show up on the resource charts. This indicates that the memory spikes are actually caused by these queries (i.e. this cause sudden memory spikes that are to fast to be captured by the chart's sampling rate).


There doesn't seem to be a lot of scaling information besides this document since Loki is a new project. However, I was able to glean from their blog post that Loki has a few different roles (Distributor, Ingestor, Querier), but it looks like their helm chart only runs one monolith version of Loki.

There isn't any good documentation about what a multi-role setup for Loki looks like, but I was able to use their ksonnet setup to generate a rough idea for all the Kubernetes yaml would look like for that.

Jsonnet config file (environments/loki/main.jsonnet):

local gateway = import 'loki/gateway.libsonnet';
local loki = import 'loki/loki.libsonnet';
local promtail = import 'promtail/promtail.libsonnet';

loki + promtail + gateway {
  _config+:: {
    namespace: 'loki',
    htpasswd_contents: 'loki:$apr1$H4yGiGNg$ssl5/NymaGFRUvxIV1Nyr.',

    bigtable_instance: 'TEST_BIGTABLE_INSTANCE',
    bigtable_project: 'TEST_BIGTABLE_PROJECT',
    gcs_bucket_name: 'TEST_GCS_BUCKET_NAME',
storage_backend: 'TEST_STORAGE_BACKEND',
    promtail_config: {
      external_labels: {},
      entry_parser: 'docker',

      scheme: 'http',
      hostname: 'gateway.%(namespace)s.svc' % $._config,
      username: 'loki',
      password: 'password',
      container_root_path: '/var/lib/docker',
    },
    replication_factor: 3,
    consul_replicas: 1,
  },
}

Generated YAML (from ks show loki -o=yaml):

---
apiVersion: v1
data:
  config.yaml: |
    ingester:
      chunk_idle_period: 15m
      lifecycler:
        claim_on_rollout: false
        heartbeat_period: 5s
        interface_names:
        - eth0
        join_after: 10s
        num_tokens: 512
        ring:
          consul:
            consistentreads: true
            host: consul.loki.svc.cluster.local:8500
            httpclienttimeout: 20s
            prefix: ""
          heartbeat_timeout: 1m
          replication_factor: 3
          store: consul
    limits_config:
      enforce_metric_name: false
    schema_config:
      configs:
      - from: "0"
        index:
          period: 168h
          prefix: loki_index_
        object_store: gcs
        schema: v9
        store: bigtable
    server:
      graceful_shutdown_timeout: 5s
      grpc_server_max_recv_msg_size: 67108864
      http_server_idle_timeout: 120s
    storage_config:
      bigtable:
        instance: TEST_BIGTABLE_INSTANCE
        project: TEST_BIGTABLE_PROJECT
      gcs:
        bucket_name: TEST_GCS_BUCKET_NAME
kind: ConfigMap
metadata:
  name: loki
---
apiVersion: v1
kind: Service
metadata:
  name: ingester
spec:
  ports:
  - name: ingester-http-metrics
    port: 80
    targetPort: 80
  - name: ingester-grpc
    port: 9095
    targetPort: 9095
  selector:
    name: ingester
---
apiVersion: extensions/v1beta1
kind: DaemonSet
metadata:
  name: promtail
spec:
  minReadySeconds: 10
  template:
    metadata:
      labels:
        name: promtail
    spec:
      containers:
      - args:
        - -client.url=http://loki:password@gateway.loki.svc/api/prom/push
        - -config.file=/etc/promtail/promtail.yml
        env:
        - name: HOSTNAME
          valueFrom:
            fieldRef:
              fieldPath: spec.nodeName
        image: grafana/promtail:latest
        imagePullPolicy: IfNotPresent
        name: promtail
        ports:
        - containerPort: 80
          name: http-metrics
        securityContext:
          privileged: true
          runAsUser: 0
        volumeMounts:
        - mountPath: /etc/promtail
          name: promtail
        - mountPath: /var/log
          name: varlog
        - mountPath: /var/lib/docker/containers
          name: varlibdockercontainers
          readOnly: true
      serviceAccount: promtail
      tolerations:
      - effect: NoSchedule
        operator: Exists
      volumes:
      - configMap:
          name: promtail
        name: promtail
      - hostPath:
          path: /var/log
        name: varlog
      - hostPath:
          path: /var/lib/docker/containers
        name: varlibdockercontainers
  updateStrategy:
    type: RollingUpdate
---
apiVersion: v1
data:
  promtail.yml: |
    client:
      external_labels: {}
    scrape_configs:
    - entry_parser: docker
      job_name: kubernetes-pods-name
      kubernetes_sd_configs:
      - role: pod
      relabel_configs:
      - source_labels:
        - __meta_kubernetes_pod_label_name
        target_label: __service__
      - source_labels:
        - __meta_kubernetes_pod_node_name
        target_label: __host__
      - action: drop
        regex: ^$
        source_labels:
        - __service__
      - action: replace
        replacement: $1
        separator: /
        source_labels:
        - __meta_kubernetes_namespace
        - __service__
        target_label: job
      - action: replace
        source_labels:
        - __meta_kubernetes_namespace
        target_label: namespace
      - action: replace
        source_labels:
        - __meta_kubernetes_pod_name
        target_label: instance
      - action: replace
        source_labels:
        - __meta_kubernetes_pod_container_name
        target_label: container_name
      - action: labelmap
        regex: __meta_kubernetes_pod_label_(.+)
      - replacement: /var/log/pods/$1/*.log
        separator: /
        source_labels:
        - __meta_kubernetes_pod_uid
        - __meta_kubernetes_pod_container_name
        target_label: __path__
    - entry_parser: docker
      job_name: kubernetes-pods-app
      kubernetes_sd_configs:
      - role: pod
      relabel_configs:
      - action: drop
        regex: .+
        source_labels:
        - __meta_kubernetes_pod_label_name
      - source_labels:
        - __meta_kubernetes_pod_label_app
        target_label: __service__
      - source_labels:
        - __meta_kubernetes_pod_node_name
        target_label: __host__
      - action: drop
        regex: ^$
        source_labels:
        - __service__
      - action: replace
        replacement: $1
        separator: /
        source_labels:
        - __meta_kubernetes_namespace
        - __service__
        target_label: job
      - action: replace
        source_labels:
        - __meta_kubernetes_namespace
        target_label: namespace
      - action: replace
        source_labels:
        - __meta_kubernetes_pod_name
        target_label: instance
      - action: replace
        source_labels:
        - __meta_kubernetes_pod_container_name
        target_label: container_name
      - action: labelmap
        regex: __meta_kubernetes_pod_label_(.+)
      - replacement: /var/log/pods/$1/*.log
        separator: /
        source_labels:
        - __meta_kubernetes_pod_uid
        - __meta_kubernetes_pod_container_name
        target_label: __path__
    - entry_parser: docker
      job_name: kubernetes-pods-direct-controllers
      kubernetes_sd_configs:
      - role: pod
      relabel_configs:
      - action: drop
        regex: .+
        separator: ""
        source_labels:
        - __meta_kubernetes_pod_label_name
        - __meta_kubernetes_pod_label_app
      - action: drop
        regex: ^([0-9a-z-.]+)(-[0-9a-f]{8,10})$
        source_labels:
        - __meta_kubernetes_pod_controller_name
      - source_labels:
        - __meta_kubernetes_pod_controller_name
        target_label: __service__
      - source_labels:
        - __meta_kubernetes_pod_node_name
        target_label: __host__
      - action: drop
        regex: ^$
        source_labels:
        - __service__
      - action: replace
        replacement: $1
        separator: /
        source_labels:
        - __meta_kubernetes_namespace
        - __service__
        target_label: job
      - action: replace
        source_labels:
        - __meta_kubernetes_namespace
        target_label: namespace
      - action: replace
        source_labels:
        - __meta_kubernetes_pod_name
        target_label: instance
      - action: replace
        source_labels:
        - __meta_kubernetes_pod_container_name
        target_label: container_name
      - action: labelmap
        regex: __meta_kubernetes_pod_label_(.+)
      - replacement: /var/log/pods/$1/*.log
        separator: /
        source_labels:
        - __meta_kubernetes_pod_uid
        - __meta_kubernetes_pod_container_name
        target_label: __path__
    - entry_parser: docker
      job_name: kubernetes-pods-indirect-controller
      kubernetes_sd_configs:
      - role: pod
      relabel_configs:
      - action: drop
        regex: .+
        separator: ""
        source_labels:
        - __meta_kubernetes_pod_label_name
        - __meta_kubernetes_pod_label_app
      - action: keep
        regex: ^([0-9a-z-.]+)(-[0-9a-f]{8,10})$
        source_labels:
        - __meta_kubernetes_pod_controller_name
      - action: replace
        regex: ^([0-9a-z-.]+)(-[0-9a-f]{8,10})$
        source_labels:
        - __meta_kubernetes_pod_controller_name
        target_label: __service__
      - source_labels:
        - __meta_kubernetes_pod_node_name
        target_label: __host__
      - action: drop
        regex: ^$
        source_labels:
        - __service__
      - action: replace
        replacement: $1
        separator: /
        source_labels:
        - __meta_kubernetes_namespace
        - __service__
        target_label: job
      - action: replace
        source_labels:
        - __meta_kubernetes_namespace
        target_label: namespace
      - action: replace
        source_labels:
        - __meta_kubernetes_pod_name
        target_label: instance
      - action: replace
        source_labels:
        - __meta_kubernetes_pod_container_name
        target_label: container_name
      - action: labelmap
        regex: __meta_kubernetes_pod_label_(.+)
      - replacement: /var/log/pods/$1/*.log
        separator: /
        source_labels:
        - __meta_kubernetes_pod_uid
        - __meta_kubernetes_pod_container_name
        target_label: __path__
kind: ConfigMap
metadata:
  name: promtail
---
apiVersion: rbac.authorization.k8s.io/v1beta1
kind: ClusterRoleBinding
metadata:
  name: promtail
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: ClusterRole
  name: promtail
subjects:
- kind: ServiceAccount
  name: promtail
  namespace: loki
---
apiVersion: v1
kind: ServiceAccount
metadata:
  name: promtail
---
apiVersion: rbac.authorization.k8s.io/v1beta1
kind: ClusterRole
metadata:
  name: promtail
rules:
- apiGroups:
  - ""
  resources:
  - nodes
  - nodes/proxy
  - services
  - endpoints
  - pods
  verbs:
  - get
  - list
  - watch
---
apiVersion: apps/v1beta1
kind: Deployment
metadata:
  name: table-manager
spec:
  minReadySeconds: 10
  replicas: 1
  revisionHistoryLimit: 10
  template:
    metadata:
      labels:
        name: table-manager
    spec:
      containers:
      - args:
        - -bigtable.instance=TEST_BIGTABLE_INSTANCE
        - -bigtable.project=TEST_BIGTABLE_PROJECT
        - -chunk.storage-client=TEST_STORAGE_BACKEND
        - -dynamodb.chunk-table.from=2018-07-11
        - -dynamodb.chunk-table.inactive-read-throughput=0
        - -dynamodb.chunk-table.inactive-write-throughput=0
        - -dynamodb.chunk-table.prefix=loki_chunks_
        - -dynamodb.chunk-table.read-throughput=0
        - -dynamodb.chunk-table.write-throughput=0
        - -dynamodb.original-table-name=loki_index
        - -dynamodb.periodic-table.from=2018-07-11
        - -dynamodb.periodic-table.inactive-read-throughput=0
        - -dynamodb.periodic-table.inactive-write-throughput=0
        - -dynamodb.periodic-table.prefix=loki_index_
        - -dynamodb.periodic-table.read-throughput=0
        - -dynamodb.periodic-table.write-throughput=0
        - -dynamodb.use-periodic-tables=true
        - -dynamodb.v9-schema-from=2018-07-11
        image: grafana/cortex-table-manager:r47-06f3294e
        imagePullPolicy: IfNotPresent
        name: table-manager
        ports:
        - containerPort: 80
          name: http-metrics
        - containerPort: 9095
          name: grpc
        resources:
          limits:
            cpu: 200m
            memory: 200Mi
          requests:
            cpu: 100m
            memory: 100Mi
---
apiVersion: apps/v1beta1
kind: Deployment
metadata:
  name: consul
spec:
  minReadySeconds: 10
  replicas: 1
  revisionHistoryLimit: 10
  template:
    metadata:
      labels:
        name: consul
    spec:
      affinity:
        podAntiAffinity:
          requiredDuringSchedulingIgnoredDuringExecution:
          - labelSelector:
              matchLabels:
                name: consul
            topologyKey: kubernetes.io/hostname
      containers:
      - args:
        - agent
        - -ui
        - -server
        - -client=0.0.0.0
        - -config-file=/etc/config/consul-config.json
        - -bootstrap-expect=1
        env:
        - name: CHECKPOINT_DISABLE
          value: "1"
        image: consul:1.4.0
        imagePullPolicy: IfNotPresent
        name: consul
        ports:
        - containerPort: 8300
          name: server
        - containerPort: 8301
          name: serf
        - containerPort: 8400
          name: client
        - containerPort: 8500
          name: api
        resources:
          requests:
            cpu: 100m
            memory: 500Mi
        volumeMounts:
        - mountPath: /etc/config
          name: consul
      - args:
        - --namespace=$(POD_NAMESPACE)
        - --pod-name=$(POD_NAME)
        env:
        - name: POD_NAMESPACE
          valueFrom:
            fieldRef:
              fieldPath: metadata.namespace
        - name: POD_NAME
          valueFrom:
            fieldRef:
              fieldPath: metadata.name
        image: quay.io/weaveworks/consul-sidekick:master-f18ad13
        imagePullPolicy: IfNotPresent
        name: sidekick
        volumeMounts:
        - mountPath: /etc/config
          name: consul
      - args:
        - --web.listen-address=:8000
        - --statsd.mapping-config=/etc/config/mapping
        image: prom/statsd-exporter:v0.8.1
        imagePullPolicy: IfNotPresent
        name: statsd-exporter
        ports:
        - containerPort: 8000
          name: http-metrics
        volumeMounts:
        - mountPath: /etc/config
          name: consul
      - args:
        - --consul.server=localhost:8500
        - --web.listen-address=:9107
        image: prom/consul-exporter:v0.4.0
        imagePullPolicy: IfNotPresent
        name: consul-exporter
        ports:
        - containerPort: 9107
          name: http-metrics
        volumeMounts:
        - mountPath: /etc/config
          name: consul
      serviceAccount: consul-sidekick
      volumes:
      - configMap:
          name: consul
        name: consul
---
apiVersion: v1
data:
  .htpasswd: bG9raTokYXByMSRINHlHaUdOZyRzc2w1L055bWFHRlJVdnhJVjFOeXIu
kind: Secret
metadata:
  name: gateway-secret
type: Opaque
---
apiVersion: apps/v1beta1
kind: Deployment
metadata:
  name: ingester
spec:
  minReadySeconds: 60
  replicas: 3
  revisionHistoryLimit: 10
  strategy:
    rollingUpdate:
      maxSurge: 0
      maxUnavailable: 1
  template:
    metadata:
      annotations:
        config_hash: 1c32bcf2f7a7db8819894e2b932df63d
      labels:
        name: ingester
    spec:
      affinity:
        podAntiAffinity:
          requiredDuringSchedulingIgnoredDuringExecution:
          - labelSelector:
              matchLabels:
                name: ingester
            topologyKey: kubernetes.io/hostname
      containers:
      - args:
        - -config.file=/etc/loki/config.yaml
        - -target=ingester
        image: grafana/loki:latest
        imagePullPolicy: IfNotPresent
        name: ingester
        ports:
        - containerPort: 80
          name: http-metrics
        - containerPort: 9095
          name: grpc
        readinessProbe:
          httpGet:
            path: /ready
            port: 80
          initialDelaySeconds: 15
          timeoutSeconds: 1
        resources:
          limits:
            cpu: "2"
            memory: 10Gi
          requests:
            cpu: "1"
            memory: 5Gi
        volumeMounts:
        - mountPath: /etc/loki
          name: loki
      terminationGracePeriodSeconds: 4800
      volumes:
      - configMap:
          name: loki
        name: loki
---
apiVersion: v1
kind: Namespace
metadata:
  name: loki
---
apiVersion: v1
kind: Service
metadata:
  name: gateway
spec:
  ports:
  - name: nginx-http
    port: 80
    targetPort: 80
  selector:
    name: gateway
---
apiVersion: rbac.authorization.k8s.io/v1beta1
kind: RoleBinding
metadata:
  name: consul-sidekick
  namespace: loki
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: Role
  name: consul-sidekick
subjects:
- kind: ServiceAccount
  name: consul-sidekick
  namespace: loki
---
apiVersion: rbac.authorization.k8s.io/v1beta1
kind: Role
metadata:
  name: consul-sidekick
  namespace: loki
rules:
- apiGroups:
  - ""
  - extensions
  - apps
  resources:
  - pods
  - replicasets
  verbs:
  - get
  - list
  - watch
---
apiVersion: v1
kind: ServiceAccount
metadata:
  name: consul-sidekick
  namespace: loki
---
apiVersion: v1
kind: Service
metadata:
  name: distributor
spec:
  ports:
  - name: distributor-http-metrics
    port: 80
    targetPort: 80
  - name: distributor-grpc
    port: 9095
    targetPort: 9095
  selector:
    name: distributor
---
apiVersion: v1
data:
  nginx.conf: |
    worker_processes  5;  ## Default: 1
    error_log  /dev/stderr;
    pid        /tmp/nginx.pid;
    worker_rlimit_nofile 8192;

    events {
      worker_connections  4096;  ## Default: 1024
    }

    http {
      default_type application/octet-stream;
      log_format   main '$remote_addr - $remote_user [$time_local]  $status '
        '"$request" $body_bytes_sent "$http_referer" '
        '"$http_user_agent" "$http_x_forwarded_for"';
      access_log   /dev/stderr  main;
      sendfile     on;
      tcp_nopush   on;
      resolver kube-dns.kube-system.svc.cluster.local;

      server {
        listen               80;
        auth_basic           “Prometheus”;
        auth_basic_user_file /etc/nginx/secrets/.htpasswd;
        proxy_set_header     X-Scope-OrgID 1;

        location = /api/prom/push {
          proxy_pass      http://distributor.loki.svc.cluster.local$request_uri;
        }

        location ~ /api/prom/.* {
          proxy_pass      http://querier.loki.svc.cluster.local$request_uri;
        }
      }
    }
kind: ConfigMap
metadata:
  name: gateway-config
---
apiVersion: apps/v1beta1
kind: Deployment
metadata:
  name: gateway
spec:
  minReadySeconds: 10
  replicas: 3
  revisionHistoryLimit: 10
  template:
    metadata:
      labels:
        name: gateway
    spec:
      affinity:
        podAntiAffinity:
          requiredDuringSchedulingIgnoredDuringExecution:
          - labelSelector:
              matchLabels:
                name: gateway
            topologyKey: kubernetes.io/hostname
      containers:
      - image: nginx:1.15.1-alpine
        imagePullPolicy: IfNotPresent
        name: nginx
        ports:
        - containerPort: 80
          name: http
        resources:
          requests:
            cpu: 50m
            memory: 100Mi
        volumeMounts:
        - mountPath: /etc/nginx
          name: gateway-config
        - mountPath: /etc/nginx/secrets
          name: gateway-secret
      volumes:
      - configMap:
          name: gateway-config
        name: gateway-config
      - name: gateway-secret
        secret:
          defaultMode: 420
          secretName: gateway-secret
---
apiVersion: v1
kind: Service
metadata:
  name: querier
spec:
  ports:
  - name: querier-http-metrics
    port: 80
    targetPort: 80
  - name: querier-grpc
    port: 9095
    targetPort: 9095
  selector:
    name: querier
---
apiVersion: v1
kind: Service
metadata:
  name: table-manager
spec:
  ports:
  - name: table-manager-http-metrics
    port: 80
    targetPort: 80
  - name: table-manager-grpc
    port: 9095
    targetPort: 9095
  selector:
    name: table-manager
---
apiVersion: v1
data:
  consul-config.json: '{"leave_on_terminate": true, "telemetry": {"dogstatsd_addr":
    "127.0.0.1:9125"}}'
  mapping: |
    mappings:
    - match: consul.*.runtime.*
      name: consul_runtime
      labels:
        type: $2
    - match: consul.runtime.total_gc_pause_ns
      name: consul_runtime_total_gc_pause_ns
      labels:
        type: $2
    - match: consul.consul.health.service.query-tag.*.*.*
      name: consul_health_service_query_tag
      labels:
        query: $1.$2.$3
    - match: consul.consul.health.service.query-tag.*.*.*.*
      name: consul_health_service_query_tag
      labels:
        query: $1.$2.$3.$4
    - match: consul.consul.health.service.query-tag.*.*.*.*.*
      name: consul_health_service_query_tag
      labels:
        query: $1.$2.$3.$4.$5
    - match: consul.consul.health.service.query-tag.*.*.*.*.*.*
      name: consul_health_service_query_tag
      labels:
        query: $1.$2.$3.$4.$5.$6
    - match: consul.consul.health.service.query-tag.*.*.*.*.*.*.*
      name: consul_health_service_query_tag
      labels:
        query: $1.$2.$3.$4.$5.$6.$7
    - match: consul.consul.health.service.query-tag.*.*.*.*.*.*.*.*
      name: consul_health_service_query_tag
      labels:
        query: $1.$2.$3.$4.$5.$6.$7.$8
    - match: consul.consul.health.service.query-tag.*.*.*.*.*.*.*.*.*
      name: consul_health_service_query_tag
      labels:
        query: $1.$2.$3.$4.$5.$6.$7.$8.$9
    - match: consul.consul.health.service.query-tag.*.*.*.*.*.*.*.*.*.*
      name: consul_health_service_query_tag
      labels:
        query: $1.$2.$3.$4.$5.$6.$7.$8.$9.$10
    - match: consul.consul.health.service.query-tag.*.*.*.*.*.*.*.*.*.*.*
      name: consul_health_service_query_tag
      labels:
        query: $1.$2.$3.$4.$5.$6.$7.$8.$9.$10.$11
    - match: consul.consul.health.service.query-tag.*.*.*.*.*.*.*.*.*.*.*.*
      name: consul_health_service_query_tag
      labels:
        query: $1.$2.$3.$4.$5.$6.$7.$8.$9.$10.$11.$12
    - match: consul.consul.catalog.deregister
      name: consul_catalog_deregister
      labels: {}
    - match: consul.consul.dns.domain_query.*.*.*.*.*
      name: consul_dns_domain_query
      labels:
        query: $1.$2.$3.$4.$5
    - match: consul.consul.health.service.not-found.*
      name: consul_health_service_not_found
      labels:
        query: $1
    - match: consul.consul.health.service.query.*
      name: consul_health_service_query
      labels:
        query: $1
    - match: consul.*.memberlist.health.score
      name: consul_memberlist_health_score
      labels: {}
    - match: consul.serf.queue.*
      name: consul_serf_events
      labels:
        type: $1
    - match: consul.serf.snapshot.appendLine
      name: consul_serf_snapshot_appendLine
      labels:
        type: $1
    - match: consul.serf.coordinate.adjustment-ms
      name: consul_serf_coordinate_adjustment_ms
      labels: {}
    - match: consul.consul.rpc.query
      name: consul_rpc_query
      labels: {}
    - match: consul.*.consul.session_ttl.active
      name: consul_session_ttl_active
      labels: {}
    - match: consul.raft.rpc.*
      name: consul_raft_rpc
      labels:
        type: $1
    - match: consul.raft.rpc.appendEntries.storeLogs
      name: consul_raft_rpc_appendEntries_storeLogs
      labels:
        type: $1
    - match: consul.consul.fsm.persist
      name: consul_fsm_persist
      labels: {}
    - match: consul.raft.fsm.apply
      name: consul_raft_fsm_apply
      labels: {}
    - match: consul.raft.leader.lastContact
      name: consul_raft_leader_lastcontact
      labels: {}
    - match: consul.raft.leader.dispatchLog
      name: consul_raft_leader_dispatchLog
      labels: {}
    - match: consul.raft.commitTime
      name: consul_raft_commitTime
      labels: {}
    - match: consul.raft.replication.appendEntries.logs.*.*.*.*
      name: consul_raft_replication_appendEntries_logs
      labels:
        query: ${1}.${2}.${3}.${4}
    - match: consul.raft.replication.appendEntries.rpc.*.*.*.*
      name: consul_raft_replication_appendEntries_rpc
      labels:
        query: ${1}.${2}.${3}.${4}
    - match: consul.raft.replication.heartbeat.*.*.*.*
      name: consul_raft_replication_heartbeat
      labels:
        query: ${1}.${2}.${3}.${4}
    - match: consul.consul.rpc.request
      name: consul_rpc_requests
      labels: {}
    - match: consul.consul.rpc.accept_conn
      name: consul_rpc_accept_conn
      labels: {}
    - match: consul.memberlist.udp.*
      name: consul_memberlist_udp
      labels:
        type: $1
    - match: consul.memberlist.tcp.*
      name: consul_memberlist_tcp
      labels:
        type: $1
    - match: consul.memberlist.gossip
      name: consul_memberlist_gossip
      labels: {}
    - match: consul.memberlist.probeNode
      name: consul_memberlist_probenode
      labels: {}
    - match: consul.memberlist.pushPullNode
      name: consul_memberlist_pushpullnode
      labels: {}
    - match: consul.http.*
      name: consul_http_request
      labels:
        method: $1
        path: /
    - match: consul.http.*.*
      name: consul_http_request
      labels:
        method: $1
        path: /$2
    - match: consul.http.*.*.*
      name: consul_http_request
      labels:
        method: $1
        path: /$2/$3
    - match: consul.http.*.*.*.*
      name: consul_http_request
      labels:
        method: $1
        path: /$2/$3/$4
    - match: consul.http.*.*.*.*.*
      name: consul_http_request
      labels:
        method: $1
        path: /$2/$3/$4/$5
    - match: consul.consul.leader.barrier
      name: consul_leader_barrier
      labels: {}
    - match: consul.consul.leader.reconcileMember
      name: consul_leader_reconcileMember
      labels: {}
    - match: consul.consul.leader.reconcile
      name: consul_leader_reconcile
      labels: {}
    - match: consul.consul.fsm.coordinate.batch-update
      name: consul_fsm_coordinate_batch_update
      labels: {}
    - match: consul.consul.fsm.autopilot
      name: consul_fsm_autopilot
      labels: {}
    - match: consul.consul.fsm.kvs.cas
      name: consul_fsm_kvs_cas
      labels: {}
    - match: consul.consul.fsm.register
      name: consul_fsm_register
      labels: {}
    - match: consul.consul.fsm.deregister
      name: consul_fsm_deregister
      labels: {}
    - match: consul.consul.fsm.tombstone.reap
      name: consul_fsm_tombstone_reap
      labels: {}
    - match: consul.consul.catalog.register
      name: consul_catalog_register
      labels: {}
    - match: consul.consul.catalog.deregister
      name: consul_catalog_deregister
      labels: {}
    - match: consul.consul.leader.reapTombstones
      name: consul_leader_reapTombstones
      labels: {}
kind: ConfigMap
metadata:
  name: consul
---
apiVersion: v1
kind: Service
metadata:
  name: consul
spec:
  ports:
  - name: consul-server
    port: 8300
    targetPort: 8300
  - name: consul-serf
    port: 8301
    targetPort: 8301
  - name: consul-client
    port: 8400
    targetPort: 8400
  - name: consul-api
    port: 8500
    targetPort: 8500
  - name: statsd-exporter-http-metrics
    port: 8000
    targetPort: 8000
  - name: consul-exporter-http-metrics
    port: 9107
    targetPort: 9107
  selector:
    name: consul
---
apiVersion: apps/v1beta1
kind: Deployment
metadata:
  name: distributor
spec:
  minReadySeconds: 10
  replicas: 3
  revisionHistoryLimit: 10
  template:
    metadata:
      annotations:
        config_hash: 1c32bcf2f7a7db8819894e2b932df63d
      labels:
        name: distributor
    spec:
      affinity:
        podAntiAffinity:
          requiredDuringSchedulingIgnoredDuringExecution:
          - labelSelector:
              matchLabels:
                name: distributor
            topologyKey: kubernetes.io/hostname
      containers:
      - args:
        - -config.file=/etc/loki/config.yaml
        - -target=distributor
        image: grafana/loki:latest
        imagePullPolicy: IfNotPresent
        name: distributor
        ports:
        - containerPort: 80
          name: http-metrics
        - containerPort: 9095
          name: grpc
        resources:
          limits:
            cpu: "1"
            memory: 200Mi
          requests:
            cpu: 500m
            memory: 100Mi
        volumeMounts:
        - mountPath: /etc/loki
          name: loki
      volumes:
      - configMap:
          name: loki
        name: loki
---
apiVersion: apps/v1beta1
kind: Deployment
metadata:
  name: querier
spec:
  minReadySeconds: 10
  replicas: 3
  revisionHistoryLimit: 10
  template:
    metadata:
      annotations:
        config_hash: 1c32bcf2f7a7db8819894e2b932df63d
      labels:
        name: querier
    spec:
      affinity:
        podAntiAffinity:
          requiredDuringSchedulingIgnoredDuringExecution:
          - labelSelector:
              matchLabels:
                name: querier
            topologyKey: kubernetes.io/hostname
      containers:
      - args:
        - -config.file=/etc/loki/config.yaml
        - -target=querier
        image: grafana/loki:latest
        imagePullPolicy: IfNotPresent
        name: querier
        ports:
        - containerPort: 80
          name: http-metrics
        - containerPort: 9095
          name: grpc
        volumeMounts:
        - mountPath: /etc/loki
          name: loki
      volumes:
      - configMap:
          name: loki
        name: loki

It looks like they're using BigTable and GCP Buckets for storage. I honestly don't know if our OOM issues would actually be alleviated with this setup.

Since most of this isn't well documented + looks complex, I'm tempted to just start looking into an ELK stack. Thoughts @keegancsmith @tsenart @slimsag @beyang ?

useful!

Related questions

Unable to clone GitLab repositories with self signed certificate hot 1
Github User Rank List