profile
viewpoint

cilium/cilium 6992

eBPF-based Networking, Security, and Observability

nirmoy/ansible 0

Ansible is a radically simple IT automation platform that makes your applications and systems easier to deploy. Avoid writing scripts or custom code to deploy and update your applications— automate in a language that approaches plain English, using SSH, with no agents to install on remote systems.

nirmoy/bazel 0

a fast, scalable, multi-language and extensible build system

nirmoy/bcc 0

BCC - Tools for BPF-based Linux IO analysis, networking, monitoring, and more

nirmoy/boxed 0

Put command output in a box

nirmoy/bzcl 0

commandline interface for bugzilla.

nirmoy/cilium 0

API-aware Networking and Security for Containers based on BPF

nirmoy/cilium-etcd-operator 0

Operator to manage Cilium's etcd cluster

issue commentkubernetes/autoscaler

pods scheduled during taint DeletionCandidateOfClusterAutoscaler crash when node deleted

@fejta-bot: Closing this issue.

<details>

In response to this:

Rotten issues close after 30d of inactivity. Reopen the issue with /reopen. Mark the issue as fresh with /remove-lifecycle rotten.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta. /close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. </details>

mvz27

comment created time in 2 hours

issue closedkubernetes/autoscaler

pods scheduled during taint DeletionCandidateOfClusterAutoscaler crash when node deleted

Gitlb-runner (kubernetes executer) with pod annotation safe-to-evict: false, are STARTED on nodes after these nodes are marked for scale down (during 10 minutes graceful period). When nodes is actual deleted (after 10 minutes) these pods (gitlab-runner jobs) crash (are deleted). Reproduced on AWS,EKS 1.14, cluster-autoscaler 1.14.5

closed time in 2 hours

mvz27

issue commentkubernetes/autoscaler

pods scheduled during taint DeletionCandidateOfClusterAutoscaler crash when node deleted

Rotten issues close after 30d of inactivity. Reopen the issue with /reopen. Mark the issue as fresh with /remove-lifecycle rotten.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta. /close

mvz27

comment created time in 2 hours

issue commentcilium/cilium

Cilium not shared the policy between nodes

@myugan with Kubernetes you can install Cilium Network Policies which are distributed for all nodes.

myugan

comment created time in 5 hours

pull request commentcilium/cilium

bpf: Don't compile unused BPF sections

Runtime-4.9 failed with known flake https://github.com/cilium/cilium/issues/14125. All other builds are green. Marking as ready-to-merge.

pchaigno

comment created time in 5 hours

issue closedGoogleContainerTools/kaniko

How can I pass the secret to Image with Kaniko?

Hi all!

I'm trying to build an image inside the Gitlab CI process with Kaniko. The problem is that during the building process I need an access to the private Gitlab repository. With docker:stable-dind I could easily pass the secret

   - docker build
      ...
      --secret id=gitconfig,src=$HOME/.gitconfig
      ...

and then just mount it inside the Dockerfile

RUN --mount=type=secret,id=gitconfig,dst=/root/.gitconfig --mount=type=ssh composer install

How can I achieve the same behavior with Kaniko?

closed time in 8 hours

Zlob

issue commentGoogleContainerTools/kaniko

How can I pass the secret to Image with Kaniko?

@LaurentTrk thanks a lot, you've really saved my day.

At the end, I just put my git config into /kaniko/git/config file in gitlab CI and than used it in docker file with ARG XDG_CONFIG_HOME="/kaniko"

Hope it will help others too.

Zlob

comment created time in 8 hours

issue commentcilium/cilium

Cilium not shared the policy between nodes

Hi @aanm as you said this is can only be done through Kubernetes, is it use etcd to put those policies?

myugan

comment created time in 13 hours

pull request commentcilium/cilium

Pr/jrajahalme/test envoy ARM64

test-me-please

jrajahalme

comment created time in 15 hours

PR opened cilium/cilium

Pr/jrajahalme/test envoy amd64 kind/enhancement release-note/major

Switch to multi-arch Envoy build, allowing Cilium to be locally installed on both x86-64 and ARM64.

Most of our tests likely still depend on non-multi-arch images. Update the local Envoy smoke test (tests/envoy-smoke-test.sh) to use multi-arch images for client (curlimages/curl) and server (httpd). To make this work the test site data is now in tests/testsite, and the client container is started with a local query that takes forever so that we can issue docker exec commands on a running container.

Amend the policies in test/envoy-smoke-test.sh to never completely remove the policy from the endpoints. This avoids bpf compilations/loads that otherwise took place due to policy enforcement mode changing when policy was completely removed. This change halves the execution time of the test.

Cilium now builds and installs on ARM64 machines.
+63 -15

0 comment

6 changed files

pr created time in 15 hours

push eventcilium/cilium

Paul Chaignon

commit sha 8470528592a036d2248f393d1af669a53471c712

test: Disable K8sVerifier on 4.19 and net-next CI pipelines K8sVerifier was mistakenly enabled on 4.19 and net-next in eeecf15 ("test: Collect bpf_*.o artifacts on K8sVerifier failures"). This commit reverts it. Fixes: eeecf15 ("test: Collect bpf_*.o artifacts on K8sVerifier failures") Signed-off-by: Paul Chaignon <paul@cilium.io>

view details

Maciej Kwiek

commit sha f35478b4fbeb85ff6b28fd79a4fca7b8f6bedce1

ci: Add quarantine capabilities to k8s-all jenkinsfile Signed-off-by: Maciej Kwiek <maciej@isovalent.com>

view details

Tobias Klauser

commit sha 37a41daeff6c9256df5c9fafd5b0f8e5d522694f

hubble/observer/types: fix comment for AgentEvent.Message It might contain a monitorAPI.AgentNotifyMessage as emitted by the *Message constructor funcs. Signed-off-by: Tobias Klauser <tklauser@distanz.ch>

view details

Tobias Klauser

commit sha 513ae0a9dbaeeee7eda04d660c6724b6fad0147f

monitor/api: fix godoc comments Correct godoc comments for type AgentNotifyMessage and func StartMessage to state the proper name. Signed-off-by: Tobias Klauser <tklauser@distanz.ch>

view details

Tobias Klauser

commit sha 76e0cfe374f9d2c9751a8099f4268e5a0990c7a0

monitor/api: format agent start timestamp in RFC3339Nano format time.Time.String() may include a monotonic clock reading, e.g. when t is time.Now() which is e.g. the case for the agent start timestamp. The godoc for time.Time.String [1] states: If the time has a monotonic clock reading, the returned string includes a final field "m=±<value>", where value is the monotonic clock reading formatted as a decimal number of seconds. [1] https://golang.org/pkg/time/#Time.String The format including the monotonic clock reading is hard to decode because there is no predefined format string in the stdlib time package. Also, the monotonic clock reading isn't really useful for the agent start timestamp, the walltime clock should be enough. Thus, format the timestamp string in RFC3339Nano format which can easily be decoded using time.Parse(time.RFC3339Nano, t), e.g in the hubble API parser. Signed-off-by: Tobias Klauser <tklauser@distanz.ch>

view details

Joe Stringer

commit sha baf84ad0d4c9acb1a285ef69a61a4df206ded357

bugtool: Add lsmod Module listings can allow figuring out the availability of certain functionality like iptables or aes modules which can be useful when debugging certain types of problems. Signed-off-by: Joe Stringer <joe@cilium.io>

view details

Paul Chaignon

commit sha 759dd49f1eb66d35b32f2545281324b37a295485

test: Disable the host firewall in endpoint routes tests The host firewall cannot work in combination to per-endpoint routes yet. When opening a PR with label ci/host-firewall, the host firewall is enabled by default in all tests. It must be explicitly disabled in tests with per-endpoint routes to avoid those tests failing. Signed-off-by: Paul Chaignon <paul@cilium.io>

view details

Paul Chaignon

commit sha dbc1c72da0c90c9e455fc1b5022a34533f9bb3ca

test: Disable the host firewall in Maglev tests Support for the host firewall + Maglev is currently broken due to an excessive BPF program size. This commit explicitly disables the host firewall to avoid tests failing when running with label ci/host-firewall or with env. variable HOST_FIREWALL=1. Related: https://github.com/cilium/cilium/issues/14047 Signed-off-by: Paul Chaignon <paul@cilium.io>

view details

Joe Stringer

commit sha 1eedfb3af646eda3abbbbe4df15b896c3a41eaea

Makefile: Remove microk8s prepull script The prepull script was a handy way to force microk8s to pull the new image into the container runtime, but we can also just directly pull it in from microk8s.ctr which simplifies the deployment and prevents issues where some kubernetes image pull problem prevents the image from being imported. Signed-off-by: Joe Stringer <joe@cilium.io>

view details

Joe Stringer

commit sha 546b4645b9e45343f0ab66352f833be66b36f3a9

docs: Improve DNS port documentation Some users had expressed confusion when using non-standard ports in conjunction with DNS policy. Clarify that when there is a k8s service, the CoreDNS / kube-dns port must be the backend port. Signed-off-by: Joe Stringer <joe@cilium.io>

view details

Maciej Kwiek

commit sha ff897f7960b5f9f4f908dc53402a15303d7f4b7c

ci: fix nightly image hubble-perf-test docker repo no longer exists Signed-off-by: Maciej Kwiek <maciej@isovalent.com>

view details

Paul Chaignon

commit sha 7570d08b8bf3498d9b3f04ff8d29f27ee102b429

test: Quarantine flakes from k8s-all CI pipeline "Check vxlan connectivity with per-endpoint routes" and "Check iptables masquerading with random-fully" are currently failing on the kubernetes-all CI pipeline for most K8s versions. This commit quarantines those tests. The list of K8s versions to exclude was retrieved using the CI dashboard [1]. 1 - https://datastudio.google.com/s/iCx91Z2LNH8 Signed-off-by: Paul Chaignon <paul@cilium.io>

view details

Joe Stringer

commit sha 679f9132e3a856151cc6ed4d3c9883094ae387ed

kvstore: Fix event watcher serialization When using the watcher in log messages with JSON-based logging, logrus would give up on trying to generate the log message and print this to the logs instead: Failed to obtain reader, failed to marshal fields to JSON, json: unsupported type: kvstore.EventChan Fix it by fixing the JSON serialization tags to the structure to avoid serializing fields that don't make sense to be serialized, and to export the fields that do make sense to be serialized. Manually tested by applying this diff: diff --git a/pkg/kvstore/base_test.go b/pkg/kvstore/base_test.go index e9ee7da296bf..eb5a3548039b 100644 --- a/pkg/kvstore/base_test.go +++ b/pkg/kvstore/base_test.go @@ -292,3 +292,10 @@ func (s *BaseTests) TestListAndWatch(c *C) { w.Stop() } + +func (s *BaseTests) TestFoo(c *C) { + w := ListAndWatch(context.TODO(), "testWatcher2", "foo2/", 100) + c.Assert(c, Not(IsNil)) + + log.WithField(fieldWatcher, w).Fatal("Stopped watcher") +} diff --git a/pkg/logging/logging.go b/pkg/logging/logging.go index 9989e8db0280..6a651c0c87f4 100644 --- a/pkg/logging/logging.go +++ b/pkg/logging/logging.go @@ -50,7 +50,7 @@ const ( // DefaultLogFormat is the string representation of the default logrus.Formatter // we want to use (possible values: text or json) - DefaultLogFormat LogFormat = LogFormatText + DefaultLogFormat LogFormat = LogFormatJSON ) var ( Fixes: #14028 Signed-off-by: Joe Stringer <joe@cilium.io>

view details

Tam Mach

commit sha 11e38d616725de18cdf3c0289425b7b993af4248

fix/helm: Correct nodeSelector values This commit is to use the correct nodeSelectors in etc, operator and preflight templates. Add deprecated note for .Values.nodeSelector option. Closes #14005 Signed-off-by: Tam Mach <sayboras@yahoo.com>

view details

Joe Stringer

commit sha e38fd961baf798ab70ea87447c9651cb022f100e

helm: Fix description for clustermesh With the `disableEnvoyVersionCheck` option commented out and no subsequent comment for the `clustermesh` option, the autogeneration script was pulling the description for `disableEnvoyVersionCheck` in for `clustermesh`. Fix it by removing the dashes so no description is generated for this particular option. Signed-off-by: Joe Stringer <joe@cilium.io>

view details

Tom Payne

commit sha 30841812cb3c8b29eb9ad638b701e207a00c6a98

docs: Fix cilium typos Signed-off-by: Tom Payne <tom@isovalent.com>

view details

Tom Payne

commit sha e0e941589beb4f673f19e60cfa5fc4ae7609956c

pkg/monitor/agent: Fix cilium typos Signed-off-by: Tom Payne <tom@isovalent.com>

view details

Tobias Klauser

commit sha 97f3b485b48f7b0b5419352eab51c4235fc06b80

monitor: merge EndpointCreateNotification and EndpointDeleteNotification The types EndpointCreateNotification and EndpointDeleteNotification contain the same fields. Thus merge them in a single type named EndpointNotification which is used by func EndpointCreateMessage and EndpointDeleteMessage. Because the type is embedded into AgentNotifyMessage the consumer can still determine whether it was a create or delete event based on AgentNotifyMessage.Type. This change will simplify parsing of endpoint create/delete notifications when exposing agent events for Hubble. Signed-off-by: Tobias Klauser <tklauser@distanz.ch>

view details

Didier Durand

commit sha 05ac4acda0740e9f7f4aa6cbd897a720914722a9

fixing 1 typo in terminology.rst Signed-off-by: Didier Durand <durand.didier@gmail.com>

view details

Daniel Borkmann

commit sha 2a3e5d43b6a5e450c6d6949ff057253d45c035ce

cilium: disable bind-protection in kube-proxy free probe mode The probe mode is expected to only run alongside kube-proxy as hybrid. There was confusion that the kube-proxy log was throwing (harmless) warnings to its log that it could not bind sockets to service ports in the hostns. This is due to Cilium performing bind protection right out of the bind(2) syscall with eBPF. To avoid this confusion, defer to kube-proxy to bind sockets instead. This is less efficient and consuming more resources, but if users want to avoid the overhead, they would run kube-proxy free in strict mode anyway where Cilium does the bind protection by default anyway. Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>

view details

push time in 15 hours

push eventcilium/proxy

Jarno Rajahalme

commit sha 6c980c60fb96e27faae370af792217316482c2ed

Docker: Use the new multi-arch builder Signed-off-by: Jarno Rajahalme <jarno@covalent.io>

view details

push time in 16 hours

push eventcilium/proxy

Jarno Rajahalme

commit sha 646429b830971c5db07dfcb46525c2705466d9bb

Docker: Use the new multi-arch builder Signed-off-by: Jarno Rajahalme <jarno@covalent.io>

view details

push time in 16 hours

push eventcilium/proxy

Jarno Rajahalme

commit sha bd076b0fb860fdd7d5fc4c1877c9d0cd603c3096

Docker: Use the new multi-arch builder Signed-off-by: Jarno Rajahalme <jarno@covalent.io>

view details

push time in 16 hours

pull request commentcilium/cilium

Fix for IPVLAN after init.sh partial rewrite in Go

test-me-please

mrostecki

comment created time in 16 hours

push eventcilium/cilium

Daniel Borkmann

commit sha 2a3e5d43b6a5e450c6d6949ff057253d45c035ce

cilium: disable bind-protection in kube-proxy free probe mode The probe mode is expected to only run alongside kube-proxy as hybrid. There was confusion that the kube-proxy log was throwing (harmless) warnings to its log that it could not bind sockets to service ports in the hostns. This is due to Cilium performing bind protection right out of the bind(2) syscall with eBPF. To avoid this confusion, defer to kube-proxy to bind sockets instead. This is less efficient and consuming more resources, but if users want to avoid the overhead, they would run kube-proxy free in strict mode anyway where Cilium does the bind protection by default anyway. Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>

view details

Paul Chaignon

commit sha 6d0a4319718c10fb0671e312cc05277e51247e20

.travis: Run race detection builds on master commits only We had to temporarily subscribe to Travis CI because we consumed our 10000 free credits. Our current plan however only allows for two concurrent builds. With four builds per commit, we are constantly running behind, with Travis CI builds now taking longer to be scheduled than it takes our Jenkins tests to finish. Long gone are the days when we considered Travis CI a viable smoke test... This commit attempts to alleviate the issue by running our race detection builds only on master commits. Signed-off-by: Paul Chaignon <paul@cilium.io>

view details

Paul Chaignon

commit sha 816b3231cdbc39f4bcdd3e6f5b40a056459a478c

vagrant: Bump all Vagrant box versions These new images include the updated, pre-pulled Docker images: https://github.com/cilium/packer-ci-build/pull/245 Signed-off-by: Paul Chaignon <paul@cilium.io>

view details

Vlad Ungureanu

commit sha 9dc81301af222cc9713b0941a85102ddf361bc44

Consolidate ec2 client create call Signed-off-by: Vlad Ungureanu <vladu@palantir.com>

view details

Paul Chaignon

commit sha acb2daae88ca0363dd98a20ccc484f13cb4d2578

test: Use NFS by default for test VMs The new K8sVerifier test compiles some Cilium binaries inside the VM, which can lead to 'interrupted system call' errors. Using NFS should fix it by speeding up the filesystem accesses. This commit switches the test VMs to use NFS by default, thereby enabling NFS in our CI. NFS remains disabled in the CI's Runtime tests because it leads to permission errors [1]. 1 - https://jenkins.cilium.io/job/Cilium-PR-Runtime-4.9/2739/consoleFull Signed-off-by: Paul Chaignon <paul@cilium.io>

view details

Sebastian Wicki

commit sha 1b2904410a73058ee81c92c7027f65bc3a5410b2

hubble/parser: Always preserve datapath numeric identity This introduces a check that we do not overwrite the numeric security identity provided by the datapath trace point. Only if the datapath did not provide an identity (i.e. in `FROM_LXC` trace points) do we want to fall back on the identity from the user-space ip cache or endpoint manager. The numeric identity from the datapath can differ from the one we obtain from user-space (e.g. the endpoint manager or the IP cache), because the identity could have changed between the time the datapath event was created and the time the event reaches the Hubble parser. To aid in troubleshooting, we want to preserve what the datapath observed when it made the policy decision. Signed-off-by: Sebastian Wicki <sebastian@isovalent.com>

view details

Manuel Buil

commit sha d50075de642c913e59e65a58201edf7337420dba

Complete kube-router documentation BUG: #14152 Kube-router fetches the CIDRs from Kubernetes and thus ipam: cluter-pool configuration does not really work well. This patch clarifies this in the kube-router documentation Signed-off-by: Manuel Buil <mbuil@suse.com>

view details

Robin Hahling

commit sha 386964b6d6ce9a1d965256ea5ff2fc9dbf057396

api/observer: add version field to ServerStatusResponse Knowing about the running version is useful, notably during a cluster upgrade. Signed-off-by: Robin Hahling <robin.hahling@gw-computing.net>

view details

Robin Hahling

commit sha b1885343e999785e94b48f1b7a6333f1643c6d02

api/observer: re-generate protobuf code Signed-off-by: Robin Hahling <robin.hahling@gw-computing.net>

view details

Robin Hahling

commit sha 3f82e21f8270721dd165c317a97fce016588c1af

hubble: add build package to provide hubble server and relay version Signed-off-by: Robin Hahling <robin.hahling@gw-computing.net>

view details

Robin Hahling

commit sha b07b21aa29f4dccd17ac445f38ed747ade51ac6e

hubble/observer: add version information to status command Signed-off-by: Robin Hahling <robin.hahling@gw-computing.net>

view details

Robin Hahling

commit sha b614d18a54762c10d605e955c75e4a07ee04ef7c

hubble/relay: add version information to status command Signed-off-by: Robin Hahling <robin.hahling@gw-computing.net>

view details

Robin Hahling

commit sha 95ff38bc2c633e6c64fa7574038fd03b1cca1b9c

api/observer: add GetNodes rpc endpoint This endpoint is intended to be implemented by Hubble Relay to provide information about nodes and their status. Signed-off-by: Robin Hahling <robin.hahling@gw-computing.net>

view details

Robin Hahling

commit sha f0a5ce9c9cca43431181277b977ce51b5b8599c6

api/observer: re-generate protobuf code + add stub The new generated code breaks implementations of observer server because they are missing the new GetNodes method. To ensure that every commit compiles on its own, add stubs to implementations of observer server. Signed-off-by: Robin Hahling <robin.hahling@gw-computing.net>

view details

Robin Hahling

commit sha ae069dcbc51fae055a91035ff2e508362f5f2472

hubble/relay: implement observer.GetNodes rpc endpoint Signed-off-by: Robin Hahling <robin.hahling@gw-computing.net>

view details

Paul Chaignon

commit sha f380dd3ff740a3d169e3f24208aa34fffb7c967d

test: Avoid installing Cilium for K8sBandwidth if tests are skipped The overall structure for test K8sBandwidth looks to have been extracted from K8sServices. It works fine but is more complex than necessary and leads to unintended behavior when tests are skipped. This commit simplifies the structure to have a single conditional Context (conditioned on net-next kernel) inside which the three It tests are run. Cilium was also installed with the bandwidth manager enabled *before* the conditional Context. That installation would therefore happen regardless of whether bandwidth tests should actually be skipped, sometimes even leading to flakes on 4.9 kernels [1]. Removing this initial installation of Cilium implies that the test pods are now deployed (once for all tests) before Cilium is installed. We therefore need to wait for the test pods, with a new helper waitForTestPods(), after each re-installation of Cilium. 1 - https://jenkins.cilium.io/job/Cilium-PR-Ginkgo-Tests-K8s/3740/testReport/junit/Suite-k8s-1/16/K8sBandwidthTest_Checks_Bandwidth_Rate_Limiting/ Signed-off-by: Paul Chaignon <paul@cilium.io>

view details

Alexandre Perrin

commit sha 1eec075404ba0d2508d08ae34974d7478850fa24

docs: Add missing Jobs to the Jenkins Trigger Phrases table Signed-off-by: Alexandre Perrin <alex@kaworu.ch>

view details

Martynas Pumputis

commit sha 885a319876321cb05fbe7f24fbac0394d0990f8e

daemon: Fix netns usage in kpr privileged unit tests Previously, the SetUpSuite() routine called netns.New(). It expected that the latter only creates a new netns without setting it. However, according to the docs it's not the case: package netns // import "github.com/vishvananda/netns" func New() (ns NsHandle, err error) New creates a new network namespace, sets it as current and returns a handle to it. This meant that we changed the netns before locking the OS thread which could result in other Go runtime threads running in the test netns. Fixes: b059c3185c ("daemon: Add unit tests for device detection") Signed-off-by: Martynas Pumputis <m@lambda.lt>

view details

Maciej Kwiek

commit sha 263421aae67a87eac76cf9ce2b047300dc3aa602

test: quarantine flaking datapathconfig tests on 1.17 this change extends quarantine for k8s-all job to 1.17 k8s version, which will help us checking whether 1.18 job actually fails due to these flakes. Signed-off-by: Maciej Kwiek <maciej@isovalent.com>

view details

Jarno Rajahalme

commit sha 60bd47fe1b5897be76602acbe133cd28218994f5

fqdn: Delay ipcache upserts until policies have been updated Add a map for newly allocated identities to ipcache.AllocateCIDR functions that the caller can use to upsert the IPs to ipcache later, after affected endpoint policy maps have been updated. Use this new functionality on the DNS proxy code path, that makes sure that new policy map entries are in place before an IP received from a DNS server is placed in ipcache. This is really straightforward as the logic for waiting was already in place for delaying the forwarding of the DNS response. Policy update path is still allowing ipcache upserts at policy ingestion time rather than waiting for the policy maps to be updated. This means that new, more specific CIDRs (e.g., 10.0.0/24) in policies can still cause momentary drops on traffic currently using a less specific CIDR (e.g., 10.0/16). Signed-off-by: Jarno Rajahalme <jarno@covalent.io>

view details

push time in 16 hours

issue openedcilium/cilium

bpf: force svc/be ports to be same on agent side

https://github.com/cilium/cilium/pull/14193#pullrequestreview-540094098

created time in 17 hours

issue commentkubernetes/autoscaler

Release notes for patch release

Stale issues rot after 30d of inactivity. Mark the issue as fresh with /remove-lifecycle rotten. Rotten issues close after an additional 30d of inactivity.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta. /lifecycle rotten

vivekbagade

comment created time in 17 hours

push eventcilium/cilium

Daniel Borkmann

commit sha 1fd0457e042400a649aed5d2722b8ba417f620db

bpf: do not create CT entry for forwarding DSR services Not needed here given the reply won't ever be seen on this node, so spare this expensive fast-path overhead (which needs to lock the map) when under DSR. We really only need to track the CT_SERVICE ones to pick an established backend. Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>

view details

Daniel Borkmann

commit sha 89c0f08cc13125deb2a4fe409d465b48301ba34c

bpf, cilium: add IPIP for DSR under XDP in LB-only mode Add a new agent flag for the lb-only load-balancer which is able to select a DSR dispatch method (--bpf-lb-dsr-dispatch). This is used in direct routing for forwarding the original request IPIP encapsulated (v4v4 or v6v6) to the related remote service backend. This is an alternative to the IP option based dispatch which is the current default in the agent. Example invocation: # ./daemon/cilium-agent --enable-ipv4=true --enable-ipv6=true \ --datapath-mode=lb-only --bpf-lb-algorithm=maglev \ --bpf-lb-maglev-table-size=65521 --bpf-lb-mode=dsr \ --bpf-lb-acceleration=native --bpf-lb-dsr-dispatch=ipip \ --devices=enp2s0np0 Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>

view details

push time in 17 hours

delete branch cilium/cilium

delete branch : pr/xdp-dsr-ipip

delete time in 17 hours

PR merged cilium/cilium

Reviewers
bpf: optimize dsr and add ipip support for lb-only area/datapath lb-only release-note/minor

See commit msgs for details.

+317 -157

3 comments

13 changed files

borkmann

pr closed time in 17 hours

push eventcilium/cilium

Daniel Borkmann

commit sha 3c8565d0ab27bbbc766fd42c831372cecfdb4c37

bpf, cilium: add IPIP for DSR under XDP in LB-only mode Add a new agent flag for the lb-only load-balancer which is able to select a DSR dispatch method (--bpf-lb-dsr-dispatch). This is used in direct routing for forwarding the original request IPIP encapsulated (v4v4 or v6v6) to the related remote service backend. This is an alternative to the IP option based dispatch which is the current default in the agent. Example invocation: # ./daemon/cilium-agent --enable-ipv4=true --enable-ipv6=true \ --datapath-mode=lb-only --bpf-lb-algorithm=maglev \ --bpf-lb-maglev-table-size=65521 --bpf-lb-mode=dsr \ --bpf-lb-acceleration=native --bpf-lb-dsr-dispatch=ipip \ --devices=enp2s0np0 Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>

view details

push time in 18 hours

pull request commentcilium/cilium

bpf: optimize dsr and add ipip support for lb-only

(all-green from CI side, only updated a help text in latest change)

borkmann

comment created time in 18 hours

Pull request review commentcilium/cilium

bpf: optimize dsr and add ipip support for lb-only

 func initKubeProxyReplacementOptions() (strict bool) { 					option.NodePortAcceleration, option.NodePortAccelerationDisabled, option.TunnelName, option.TunnelDisabled) 			} 		}++		if option.Config.NodePortMode == option.NodePortModeDSR &&+			option.Config.LoadBalancerDSRDispatch == option.DSRDispatchIPIP {+			if option.Config.DatapathMode != datapathOption.DatapathModeLBOnly {+				log.Fatalf("DSR dispatch mode %s only supported for standalone load balancer", option.Config.LoadBalancerDSRDispatch)

Ok, will add once CI is through and green, then update and have smoke tests run.

borkmann

comment created time in 20 hours

Pull request review commentcilium/cilium

bpf: optimize dsr and add ipip support for lb-only

 cilium-agent [flags]       --bpf-fragments-map-max int                            Maximum number of entries in fragments tracking map (default 8192)       --bpf-lb-acceleration string                           BPF load balancing acceleration via XDP ("native", "disabled") (default "disabled")       --bpf-lb-algorithm string                              BPF load balancing algorithm ("random", "maglev") (default "random")+      --bpf-lb-dsr-dispatch string                           BPF load balancing DSR dispatch method ("opt", "ipip") (default "opt")

Given we need to handle this in Cilium in the mid-term as well, maybe it's not necessarily needed till next release.

borkmann

comment created time in 20 hours

pull request commentcilium/cilium

v1.7 backports 2020-11-23

test-backport-1.7

christarazi

comment created time in 21 hours

issue openedcilium/hubble

Hubble 0.7.3 is missing http and dns metrics

Description

I have deployed Cilium 1.9.0 with Hubble 0.7.3. To verify network connectivity and generate some network traffic, I am using a simple example service.

Querying Hubble metrics endpoint returns the following content:

# HELP hubble_drop_total Number of drops
# TYPE hubble_drop_total counter
hubble_drop_total{protocol="ICMPv4",reason="Stale or unroutable IP"} 1
hubble_drop_total{protocol="ICMPv4",reason="Unsupported protocol for NAT masquerade"} 2
hubble_drop_total{protocol="TCP",reason="Stale or unroutable IP"} 7
# HELP hubble_flows_processed_total Total number of flows processed
# TYPE hubble_flows_processed_total counter
hubble_flows_processed_total{protocol="ICMPv4",subtype="",type="Drop",verdict="DROPPED"} 3
hubble_flows_processed_total{protocol="ICMPv4",subtype="to-endpoint",type="Trace",verdict="FORWARDED"} 39
hubble_flows_processed_total{protocol="ICMPv4",subtype="to-overlay",type="Trace",verdict="FORWARDED"} 43
hubble_flows_processed_total{protocol="ICMPv4",subtype="to-stack",type="Trace",verdict="FORWARDED"} 19
hubble_flows_processed_total{protocol="TCP",subtype="",type="Drop",verdict="DROPPED"} 7
hubble_flows_processed_total{protocol="TCP",subtype="to-endpoint",type="Trace",verdict="FORWARDED"} 17144
hubble_flows_processed_total{protocol="TCP",subtype="to-overlay",type="Trace",verdict="FORWARDED"} 12558
hubble_flows_processed_total{protocol="TCP",subtype="to-stack",type="Trace",verdict="FORWARDED"} 6985
hubble_flows_processed_total{protocol="UDP",subtype="to-endpoint",type="Trace",verdict="FORWARDED"} 10901
hubble_flows_processed_total{protocol="UDP",subtype="to-overlay",type="Trace",verdict="FORWARDED"} 6471
# HELP hubble_icmp_total Number of ICMP messages
# TYPE hubble_icmp_total counter
hubble_icmp_total{family="IPv4",type="EchoReply"} 39
hubble_icmp_total{family="IPv4",type="EchoRequest"} 61
hubble_icmp_total{family="IPv4",type="TimeExceeded(TTLExceeded)"} 4
# HELP hubble_port_distribution_total Numbers of packets distributed by destination port
# TYPE hubble_port_distribution_total counter
hubble_port_distribution_total{port="0",protocol="ICMPv4"} 39
hubble_port_distribution_total{port="2379",protocol="TCP"} 178
hubble_port_distribution_total{port="2380",protocol="TCP"} 3970
hubble_port_distribution_total{port="4240",protocol="TCP"} 180
hubble_port_distribution_total{port="4244",protocol="TCP"} 1264
hubble_port_distribution_total{port="4245",protocol="TCP"} 919
hubble_port_distribution_total{port="53",protocol="UDP"} 6398
hubble_port_distribution_total{port="6443",protocol="TCP"} 430
hubble_port_distribution_total{port="80",protocol="TCP"} 6109
hubble_port_distribution_total{port="8080",protocol="TCP"} 519
hubble_port_distribution_total{port="8181",protocol="TCP"} 493
# HELP hubble_tcp_flags_total TCP flag occurrences
# TYPE hubble_tcp_flags_total counter
hubble_tcp_flags_total{family="IPv4",flag="FIN"} 5707
hubble_tcp_flags_total{family="IPv4",flag="RST"} 2634
hubble_tcp_flags_total{family="IPv4",flag="SYN"} 3415
hubble_tcp_flags_total{family="IPv4",flag="SYN-ACK"} 3379

According to the documentation, http and dns metrics should also be returned.

Am I missing something in the configuration?

Expected behavior

The following metrics should be returned:

  • dns
    • dns_queries_total
    • dns_responses_total
    • dns_response_types_total
  • http
    • http_requests_total
    • http_responses_total
    • http_request_duration_seconds

Additional context

  • Self-hosted Kubernetes v1.18.2 (Centos 8)
  • Cilium 1.9.0
  • Hubble 0.7.3

To reproduce

helm install cilium cilium/cilium --version 1.9.0 \
   --namespace kube-system \
   --set etcd.enabled=true \
   --set etcd.managed=true \
   --set identityAllocationMode=kvstore \
   --set hubble.enabled=true \
   --set hubble.listenAddress=":4244" \
   --set hubble.metrics.enabled="{dns,drop,tcp,flow,port-distribution,icmp,http}" \
   --set hubble.relay.enabled=true \
   --set hubble.ui.enabled=true \
   --set prometheus.enabled=true \
   --set operator.prometheus.enabled=true

kubectl apply -f https://raw.githubusercontent.com/cilium/cilium/v1.9/examples/kubernetes/clustermesh/global-service-example/cluster1.yaml

kubectl -n kube-system port-forward svc/hubble-metrics 9091

created time in 21 hours

pull request commentkubernetes/autoscaler

Update vendor dependencies

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: <a href="https://github.com/kubernetes/autoscaler/pull/3730#" title="Author self-approved">BigDarkClown</a> To complete the pull request process, please assign feiskyer after the PR has been reviewed. You can assign the PR to them by writing /assign @feiskyer in a comment when ready.

The full list of commands accepted by this bot can be found here.

<details open> Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment Approvers can cancel approval by writing /approve cancel in a comment </details> <!-- META={"approvers":["feiskyer"]} -->

BigDarkClown

comment created time in a day

PR opened kubernetes/autoscaler

Update vendor dependencies
+190047 -69956

0 comment

1831 changed files

pr created time in a day

pull request commentcilium/cilium

make Cilium's CNI conf the only one available

test-me-please

aanm

comment created time in a day

more