profile
viewpoint

Ask questionsFile perms are not preserved in `fly execute`

Bug Report

Locally executed builds often fail with:

Backend error: Exit status: 500, message: {"Type":"","Message":"runc exec: exit status 1: exec failed: container_linux.go:345: starting container process caused \"exec: \\\"cbug/dummy\\\": permission denied\"\n","Handle":"","ProcessID":"","Binary":""}

On my machine:

 cbug > ll ci
total 16
drwxr-xr-x  4 pivotal  staff  128 Apr 29 16:15 .
drwxr-xr-x  4 pivotal  staff  128 Apr 29 16:15 ..
-rwxr-xr-x  1 pivotal  staff   21 Apr 29 15:58 dummy
-rw-r--r--  1 pivotal  staff  173 Apr 29 16:15 dummy.yml

Inspection of the bind-mounted volume after a privileged build shows that permissions and ownership were not preserved* and therefore the run.path executable could not be called.

On the worker both uploaded files have 0666 perms and are owned by root:

worker/28fe13ea-1333-497c-ac89-b6cb2eb51ab2:/var/vcap/data/worker/work/volumes# ll ./live/f6301c44-0ae1-479d-6eeb-4fb1d544aa1f/volume/ci/
total 8
drwxrwxrwx 1 root root  28 Apr 29 15:16 ./
drwxrwxrwx 1 root root  24 Apr 29 15:16 ../
-rw-rw-rw- 1 root root  21 Apr 29 14:58 dummy
-rw-rw-rw- 1 root root 173 Apr 29 15:15 dummy.yml

The parent directory, ci, is 0777:

worker/28fe13ea-1333-497c-ac89-b6cb2eb51ab2:/var/vcap/data/worker/work/volumes# ll ./live/f6301c44-0ae1-479d-6eeb-4fb1d544aa1f/volume/
total 4
drwxrwxrwx 1 root root  24 Apr 29 15:16 ./
drwxr-xr-x 1 root root  72 Apr 29 15:16 ../
drwxrwxrwx 1 root root  28 Apr 29 15:16 ci/

While for a successful run, the Volume has been laid down with 0755 and the correct owner:

worker/28fe13ea-1333-497c-ac89-b6cb2eb51ab2:/var/vcap/data/worker/work/volumes# ll live/c6477ce5-a3f6-48a0-5dfa-c8bea76123ec/volume/ci
total 8
drwxr-xr-x 1 pivotal staff  28 Apr 29 15:16 ./
drwxr-xr-x 1 pivotal staff  24 Apr 29 15:15 ../
-rwxr-xr-x 1 pivotal staff  21 Apr 29 14:58 dummy*
-rw-r--r-- 1 pivotal staff 173 Apr 29 15:15 dummy.yml

And the parent ci is 0755:

worker/28fe13ea-1333-497c-ac89-b6cb2eb51ab2:/var/vcap/data/worker/work/volumes# ll live/c6477ce5-a3f6-48a0-5dfa-c8bea76123ec/volume/
total 4
drwxr-xr-x 1 pivotal staff  24 Apr 29 15:15 ./
drwxr-xr-x 1 root    root   72 Apr 29 15:16 ../
drwxr-xr-x 1 pivotal staff  28 Apr 29 15:16 ci/

We see this happen at a rate of approximately 1 in 6 fly executes. We see this on priv and unpriv execs. We have not seen it happen (yet) through a pipeline build.

 

* I am aware that ownership would change in the case of unprivileged containers.

Steps to Reproduce

Repro Gist

Expected Results

The file should have the correct permissions and the execute should succeed.

Actual Results

The above permission denied failure.

Version Info

  • Concourse version: 5.1.0
  • Deployment type (BOSH/Docker/binary): BOSH
  • Infrastructure/IaaS: GCP
  • Did this used to work? Yes
concourse/concourse

Answer questions vito

I had it running all morning and none of them failed. 😕 This is also something our CI would have probably caught by now if it was as simple as the issue describes. All I can suggest at this point is poking around your deployment to see if anything looks suspicious, and/or try to reproduce in a smaller deployment (i.e. docker-compose).

useful!
source:https://uonfu.com/
answerer
Alex Suraci vito @vmware Toronto, ON @concourse co-creator, pm, engineer
Github User Rank List