Bug Report

I restarted all of my Concourse nodes to apply v5.0.1, but did it badly, and ended up just SIGKILL'ing everything (I'm sorry!). Now that stuff is rebooted, almost every resource is causing this error in Concourse:

Mar 25 17:53:54 concourse[12561]: {"timestamp":"2019-03-25T21:53:54.670443499Z","level":"error","source":"atc","message":"atc.pipelines.radar.failed-to-run-scan-resource","data":{"error":"Backend error: Exit status: 500, message: {\"Type\":\"\",\"Message\":\"exit status 2\",\"Handle\":\"\",\"ProcessID\":\"\",\"Binary\":\"\"}\n","pipeline":"api-server","session":"18.5","team":"isoscribe"}}

Steps to Reproduce

  1. Create a pipeline
  2. Make it do stuff
  3. Sigkill everything that has to do with the workers
  4. Start back up the nodes

Expected Results

Not this

Actual Results


Version Info

  • Concourse version: 5.0.1
  • Deployment type (BOSH/Docker/binary): Binary
  • Infrastructure/IaaS: VMware
  • Browser (if applicable): Chrome
  • Did this used to work? Yes

Answer questions enugentdt

I just wanted to update the reproducing steps, as I'm able to consistently make this happen. So might this be worth reopening, @vito ?

Steps to reproduce:

  1. Create a concourse web node and a concourse worker
  2. Create a pipeline, and have it run through at least once (could be just a simple check-put, or could be something existing)
  3. Hard poweroff the worker VM
  4. Bring the worker VM back online

When the worker VM comes back up, this is the view you'll get: Screen Shot 2019-04-04 at 11 27 26 PM

All the resources show some variation of the following error:

Backend error: Exit status: 500, message: {"Type":"","Message":"exit status 2","Handle":"","ProcessID":"","Binary":""}

But yes, #3079 will fix this (hopefully). However, it has persisted since early v4, so maybe there's something lower that is upset? Or maybe it's a Garden issue, but I'm not too sure on how Garden works with regards to Concourse.

