profile
viewpoint

Ask questionsmissing infrastructure to notify us of panics in prod and other deployments

I notice there are panics in our frontend pod logs, and probably in other services, too.

We don't have anything to notify us of these in prod, nor in other deployments. That is a major oversight on our part. We need something to tell us about these.

I don't know who / when is the best time to tackle this, so I am backlogging for now. Any volunteers?

sourcegraph/sourcegraph

Answer questions bobheadxi

took a quick look at this - there is defer ... recover but it cannot capture panics in spawned goroutines which kind of renders it useless

an alternative might be to create some kind of wrapper program that can capture the output and update the entrypoints of the various services to use it: https://sourcegraph.com/search?q=repo:%5Egithub.com/sourcegraph/sourcegraph%24+file:%5Ecmd/.*%3F/Dockerfile+ENTRYPOINT&patternType=literal - I'm not too familiar with doing things to command output, so this might be nontrivial.

Aside: from the above query I notice we use tini, where I found this discussion about cleanup steps: https://github.com/krallin/tini/issues/28 - the idea was shut down, but I imagine a wrapper program would look like some of the scripts people posted in that discussion

useful!

Related questions

Add loki to sourcegraph.com hot 1
Unable to clone GitLab repositories with self signed certificate hot 1
Github User Rank List