This article is a short but insightful one into error handling methods better than doing a
panic in your Go application.
I am not saying panics are bad but they should be your last resort.
So the question from those who don’t write Go:
What is a panic?
A panic is an exception, a call to disrupt the state of an application and bring it to a halt indefinitely.
Similar to the C style of programming, Go considers errors as first-class values so panics are more like runtime errors issued voluntarily by the user.
It might also be considered as
throw in other languages.
What this entails is that errors in Golang do not always propagate to the top of the call stack (you may as well just call it an assignment to a variable), so error handling is very explicit.
From Rob Pike:
Errors are Values
Take some time to give this article a read then head back:
The Go Blog
Rob Pike 12 January 2015 A common point of discussion among Go programmers, especially those new to the language, is…
Hence, errors would be always handled like so:
If we needed some Java-style errors, we would
throw a panic and cause mayhem.
Even though this might seem good, it is very hard to handle and not in the best spirit for a Gopher.
For those who write Go, you might be unaware of the
recover() function which can be likened to a
catch for a
panic , but the fact it is unknown to some speaks to the ideology of Go being
not panic friendly.
Panics in Go can be thought of as a baby, the first cry is welcoming and speaks to a correctly functioning “program” (pardon my language but it is what it is).
At the start, it’s nice but later on becomes a nuisance that makes you want to give it away; now imagine if there was more than one!.
A panic should always be a last resort, and even then consider a better option!
This can be thought of in the same sense as code, multiple waves of panic lead to narcissistic errors that really don’t deserve your attention.
The more panics there are, the less visible the important errors (not talking about making more babies here but the ideology works).
So how does Go say we handle errors?
Go is a very flexible language and there are several methods, but they all speak to handling errors more by behavior than the type of error.
It requires you to FREQUENTLY handle errors in that manner of behavior so it’s more to see a lot of
if err != nil in any Golang codebase rather than an
if err.(type) == AuthException .
A couple of strategies employed which have good support are:
- Logging errors with context (cause and message)
- Expose errors as metrics
- Expose errors as events
Logging errors with context
Logs are the easiest way to inspect the running state of an application, whether it be debugging or general inspection.
Since errors in Go are first-class values, there are several methods to do this but the important thing is CONTEXT.
There is a lot here but a major article from Dave Cheney sets the standard for this, read here.
Do not print the error verbatim like
fmt.Println(err) , rather do some initial work to understand the errors cause and its type so as to describe it like
fmt.Println(fmt.Sprintf("error occurred doing this: %s", err)) , thereby adding valuable context to the error.
Another pattern employed by Cheney is to use errors.Wrap and declare a cause->message relationship for your errors.
Exporting errors as metrics
This is one strategy I do employ, it not only covers the base case of supplementing the “Errors with context” approach but also adds in the ease of declaring after-effects on the exposed metrics e.g alerts, visualizations, stack trace analysis, etc all outside the application.
Let’s take an example where I count the number of failed image downloads
From there, I observe errors as exposed streams and can adequately get some inference about the general scope of the error.
This is similar to tagging errors as implemented by errors-go, they summarise a lot of concepts around aggregating errors using types. In this case, I use Prometheus labels along with an
The label could have values like
IMAGE_AUTH_NEEDED to express the code -> error relationship and the severity of its occurrence.
Exposing errors as events
One platform I notably learn on this is Kubernetes. Errors in Kubernetes are typically exposed as aggregated events, where repeated instances of them are timed and even managed as distinct states.
Kubernetes is written in Go and that adds some influence as to how they manage errors over a distributed control plane, although it’s more error reporting than handling but the same concepts apply.
Kubernetes does this using both approaches earlier discussed, tag and aggregate error types using labels (here covered as a REASON or by Dave Cheney, a CAUSE) alongside a descriptive MESSAGE about the state of that error alongside stack traces and logs needed to resolve the problem.
As opposed to a panic for severe errors, we hold a more stateful approach to handling the error and stall further operations whilst reporting the current error.
Similar to the quicksand ideology, this gives you enough time to poke around and find what’s wrong or restart the application as opposed to bringing down the whole cluster due to an OutOfIndex error etc.
So after all the long talk, when is it okay to panic?
Panics are somewhat okay when the error state needs attention and there’s no going forward from there.
An example would be starting an application with a missing environment variable or having an invalid configuration (this could also be hot reloaded).
No amount of error handling would fit a case of this, panic as needed and let the user know their attention is needed. A failed write to a store could be worth a panic as the application not writing will lead to a fatal inconsistent state etc.
A lot of the time, panics are needed only when you have a fatal end and need to stop to save yourself, rather than shoot yourself in the foot for some fancy stack trace.
To end this, I say:
With great power comes great responsibility, but even Spiderman knew better than to panic unless needed.
Types of Kubernetes Events
Kubernetes events are a resource type in Kubernetes that are automatically created when other resources have state…
Reporting Errors from Control Plane to Applications Using Kubernetes Events
At Box, we manage several large scale Kubernetes clusters that serve as an internal platform as a service (PaaS) for…
Don't just check errors, handle them gracefully
This post is an extract from my presentation at the recent GoCon spring conference in Tokyo, Japan. I've spent a lot of…
Error handling in Go
Go does not provide conventional try/catch method to handle the errors, instead, errors are returned as a normal return…