Guideline: Error Handling

“Opposing beams particle accelerator” graffiti in Novosibirk.

Consider this classic question, raised in the context of Go and API compatibility (it was asked in a twitter poll, but it’s not available anymore):

func Process() error { return ErrFoo } // v1
func Process() error { return ErrBar } // v2

// Did I break API compatibility? Do I need to bump the major version?

The responses split 50/50, but I don’t think it should be like this if you are aware of one simple thing:

Errors are the part of your API contract.

API contract is the most important part of software design. Every serious developer should be scrupulous about them. This is because API is by definition are just a layered stack of agreements between pieces of software. For example, see Linus Torvalds reaction on a broken Linux kernel contract.

The case of the tweet is obvious. ErrFoo is a public error, and callers are probably using it. You can’t rename it because it would break the contract.

The question is, how to make sure you don’t break the contract. Software changes, and it’s sometimes hard to predict how.

Tactic 0. Minimize the possibility of an error

The best thing you can do to handle an error is to remove the possibility of error.

Compare this:

import http.client

conn = http.client.HTTPSConnection("www.python.org")
conn.request("GET", "/")
res = conn.getresponse()

print(res.status, res.reason)

And this:

import requests

res = requests.get("https://www.python.org/")

print(res.status, res.reason)

They do the same, but it’s much easier to make a mistake in the first example. A programmer can type conn.request("CET", "/") and then wonder why the response is so strange.

This is the reason why modern languages have literals to express durations and other units.

change_speed(Speed s);

change_speed(2.3); // doesn't compile, no unit
change_speed(23m / 10s); // meters per second

“A Philosophy of Software Design” by John Ousterhout calls this tactic “Define errors out of existence”. “Effective C++” by Scott Meyers calls this “Make interfaces easy to use correctly and hard to use incorrectly”.

Tactic 1. Repair what you can — but when you must fail, fail noisily and as soon as possible

This one comes from the basics of the Unix philosophy. Let’s break it into parts:

Repair what you can

Rely on the API contract you use. For example, if you use rename() and get ENOENT, it may be totally fine, and you can silently handle this.

try:
    shutil.move(old_path, new_path)
except IOError as e:
    if e.errno != errno.ENOENT:
        raise e
    # Silently skip ENOENT

You can make IOError a part of your API contract if you think that callers can do something meaningful to handle it.

If callers cannot do anything meaningful with the error you return, don’t make it a part of the contract.

Fail noisily

Fail noisily doesn’t mean that you should log an error, and then throw it. It’s a bad decision, and can be improved.

try:
   f()
except Exception as e:
    logging.error(str(e))
    raise MyExcepition(e) # propagate the exception

This looks bad in logs, and doesn’t help to debug the issue later.

It muddles the error stack trace because language machinery will complain that exception was thrown during exception handling. In case of Python, raise from must be used to chain exceptions.

Use your language tools to add context to errors, if they’re expected. If they’re unexpected, don’t handle errors at all. Let the caller handle them.

Tactic 2. Wrap errors properly

If you’re wrapping an error like this:

try:
   f()
except SomeException as e:
    raise MyExcepition(e) from e

and MyException is not a part of your contract, you’re wasting your caller’s time. You’re wasting your time too because there’s no need in supporting MyException.

It rarely happens in languages with exceptions, but it’s a big problem in Go. In fact, it’s so big, language designers had to introduce crutches to work with it. See a separate article for more details.

Don’t wrap errors just because you can.