How to: Structured Logging

Published:
Translations:no translations yet

There’s a lot of structured logging libraries for Go but there’s no good guideline on how to use them. This article explains how to get the most of your logs using those libraries.

If you’re interested in underlying assumptions for this guideline, check out the previous article What is Structured Logging.

How to populate a context

Every logging library has a notion of context. Context groups log entries about a single task. The task may be an incoming HTTP request, a transaction in the database, or removal of a dozen files from disk.

A context must be created before the first mention of the task in the logs. To create context you must give it an identifier (preferably unique), so all related log records could have it as an attribute. Often, a task identifier is suitable for a context identifier. If not, a context identifier can be generated (it’s best not to abuse it).

You don’t need to add anything else to the context. Just context identifier.

The first mention of a task is usually logged with the Info level. For example, there you can write a log record with parameters of the task.

Contexts are nested together. Processing a single “large” task can fall into several independent or concurrent operations with its contexts. Later you can restore the graph of program execution using logs.

Fields must be in snake_case.

Example of a log with contexts

INFO  handle request             [req_id=1]
INFO  load file                  [req_id=1, path=A]
INFO  load file                  [req_id=1, path=B]
WARN  cannot load file, fallback [req_id=1, path=A, error='permission error']
INFO  load file from FTP         [req_id=1, path=A]
DEBUG file loaded                [req_id=1, path=B]
DEBUG file loaded from FTP       [req_id=1, path=A]
DEBUG merge file content         [req_id=1, left=A, right=B]
DEBUG request handled            [req_id=1]

Context creation

// Not good, too much junk in log records produced by subroutines
l := log.FromContext(ctx).WithFields(log.Fields{
        "task_type": task.Type,
        "task_id":         task.ID.String(),
        "request_id":      task.RequestID,
    })
ctx = log.AttachToContext(ctx, l)
 
foo(ctx)
 
----------------------------------------------------------------------
// Better, subroutines log only task_id
l := log.FromContext(ctx).WithFields(log.Fields{
        "task_id": task.ID.String(),
})
l.WithFields(log.Fields{
        "task_type": task.Type,
        "request_id":      task.RequestID,
}).Info("process task")
 
ctx = log.AttachToContext(ctx, l)
foo(ctx)

How to format messages

A common argument against structured logs is that these logs are for machines, not people. But if you write all messages in the same style in structured logs, people can parse them no worse than machines.

Use present simple or past simple, without passive forms. Punctuation should be avoided (for example, “…”, “.” or “!” at the end of messages).

Entering and leaving the context

// Good
logger.Info("create task") // Infinitive + noun.
...
// Here we connect scan service over a network and create a task there.
// We add the id of the task to the context.
...
logger.Debug("task created") // Noun + past participle.

----------------------------------------------------------------------

// Bad, too long
logger.Info("trying to create task"
// Bad, longer than present simple, same meaning
logger.Info("creating task")
// Bad, scanner should be in the context at this point. No need to mention it once again.
logger.Info("scanner: creating task")
// Bad, almost always finishes in 10 ms, ... doesn't mean anything useful.
logger.Info("create task...")
...
...
// Bad, too long
logger.Debug("task has been created")

Messages with dynamic parts

Messages should not be formatted dynamically. Don’t use Infof() and other *f(). Don’t concatenate messages via +.

// Good. If there's only one field, use WithField().
logger.WithFields(log.Fields{
   "counter": counter,
   "duration_ms": totalDuration.Milliseconds(),
}).Debug("transactions handled")

----------------------------------------------------------------------

// Bad
logger.Debugf("handled %d transactions in %d ms", counter, totalDuration.Milliseconds())

How to choose a log level

Typical levels are Error, Warn, Info and Debug. Use Error and Warn for errors and Info and Debug for everything else.

Errors

Errors are logged to Error and Warn.

If the error cannot be suppressed anywhere before, it should be logged only at the highest level of context.

Sometimes errors can have useful details that are impossible to recover higher on the stack. Then you need to log the details on Warn, the error is returned higher up the stack.

err := errors.New("source import task failed")
logger.WithFields(log.Fields{
    "state":  task.State,
    "result": task.EncodedResult,
}).Warn(err)
return err

Sometimes errors do not form Go errors, they’re suppressed in-place. Use Warn for them.

// There's no recovery, we just warn about potential problem.
// The result could be safely used.
if len(scanServiceTask.Result.Errors) > 0 {
        log.FromContext(ctx).WithFields(log.Fields{
            "scan_id":            scanServiceTask.ScanID,
            "scanner_error": scanServiceTask.Result.Errors.ToSingleError().Error(),
        }).Warn("scan result contains errors")
}

If the error cannot be returned (in defer), it should be logged on Warn.

defer func() {
    if err := publish(); err != nil {
        logger.WithError(err).Warn("cannot publish")
    }
}()

Info, Debug or not logging at all?

Do not log any information that can be obtained from the database or raw data stored somewhere.

By default, log records should be written on Debug. Exceptions (need Info):

Minimum for a service

Unix has Rule of silence, but it’s not about logs. A service should use log.Info("ready to process tasks") on startup. For a task a service must write at least something on the Info level. Otherwise, it is not clear whether this process is processing tasks or someone else.

At a minimum, you need to log the basic parameters of incoming tasks (at least id or request id). It is also necessary to write that the task has been processed, preferably with a status.