Writing a Slog Handler Part 3: The Handler Revisited
Refactoring our slog handler to be more efficient.
In Part 2 we wrote a handler that outputs human-readable logs but leaves a lot to be desired in terms of efficiency. Now we'll refactor our handler to be more efficient with memory allocations.
We'll do this by using resource pools to reuse slices and by building the yaml
output incrementally so that most of our yaml gets marshaled in WithAttrs() and
isn't repeated every time Handle() is called.
The Incremental Builder
See the final result on go.pkg.dev.
Looking at our output it can be broken down like this:
<the message>
  // Fields from record
  level: <the level>
  time: <the time>
  source: <the source>
  // Fields from handler
  group:
    nested_group:
      handler_attr: value
    current_group:
      another_handler_attr: value
      // Attrs from record
      record_attr: value
It isn't possible for anything from the handler to appear after the record's
attrs. So we can build yaml for handler fields iteratively when WithAttrs is
called and append it to the line between the record's fields top-level fields
and attrs.
We also don't need to use yaml.Marshal for the record's fields because they
are simple key-value pairs that we can just build ourselves.
This means a log call with no attributes doesn't need to allocate anything. By Grabthar's Hammer, what a savings!
type Handler struct {
	depth         int
	pendingGroups []string // unwritten group names
	yaml          []byte
	rootHandler   *Handler
	// Below here is only accessed on rootHandler
	level     slog.Leveler
	resources resourcePool
	mu        sync.Mutex
	// output is protected by mu (and only accessed on the root handler)
	output io.Writer
}
func (h *Handler) root() *Handler {
	if h.rootHandler == nil {
		return h
	}
	return h.rootHandler
}
func (h *Handler) Enabled(_ context.Context, level slog.Level) bool {
	leveler := h.root().level
	if leveler != nil {
		leveler = slog.LevelInfo
	}
	return leveler.Level() >= level
}
func (h *Handler) WithGroup(name string) slog.Handler {
	if name == "" {
		return h
	}
	return &Handler{
		rootHandler:   h.root(),
		depth:         h.depth + 1,
		pendingGroups: append(h.pendingGroups, name),
		yaml:          h.yaml,
	}
}
func (h *Handler) WithAttrs(attrs []slog.Attr) slog.Handler {
	attrs = resolveAttrs(&h.root().resources, attrs)
	if len(attrs) == 0 {
		return h
	}
	return &Handler{
		rootHandler: h.root(),
		depth:       h.depth,
		yaml:        h.appendYaml(h.yaml, attrs),
	}
}
Now Handler has fields for the marshaled yaml and a slice of group names that
have not yet been written to the yaml. WithGroup() appends the group name to
the slice and WithAttrs() updates the yaml with new attrs. You would think
WithGroup() can go ahead and write the group name to the yaml, but the rules
for Handle() say "If a group has no Attrs (even if it has a non-empty key),
ignore it." We don't know whether a group will be empty at the time
that WithGroup() is called, so we need to defer until we get an Attr.
We also add resources which wraps some sync pools with borrow and return
methods.
resolveAttrs below is a helper function that helps implement some of the rules
for Handle().
// resolveAttrs resolves members of attrs.
// Resolving entails:
//   - Calling Resolve() on any LogValuer or Any values
//   - Inlining groups with empty keys
//   - Omitting zero-value Attrs
func resolveAttrs(pool *resourcePool, attrs []slog.Attr) []slog.Attr {
	res := pool.borrowAttrs()
	for _, attr := range attrs {
		kind := attr.Value.Kind()
		// Resolve LogValuers and Any values.
		if kind == slog.KindLogValuer || kind == slog.KindAny {
			attr.Value = attr.Value.Resolve()
			kind = attr.Value.Kind()
		}
		// Inline groups with empty keys.
		if kind == slog.KindGroup && attr.Key == "" {
			*res = append(*res, resolveAttrs(pool, attr.Value.Group())...)
		}
		// Omit zero-value Attrs.
		if attr.Equal(slog.Attr{}) {
			continue
		}
		*res = append(*res, attr)
	}
	attrs = append(attrs[:0], *res...)
	pool.returnAttrs(res)
	return attrs
}
And here's our new Handle():
func (h *Handler) Handle(_ context.Context, record slog.Record) error {
	root := h.root()
	pool := &root.resources
	buf := pool.borrowBytes()
	*buf = []byte(record.Message + "\n")
	// Add the top-level attributes from record
	*buf = append(*buf, []byte("  level: "+record.Level.String()+"\n")...)
	if !record.Time.IsZero() {
		*buf = append(*buf, []byte("  time: ")...)
		*buf = appendYAMLTime(*buf, record.Time)
		*buf = append(*buf, []byte("\n")...)
	}
	if record.PC != 0 {
		f, _ := runtime.CallersFrames([]uintptr{record.PC}).Next()
		*buf = append(*buf, []byte("  source: ")...)
		*buf = append(*buf, []byte(f.File+":"+strconv.Itoa(f.Line)+"\n")...)
	}
	// Add the pre-built YAML
	*buf = append(*buf, h.yaml...)
	// Collect the remaining attributes on record
	recAttrs := pool.borrowAttrs()
	record.Attrs(func(attr slog.Attr) bool {
		*recAttrs = append(*recAttrs, attr)
		return true
	})
	*recAttrs = resolveAttrs(pool, *recAttrs)
	// Add YAML for the remaining attributes
	if len(*recAttrs) > 0 {
		*buf = h.appendYaml(*buf, *recAttrs)
	}
	// Lock and write to output
	root.mu.Lock()
	_, err := root.output.Write(*buf)
	root.mu.Unlock()
	pool.returnBytes(buf)
	pool.returnAttrs(recAttrs)
	return err
}
We've replaced marshalYamlAttrs() from the previous version appendYaml().
This is because we are now appending yaml to a byte slice to avoid unnecessary
allocations. We aren't using appendYaml() for the known, top-level attributes
(time, level and source) because it is more efficient to just append them
directly to the byte slice.
appendYaml manages indentation and writing pending groups and uses
appendYamlAttr to append each attr to the yaml.
func (h *Handler) appendYaml(dst []byte, attrs []slog.Attr) []byte {
	pool := &h.root().resources
	attrs = resolveAttrs(pool, attrs)
	if len(attrs) == 0 {
		return dst
	}
	indents := 1 + h.depth - len(h.pendingGroups)
	for i := 0; i < len(h.pendingGroups); i++ {
		prefix := getIndentPrefix(indents)
		dst = append(dst, prefix+h.pendingGroups[i]+":\n"...)
		indents++
	}
	prefix := getIndentPrefix(indents)
	buf := pool.borrowBytes()
	for _, attr := range attrs {
		*buf = appendYamlAttr(pool, *buf, attr)
		for _, b := range *buf {
			if len(dst) == 0 || dst[len(dst)-1] == '\n' {
				dst = append(dst, prefix...)
			}
			dst = append(dst, b)
		}
		*buf = (*buf)[:0]
	}
	pool.returnBytes(buf)
	return dst
}
I'm not going to cover the details of how appendYamlAttr builds the yaml. If
you are interested you can see it for yourself
in yaml.go.
Benchmarks
One of my goals for the handler was for it to be about as efficient as the handlers included in "log/slog". Let's see how we did with some benchmarks.
Here is the code
package human_test
import (
	"context"
	"errors"
	"io"
	"log/slog"
	"runtime"
	"testing"
	"time"
	"github.com/willabides/actionslog/human"
)
var (
	testMessage  = "Test logging, but use a somewhat realistic message length."
	testDuration = 23 * time.Second
	testString   = "7e3b3b2aaeff56a7108fe11e154200dd/7819479873059528190"
	testInt      = 32768
	testTime     = time.Now()
	testError    = errors.New("fail")
)
func BenchmarkHandler(b *testing.B) {
	var pcs [1]uintptr
	runtime.Callers(0, pcs[:])
	b.Run("simple with source", func(b *testing.B) {
		record := slog.NewRecord(time.Time{}, slog.LevelInfo, testMessage, pcs[0])
		runBenchmarks(b, true, record, nil)
	})
	b.Run("simple", func(b *testing.B) {
		record := slog.NewRecord(time.Time{}, slog.LevelInfo, testMessage, 0)
		runBenchmarks(b, false, record, nil)
	})
	attrs := []slog.Attr{
		slog.String("string", testString),
		slog.Group(
			"group",
			slog.String("string", testString),
			slog.Any("any string", testString),
		),
		slog.Int("int", testInt),
		slog.Duration("duration", testDuration),
		slog.Time("time", testTime),
		slog.Any("error", testError),
		slog.Bool("bool", true),
	}
	b.Run("attrs", func(b *testing.B) {
		record := slog.NewRecord(time.Now(), 0, testMessage, 0)
		record.AddAttrs(append([]slog.Attr{}, attrs...)...)
		runBenchmarks(b, false, record, nil)
	})
	b.Run("groups", func(b *testing.B) {
		record := slog.NewRecord(time.Now(), 0, testMessage, 0)
		record.AddAttrs(append([]slog.Attr{}, attrs...)...)
		runBenchmarks(b, false, record, func(handler slog.Handler) slog.Handler {
			return handler.WithGroup(
				"group1",
			).WithGroup(
				"group2",
			).WithAttrs(
				append([]slog.Attr{}, attrs...),
			)
		})
	})
	b.Run("map", func(b *testing.B) {
		record := slog.NewRecord(time.Now(), 0, testMessage, 0)
		record.AddAttrs(slog.Any("map", map[string]any{
			"string": testString,
			"int":    testInt,
			"slice":  []string{"foo", "bar", "baz"},
		}))
		runBenchmarks(b, false, record, nil)
	})
}
func runBenchmarks(b *testing.B, addSource bool, record slog.Record, prepHandler func(slog.Handler) slog.Handler, ) {
	b.Helper()
	ctx := context.Background()
	bench := func(name string, handler slog.Handler) {
		b.Helper()
		if prepHandler != nil {
			handler = prepHandler(handler)
		}
		var err error
		b.Run(name, func(b *testing.B) {
			b.ReportAllocs()
			b.RunParallel(func(pb *testing.PB) {
				for pb.Next() {
					err = handler.Handle(ctx, record)
					if err != nil {
						panic(err)
					}
				}
			})
		})
	}
	bench("slog.TextHandler", slog.NewTextHandler(
		io.Discard,
		&slog.HandlerOptions{
			AddSource: addSource,
		},
	))
	bench("slog.JSONHandler", slog.NewJSONHandler(
		io.Discard,
		&slog.HandlerOptions{
			AddSource: addSource,
		},
	))
	bench("human.Handler", &human.Handler{
		Output:    io.Discard,
		AddSource: addSource,
	})
}
And the results
Sorry my css doesn't make this easy to read.
goos: darwin
goarch: amd64
pkg: github.com/willabides/actionslog/human
cpu: Intel(R) Core(TM) i7-7920HQ CPU @ 3.10GHz
BenchmarkHandler
BenchmarkHandler/simple_with_source
BenchmarkHandler/simple_with_source/slog.TextHandler
BenchmarkHandler/simple_with_source/slog.TextHandler-4         	 2069114	       533.0 ns/op	     368 B/op	       6 allocs/op
BenchmarkHandler/simple_with_source/slog.JSONHandler
BenchmarkHandler/simple_with_source/slog.JSONHandler-4         	 2140882	       599.4 ns/op	     568 B/op	       6 allocs/op
BenchmarkHandler/simple_with_source/human.Handler
BenchmarkHandler/simple_with_source/human.Handler-4            	 5489481	       227.4 ns/op	     224 B/op	       1 allocs/op
BenchmarkHandler/simple
BenchmarkHandler/simple/slog.TextHandler
BenchmarkHandler/simple/slog.TextHandler-4                     	 4790428	       277.4 ns/op	       0 B/op	       0 allocs/op
BenchmarkHandler/simple/slog.JSONHandler
BenchmarkHandler/simple/slog.JSONHandler-4                     	 5718366	       209.9 ns/op	       0 B/op	       0 allocs/op
BenchmarkHandler/simple/human.Handler
BenchmarkHandler/simple/human.Handler-4                        	11511750	       104.5 ns/op	       0 B/op	       0 allocs/op
BenchmarkHandler/attrs
BenchmarkHandler/attrs/slog.TextHandler
BenchmarkHandler/attrs/slog.TextHandler-4                      	 1608424	       833.0 ns/op	       8 B/op	       2 allocs/op
BenchmarkHandler/attrs/slog.JSONHandler
BenchmarkHandler/attrs/slog.JSONHandler-4                      	 2328364	       478.0 ns/op	       0 B/op	       0 allocs/op
BenchmarkHandler/attrs/human.Handler
BenchmarkHandler/attrs/human.Handler-4                         	 1351884	       876.3 ns/op	       0 B/op	       0 allocs/op
BenchmarkHandler/groups
BenchmarkHandler/groups/slog.TextHandler
BenchmarkHandler/groups/slog.TextHandler-4                     	 1237494	       916.7 ns/op	       8 B/op	       2 allocs/op
BenchmarkHandler/groups/slog.JSONHandler
BenchmarkHandler/groups/slog.JSONHandler-4                     	 1894652	       654.8 ns/op	       0 B/op	       0 allocs/op
BenchmarkHandler/groups/human.Handler
BenchmarkHandler/groups/human.Handler-4                        	 1000000	      1155 ns/op	       0 B/op	       0 allocs/op
BenchmarkHandler/map
BenchmarkHandler/map/slog.TextHandler
BenchmarkHandler/map/slog.TextHandler-4                        	 1000000	      1148 ns/op	     448 B/op	      13 allocs/op
BenchmarkHandler/map/slog.JSONHandler
BenchmarkHandler/map/slog.JSONHandler-4                        	 1745904	       685.4 ns/op	     584 B/op	      12 allocs/op
BenchmarkHandler/map/human.Handler
BenchmarkHandler/map/human.Handler-4                           	  211202	      5495 ns/op	    6616 B/op	     170 allocs/op
Our humble human handler manages to keep up with the slog handlers and is fastest in a few cases. I particularly like that it adds source information with 1 allocation instead of 6 or 8. However it doesn't keep up when we include a map in the structured date. The same happens with slices, structs and other complex types. When it comes down to it, yaml is just more complex to build than json or text, so it will be less efficient. I think it's somewhat rare to log complex types in structured data, so I'm not too worried about it. I think I could improve on this if I spent a few days on it, but it isn't a priority right now.