Added a blog post
This commit is contained in:
6
content/archives.md
Normal file
6
content/archives.md
Normal file
@@ -0,0 +1,6 @@
|
||||
---
|
||||
title: "Archive"
|
||||
layout: "archives"
|
||||
url: "/archives/"
|
||||
summary: archives
|
||||
---
|
||||
203
content/posts/go_heap_allocation.md
Normal file
203
content/posts/go_heap_allocation.md
Normal file
@@ -0,0 +1,203 @@
|
||||
+++
|
||||
title = 'Go Heap Allocations'
|
||||
date = 2024-01-24T21:45:13-06:00
|
||||
draft = false
|
||||
+++
|
||||
|
||||
Recently I found myself caring about performance. Usually, with go, you don't have to care if something is on the stack or the heap. Since Go is a garbage collected language all those details are taken care of for you. Worring about these things is a pre-optimization, and the community is very happy to tell you so. However, there is the rare event performance takes precidence, in which case the favor of garbage collection turns into a frustrating scavenger hunt.
|
||||
|
||||
Lets write a simple benchmark so we can start looking for allocations.
|
||||
|
||||
<sub>main_test.go</sub>
|
||||
```go
|
||||
type Person struct {
|
||||
name string
|
||||
age int
|
||||
}
|
||||
|
||||
func BenchmarkNew(b *testing.B) {
|
||||
for i := 0; i < b.N; i++ {
|
||||
e := Person{
|
||||
name: "eric",
|
||||
age: 12,
|
||||
}
|
||||
|
||||
_, err := json.Marshal(e)
|
||||
if err != nil {
|
||||
b.FailNow()
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
We want to run our benchmarks and capture memory statistics. To do this we can use the `-memprofile` flag for `go test`. This will generate a memory profile which will help find where our allocations are occuring. We can also get a count of allocations to the heap during each loop in our benchmark by passing the `-benchmem` flag.
|
||||
|
||||
```
|
||||
❯ go test -run='^$' -bench=BenchmarkNew -memprofile=mem.out -benchmem .
|
||||
goos: linux
|
||||
goarch: amd64
|
||||
pkg: github.com/d1ngd0/go-play
|
||||
cpu: AMD FX(tm)-8320 Eight-Core Processor
|
||||
BenchmarkNew-8 3484684 443.3 ns/op 32 B/op 2 allocs/op
|
||||
PASS
|
||||
ok github.com/d1ngd0/go-play 1.906s
|
||||
```
|
||||
|
||||
So our code has 2 memory allocations per run. We can use the `go tool pprof` command to look through our memory profile and figure out where exactly the allocations are occuring.
|
||||
|
||||
```
|
||||
❯ go tool pprof mem.out
|
||||
File: go-play.test
|
||||
Type: alloc_space
|
||||
Time: Jan 24, 2024 at 10:17pm (CST)
|
||||
Entering interactive mode (type "help" for commands, "o" for options)
|
||||
(pprof)
|
||||
```
|
||||
|
||||
Running the command above will place you in a prompt with commands to help you explore the data. The best place to start is with `top`, which shows you the top bytes allocated.
|
||||
|
||||
```
|
||||
(pprof) top
|
||||
Showing nodes accounting for 138.50MB, 100% of 138.50MB total
|
||||
flat flat% sum% cum cum%
|
||||
101.50MB 73.29% 73.29% 138.50MB 100% github.com/d1ngd0/go-play.BenchmarkNew
|
||||
37MB 26.71% 100% 37MB 26.71% encoding/json.Marshal
|
||||
0 0% 100% 138.50MB 100% testing.(*B).launch
|
||||
0 0% 100% 138.50MB 100% testing.(*B).runN
|
||||
```
|
||||
|
||||
We can then view the **exact** location by using the `list` command.
|
||||
|
||||
```
|
||||
(pprof) list encoding/json.Marshal
|
||||
Total: 138.50MB
|
||||
ROUTINE ======================== encoding/json.Marshal in /usr/local/go/src/encoding/json/encode.go
|
||||
37MB 37MB (flat, cum) 26.71% of Total
|
||||
. . 158:func Marshal(v any) ([]byte, error) {
|
||||
. . 159: e := newEncodeState()
|
||||
. . 160: defer encodeStatePool.Put(e)
|
||||
. . 161:
|
||||
. . 162: err := e.marshal(v, encOpts{escapeHTML: true})
|
||||
. . 163: if err != nil {
|
||||
. . 164: return nil, err
|
||||
. . 165: }
|
||||
37MB 37MB 166: buf := append([]byte(nil), e.Bytes()...)
|
||||
. . 167:
|
||||
. . 168: return buf, nil
|
||||
. . 169:}
|
||||
. . 170:
|
||||
. . 171:// MarshalIndent is like Marshal but applies Indent to format the output.
|
||||
```
|
||||
|
||||
Now we see that `buf := append([]byte(nil), e.Bytes()...)` is the line causing the heap allocation. This makes sense, as the code here is effectively copying the bytes from one slice into a new one.
|
||||
|
||||
One of the issues here is we are tracking heap allocations by bytes allocated,not total number of allocations. Each allocation has overhead, as our program has to find a space large enough to hold our value. Disproportionate bytes needed for each allocation to the heap may throw off our numbers, and make a small performance issue look like a huge one.
|
||||
|
||||
For instance lets add a big heap allocation to our benchmark.
|
||||
|
||||
<sub>main_test.go</sub>
|
||||
```go
|
||||
func BenchmarkNew(b *testing.B) {
|
||||
for i := 0; i < b.N; i++ {
|
||||
// make a big heap allocation
|
||||
v := make([]byte, 100000)
|
||||
// make sure we use v so it doesn't optimise out
|
||||
_ = v
|
||||
|
||||
e := Person{
|
||||
name: "eric",
|
||||
age: 12,
|
||||
}
|
||||
|
||||
// cause more heap allocations
|
||||
for x := 0; x < 1000; x++ {
|
||||
_, err := json.Marshal(e)
|
||||
if err != nil {
|
||||
b.FailNow()
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
We now see a staggering number of allocations, since we have increased the number of times we run json.Marshal
|
||||
|
||||
```
|
||||
❯ go test -run='^$' -bench=BenchmarkNew -memprofile=mem.out -benchmem .
|
||||
goos: linux
|
||||
goarch: amd64
|
||||
pkg: github.com/d1ngd0/go-play
|
||||
cpu: AMD FX(tm)-8320 Eight-Core Processor
|
||||
BenchmarkNew-8 163 8571304 ns/op 138551 B/op 2001 allocs/op
|
||||
PASS
|
||||
ok github.com/d1ngd0/go-play 2.138s
|
||||
```
|
||||
|
||||
After looking at the allocations in `BenchmarkNew` we see this
|
||||
|
||||
```
|
||||
(pprof) list BenchmarkNew
|
||||
Total: 32.34MB
|
||||
ROUTINE ======================== github.com/d1ngd0/go-play.BenchmarkNew in /home/paul/Projects/go-play/main_test.go
|
||||
30.37MB 32.24MB (flat, cum) 99.69% of Total
|
||||
. . 13:func BenchmarkNew(b *testing.B) {
|
||||
. . 14: for i := 0; i < b.N; i++ {
|
||||
. . 15: // make a big heap allocation
|
||||
24.78MB 24.78MB 16: v := make([]byte, 100000)
|
||||
. . 17: // make sure we use v so it doesn't optimise out
|
||||
. . 18: _ = v
|
||||
. . 19:
|
||||
. . 20: e := Person{
|
||||
. . 21: name: "eric",
|
||||
. . 22: age: 12,
|
||||
. . 23: }
|
||||
. . 24:
|
||||
. . 25: // cause more heap allocations
|
||||
. . 26: for x := 0; x < 1000; x++ {
|
||||
5.58MB 7.46MB 27: _, err := json.Marshal(e)
|
||||
. . 28: if err != nil {
|
||||
. . 29: b.FailNow()
|
||||
. . 30: }
|
||||
. . 31: }
|
||||
. . 32: }
|
||||
```
|
||||
|
||||
You can see line 16 has a much larger allocation in bytes, but each run should cause a single allocation. We are running 1000 allocations in the for loop in BenchmemNew and another 1000 inside `json.Marshal`. This will have a much larger performance impact. Though the way we are measuring things would make us look at line 16 first. Let's run our test again to get a count of allocations instead.
|
||||
|
||||
First when we run our command we will set the flag `-memprofilerate=1`. This will count **every** allocation that occurs, though it does this at a massive performance cost. Since we only care to count allocations, we don't really care how long it takes to run.
|
||||
|
||||
```
|
||||
❯ go test -run='^$' -bench=BenchmarkNew -memprofile=mem.out -memprofilerate=1 -benchmem .
|
||||
```
|
||||
|
||||
Now we can run `pprof` again, but this time we will pass in the `-alloc_objects` flag.
|
||||
|
||||
```
|
||||
❯ go tool pprof -alloc_objects mem.out
|
||||
(pprof) list BenchmarkNew
|
||||
Total: 396519
|
||||
ROUTINE ======================== github.com/d1ngd0/go-play.BenchmarkNew in /home/paul/Projects/go-play/main_test.go
|
||||
264264 396342 (flat, cum) 100% of Total
|
||||
. . 13:func BenchmarkNew(b *testing.B) {
|
||||
. . 14: for i := 0; i < b.N; i++ {
|
||||
. . 15: // make a big heap allocation
|
||||
264 264 16: v := make([]byte, 100000)
|
||||
. . 17: // make sure we use v so it doesn't optimise out
|
||||
. . 18: _ = v
|
||||
. . 19:
|
||||
. . 20: e := Person{
|
||||
. . 21: name: "eric",
|
||||
. . 22: age: 12,
|
||||
. . 23: }
|
||||
. . 24:
|
||||
. . 25: // cause more heap allocations
|
||||
. . 26: for x := 0; x < 1000; x++ {
|
||||
264000 396078 27: _, err := json.Marshal(e)
|
||||
. . 28: if err != nil {
|
||||
. . 29: b.FailNow()
|
||||
. . 30: }
|
||||
. . 31: }
|
||||
. . 32: }
|
||||
```
|
||||
|
||||
Now we can see where the allocations are really occuring.
|
||||
Reference in New Issue
Block a user