Skip to content

Commit 7cdccc0

Browse files
authored
Merge pull request cch123#42 from wziww/master
pprof allocs 部分
2 parents 27c9896 + ccbda40 commit 7cdccc0

File tree

2 files changed

+233
-0
lines changed

2 files changed

+233
-0
lines changed

pprof.md

Lines changed: 232 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,232 @@
1+
# pprof
2+
在进行深入本章节之前, 让我们来看三个问题, 相信下面这几个问题也是大部分人在使用 pprof 的时候对它最大的困惑, 那么可以带着这三个问题来进行接下去的分析
3+
- 开启 pprof 会对 runtime 产生多大的压力?
4+
- 能否选择性在合适阶段对生产环境的应用进行 pprof 的开启 / 关闭操作?
5+
- pprof 的原理是什么?
6+
7+
go 内置的 `pprof API``runtime/pprof` 包内, 它提供给了用户与 `runtime` 交互的能力, 让我们能够在应用运行的过程中分析当前应用的各项指标来辅助进行性能优化以及问题排查, 当然也可以直接加载 `_ "net/http/pprof"` 包使用内置的 `http 接口` 来进行使用, `net` 模块内的 pprof 即为 go 替我们封装好的一系列调用 `runtime/pprof` 的方法, 当然也可以自己直接使用
8+
```go
9+
// src/runtime/pprof/pprof.go
10+
// 可观察类目
11+
profiles.m = map[string]*Profile{
12+
"goroutine": goroutineProfile,
13+
"threadcreate": threadcreateProfile,
14+
"heap": heapProfile,
15+
"allocs": allocsProfile,
16+
"block": blockProfile,
17+
"mutex": mutexProfile,
18+
}
19+
```
20+
21+
## allocs
22+
```go
23+
24+
var allocsProfile = &Profile{
25+
name: "allocs",
26+
count: countHeap, // identical to heap profile
27+
write: writeAlloc,
28+
}
29+
```
30+
- writeAlloc (主要涉及以下几个 api)
31+
- ReadMemStats(m *MemStats)
32+
- MemProfile(p []MemProfileRecord, inuseZero bool)
33+
34+
```go
35+
// ReadMemStats populates m with memory allocator statistics.
36+
//
37+
// The returned memory allocator statistics are up to date as of the
38+
// call to ReadMemStats. This is in contrast with a heap profile,
39+
// which is a snapshot as of the most recently completed garbage
40+
// collection cycle.
41+
func ReadMemStats(m *MemStats) {
42+
// STW 操作
43+
stopTheWorld("read mem stats")
44+
// systemstack 切换
45+
systemstack(func() {
46+
// 将 memstats 通过 copy 操作复制给 m
47+
readmemstats_m(m)
48+
})
49+
50+
startTheWorld()
51+
}
52+
```
53+
54+
```go
55+
// MemProfile returns a profile of memory allocated and freed per allocation
56+
// site.
57+
//
58+
// MemProfile returns n, the number of records in the current memory profile.
59+
// If len(p) >= n, MemProfile copies the profile into p and returns n, true.
60+
// If len(p) < n, MemProfile does not change p and returns n, false.
61+
//
62+
// If inuseZero is true, the profile includes allocation records
63+
// where r.AllocBytes > 0 but r.AllocBytes == r.FreeBytes.
64+
// These are sites where memory was allocated, but it has all
65+
// been released back to the runtime.
66+
//
67+
// The returned profile may be up to two garbage collection cycles old.
68+
// This is to avoid skewing the profile toward allocations; because
69+
// allocations happen in real time but frees are delayed until the garbage
70+
// collector performs sweeping, the profile only accounts for allocations
71+
// that have had a chance to be freed by the garbage collector.
72+
//
73+
// Most clients should use the runtime/pprof package or
74+
// the testing package's -test.memprofile flag instead
75+
// of calling MemProfile directly.
76+
func MemProfile(p []MemProfileRecord, inuseZero bool) (n int, ok bool) {
77+
lock(&proflock)
78+
// If we're between mProf_NextCycle and mProf_Flush, take care
79+
// of flushing to the active profile so we only have to look
80+
// at the active profile below.
81+
mProf_FlushLocked()
82+
clear := true
83+
/*
84+
* 记住这个 mbuckets -- memory profile buckets
85+
* allocs 的采样都是记录在这个全局变量内, 下面会进行详细分析
86+
* -------------------------------------------------
87+
* (gdb) info variables mbuckets
88+
* All variables matching regular expression "mbuckets":
89+
90+
* File runtime:
91+
* runtime.bucket *runtime.mbuckets;
92+
* (gdb)
93+
*/
94+
for b := mbuckets; b != nil; b = b.allnext {
95+
mp := b.mp()
96+
if inuseZero || mp.active.alloc_bytes != mp.active.free_bytes {
97+
n++
98+
}
99+
if mp.active.allocs != 0 || mp.active.frees != 0 {
100+
clear = false
101+
}
102+
}
103+
if clear {
104+
// Absolutely no data, suggesting that a garbage collection
105+
// has not yet happened. In order to allow profiling when
106+
// garbage collection is disabled from the beginning of execution,
107+
// accumulate all of the cycles, and recount buckets.
108+
n = 0
109+
for b := mbuckets; b != nil; b = b.allnext {
110+
mp := b.mp()
111+
for c := range mp.future {
112+
mp.active.add(&mp.future[c])
113+
mp.future[c] = memRecordCycle{}
114+
}
115+
if inuseZero || mp.active.alloc_bytes != mp.active.free_bytes {
116+
n++
117+
}
118+
}
119+
}
120+
if n <= len(p) {
121+
ok = true
122+
idx := 0
123+
for b := mbuckets; b != nil; b = b.allnext {
124+
mp := b.mp()
125+
if inuseZero || mp.active.alloc_bytes != mp.active.free_bytes {
126+
// mbuckets 数据拷贝
127+
record(&p[idx], b)
128+
idx++
129+
}
130+
}
131+
}
132+
unlock(&proflock)
133+
return
134+
}
135+
```
136+
137+
总结一下 `pprof/allocs` 所涉及的操作
138+
- 短暂的 `STW` 以及 `systemstack` 切换来获取 `runtime` 相关信息
139+
- 拷贝全局对象 `mbuckets` 值返回给用户
140+
141+
### mbuckets
142+
上文提到, `pprof/allocs` 的核心在于对 `mbuckets` 的操作, 下面用一张图来简单描述下 `mbuckets` 的相关操作
143+
```go
144+
var mbuckets *bucket // memory profile buckets
145+
type bucket struct {
146+
next *bucket
147+
allnext *bucket
148+
typ bucketType // memBucket or blockBucket (includes mutexProfile)
149+
hash uintptr
150+
size uintptr
151+
nstk uintptr
152+
}
153+
```
154+
155+
156+
```shell
157+
---------------
158+
| user access |
159+
---------------
160+
|
161+
------------------ |
162+
| mbuckets list | copy |
163+
| (global) | -------------------------------------
164+
------------------
165+
|
166+
|
167+
| create && insert new bucket into mbuckets
168+
|
169+
|
170+
--------------------------------------
171+
| func stkbucket & typ == memProfile |
172+
--------------------------------------
173+
|
174+
----------------
175+
| mProf_Malloc | // 堆栈等信息记录
176+
----------------
177+
|
178+
----------------
179+
| profilealloc | // next_sample 计算
180+
----------------
181+
|
182+
| /*
183+
| * if rate := MemProfileRate; rate > 0 {
184+
| * if rate != 1 && size < c.next_sample {
185+
| * c.next_sample -= size
186+
| 采样 * } else {
187+
| 记录 * mp := acquirem()
188+
| * profilealloc(mp, x, size)
189+
| * releasem(mp)
190+
| * }
191+
| * }
192+
| */
193+
|
194+
------------ 不采样
195+
| mallocgc |-----------...
196+
------------
197+
```
198+
199+
由上图我们可以清晰的看见, `runtime` 在内存分配的时候会根据一定策略进行采样, 记录到 `mbuckets` 中让用户得以进行分析, 而采样算法有个重要的依赖 `MemProfileRate`
200+
201+
```go
202+
// MemProfileRate controls the fraction of memory allocations
203+
// that are recorded and reported in the memory profile.
204+
// The profiler aims to sample an average of
205+
// one allocation per MemProfileRate bytes allocated.
206+
//
207+
// To include every allocated block in the profile, set MemProfileRate to 1.
208+
// To turn off profiling entirely, set MemProfileRate to 0.
209+
//
210+
// The tools that process the memory profiles assume that the
211+
// profile rate is constant across the lifetime of the program
212+
// and equal to the current value. Programs that change the
213+
// memory profiling rate should do so just once, as early as
214+
// possible in the execution of the program (for example,
215+
// at the beginning of main).
216+
var MemProfileRate int = 512 * 1024
217+
```
218+
默认大小是 512 KB, 可以由用户自行配置.
219+
220+
值的注意的是, 由于开启了 pprof 会产生一些采样的额外压力及开销, go 团队已经在较新的编译器中有选择地进行了这个变量的配置以[改变](https://go-review.googlesource.com/c/go/+/299671/8/src/runtime/mprof.go)默认开启的现状
221+
222+
具体方式为代码未进行相关引用则编译器将初始值配置为 0, 否则则为默认(512 KB)
223+
224+
(本文讨论的基于 1.14.3 版本, 如有差异请进行版本确认)
225+
226+
#### pprof/mallocs 总结
227+
- 开启后会对 runtime 产生额外压力, 采样时会在 `runtime malloc` 时记录额外信息以供后续分析
228+
- 可以人为选择是否开启, 以及采样频率, 通过设置 `runtime.MemProfileRate` 参数, 不同 go 版本存在差异(是否默认开启), 与用户代码内是否引用(linker)相关模块/变量有关, 默认大小为 512 KB
229+
230+
231+
# 参考资料
232+
https://go-review.googlesource.com/c/go/+/299671

readme.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -30,6 +30,7 @@
3030
24. [x] [Atomic](atomic.md)
3131
25. [ ] [Generics](generics.md)
3232
26. [x] [IO](io.md)
33+
26. [x] [pprof](pprof.md)
3334

3435
# Authors
3536

0 commit comments

Comments
 (0)