Commit 45c43d4
authored
fix: refactor agent resource monitoring API to avoid excessive calls to DB (coder#20430)
This should resolve coder/internal#728 by
refactoring the ResourceMonitorAPI struct to only require querying the
resource monitor once for memory and once for volumes, then using the
stored monitors on the API struct from that point on. This should
eliminate the vast majority of calls to `GetWorkspaceByAgentID` and
`FetchVolumesResourceMonitorsUpdatedAfter`/`FetchMemoryResourceMonitorsUpdatedAfter`
(millions of calls per week).
Tests passed, and I ran an instance of coder via a workspace with a
template that added resource monitoring every 10s. Note that this is the
default docker container, so there are other sources of
`GetWorkspaceByAgentID` db queries. Note that this workspace was running
for ~15 minutes at the time I gathered this data.
Over 30s for the `ResourceMonitor` calls:
```
coder@callum-coder-2:~/coder$ curl localhost:19090/metrics | grep ResourceMonitor | grep count
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
0 0 0 0 0 0 0 0 --:--:-- --:--:-- --:--:-- 0coderd_db_query_latencies_seconds_count{query="FetchMemoryResourceMonitorsByAgentID"} 2
coderd_db_query_latencies_seconds_count{query="FetchMemoryResourceMonitorsUpdatedAfter"} 2
100 288k 0 288k 0 0 58.3M 0 --:--:-- --:--:-- --:--:-- 70.4M
coderd_db_query_latencies_seconds_count{query="FetchVolumesResourceMonitorsByAgentID"} 2
coderd_db_query_latencies_seconds_count{query="FetchVolumesResourceMonitorsUpdatedAfter"} 2
coderd_db_query_latencies_seconds_count{query="UpdateMemoryResourceMonitor"} 155
coderd_db_query_latencies_seconds_count{query="UpdateVolumeResourceMonitor"} 155
coder@callum-coder-2:~/coder$ curl localhost:19090/metrics | grep ResourceMonitor | grep count
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
0 0 0 0 0 0 0 0 --:--:-- --:--:-- --:--:-- 0coderd_db_query_latencies_seconds_count{query="FetchMemoryResourceMonitorsByAgentID"} 2
coderd_db_query_latencies_seconds_count{query="FetchMemoryResourceMonitorsUpdatedAfter"} 2
100 288k 0 288k 0 0 34.7M 0 --:--:-- --:--:-- --:--:-- 40.2M
coderd_db_query_latencies_seconds_count{query="FetchVolumesResourceMonitorsByAgentID"} 2
coderd_db_query_latencies_seconds_count{query="FetchVolumesResourceMonitorsUpdatedAfter"} 2
coderd_db_query_latencies_seconds_count{query="UpdateMemoryResourceMonitor"} 158
coderd_db_query_latencies_seconds_count{query="UpdateVolumeResourceMonitor"} 158
```
And over 1m for the `GetWorkspaceAgentByID` calls, the majority are from
the workspace metadata stats updates:
```
coder@callum-coder-2:~/coder$ curl localhost:19090/metrics | grep GetWorkspaceByAgentID | grep count
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
100 284k 0 284k 0 0 42.4M 0 --:--:-- --:--:-- --:--:-- 46.3M
coderd_db_query_latencies_seconds_count{query="GetWorkspaceByAgentID"} 876
coder@callum-coder-2:~/coder$ curl localhost:19090/metrics | grep GetWorkspaceByAgentID | grep count
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
100 284k 0 284k 0 0 75.4M 0 --:--:-- --:--:-- --:--:-- 92.7M
coderd_db_query_latencies_seconds_count{query="GetWorkspaceByAgentID"} 918
```
---------
Signed-off-by: Callum Styan <callumstyan@gmail.com>1 parent a1e7e10 commit 45c43d4
File tree
3 files changed
+82
-35
lines changed- coderd/agentapi
3 files changed
+82
-35
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
239 | 239 | | |
240 | 240 | | |
241 | 241 | | |
| 242 | + | |
| 243 | + | |
| 244 | + | |
| 245 | + | |
242 | 246 | | |
243 | 247 | | |
244 | 248 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
5 | 5 | | |
6 | 6 | | |
7 | 7 | | |
| 8 | + | |
8 | 9 | | |
9 | 10 | | |
10 | 11 | | |
| |||
33 | 34 | | |
34 | 35 | | |
35 | 36 | | |
| 37 | + | |
| 38 | + | |
| 39 | + | |
| 40 | + | |
| 41 | + | |
36 | 42 | | |
37 | 43 | | |
38 | | - | |
39 | | - | |
40 | | - | |
41 | | - | |
| 44 | + | |
| 45 | + | |
| 46 | + | |
| 47 | + | |
| 48 | + | |
| 49 | + | |
| 50 | + | |
| 51 | + | |
| 52 | + | |
| 53 | + | |
| 54 | + | |
| 55 | + | |
| 56 | + | |
42 | 57 | | |
43 | 58 | | |
44 | | - | |
| 59 | + | |
45 | 60 | | |
46 | | - | |
| 61 | + | |
47 | 62 | | |
| 63 | + | |
| 64 | + | |
| 65 | + | |
| 66 | + | |
| 67 | + | |
48 | 68 | | |
| 69 | + | |
49 | 70 | | |
50 | 71 | | |
51 | 72 | | |
52 | 73 | | |
53 | 74 | | |
54 | 75 | | |
55 | | - | |
| 76 | + | |
56 | 77 | | |
57 | 78 | | |
58 | | - | |
59 | 79 | | |
60 | | - | |
| 80 | + | |
61 | 81 | | |
62 | 82 | | |
63 | 83 | | |
64 | | - | |
65 | | - | |
| 84 | + | |
| 85 | + | |
66 | 86 | | |
67 | 87 | | |
68 | 88 | | |
69 | 89 | | |
70 | 90 | | |
71 | | - | |
72 | 91 | | |
73 | 92 | | |
74 | 93 | | |
| |||
77 | 96 | | |
78 | 97 | | |
79 | 98 | | |
| 99 | + | |
| 100 | + | |
| 101 | + | |
| 102 | + | |
80 | 103 | | |
81 | 104 | | |
82 | 105 | | |
| |||
89 | 112 | | |
90 | 113 | | |
91 | 114 | | |
92 | | - | |
93 | | - | |
94 | | - | |
95 | | - | |
96 | | - | |
97 | | - | |
98 | | - | |
99 | | - | |
100 | | - | |
101 | | - | |
102 | | - | |
103 | | - | |
| 115 | + | |
104 | 116 | | |
105 | 117 | | |
106 | 118 | | |
| |||
109 | 121 | | |
110 | 122 | | |
111 | 123 | | |
112 | | - | |
| 124 | + | |
113 | 125 | | |
114 | | - | |
| 126 | + | |
115 | 127 | | |
116 | 128 | | |
117 | | - | |
| 129 | + | |
118 | 130 | | |
119 | 131 | | |
120 | | - | |
| 132 | + | |
121 | 133 | | |
122 | 134 | | |
123 | 135 | | |
| |||
127 | 139 | | |
128 | 140 | | |
129 | 141 | | |
| 142 | + | |
| 143 | + | |
| 144 | + | |
| 145 | + | |
| 146 | + | |
130 | 147 | | |
131 | 148 | | |
132 | 149 | | |
| |||
143 | 160 | | |
144 | 161 | | |
145 | 162 | | |
146 | | - | |
| 163 | + | |
147 | 164 | | |
148 | 165 | | |
149 | 166 | | |
| |||
169 | 186 | | |
170 | 187 | | |
171 | 188 | | |
172 | | - | |
173 | | - | |
174 | | - | |
175 | | - | |
176 | | - | |
177 | 189 | | |
178 | 190 | | |
179 | | - | |
| 191 | + | |
180 | 192 | | |
181 | 193 | | |
182 | 194 | | |
| |||
219 | 231 | | |
220 | 232 | | |
221 | 233 | | |
| 234 | + | |
| 235 | + | |
| 236 | + | |
| 237 | + | |
| 238 | + | |
222 | 239 | | |
223 | 240 | | |
224 | 241 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
101 | 101 | | |
102 | 102 | | |
103 | 103 | | |
| 104 | + | |
| 105 | + | |
| 106 | + | |
104 | 107 | | |
105 | 108 | | |
106 | 109 | | |
| |||
304 | 307 | | |
305 | 308 | | |
306 | 309 | | |
| 310 | + | |
| 311 | + | |
| 312 | + | |
307 | 313 | | |
308 | 314 | | |
309 | 315 | | |
| |||
337 | 343 | | |
338 | 344 | | |
339 | 345 | | |
| 346 | + | |
| 347 | + | |
340 | 348 | | |
341 | 349 | | |
342 | 350 | | |
| |||
387 | 395 | | |
388 | 396 | | |
389 | 397 | | |
| 398 | + | |
| 399 | + | |
| 400 | + | |
390 | 401 | | |
391 | 402 | | |
392 | 403 | | |
| |||
466 | 477 | | |
467 | 478 | | |
468 | 479 | | |
| 480 | + | |
| 481 | + | |
| 482 | + | |
469 | 483 | | |
470 | 484 | | |
471 | 485 | | |
| |||
742 | 756 | | |
743 | 757 | | |
744 | 758 | | |
| 759 | + | |
| 760 | + | |
| 761 | + | |
745 | 762 | | |
746 | 763 | | |
747 | 764 | | |
| |||
780 | 797 | | |
781 | 798 | | |
782 | 799 | | |
| 800 | + | |
| 801 | + | |
| 802 | + | |
783 | 803 | | |
784 | 804 | | |
785 | 805 | | |
| |||
832 | 852 | | |
833 | 853 | | |
834 | 854 | | |
| 855 | + | |
| 856 | + | |
| 857 | + | |
835 | 858 | | |
836 | 859 | | |
837 | 860 | | |
| |||
891 | 914 | | |
892 | 915 | | |
893 | 916 | | |
| 917 | + | |
| 918 | + | |
| 919 | + | |
894 | 920 | | |
895 | 921 | | |
896 | 922 | | |
| |||
0 commit comments