You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The chart supports different storage configurations:
153
76
154
77
-**Local Development**: Uses hostPath volumes with minikube mount
155
-
-**GKE**: Uses standard GKE storage classes (`standard-rwo`, `standard-rwx`)
156
78
-**Custom**: Configure via `modelsCachePvc.storageClassName`
157
79
80
+
### Using GPUs
81
+
82
+
In order to use GPUs for AI/ML workloads we need to add the necessary config to the services. Check the dependency charts documentation for the values. For example jupyterhub config would be:
83
+
84
+
```yaml
85
+
juypterhub:
86
+
...
87
+
extraResource:
88
+
limits:
89
+
nvidia.com/gpu: 1
90
+
guarantees:
91
+
nvidia.com/gpu: 1
92
+
93
+
nodeSelector:
94
+
cloud.google.com/gke-accelerator: nvidia-l4
95
+
```
96
+
158
97
### Model Servers
159
98
160
99
#### Ollama
@@ -170,13 +109,7 @@ Ramalama provides:
170
109
- Support for CUDA and Metal (macOS) acceleration
171
110
- Lightweight deployment option
172
111
173
-
You can run either Ollama or Ramalama, but not both simultaneously. Toggle using:
0 commit comments