Skip to content

Commit 7d7df31

Browse files
update readme
1 parent a871088 commit 7d7df31

File tree

1 file changed

+22
-87
lines changed
  • ai/ai-starter-kit/helm-chart/ai-starter-kit

1 file changed

+22
-87
lines changed

ai/ai-starter-kit/helm-chart/ai-starter-kit/README.md

Lines changed: 22 additions & 87 deletions
Original file line numberDiff line numberDiff line change
@@ -9,7 +9,6 @@ The AI Starter Kit simplifies the deployment of AI infrastructure by providing:
99
- **JupyterHub**: Multi-user notebook environment with pre-configured AI/ML libraries
1010
- **Model Serving**: Support for both Ollama and Ramalama model servers
1111
- **MLflow**: Experiment tracking and model management
12-
- **GPU Support**: Configurations for GPU acceleration on GKE and macOS
1312
- **Model Caching**: Persistent storage for efficient model management
1413
- **Example Notebooks**: Pre-loaded notebooks to get you started immediately
1514

@@ -28,15 +27,6 @@ The AI Starter Kit simplifies the deployment of AI infrastructure by providing:
2827
- Minimum 4 CPU cores and 16GB RAM available
2928
- 40GB+ free disk space
3029

31-
#### GKE (Google Kubernetes Engine)
32-
- Google Cloud CLI (`gcloud`) installed and configured
33-
- Appropriate GCP permissions to create clusters
34-
35-
#### macOS with GPU (Apple Silicon)
36-
- macOS with Apple Silicon (M1/M2/M3/M4)
37-
- minikube with krunkit driver
38-
- 16GB+ RAM recommended
39-
4030
## Installation
4131

4232
### Quick Start (Minikube)
@@ -65,74 +55,7 @@ helm install ai-starter-kit . \
6555
```bash
6656
kubectl port-forward svc/ai-starter-kit-jupyterhub-proxy-public 8080:80
6757
```
68-
Navigate to http://localhost:8080 and login with any username and password `sneakypass`
69-
70-
### GKE Deployment
71-
72-
1. **Create a GKE Autopilot cluster:**
73-
```bash
74-
export REGION=us-central1
75-
export CLUSTER_NAME="ai-starter-cluster"
76-
export PROJECT_ID=$(gcloud config get project)
77-
78-
gcloud container clusters create-auto ${CLUSTER_NAME} \
79-
--project=${PROJECT_ID} \
80-
--region=${REGION} \
81-
--release-channel=rapid \
82-
--labels=created-by=ai-on-gke,guide=ai-starter-kit
83-
```
84-
85-
2. **Get cluster credentials:**
86-
```bash
87-
gcloud container clusters get-credentials ${CLUSTER_NAME} --location=${REGION}
88-
```
89-
90-
3. **Install the chart with GKE-specific values:**
91-
```bash
92-
helm install ai-starter-kit . \
93-
--set huggingface.token="YOUR_HF_TOKEN" \
94-
-f values.yaml \
95-
-f values-gke.yaml
96-
```
97-
98-
### GKE with GPU (Ollama)
99-
100-
For GPU-accelerated model serving with Ollama:
101-
102-
```bash
103-
helm install ai-starter-kit . \
104-
--set huggingface.token="YOUR_HF_TOKEN" \
105-
-f values-gke.yaml \
106-
-f values-ollama-gpu.yaml
107-
```
108-
109-
### GKE with GPU (Ramalama)
110-
111-
For GPU-accelerated model serving with Ramalama:
112-
113-
```bash
114-
helm install ai-starter-kit . \
115-
--set huggingface.token="YOUR_HF_TOKEN" \
116-
-f values-gke.yaml \
117-
-f values-ramalama-gpu.yaml
118-
```
119-
120-
### macOS with Apple Silicon GPU
121-
122-
1. **Start minikube with krunkit driver:**
123-
```bash
124-
minikube start --driver krunkit \
125-
--cpus 8 --memory 16000 --disk-size 40000mb \
126-
--mount --mount-string="/tmp/models-cache:/tmp/models-cache"
127-
```
128-
129-
2. **Install with macOS GPU support:**
130-
```bash
131-
helm install ai-starter-kit . \
132-
--set huggingface.token="YOUR_HF_TOKEN" \
133-
-f values.yaml \
134-
-f values-macos.yaml
135-
```
58+
Navigate to http://localhost:8080 and login with any username and password `password`
13659

13760
## Configuration
13861

@@ -152,9 +75,25 @@ helm install ai-starter-kit . \
15275
The chart supports different storage configurations:
15376

15477
- **Local Development**: Uses hostPath volumes with minikube mount
155-
- **GKE**: Uses standard GKE storage classes (`standard-rwo`, `standard-rwx`)
15678
- **Custom**: Configure via `modelsCachePvc.storageClassName`
15779

80+
### Using GPUs
81+
82+
In order to use GPUs for AI/ML workloads we need to add the necessary config to the services. Check the dependency charts documentation for the values. For example jupyterhub config would be:
83+
84+
```yaml
85+
juypterhub:
86+
...
87+
extraResource:
88+
limits:
89+
nvidia.com/gpu: 1
90+
guarantees:
91+
nvidia.com/gpu: 1
92+
93+
nodeSelector:
94+
cloud.google.com/gke-accelerator: nvidia-l4
95+
```
96+
15897
### Model Servers
15998
16099
#### Ollama
@@ -170,13 +109,7 @@ Ramalama provides:
170109
- Support for CUDA and Metal (macOS) acceleration
171110
- Lightweight deployment option
172111
173-
You can run either Ollama or Ramalama, but not both simultaneously. Toggle using:
174-
```yaml
175-
ollama:
176-
enabled: true/false
177-
ramalama:
178-
enabled: true/false
179-
```
112+
180113
181114
## Usage
182115
@@ -209,8 +142,10 @@ kubectl port-forward svc/ai-starter-kit-ramalama 8080:8080
209142
### Pre-loaded Example Notebooks
210143

211144
The JupyterHub environment comes with pre-loaded example notebooks:
145+
- `ray.ipynb`: Simple Ray nad MLflow example
212146
- `chat_bot.ipynb`: Simple chatbot interface using Ollama for conversational AI.
213-
- `multi-agent-ollama.ipynb`: Multi-agent workflow demonstration using Ollama.
147+
- `multi-agent.ipynb`:Multi-agent workflow demonstration using Ray.
148+
- `multi-agent-ollama.ipynb`: Similar multi-agent workflow demonstration using Ollama.
214149
- `multi-agent-ramalama.ipynb`: Similar multi-agent workflow using RamaLama runtime for comparison.
215150
- `welcome.ipynb`: Introduction notebook with embedding model examples using Qwen models.
216151

0 commit comments

Comments
 (0)