Skip to content

Commit fc83031

Browse files
rlorenzoclaude
andcommitted
Fix OpenAI/Gemini image generation and add Claude Code MCP support
- Update environment variables to provider-namespaced format (PROVIDERS__*__*) - Switch Gemini provider from Generative AI API to Vertex AI with service account auth - Fix gpt-image-1 model capabilities (remove unsupported style parameter) - Add Claude Code MCP integration with start-mcp.sh script - Update Docker configuration and documentation with Vertex AI setup instructions 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <noreply@anthropic.com>
1 parent 68b48a6 commit fc83031

20 files changed

+1051
-173
lines changed

.env.example

Lines changed: 4 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -15,9 +15,10 @@ PROVIDERS__OPENAI__TIMEOUT=300.0
1515
PROVIDERS__OPENAI__MAX_RETRIES=3
1616
PROVIDERS__OPENAI__ENABLED=true
1717

18-
# Gemini Provider (default disabled)
19-
PROVIDERS__GEMINI__API_KEY=your-gemini-api-key-here
20-
PROVIDERS__GEMINI__BASE_URL=https://generativelanguage.googleapis.com/v1beta/
18+
# Gemini Provider (requires Vertex AI setup)
19+
# For Imagen models, use path to Google Cloud service account JSON file
20+
PROVIDERS__GEMINI__API_KEY=/path/to/your/vertex-ai-key.json
21+
PROVIDERS__GEMINI__BASE_URL=https://us-central1-aiplatform.googleapis.com/v1
2122
PROVIDERS__GEMINI__TIMEOUT=300.0
2223
PROVIDERS__GEMINI__MAX_RETRIES=3
2324
PROVIDERS__GEMINI__ENABLED=false

.gitignore

Lines changed: 15 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -10,3 +10,18 @@ desktop.ini
1010
.coverage
1111
.cache/
1212
*.log
13+
14+
# Google Cloud service account keys
15+
vertex-ai-key.json
16+
*-key.json
17+
*-service-account.json
18+
vertex-ai-*.json
19+
service-account-*.json
20+
21+
# Generated images and storage
22+
storage/
23+
logs/
24+
25+
# Auto-generated monitoring configuration
26+
monitoring/
27+
redis.conf

Dockerfile

Lines changed: 4 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -16,8 +16,8 @@ WORKDIR /app
1616
# Copy dependency files
1717
COPY pyproject.toml uv.lock ./
1818

19-
# Install dependencies
20-
RUN uv sync --frozen --no-dev
19+
# Install dependencies (include all groups for proper functionality)
20+
RUN uv sync --frozen
2121

2222
# Production stage
2323
FROM python:3.11-slim
@@ -43,7 +43,8 @@ COPY . .
4343
RUN mkdir -p /app/storage/images /app/storage/cache /app/storage/logs && \
4444
chown -R appuser:appuser /app
4545

46-
# Set environment variables
46+
# Set environment variables for virtual environment activation
47+
ENV VIRTUAL_ENV="/app/.venv"
4748
ENV PATH="/app/.venv/bin:$PATH"
4849
ENV PYTHONPATH="/app"
4950
ENV PYTHONDONTWRITEBYTECODE=1

README.md

Lines changed: 34 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -112,7 +112,7 @@ The AI ecosystem has evolved to include powerful language models from multiple p
112112
- Python 3.10+
113113
- [UV package manager](https://docs.astral.sh/uv/)
114114
- OpenAI API key (for OpenAI models)
115-
- Google Gemini API key (for Gemini models, optional)
115+
- Google Cloud service account with Vertex AI access (for Imagen models, optional)
116116

117117
### Installation
118118

@@ -128,11 +128,18 @@ The AI ecosystem has evolved to include powerful language models from multiple p
128128
2. **Configure environment**:
129129
```bash
130130
cp .env.example .env
131-
# Edit .env and add your API keys:
131+
# Edit .env and add your credentials:
132132
# - PROVIDERS__OPENAI__API_KEY for OpenAI models
133-
# - PROVIDERS__GEMINI__API_KEY for Gemini models (optional)
133+
# - PROVIDERS__GEMINI__API_KEY for Imagen models (path to service account JSON file)
134134
```
135135

136+
**For Imagen models (Vertex AI setup)**:
137+
1. Go to [Google Cloud Console](https://console.cloud.google.com)
138+
2. Enable Vertex AI API for your project
139+
3. Create a service account with "Vertex AI User" role
140+
4. Download the JSON key file to your project directory
141+
5. Set `PROVIDERS__GEMINI__API_KEY` to the path of your JSON file
142+
136143
3. **Test the setup**:
137144
```bash
138145
uv run python scripts/dev.py setup
@@ -154,6 +161,9 @@ The AI ecosystem has evolved to include powerful language models from multiple p
154161

155162
# Production deployment with monitoring
156163
./run.sh prod
164+
165+
# Stop all services
166+
./run.sh stop
157167
```
158168

159169
#### Manual Execution
@@ -218,13 +228,28 @@ This server works with **any MCP-compatible chatbot client**. Here are configura
218228
"image-gen-mcp"
219229
],
220230
"env": {
221-
"OPENAI_API_KEY": "your-api-key-here"
231+
"PROVIDERS__OPENAI__API_KEY": "your-api-key-here"
222232
}
223233
}
224234
}
225235
}
226236
```
227237
238+
##### Claude Code (Anthropic CLI)
239+
```bash
240+
# First, create the startup script (one-time setup)
241+
# This is already included in the repository as start-mcp.sh
242+
243+
# Add MCP server with API key
244+
claude mcp add image-gen-mcp /path/to/image-gen-mcp/start-mcp.sh -e PROVIDERS__OPENAI__API_KEY=your-api-key-here
245+
246+
# Or add without API key if it's in your .env file
247+
claude mcp add image-gen-mcp /path/to/image-gen-mcp/start-mcp.sh
248+
249+
# Verify setup
250+
claude mcp list
251+
```
252+
228253
##### Continue.dev (VS Code Extension)
229254
```json
230255
{
@@ -233,7 +258,7 @@ This server works with **any MCP-compatible chatbot client**. Here are configura
233258
"command": "uv",
234259
"args": ["--directory", "/path/to/image-gen-mcp", "run", "image-gen-mcp"],
235260
"env": {
236-
"OPENAI_API_KEY": "your-api-key-here"
261+
"PROVIDERS__OPENAI__API_KEY": "your-api-key-here"
237262
}
238263
}
239264
}
@@ -360,9 +385,10 @@ PROVIDERS__OPENAI__TIMEOUT=300.0
360385
PROVIDERS__OPENAI__MAX_RETRIES=3
361386
PROVIDERS__OPENAI__ENABLED=true
362387

363-
# Gemini Provider (default disabled)
364-
PROVIDERS__GEMINI__API_KEY=your-gemini-api-key-here
365-
PROVIDERS__GEMINI__BASE_URL=https://generativelanguage.googleapis.com/v1beta/
388+
# Gemini Provider (requires Vertex AI setup)
389+
# For Imagen models, use path to Google Cloud service account JSON file
390+
PROVIDERS__GEMINI__API_KEY=/path/to/your/vertex-ai-key.json
391+
PROVIDERS__GEMINI__BASE_URL=https://us-central1-aiplatform.googleapis.com/v1
366392
PROVIDERS__GEMINI__TIMEOUT=300.0
367393
PROVIDERS__GEMINI__MAX_RETRIES=3
368394
PROVIDERS__GEMINI__ENABLED=false

deploy/vps-setup.sh

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -306,7 +306,7 @@ create_env_template() {
306306

307307
cat > /opt/gpt-image-mcp/.env.example <<EOF
308308
# OpenAI Configuration
309-
OPENAI_API_KEY=your-openai-api-key-here
309+
PROVIDERS__OPENAI__API_KEY=your-openai-api-key-here
310310
OPENAI_ORGANIZATION=
311311
OPENAI_BASE_URL=https://api.openai.com/v1
312312

docker-compose.dev.yml

Lines changed: 8 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,3 @@
1-
version: '3.8'
2-
31
services:
42
# Main application - Development version
53
gpt-image-mcp-dev:
@@ -9,7 +7,8 @@ services:
97
ports:
108
- "3001:3001" # Expose directly for development
119
environment:
12-
- OPENAI_API_KEY=${OPENAI_API_KEY}
10+
- PROVIDERS__OPENAI__API_KEY=${PROVIDERS__OPENAI__API_KEY}
11+
- PROVIDERS__GEMINI__API_KEY=${PROVIDERS__GEMINI__API_KEY:-}
1312
- REDIS_URL=redis://redis:6379/0
1413
- STORAGE_BASE_PATH=/app/storage
1514
- CACHE_ENABLED=true
@@ -19,7 +18,8 @@ services:
1918
volumes:
2019
- ./storage:/app/storage
2120
- ./logs:/app/logs
22-
- .:/app # Mount source code for development
21+
# Mount only the source code, not the entire app directory (preserves .venv)
22+
- ./gpt_image_mcp:/app/gpt_image_mcp
2323
depends_on:
2424
- redis
2525
networks:
@@ -39,7 +39,8 @@ services:
3939
container_name: gpt-image-mcp-stdio
4040
restart: "no" # Don't restart automatically for stdio
4141
environment:
42-
- OPENAI_API_KEY=${OPENAI_API_KEY}
42+
- PROVIDERS__OPENAI__API_KEY=${PROVIDERS__OPENAI__API_KEY}
43+
- PROVIDERS__GEMINI__API_KEY=${PROVIDERS__GEMINI__API_KEY:-}
4344
- REDIS_URL=redis://redis:6379/0
4445
- STORAGE_BASE_PATH=/app/storage
4546
- CACHE_ENABLED=true
@@ -48,7 +49,8 @@ services:
4849
volumes:
4950
- ./storage:/app/storage
5051
- ./logs:/app/logs
51-
- .:/app
52+
# Mount only the source code, not the entire app directory (preserves .venv)
53+
- ./gpt_image_mcp:/app/gpt_image_mcp
5254
depends_on:
5355
- redis
5456
networks:

docker-compose.prod.yml

Lines changed: 2 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,3 @@
1-
version: '3.8'
2-
31
services:
42
# Main application
53
gpt-image-mcp:
@@ -9,7 +7,8 @@ services:
97
ports:
108
- "127.0.0.1:3001:3001" # Bind to localhost only, nginx will proxy
119
environment:
12-
- OPENAI_API_KEY=${OPENAI_API_KEY}
10+
- PROVIDERS__OPENAI__API_KEY=${PROVIDERS__OPENAI__API_KEY}
11+
- PROVIDERS__GEMINI__API_KEY=${PROVIDERS__GEMINI__API_KEY:-}
1312
- REDIS_URL=redis://redis:6379/0
1413
- STORAGE_BASE_PATH=/app/storage
1514
- CACHE_ENABLED=true

docs/multi_provider_api_guide.md

Lines changed: 30 additions & 14 deletions
Original file line numberDiff line numberDiff line change
@@ -36,7 +36,7 @@ Creates an image given a prompt.
3636
```bash
3737
curl https://api.openai.com/v1/images/generations \
3838
-H "Content-Type: application/json" \
39-
-H "Authorization: Bearer $OPENAI_API_KEY" \
39+
-H "Authorization: Bearer $PROVIDERS__OPENAI__API_KEY" \
4040
-d '{
4141
"model": "gpt-image-1",
4242
"prompt": "A cute baby sea otter",
@@ -65,36 +65,52 @@ Creates an edited or extended image given source images and a prompt. Only suppo
6565
| `output_format` | string | No | Output format for `gpt-image-1` |
6666
| `background` | string | No | Background setting for `gpt-image-1` |
6767

68-
## Google Gemini Images API (OpenAI Compatible)
68+
## Google Vertex AI Images API (Imagen Models)
6969

7070
### Create Image
7171

72-
**Endpoint**: `POST https://generativelanguage.googleapis.com/v1beta/openai/images/generations`
72+
**Endpoint**: `POST https://us-central1-aiplatform.googleapis.com/v1/projects/{PROJECT_ID}/locations/us-central1/publishers/google/models/{MODEL_ID}:predict`
7373

74-
Creates an image using Google's Imagen models through OpenAI compatibility mode.
74+
Creates an image using Google's Imagen models through Vertex AI API with service account authentication.
7575

7676
#### Request Body
7777

78+
**Format**: Vertex AI prediction format with instances and parameters.
79+
80+
**Instances**:
7881
| Parameter | Type | Required | Description |
7982
|-----------|------|----------|-------------|
8083
| `prompt` | string | Yes | Text description of the desired image |
81-
| `model` | string | No | Model to use: `imagen-4`, `imagen-4-ultra`, or `imagen-3` |
82-
| `n` | integer | No | Number of images to generate (1-8) |
84+
85+
**Parameters**:
86+
| Parameter | Type | Required | Description |
87+
|-----------|------|----------|-------------|
88+
| `sampleCount` | integer | No | Number of images to generate (1-4) |
8389
| `aspectRatio` | string | No | Aspect ratio: `1:1`, `9:16`, `16:9`, `3:4`, `4:3` |
84-
| `outputFormat` | string | No | Output format: `png`, `jpeg` |
85-
| `safety` | string | No | Safety filter level: `strict`, `moderate`, `permissive` |
90+
91+
**Available Models**:
92+
- `imagen-4.0-generate-preview-06-06` (Imagen-4)
93+
- `imagen-3.0-generate-002` (Imagen-3)
8694

8795
#### Example Request
8896

97+
**Note**: Gemini/Imagen models now use Vertex AI API with service account authentication.
98+
8999
```bash
90-
curl https://generativelanguage.googleapis.com/v1beta/openai/images/generations \
100+
# First, get access token from service account
101+
ACCESS_TOKEN=$(gcloud auth application-default print-access-token)
102+
103+
curl https://us-central1-aiplatform.googleapis.com/v1/projects/YOUR_PROJECT_ID/locations/us-central1/publishers/google/models/imagen-4.0-generate-preview-06-06:predict \
91104
-H "Content-Type: application/json" \
92-
-H "Authorization: Bearer $GEMINI_API_KEY" \
105+
-H "Authorization: Bearer $ACCESS_TOKEN" \
93106
-d '{
94-
"model": "imagen-4",
95-
"prompt": "A beautiful sunset over mountains",
96-
"n": 1,
97-
"aspectRatio": "16:9"
107+
"instances": [{
108+
"prompt": "A beautiful sunset over mountains"
109+
}],
110+
"parameters": {
111+
"sampleCount": 1,
112+
"aspectRatio": "16:9"
113+
}
98114
}'
99115
```
100116

docs/openai_images_api_guide.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -36,7 +36,7 @@ Creates an image given a prompt. [Learn more](https://platform.openai.com/docs/g
3636
```bash
3737
curl https://api.openai.com/v1/images/generations \
3838
-H "Content-Type: application/json" \
39-
-H "Authorization: Bearer $OPENAI_API_KEY" \
39+
-H "Authorization: Bearer $PROVIDERS__OPENAI__API_KEY" \
4040
-d '{
4141
"model": "gpt-image-1",
4242
"prompt": "A cute baby sea otter",
@@ -84,7 +84,7 @@ Creates an edited or extended image given source images and a prompt. Only suppo
8484

8585
```bash
8686
curl -X POST "https://api.openai.com/v1/images/edits" \
87-
-H "Authorization: Bearer $OPENAI_API_KEY" \
87+
-H "Authorization: Bearer $PROVIDERS__OPENAI__API_KEY" \
8888
-F "model=gpt-image-1" \
8989
-F "image[]=@body-lotion.png" \
9090
-F "image[]=@bath-bomb.png" \

gpt_image_mcp/config/settings.py

Lines changed: 3 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -272,6 +272,9 @@ class Settings(BaseSettings):
272272
env_file_alternates=[".env.local", ".env.production"],
273273
)
274274

275+
# Provider settings (main structure for environment variables)
276+
providers: ProvidersSettings = Field(default_factory=ProvidersSettings)
277+
275278
# Direct settings for backwards compatibility
276279
openai: OpenAISettings | None = Field(default=None)
277280
gemini: GeminiSettings | None = Field(default=None)
@@ -285,16 +288,6 @@ def from_env(cls):
285288
"""Load settings from environment variables and .env files."""
286289
return cls()
287290

288-
# Backward compatibility property for providers access
289-
@property
290-
def providers(self) -> ProvidersSettings:
291-
"""Provide access to providers settings for backwards compatibility."""
292-
return ProvidersSettings(
293-
openai=self.openai,
294-
gemini=self.gemini,
295-
enabled_providers=self._get_enabled_providers(),
296-
default_provider=self._get_default_provider(),
297-
)
298291

299292
def _get_enabled_providers(self) -> list[str]:
300293
"""Get list of enabled providers."""

0 commit comments

Comments
 (0)