Skip to content

Commit a18aa70

Browse files
Chuseuitijcardenes-viaolliestanleyRudd-O
authored
update inference readme (LAION-AI#2969)
Fixed inference README with working version and deleted a lot of the extra variants information to simplify the document, as they are derivations of the same steps. The motivation is that this container is not part of the docker compose, neither exist in the existing repository in the main branch: inference-text-client --------- Co-authored-by: jcardenes <jcardenes@solvewithvia.com> Co-authored-by: Oliver Stanley <olivergestanley@gmail.com> Co-authored-by: Rudd-O <rudd-o@users.noreply.github.com>
1 parent ece0384 commit a18aa70

File tree

2 files changed

+13
-76
lines changed

2 files changed

+13
-76
lines changed

docker-compose.yaml

+2
Original file line numberDiff line numberDiff line change
@@ -5,6 +5,8 @@ services:
55

66
# Use `docker compose --profile frontend-dev up --build --attach-dependencies` to start the services needed to work on the frontend. If you want to also run the inference, add a second `--profile inference` argument.
77

8+
# If you update the containers used by the inference profile, please update inference/README.md. Thank you
9+
810
# The profile ci is used by CI automations. (i.e E2E testing)
911

1012
# This DB is for the FastAPI Backend.

inference/README.md

+11-76
Original file line numberDiff line numberDiff line change
@@ -2,7 +2,10 @@
22

33
# OpenAssistant Inference
44

5-
Preliminary implementation of the inference engine for OpenAssistant.
5+
Preliminary implementation of the inference engine for OpenAssistant. This is
6+
strictly for local development, although you might find limited success for your
7+
self-hosting OA plan. There is no warranty that this will not change in the
8+
future — in fact, expect it to change.
69

710
## Development Variant 1 (docker compose)
811

@@ -30,99 +33,31 @@ Tail the logs:
3033
```shell
3134
docker compose logs -f \
3235
inference-server \
33-
inference-worker \
34-
inference-text-client \
35-
inference-text-generation-server
36-
```
37-
38-
Attach to the text-client, and start chatting:
36+
inference-worker
3937

40-
```shell
41-
docker attach open-assistant-inference-text-client-1
4238
```
4339

44-
> **Note:** In the last step, `open-assistant-inference-text-client-1` refers to
45-
> the name of the `text-client` container started in step 2.
46-
4740
> **Note:** The compose file contains the bind mounts enabling you to develop on
4841
> the modules of the inference stack, and the `oasst-shared` package, without
4942
> rebuilding.
5043
44+
> **Note:** You can change the model by editing variable `MODEL_CONFIG_NAME` in
45+
> the `docker-compose.yaml` file. Valid model names can be found in
46+
> [model_configs.py](../oasst-shared/oasst_shared/model_configs.py).
47+
5148
> **Note:** You can spin up any number of workers by adjusting the number of
5249
> replicas of the `inference-worker` service to your liking.
5350
5451
> **Note:** Please wait for the `inference-text-generation-server` service to
5552
> output `{"message":"Connected"}` before starting to chat.
5653
57-
## Development Variant 2 (tmux terminal multiplexing)
58-
59-
Ensure you have `tmux` installed on you machine and the following packages
60-
installed into the Python environment;
61-
62-
- `uvicorn`
63-
- `worker/requirements.txt`
64-
- `server/requirements.txt`
65-
- `text-client/requirements.txt`
66-
- `oasst_shared`
67-
68-
You can run development setup to start the full development setup.
69-
70-
```bash
71-
cd inference
72-
./full-dev-setup.sh
73-
```
74-
75-
> Make sure to wait until the 2nd terminal is ready and says
76-
> `{"message":"Connected"}` before entering input into the last terminal.
77-
78-
## Development Variant 3 (you'll need multiple terminals)
79-
80-
Run a postgres container:
81-
82-
```bash
83-
docker run --rm -it -p 5432:5432 -e POSTGRES_PASSWORD=postgres --name postgres postgres
84-
```
85-
86-
Run a redis container (or use the one of the general docker compose file):
87-
88-
```bash
89-
docker run --rm -it -p 6379:6379 --name redis redis
90-
```
91-
92-
Run the inference server:
93-
94-
```bash
95-
cd server
96-
pip install -r requirements.txt
97-
DEBUG_API_KEYS='0000,0001,0002' uvicorn main:app --reload
98-
```
99-
100-
Run one (or more) workers:
101-
102-
```bash
103-
cd worker
104-
pip install -r requirements.txt
105-
API_KEY=0000 python __main__.py
106-
107-
# to add another worker, simply run
108-
API_KEY=0001 python __main__.py
109-
```
110-
111-
For the worker, you'll also want to have the text-generation-inference server
112-
running:
113-
114-
```bash
115-
docker run --rm -it -p 8001:80 -e MODEL_ID=distilgpt2 \
116-
-v $HOME/.cache/huggingface:/root/.cache/huggingface \
117-
--name text-generation-inference ghcr.io/yk/text-generation-inference
118-
```
119-
120-
Run the text client:
54+
Run the text client and start chatting:
12155

12256
```bash
12357
cd text-client
12458
pip install -r requirements.txt
12559
python __main__.py
60+
# You'll soon see a `User:` prompt, where you can type your prompts.
12661
```
12762

12863
## Distributed Testing

0 commit comments

Comments
 (0)