|
2 | 2 |
|
3 | 3 | # OpenAssistant Inference
|
4 | 4 |
|
5 |
| -Preliminary implementation of the inference engine for OpenAssistant. |
| 5 | +Preliminary implementation of the inference engine for OpenAssistant. This is |
| 6 | +strictly for local development, although you might find limited success for your |
| 7 | +self-hosting OA plan. There is no warranty that this will not change in the |
| 8 | +future — in fact, expect it to change. |
6 | 9 |
|
7 | 10 | ## Development Variant 1 (docker compose)
|
8 | 11 |
|
@@ -30,99 +33,31 @@ Tail the logs:
|
30 | 33 | ```shell
|
31 | 34 | docker compose logs -f \
|
32 | 35 | inference-server \
|
33 |
| - inference-worker \ |
34 |
| - inference-text-client \ |
35 |
| - inference-text-generation-server |
36 |
| -``` |
37 |
| - |
38 |
| -Attach to the text-client, and start chatting: |
| 36 | + inference-worker |
39 | 37 |
|
40 |
| -```shell |
41 |
| -docker attach open-assistant-inference-text-client-1 |
42 | 38 | ```
|
43 | 39 |
|
44 |
| -> **Note:** In the last step, `open-assistant-inference-text-client-1` refers to |
45 |
| -> the name of the `text-client` container started in step 2. |
46 |
| -
|
47 | 40 | > **Note:** The compose file contains the bind mounts enabling you to develop on
|
48 | 41 | > the modules of the inference stack, and the `oasst-shared` package, without
|
49 | 42 | > rebuilding.
|
50 | 43 |
|
| 44 | +> **Note:** You can change the model by editing variable `MODEL_CONFIG_NAME` in |
| 45 | +> the `docker-compose.yaml` file. Valid model names can be found in |
| 46 | +> [model_configs.py](../oasst-shared/oasst_shared/model_configs.py). |
| 47 | +
|
51 | 48 | > **Note:** You can spin up any number of workers by adjusting the number of
|
52 | 49 | > replicas of the `inference-worker` service to your liking.
|
53 | 50 |
|
54 | 51 | > **Note:** Please wait for the `inference-text-generation-server` service to
|
55 | 52 | > output `{"message":"Connected"}` before starting to chat.
|
56 | 53 |
|
57 |
| -## Development Variant 2 (tmux terminal multiplexing) |
58 |
| - |
59 |
| -Ensure you have `tmux` installed on you machine and the following packages |
60 |
| -installed into the Python environment; |
61 |
| - |
62 |
| -- `uvicorn` |
63 |
| -- `worker/requirements.txt` |
64 |
| -- `server/requirements.txt` |
65 |
| -- `text-client/requirements.txt` |
66 |
| -- `oasst_shared` |
67 |
| - |
68 |
| -You can run development setup to start the full development setup. |
69 |
| - |
70 |
| -```bash |
71 |
| -cd inference |
72 |
| -./full-dev-setup.sh |
73 |
| -``` |
74 |
| - |
75 |
| -> Make sure to wait until the 2nd terminal is ready and says |
76 |
| -> `{"message":"Connected"}` before entering input into the last terminal. |
77 |
| -
|
78 |
| -## Development Variant 3 (you'll need multiple terminals) |
79 |
| - |
80 |
| -Run a postgres container: |
81 |
| - |
82 |
| -```bash |
83 |
| -docker run --rm -it -p 5432:5432 -e POSTGRES_PASSWORD=postgres --name postgres postgres |
84 |
| -``` |
85 |
| - |
86 |
| -Run a redis container (or use the one of the general docker compose file): |
87 |
| - |
88 |
| -```bash |
89 |
| -docker run --rm -it -p 6379:6379 --name redis redis |
90 |
| -``` |
91 |
| - |
92 |
| -Run the inference server: |
93 |
| - |
94 |
| -```bash |
95 |
| -cd server |
96 |
| -pip install -r requirements.txt |
97 |
| -DEBUG_API_KEYS='0000,0001,0002' uvicorn main:app --reload |
98 |
| -``` |
99 |
| - |
100 |
| -Run one (or more) workers: |
101 |
| - |
102 |
| -```bash |
103 |
| -cd worker |
104 |
| -pip install -r requirements.txt |
105 |
| -API_KEY=0000 python __main__.py |
106 |
| - |
107 |
| -# to add another worker, simply run |
108 |
| -API_KEY=0001 python __main__.py |
109 |
| -``` |
110 |
| - |
111 |
| -For the worker, you'll also want to have the text-generation-inference server |
112 |
| -running: |
113 |
| - |
114 |
| -```bash |
115 |
| -docker run --rm -it -p 8001:80 -e MODEL_ID=distilgpt2 \ |
116 |
| - -v $HOME/.cache/huggingface:/root/.cache/huggingface \ |
117 |
| - --name text-generation-inference ghcr.io/yk/text-generation-inference |
118 |
| -``` |
119 |
| - |
120 |
| -Run the text client: |
| 54 | +Run the text client and start chatting: |
121 | 55 |
|
122 | 56 | ```bash
|
123 | 57 | cd text-client
|
124 | 58 | pip install -r requirements.txt
|
125 | 59 | python __main__.py
|
| 60 | +# You'll soon see a `User:` prompt, where you can type your prompts. |
126 | 61 | ```
|
127 | 62 |
|
128 | 63 | ## Distributed Testing
|
|
0 commit comments