You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: blackwell/README.md
+25-31Lines changed: 25 additions & 31 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -10,8 +10,8 @@ The core libs for running unsloth which have dependencies on `CUDA` version are:
10
10
-`bitsandbytes` - already has wheels built with `CUDA 12.8` so `pip install` should work out of the box
11
11
-`triton` - requires `triton>=3.3.1`
12
12
-`torch` - requires installing with `pip install torch --extra-index-url https://download.pytorch.org/whl/cu128`
13
-
-`vllm` - safest is to use the nightly build: `uv pip install -U vllm --torch-backend=cu128 --extra-index-url https://wheels.vllm.ai/nightly`
14
-
-`xformers` - as of 6/26, `xformers` wheels are not yet built with `sm100+` enabled as support was only recently [added](https://github.com/facebookresearch/xformers/commit/d9b3b6e2b38ca485c89507ef8ac1fbef2723cdfa) so will require a source build (see below).
13
+
-`vllm` - vLLM 0.10.0 supports Blackwell now, but use CUDA 12.8: `uv pip install -U vllm --torch-backend=cu128`
14
+
-`xformers` - (Optional) as of 6/26, `xformers` wheels are not yet built with `sm100+` enabled as support was only recently [added](https://github.com/facebookresearch/xformers/commit/d9b3b6e2b38ca485c89507ef8ac1fbef2723cdfa) so will require a source build (see below).
15
15
16
16
## Installation
17
17
@@ -38,7 +38,7 @@ The installation order is important, since we want the overwrite bundled depende
Xformers is optional, but it is definitely faster and uses less memory. We'll use PyTorch's native SDPA if you do not want Xformers. Building Xformers from source might be slow, so beware!
53
63
54
64
```bash
55
65
# First uninstall xformers installed by previous libraries
@@ -64,23 +74,11 @@ The installation order is important, since we want the overwrite bundled depende
64
74
65
75
Note that we have to explicitly set`TORCH_CUDA_ARCH_LIST=12.0`.
66
76
67
-
5) Update `triton`
68
-
69
-
```bash
70
-
uv pip install -U triton>=3.3.1
71
-
```
72
-
73
-
`triton>=3.3.1` is required for`Blackwell` support.
74
-
75
-
6) `transformers`
76
-
`transformers >= 4.53.0` breaks `unsloth` inference. Specifically, `transformers` with `gradient_checkpointing` enabled will automatically [switch off caching](https://github.com/huggingface/transformers/blob/67ddc82fbc7e52c6f42a395b4a6d278c55b77a39/src/transformers/modeling_layers.py#L52-L59).
77
-
78
-
When using `unsloth``FastLanguageModel` to `generate` directly after training with `use_cache=True`, this will result in mismatch between expected and actual outputs [here](https://github.com/unslothai/unsloth/blob/bfa6a3678e2fb8097c5ece41d095a8051f099db3/unsloth/models/llama.py#L939).
79
-
80
-
Temporary solution is to switch off `gradient_checkpointing` (e.g., `model.disable_gradient_checkpointing()`) before generation if using `4.53.0` or stick with `4.52.4`for now:
77
+
5) `transformers`
78
+
Install any transformers version, but best to get the latest.
81
79
82
80
```bash
83
-
uv pip install -U transformers==4.52.4
81
+
uv pip install -U transformers
84
82
```
85
83
86
84
@@ -112,7 +110,7 @@ The installation order is important, since we want the overwrite bundled depende
112
110
Make sure you are inside the activated conda/mamba environment. You should see the name of your environment as a prefix to your terminal shell like this your `(unsloth-blackwell)user@machine:`
Note that we have to specify `cu128`, otherwise `vllm` will install `torch==2.7.0` but with `cu126`.
@@ -125,9 +123,11 @@ The installation order is important, since we want the overwrite bundled depende
125
123
pip install unsloth unsloth_zoo bitsandbytes
126
124
```
127
125
128
-
4) Download and build `xformers`
126
+
4) Download and build `xformers` (Optional)
129
127
130
-
Make sure you are inside the activated conda/mamba environment. You should see the name of your environment as a prefix to your terminal shell like this your `(unsloth-blackwell)user@machine:`
128
+
Xformers is optional, but it is definitely faster and uses less memory. We'll use PyTorch's native SDPA if you do not want Xformers. Building Xformers from source might be slow, so beware!
129
+
130
+
You should see the name of your environment as a prefix to your terminal shell like this your `(unsloth-blackwell)user@machine:`
131
131
132
132
```bash
133
133
# First uninstall xformers installed by previous libraries
@@ -153,16 +153,10 @@ The installation order is important, since we want the overwrite bundled depende
153
153
`triton>=3.3.1` is required for`Blackwell` support.
154
154
155
155
6) `Transformers`
156
-
`transformers >= 4.53.0` breaks `unsloth` inference. Specifically, `transformers` with `gradient_checkpointing` enabled will automatically [switch off caching](https://github.com/huggingface/transformers/blob/67ddc82fbc7e52c6f42a395b4a6d278c55b77a39/src/transformers/modeling_layers.py#L52-L59).
157
-
158
-
When using `unsloth``FastLanguageModel` to `generate` directly after training with `use_cache=True`, this will result in mismatch between expected and actual outputs [here](https://github.com/unslothai/unsloth/blob/bfa6a3678e2fb8097c5ece41d095a8051f099db3/unsloth/models/llama.py#L939).
159
-
160
-
Temporary solution is to switch off `gradient_checkpointing` (e.g., `model.disable_gradient_checkpointing()`) before generation if using `4.53.0` or stick with `4.52.4`for now:
161
-
162
-
Make sure you are inside the activated conda/mamba environment. You should see the name of your environment as a prefix to your terminal shell like this your `(unsloth-blackwell)user@machine:`
156
+
Install any transformers version, but best to get the latest.
163
157
164
158
```bash
165
-
pip install -U transformers==4.52.4
159
+
uv pip install -U transformers
166
160
```
167
161
168
162
@@ -171,7 +165,7 @@ If you are using mamba as your package just replace conda with mamba for all com
171
165
172
166
## WSL-Specific Notes
173
167
174
-
If you're using WSL (Windows Subsystem for Linux) and encounter issues during xformers compilation, follow these additional steps:
168
+
If you're using WSL (Windows Subsystem for Linux) and encounter issues during xformers compilation (reminder Xformers is optional, but faster for training) follow these additional steps:
0 commit comments