Skip to content

Commit 34a11b0

Browse files
committed
add hyper-parameters to README
1 parent 32b43b6 commit 34a11b0

File tree

1 file changed

+52
-0
lines changed

1 file changed

+52
-0
lines changed

README.md

+52
Original file line numberDiff line numberDiff line change
@@ -74,3 +74,55 @@ Just like for training, you can run `image_sample.py` through MPI to use multipl
7474
You can change the number of sampling steps using the `--timestep_respacing` argument. For example, `--timestep_respacing 250` uses 250 steps to sample. Passing `--timestep_respacing ddim250` is similar, but uses the uniform stride from the [DDIM paper](https://arxiv.org/abs/2010.02502) rather than our stride.
7575

7676
To sample using [DDIM](https://arxiv.org/abs/2010.02502), pass `--use_ddim True`.
77+
78+
## Experiment hyper-parameters
79+
80+
This section includes run flags for training the main models in the paper. Note that the batch sizes are specified for single-GPU training, even though most of these runs will not naturally fit on a single GPU. To address this, either set `--microbatch` to a small value (e.g. 4) to train on one GPU, or run with MPI and divide `--batch_size` by the number of GPUs.
81+
82+
Unconditional ImageNet-64 with our `L_hybrid` objective and cosine noise schedule:
83+
84+
```bash
85+
MODEL_FLAGS="--image_size 64 --num_channels 128 --num_res_blocks 3 --learn_sigma True"
86+
DIFFUSION_FLAGS="--diffusion_steps 4000 --noise_schedule cosine"
87+
TRAIN_FLAGS="--lr 1e-4 --batch_size 128"
88+
```
89+
90+
Unconditional CIFAR-10 with our `L_hybrid` objective and cosine noise schedule:
91+
92+
```bash
93+
MODEL_FLAGS="--image_size 32 --num_channels 128 --num_res_blocks 3 --learn_sigma True --dropout 0.3"
94+
DIFFUSION_FLAGS="--diffusion_steps 4000 --noise_schedule cosine"
95+
TRAIN_FLAGS="--lr 1e-4 --batch_size 128"
96+
```
97+
98+
Class-conditional ImageNet-64 model (270M parameters, trained for 250K iterations):
99+
100+
```bash
101+
MODEL_FLAGS="--image_size 64 --num_channels 192 --num_res_blocks 3 --learn_sigma True"
102+
DIFFUSION_FLAGS="--diffusion_steps 4000 --noise_schedule cosine --rescale_learned_sigmas False --rescale_timesteps False"
103+
TRAIN_FLAGS="--lr 3e-4 --batch_size 2048"
104+
```
105+
106+
Upsampling 256x256 model (280M parameters, trained for 500K iterations):
107+
108+
```bash
109+
MODEL_FLAGS="--num_channels 192 --num_res_blocks 3 --learn_sigma True"
110+
DIFFUSION_FLAGS="--diffusion_steps 4000 --noise_schedule linear --rescale_learned_sigmas False --rescale_timesteps False"
111+
TRAIN_FLAGS="--lr 3e-4 --batch_size 256"
112+
```
113+
114+
LSUN bedroom model (lr=1e-4):
115+
116+
```bash
117+
MODEL_FLAGS="--image_size 256 --num_channels 128 --num_res_blocks 2 --num_heads 1 --learn_sigma True"
118+
DIFFUSION_FLAGS="--diffusion_steps 1000 --noise_schedule linear --rescale_learned_sigmas False --rescale_timesteps False"
119+
TRAIN_FLAGS="--lr 1e-4 --batch_size 128"
120+
```
121+
122+
LSUN bedroom model (lr=2e-5):
123+
124+
```bash
125+
MODEL_FLAGS="--image_size 256 --num_channels 128 --num_res_blocks 2 --num_heads 1 --learn_sigma True"
126+
DIFFUSION_FLAGS="--diffusion_steps 1000 --noise_schedule linear --rescale_learned_sigmas False --rescale_timesteps False"
127+
TRAIN_FLAGS="--lr 2e-5 --batch_size 128"
128+
```

0 commit comments

Comments
 (0)