osfa
diff --git a/‎.gitignore
+3 b/‎.gitignore
+3
diff --git a/‎README.md
+76 b/‎README.md
+76
diff --git a/‎datasets/README.md
+37 b/‎datasets/README.md
+37
diff --git a/‎datasets/cifar10.py
+43 b/‎datasets/cifar10.py
+43
diff --git a/‎datasets/lsun_bedroom.py
+54 b/‎datasets/lsun_bedroom.py
+54
diff --git a/‎improved_diffusion/__init__.py
+3 b/‎improved_diffusion/__init__.py
+3
diff --git a/‎improved_diffusion/dist_util.py
+82 b/‎improved_diffusion/dist_util.py
+82
diff --git a/‎improved_diffusion/fp16_util.py
+76 b/‎improved_diffusion/fp16_util.py
+76
@@ -0,0 +1,3 @@
+.DS_Store
+__pycache__/
+
@@ -0,0 +1,76 @@
+# improved-diffusion
+
+This is the codebase for "Improved Denoising Diffusion Probabilistic Models".
+
+# Usage
+
+This section of the README walks through how to train and sample from a model.
+
+## Installation
+
+Clone this repository and navigate to it in your terminal. Then run:
+
+```
+pip install -e .
+```
+
+This should install the `improved_diffusion` python package that the scripts depend on.
+
+## Preparing Data
+
+The training code reads images from a directory of image files. In the [datasets](datasets) folder, we have provided instructions/scripts for preparing these directories for ImageNet, LSUN bedrooms, and CIFAR-10.
+
+For creating your own dataset, simply dump all of your images into a directory with ".jpg", ".jpeg", or ".png" extensions. If you wish to train a class-conditional model, name the files like "mylabel1_XXX.jpg", "mylabel2_YYY.jpg", etc., so that the data loader knows that "mylabel1" and "mylabel2" are the labels. Subdirectories will automatically be enumerated as well, so the images can be organized into a recursive structure (although the directory names will be ignored, and the underscore prefixes are used as names).
+
+The images will automatically be scaled and center-cropped by the data-loading pipeline. Simply pass `--data_dir path/to/images` to the training script, and it will take care of the rest.
+
+## Training
+
+To train your model, you should first decide some hyperparameters. We will split up our hyperparameters into three groups: model architecture, diffusion process, and training flags. Here are some reasonable defaults for a baseline:
+
+```
+MODEL_FLAGS="--image_size 64 --num_channels 128 --num_res_blocks 3"
+DIFFUSION_FLAGS="--diffusion_steps 4000 --noise_schedule linear"
+TRAIN_FLAGS="--lr 1e-4 --batch_size 128"
+```
+
+Here are some changes we experiment with, and how to set them in the flags:
+
+ * **Learned sigmas:** add `--learn_sigma True` to `MODEL_FLAGS`
+ * **Cosine schedule:** change `--noise_schedule linear` to `--noise_schedule cosine`
+ * **Reweighted VLB:** add `--use_kl True` to `DIFFUSION_FLAGS` and add `--schedule_sampler loss-second-moment` to  `TRAIN_FLAGS`.
+ * **Class-conditional:** add `--class_cond True` to `MODEL_FLAGS`.
+
+Once you have setup your hyper-parameters, you can run an experiment like so:
+
+```
+python scripts/image_train.py --data_dir path/to/images $MODEL_FLAGS $DIFFUSION_FLAGS $TRAIN_FLAGS
+```
+
+You may also want to train in a distributed manner. In this case, run the same command with `mpiexec`:
+
+```
+mpiexec -n $NUM_GPUS python scripts/image_train.py --data_dir path/to/images $MODEL_FLAGS $DIFFUSION_FLAGS $TRAIN_FLAGS
+```
+
+When training in a distributed manner, you must manually divide the `--batch_size` argument by the number of ranks. In lieu of distributed training, you may use `--microbatch 16` (or `--microbatch 1` in extreme memory-limited cases) to reduce memory usage.
+
+The logs and saved models will be written to a logging directory determined by the `OPENAI_LOGDIR` environment variable. If it is not set, then a temporary directory will be created in `/tmp`.
+
+## Sampling
+
+The above training script saves checkpoints to `.pt` files in the logging directory. These checkpoints will have names like `ema_0.9999_200000.pt` and `model200000.pt`. You will likely want to sample from the EMA models, since those produce much better samples.
+
+Once you have a path to your model, you can generate a large batch of samples like so:
+
+```
+python scripts/image_sample.py --model_path /path/to/model.pt $MODEL_FLAGS $DIFFUSION_FLAGS
+```
+
+Again, this will save results to a logging directory. Samples are saved as a large `npz` file, where `arr_0` in the file is a large batch of samples.
+
+Just like for training, you can run `image_sample.py` through MPI to use multiple GPUs and machines.
+
+You can change the number of sampling steps using the `--timestep_respacing` argument. For example, `--timestep_respacing 250` uses 250 steps to sample. Passing `--timestep_respacing ddim250` is similar, but uses the uniform stride from the [DDIM paper](https://arxiv.org/abs/2010.02502) rather than our stride.
+
+To sample using [DDIM](https://arxiv.org/abs/2010.02502), pass `--use_ddim True`.
@@ -0,0 +1,37 @@
+# Downloading datasets
+
+This directory includes instructions and scripts for downloading ImageNet, LSUN bedrooms, and CIFAR-10 for use in this codebase.
+
+## ImageNet-64
+
+To download unconditional ImageNet-64, go to [this page on image-net.org](http://www.image-net.org/small/download.php) and click on "Train (64x64)". Simply download the file and unzip it, and use the resulting directory as the data directory (the `--data_dir` argument for the training script).
+
+## Class-conditional ImageNet
+
+For our class-conditional models, we use the official ILSVRC2012 dataset with manual center cropping and downsampling. To obtain this dataset, navigate to [this page on image-net.org](http://www.image-net.org/challenges/LSVRC/2012/downloads) and sign in (or create an account if you do not already have one). Then click on the link reading "Training images (Task 1 & 2)". This is a 138GB tar file containing 1000 sub-tar files, one per class.
+
+Once the file is downloaded, extract it and look inside. You should see 1000 `.tar` files. You need to extract each of these, which may be impractical to do by hand on your operating system. To automate the process on a Unix-based system, you can `cd` into the directory and run this short shell script:
+
+```
+for file in *.tar; do tar xf "$file"; rm "$file"; done
+```
+
+This will extract and remove each tar file in turn.
+
+Once all of the images have been extracted, the resulting directory should be usable as a data directory (the `--data_dir` argument for the training script). The filenames should all start with WNID (class ids) followed by underscores, like `n01440764_2708.JPEG`. Conveniently (but not by accident) this is how the automated data-loader expects to discover class labels.
+
+## CIFAR-10
+
+For CIFAR-10, we created a script [cifar10.py](cifar10.py) that creates `cifar_train` and `cifar_test` directories. These directories contain files named like `truck_49997.png`, so that the class name is discernable to the data loader.
+
+The `cifar_train` and `cifar_test` directories can be passed directly to the training scripts via the `--data_dir` argument.
+
+## LSUN bedroom
+
+To download and pre-process LSUN bedroom, clone [fyu/lsun](https://github.com/fyu/lsun) on GitHub and run their download script `python3 download.py bedroom`. The result will be an "lmdb" database named like `bedroom_train_lmdb`. You can pass this to our [lsun_bedroom.py](lsun_bedroom.py) script like so:
+
+```
+python lsun_bedroom.py bedroom_train_lmdb lsun_train_output_dir
+```
+
+This creates a directory called `lsun_train_output_dir`. This directory can be passed to the training scripts via the `--data_dir` argument.
@@ -0,0 +1,43 @@
+import os
+import tempfile
+
+import torchvision
+from tqdm.auto import tqdm
+
+CLASSES = (
+    "plane",
+    "car",
+    "bird",
+    "cat",
+    "deer",
+    "dog",
+    "frog",
+    "horse",
+    "ship",
+    "truck",
+)
+
+
+def main():
+    for split in ["train", "test"]:
+        out_dir = f"cifar_{split}"
+        if os.path.exists(out_dir):
+            print(f"skipping split {split} since {out_dir} already exists.")
+            continue
+
+        print("downloading...")
+        with tempfile.TemporaryDirectory() as tmp_dir:
+            dataset = torchvision.datasets.CIFAR10(
+                root=tmp_dir, train=split == "train", download=True
+            )
+
+        print("dumping images...")
+        os.mkdir(out_dir)
+        for i in tqdm(range(len(dataset))):
+            image, label = dataset[i]
+            filename = os.path.join(out_dir, f"{CLASSES[label]}_{i:05d}.png")
+            image.save(filename)
+
+
+if __name__ == "__main__":
+    main()
@@ -0,0 +1,54 @@
+"""
+Convert an LSUN lmdb database into a directory of images.
+"""
+
+import argparse
+import io
+import os
+
+from PIL import Image
+import lmdb
+import numpy as np
+
+
+def read_images(lmdb_path, image_size):
+    env = lmdb.open(lmdb_path, map_size=1099511627776, max_readers=100, readonly=True)
+    with env.begin(write=False) as transaction:
+        cursor = transaction.cursor()
+        for _, webp_data in cursor:
+            img = Image.open(io.BytesIO(webp_data))
+            width, height = img.size
+            scale = image_size / min(width, height)
+            img = img.resize(
+                (int(round(scale * width)), int(round(scale * height))),
+                resample=Image.BOX,
+            )
+            arr = np.array(img)
+            h, w, _ = arr.shape
+            h_off = (h - image_size) // 2
+            w_off = (w - image_size) // 2
+            arr = arr[h_off : h_off + image_size, w_off : w_off + image_size]
+            yield arr
+
+
+def dump_images(out_dir, images, prefix):
+    if not os.path.exists(out_dir):
+        os.mkdir(out_dir)
+    for i, img in enumerate(images):
+        Image.fromarray(img).save(os.path.join(out_dir, f"{prefix}_{i:07d}.png"))
+
+
+def main():
+    parser = argparse.ArgumentParser()
+    parser.add_argument("--image-size", help="new image size", type=int, default=256)
+    parser.add_argument("--prefix", help="class name", type=str, default="bedroom")
+    parser.add_argument("lmdb_path", help="path to an LSUN lmdb database")
+    parser.add_argument("out_dir", help="path to output directory")
+    args = parser.parse_args()
+
+    images = read_images(args.lmdb_path, args.image_size)
+    dump_images(args.out_dir, images, args.prefix)
+
+
+if __name__ == "__main__":
+    main()
@@ -0,0 +1,3 @@
+"""
+Codebase for "Improved Denoising Diffusion Probabilistic Models".
+"""
@@ -0,0 +1,82 @@
+"""
+Helpers for distributed training.
+"""
+
+import io
+import os
+import socket
+
+import blobfile as bf
+from mpi4py import MPI
+import torch as th
+import torch.distributed as dist
+
+# Change this to reflect your cluster layout.
+# The GPU for a given rank is (rank % GPUS_PER_NODE).
+GPUS_PER_NODE = 8
+
+SETUP_RETRY_COUNT = 3
+
+
+def setup_dist():
+    """
+    Setup a distributed process group.
+    """
+    if dist.is_initialized():
+        return
+
+    comm = MPI.COMM_WORLD
+    backend = "gloo" if not th.cuda.is_available() else "nccl"
+
+    if backend == "gloo":
+        hostname = "localhost"
+    else:
+        hostname = socket.gethostbyname(socket.getfqdn())
+    os.environ["MASTER_ADDR"] = comm.bcast(hostname, root=0)
+    os.environ["RANK"] = str(comm.rank)
+    os.environ["WORLD_SIZE"] = str(comm.size)
+
+    port = comm.bcast(_find_free_port(), root=0)
+    os.environ["MASTER_PORT"] = str(port)
+    dist.init_process_group(backend=backend, init_method="env://")
+
+
+def dev():
+    """
+    Get the device to use for torch.distributed.
+    """
+    if th.cuda.is_available():
+        return th.device(f"cuda:{MPI.COMM_WORLD.Get_rank() % GPUS_PER_NODE}")
+    return th.device("cpu")
+
+
+def load_state_dict(path, **kwargs):
+    """
+    Load a PyTorch file without redundant fetches across MPI ranks.
+    """
+    if MPI.COMM_WORLD.Get_rank() == 0:
+        with bf.BlobFile(path, "rb") as f:
+            data = f.read()
+    else:
+        data = None
+    data = MPI.COMM_WORLD.bcast(data)
+    return th.load(io.BytesIO(data), **kwargs)
+
+
+def sync_params(params):
+    """
+    Synchronize a sequence of Tensors across ranks from rank 0.
+    """
+    for p in params:
+        with th.no_grad():
+            dist.broadcast(p, 0)
+
+
+def _find_free_port():
+    try:
+        s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
+        s.bind(("", 0))
+        s.setsockopt(socket.SOL_SOCKET, socket.SO_REUSEADDR, 1)
+        return s.getsockname()[1]
+    finally:
+        s.close()
@@ -0,0 +1,76 @@
+"""
+Helpers to train with 16-bit precision.
+"""
+
+import torch.nn as nn
+from torch._utils import _flatten_dense_tensors, _unflatten_dense_tensors
+
+
+def convert_module_to_f16(l):
+    """
+    Convert primitive modules to float16.
+    """
+    if isinstance(l, (nn.Conv1d, nn.Conv2d, nn.Conv3d)):
+        l.weight.data = l.weight.data.half()
+        l.bias.data = l.bias.data.half()
+
+
+def convert_module_to_f32(l):
+    """
+    Convert primitive modules to float32, undoing convert_module_to_f16().
+    """
+    if isinstance(l, (nn.Conv1d, nn.Conv2d, nn.Conv3d)):
+        l.weight.data = l.weight.data.float()
+        l.bias.data = l.bias.data.float()
+
+
+def make_master_params(model_params):
+    """
+    Copy model parameters into a (differently-shaped) list of full-precision
+    parameters.
+    """
+    master_params = _flatten_dense_tensors(
+        [param.detach().float() for param in model_params]
+    )
+    master_params = nn.Parameter(master_params)
+    master_params.requires_grad = True
+    return [master_params]
+
+
+def model_grads_to_master_grads(model_params, master_params):
+    """
+    Copy the gradients from the model parameters into the master parameters
+    from make_master_params().
+    """
+    master_params[0].grad = _flatten_dense_tensors(
+        [param.grad.data.detach().float() for param in model_params]
+    )
+
+
+def master_params_to_model_params(model_params, master_params):
+    """
+    Copy the master parameter data back into the model parameters.
+    """
+    # Without copying to a list, if a generator is passed, this will
+    # silently not copy any parameters.
+    model_params = list(model_params)
+
+    for param, master_param in zip(
+        model_params, unflatten_master_params(model_params, master_params)
+    ):
+        param.detach().copy_(master_param)
+
+
+def unflatten_master_params(model_params, master_params):
+    """
+    Unflatten the master parameters to look like model_params.
+    """
+    return _unflatten_dense_tensors(master_params[0].detach(), model_params)
+
+
+def zero_grad(model_params):
+    for param in model_params:
+        # Taken from https://pytorch.org/docs/stable/_modules/torch/optim/optimizer.html#Optimizer.add_param_group
+        if param.grad is not None:
+            param.grad.detach_()
+            param.grad.zero_()
Original file line number	Diff line number	Diff line change
`@@ -0,0 +1,3 @@`
	`1`	`+"""`
	`2`	`+Codebase for "Improved Denoising Diffusion Probabilistic Models".`
	`3`	`+"""`