[2021 ICLR] Denoising Diffusion Implicit Models #202

Jasonlee1995 · 2024-05-30T04:31:11Z

DDPM = Denoising Diffusion Probabilistic Model
DDIM = Denoising Diffusion Implicit Model

DDPM은 adversarial training 없이 high quality image를 생성할 수 있지만, 생성 속도가 느리다는 치명적인 단점이 있음

DDPM은 Markov chain을 simulate하여 sample을 생성하기에 생성 속도가 느림
(generative process approximates the reverse of the forward diffusion process)

생성 속도를 높이기 위해, 해당 논문은 DDIM을 제안함

DDIM은 DDPM과 똑같은 objective function으로 학습하지만, DDPM과 다르게 implicit probabilistic model임
(implicit generative model : use a latent variable and transform it using a deterministic function)

논문의 main contribution 3가지는 다음과 같음

non-Markovian forward process를 정의한 뒤, 이에 대한 variational training objective가 DDPM의 surrogate objective와 같음을 보임
DDIM : non-Markovian forward process에서의 random noise를 0으로 설정하여 deterministic함
DDIM은 DDPM보다 sample efficiency가 좋으며, DDPM과 달리 latent와 image간의 consistency가 존재함

DDPM으로 학습한 모델이 사실은 non-Markovian forward process도 학습하고 있었다라는 finding + deterministic한 DDIM은 sample efficiency, consistency가 존재한다는게 논문의 전부라고 생각하면 됨

중요하다고 생각되는 부분만 간단히 요약

1. Variational Inference for Non-Markovian Forward Process

background

T가 클수록 reverse process가 Gaussian에 가까워지기에, diffusion model에서 large T value를 선택하는 것이 중요함
DDPM에서는 T=1000을 사용
그러나 T iterations를 sequential하게 수행해야 1개의 sample을 생성할 수 있음

Intuition
저자들은 DDPM objective $L_{\gamma}$가 $q(\mathbf{x}_{t} | \mathbf{x}_{0})$에만 depend하는 것을 보고, non-Markovian한 forward process를 생각하게 되었다고 함
(only depends on $q(\mathbf{x}_{t} | \mathbf{x}_{0})$, but not directly on $q(\mathbf{x}_{1:T} | \mathbf{x}_{0})$ )

1.1. Non-Markovian forward processes

mean function detail

여기서 하는 작업은 같은 objective가 나올 수 있도록 non-Markovian forward process를 하나 잘 만들어주는 것임
Markov chain을 제외하고는 기존 성질들을 많이 따라가도록 setup해줌

1.2. Generative process and unified variational inference objective

완벽하게 증명한 건 아니긴한데, 결국은 mean 맞추는 term이 나오게 되면서 DDPM과 같은 form이 나온다고 이해하면 됨
더 자세한 증명은 논문 Appendix B에 나와있음

결국 해당 section에서 말하고자 하는 바는 다음과 같음
Theorem 1이 의미하는 바는 non-Markovian forward process의 variational objective $J_{\sigma}$가 choice $\sigma$와 상관없이 DDPM의 variational objective $L_{\gamma}$로 표현 가능하다는 것임
loss의 형태가 같기에, DDPM도 사실상 non-Markovian forward process가 내포되어 있다는 것을 의미함

model parameter가 timestep t간 share하지 않는다면, DDPM에서 사용한 simplified objective $L_{1}$의 당위성을 주장함
(model parameter가 share하지 않는다면 각 term을 maximize하면 되기에 $L_{\gamma}$가 $L_{1}$로 simplify 가능)
그러나 실제로는 model parameter는 timestep t간 share하기에, 그냥 상상의 시나리오다...라고만 생각하면 됨
저자의 openreview 답을 봐도 뭐 딱히... 참고할게 없달까나?

2. Sampling from Generalized Generative Processes

위에서 이야기한 바를 다시 한번 정리하면...

DDPM에서 사용한 simplified objective으로 학습한 모델은, Markovian forward process에 대한 generation process만 학습한 것이 아니라 $\sigma$로 parameterized된 non-Markovian forward process에 대한 generation process도 학습했다는 것임

따라서 pre-trained DDPM 모델을 그대로 사용해도 됨

2.1. Denoising diffusion implicit models

generation process : $p_{\theta}^{(t)}(\mathbf{x}_{t-1} | \mathbf{x}_t)$
$\mathbf{x}_t$가 주어졌을 때 $\mathbf{x}_0$를 먼저 예측하고, 이를 바탕으로 $\mathbf{x}_{t-1}$을 예측하면 위와 같이 표현 가능

이때 $\sigma_t = 0$이라면, forward process는 deterministic함
($\mathbf{x}_0, \mathbf{x}_{t}$가 주어졌을 때 $\mathbf{x}_{t-1}$은 deterministic하게 정해짐)
따라서, $\sigma_t = 0$인 모델은 implicit probabilistic model이 됨
저자들은 이를 denoising diffusion implicit model (DDIM)이라고 명칭
(implicit probabilistic model trained with DDPM objective, despite the forward process no longer being a diffusion)

2.2. Accelerated generation process

위 section의 결과로부터, DDPM의 objective로 학습해도 $q(\mathbf{x}_t | \mathbf{x}_{t-1})$ 형태에 의존하지 않아도 됨을 알게 됨
($q(\mathbf{x}_t | \mathbf{x}_{0})$만 잘 정의되면 됨)

생성 속도를 높이기 위해, forward process에서 T보다 작은 length를 가지는 sub-sequence를 선택하고
reversed sub-sequence인 sampling trajectory를 이용하여 latent로부터 sample을 생성해줌

이때 sub-sequence를 어떻게 선택할 지에 대해서 논문에서는 linear, quadratic 2가지 방식을 사용함

2.3. Relevance to neural ODEs

DDIM을 particular ordinary differential equations (ODEs)를 solve하는 Euler integration으로 볼 수 있음
해당 식을 통해 discretization step이 충분하다면, generation process를 reverse할 수 있음
조금만 더 풀어서 설명하면, DDPM과 달리 DDIM은 observation이 주어졌을 때 $\mathbf{x}_T$라는 형태의 encoding을 구할 수 있음

기존 연구인 Score-based generative modeling through stochastic differential equations와 다른 점과 비교하면
기존 연구 : Euler steps with respect to $dt$
DDIM : Euler steps with respect to $d\sigma(t)$ → depend less directly on the scaling of time $t$

3. Experiments

setup detail

pre-trained DDPM을 그대로 이용하되 iteration step $\tau$, stochasticity $\sigma$를 조절
DDPM에서 사용했던 $\sigma$들을 비교 대상으로 잡음

3.1. Sample quality and efficiency

Table 1
iteration step이 많을수록 sample quality가 좋음 → trade-off between sample quality and computational costs
DDIM은 iteration step이 작아도 성능이 좋음 → sample efficient

Figure 3
DDPM에서 사용한 $\sigma$ + small iteration step → sample quality deteriorates rapidly

Figure 4
sample trajectory의 length가 커질수록 sample 생성 시간도 linear하게 늘어남

3.2. Sample consistency in DDIMs

Figure 5
$\mathbf{x}_T$를 고정해두고 다양한 trajectory로 image 생성
generative trajectory에 상관 없이, 대부분의 high-level features는 유사함
이를 통해 $\mathbf{x}_T$가 image에 대한 informative latent encoding임을 알 수 있음

3.3. Interpolation in deterministic generative processes

Figure 6
$\mathbf{x}_T$에 대해 linear interpolation → semantically meaningful interpolations between two samples
즉, DDIM은 latent variables를 control하여 image generation이 가능함

3.4. Reconstruction from latent space

Table 2
DDIM은 particular ODE에 대한 Euler integration으로 볼 수 있음
이를 확인해보기 위해 $\mathbf{x}_0$로부터 $\mathbf{x}_T$를 encode하고, 구한 $\mathbf{x}_T$를 이용하여 $\mathbf{x}_0$를 reconstruct
DDIM은 lower construction error를 가지며 Neural ODEs, normalizing flows와 유사한 property를 가짐

The text was updated successfully, but these errors were encountered:

Jasonlee1995 added Generative Generative Modeling Vision Related with Computer Vision tasks labels May 30, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[2021 ICLR] Denoising Diffusion Implicit Models #202

[2021 ICLR] Denoising Diffusion Implicit Models #202

Jasonlee1995 commented May 30, 2024 •

edited

Loading

[2021 ICLR] Denoising Diffusion Implicit Models #202

[2021 ICLR] Denoising Diffusion Implicit Models #202

Comments

Jasonlee1995 commented May 30, 2024 • edited Loading

1. Variational Inference for Non-Markovian Forward Process

2. Sampling from Generalized Generative Processes

3. Experiments

Jasonlee1995 commented May 30, 2024 •

edited

Loading