Skip to content

Commit 4414d36

Browse files
authored
Merge branch 'master' into img2img-enhance
2 parents c5f9f7c + 955df77 commit 4414d36

20 files changed

+391
-141
lines changed

README.md

+10-11
Original file line numberDiff line numberDiff line change
@@ -13,9 +13,9 @@ A browser interface based on Gradio library for Stable Diffusion.
1313
- Prompt Matrix
1414
- Stable Diffusion Upscale
1515
- Attention, specify parts of text that the model should pay more attention to
16-
- a man in a ((tuxedo)) - will pay more attention to tuxedo
17-
- a man in a (tuxedo:1.21) - alternative syntax
18-
- select text and press ctrl+up or ctrl+down to automatically adjust attention to selected text (code contributed by anonymous user)
16+
- a man in a `((tuxedo))` - will pay more attention to tuxedo
17+
- a man in a `(tuxedo:1.21)` - alternative syntax
18+
- select text and press `Ctrl+Up` or `Ctrl+Down` to automatically adjust attention to selected text (code contributed by anonymous user)
1919
- Loopback, run img2img processing multiple times
2020
- X/Y/Z plot, a way to draw a 3 dimensional plot of images with different parameters
2121
- Textual Inversion
@@ -28,7 +28,7 @@ A browser interface based on Gradio library for Stable Diffusion.
2828
- CodeFormer, face restoration tool as an alternative to GFPGAN
2929
- RealESRGAN, neural network upscaler
3030
- ESRGAN, neural network upscaler with a lot of third party models
31-
- SwinIR and Swin2SR([see here](https://github.com/AUTOMATIC1111/stable-diffusion-webui/pull/2092)), neural network upscalers
31+
- SwinIR and Swin2SR ([see here](https://github.com/AUTOMATIC1111/stable-diffusion-webui/pull/2092)), neural network upscalers
3232
- LDSR, Latent diffusion super resolution upscaling
3333
- Resizing aspect ratio options
3434
- Sampling method selection
@@ -46,7 +46,7 @@ A browser interface based on Gradio library for Stable Diffusion.
4646
- drag and drop an image/text-parameters to promptbox
4747
- Read Generation Parameters Button, loads parameters in promptbox to UI
4848
- Settings page
49-
- Running arbitrary python code from UI (must run with --allow-code to enable)
49+
- Running arbitrary python code from UI (must run with `--allow-code` to enable)
5050
- Mouseover hints for most UI elements
5151
- Possible to change defaults/mix/max/step values for UI elements via text config
5252
- Tiling support, a checkbox to create images that can be tiled like textures
@@ -69,7 +69,7 @@ A browser interface based on Gradio library for Stable Diffusion.
6969
- also supports weights for prompts: `a cat :1.2 AND a dog AND a penguin :2.2`
7070
- No token limit for prompts (original stable diffusion lets you use up to 75 tokens)
7171
- DeepDanbooru integration, creates danbooru style tags for anime prompts
72-
- [xformers](https://github.com/AUTOMATIC1111/stable-diffusion-webui/wiki/Xformers), major speed increase for select cards: (add --xformers to commandline args)
72+
- [xformers](https://github.com/AUTOMATIC1111/stable-diffusion-webui/wiki/Xformers), major speed increase for select cards: (add `--xformers` to commandline args)
7373
- via extension: [History tab](https://github.com/yfszzx/stable-diffusion-webui-images-browser): view, direct and delete images conveniently within the UI
7474
- Generate forever option
7575
- Training tab
@@ -78,11 +78,11 @@ A browser interface based on Gradio library for Stable Diffusion.
7878
- Clip skip
7979
- Hypernetworks
8080
- Loras (same as Hypernetworks but more pretty)
81-
- A sparate UI where you can choose, with preview, which embeddings, hypernetworks or Loras to add to your prompt.
81+
- A sparate UI where you can choose, with preview, which embeddings, hypernetworks or Loras to add to your prompt
8282
- Can select to load a different VAE from settings screen
8383
- Estimated completion time in progress bar
8484
- API
85-
- Support for dedicated [inpainting model](https://github.com/runwayml/stable-diffusion#inpainting-with-stable-diffusion) by RunwayML.
85+
- Support for dedicated [inpainting model](https://github.com/runwayml/stable-diffusion#inpainting-with-stable-diffusion) by RunwayML
8686
- via extension: [Aesthetic Gradients](https://github.com/AUTOMATIC1111/stable-diffusion-webui-aesthetic-gradients), a way to generate images with a specific aesthetic by using clip images embeds (implementation of [https://github.com/vicgalle/stable-diffusion-aesthetic-gradients](https://github.com/vicgalle/stable-diffusion-aesthetic-gradients))
8787
- [Stable Diffusion 2.0](https://github.com/Stability-AI/stablediffusion) support - see [wiki](https://github.com/AUTOMATIC1111/stable-diffusion-webui/wiki/Features#stable-diffusion-20) for instructions
8888
- [Alt-Diffusion](https://arxiv.org/abs/2211.06679) support - see [wiki](https://github.com/AUTOMATIC1111/stable-diffusion-webui/wiki/Features#alt-diffusion) for instructions
@@ -91,7 +91,6 @@ A browser interface based on Gradio library for Stable Diffusion.
9191
- Eased resolution restriction: generated image's domension must be a multiple of 8 rather than 64
9292
- Now with a license!
9393
- Reorder elements in the UI from settings screen
94-
-
9594

9695
## Installation and Running
9796
Make sure the required [dependencies](https://github.com/AUTOMATIC1111/stable-diffusion-webui/wiki/Dependencies) are met and follow the instructions available for both [NVidia](https://github.com/AUTOMATIC1111/stable-diffusion-webui/wiki/Install-and-Run-on-NVidia-GPUs) (recommended) and [AMD](https://github.com/AUTOMATIC1111/stable-diffusion-webui/wiki/Install-and-Run-on-AMD-GPUs) GPUs.
@@ -101,7 +100,7 @@ Alternatively, use online services (like Google Colab):
101100
- [List of Online Services](https://github.com/AUTOMATIC1111/stable-diffusion-webui/wiki/Online-Services)
102101

103102
### Automatic Installation on Windows
104-
1. Install [Python 3.10.6](https://www.python.org/downloads/windows/), checking "Add Python to PATH"
103+
1. Install [Python 3.10.6](https://www.python.org/downloads/windows/), checking "Add Python to PATH".
105104
2. Install [git](https://git-scm.com/download/win).
106105
3. Download the stable-diffusion-webui repository, for example by running `git clone https://github.com/AUTOMATIC1111/stable-diffusion-webui.git`.
107106
4. Run `webui-user.bat` from Windows Explorer as normal, non-administrator, user.
@@ -159,4 +158,4 @@ Licenses for borrowed code can be found in `Settings -> Licenses` screen, and al
159158
- Security advice - RyotaK
160159
- UniPC sampler - Wenliang Zhao - https://github.com/wl-zhao/UniPC
161160
- Initial Gradio script - posted on 4chan by an Anonymous user. Thank you Anonymous user.
162-
- (You)
161+
- (You)

extensions-builtin/Lora/lora.py

+169-34
Original file line numberDiff line numberDiff line change
@@ -2,20 +2,34 @@
22
import os
33
import re
44
import torch
5+
from typing import Union
56

67
from modules import shared, devices, sd_models, errors
78

89
metadata_tags_order = {"ss_sd_model_name": 1, "ss_resolution": 2, "ss_clip_skip": 3, "ss_num_train_images": 10, "ss_tag_frequency": 20}
910

1011
re_digits = re.compile(r"\d+")
11-
re_unet_down_blocks = re.compile(r"lora_unet_down_blocks_(\d+)_attentions_(\d+)_(.+)")
12-
re_unet_mid_blocks = re.compile(r"lora_unet_mid_block_attentions_(\d+)_(.+)")
13-
re_unet_up_blocks = re.compile(r"lora_unet_up_blocks_(\d+)_attentions_(\d+)_(.+)")
14-
re_text_block = re.compile(r"lora_te_text_model_encoder_layers_(\d+)_(.+)")
12+
re_x_proj = re.compile(r"(.*)_([qkv]_proj)$")
13+
re_compiled = {}
14+
15+
suffix_conversion = {
16+
"attentions": {},
17+
"resnets": {
18+
"conv1": "in_layers_2",
19+
"conv2": "out_layers_3",
20+
"time_emb_proj": "emb_layers_1",
21+
"conv_shortcut": "skip_connection",
22+
}
23+
}
24+
25+
26+
def convert_diffusers_name_to_compvis(key, is_sd2):
27+
def match(match_list, regex_text):
28+
regex = re_compiled.get(regex_text)
29+
if regex is None:
30+
regex = re.compile(regex_text)
31+
re_compiled[regex_text] = regex
1532

16-
17-
def convert_diffusers_name_to_compvis(key):
18-
def match(match_list, regex):
1933
r = re.match(regex, key)
2034
if not r:
2135
return False
@@ -26,16 +40,33 @@ def match(match_list, regex):
2640

2741
m = []
2842

29-
if match(m, re_unet_down_blocks):
30-
return f"diffusion_model_input_blocks_{1 + m[0] * 3 + m[1]}_1_{m[2]}"
43+
if match(m, r"lora_unet_down_blocks_(\d+)_(attentions|resnets)_(\d+)_(.+)"):
44+
suffix = suffix_conversion.get(m[1], {}).get(m[3], m[3])
45+
return f"diffusion_model_input_blocks_{1 + m[0] * 3 + m[2]}_{1 if m[1] == 'attentions' else 0}_{suffix}"
46+
47+
if match(m, r"lora_unet_mid_block_(attentions|resnets)_(\d+)_(.+)"):
48+
suffix = suffix_conversion.get(m[0], {}).get(m[2], m[2])
49+
return f"diffusion_model_middle_block_{1 if m[0] == 'attentions' else m[1] * 2}_{suffix}"
50+
51+
if match(m, r"lora_unet_up_blocks_(\d+)_(attentions|resnets)_(\d+)_(.+)"):
52+
suffix = suffix_conversion.get(m[1], {}).get(m[3], m[3])
53+
return f"diffusion_model_output_blocks_{m[0] * 3 + m[2]}_{1 if m[1] == 'attentions' else 0}_{suffix}"
3154

32-
if match(m, re_unet_mid_blocks):
33-
return f"diffusion_model_middle_block_1_{m[1]}"
55+
if match(m, r"lora_unet_down_blocks_(\d+)_downsamplers_0_conv"):
56+
return f"diffusion_model_input_blocks_{3 + m[0] * 3}_0_op"
3457

35-
if match(m, re_unet_up_blocks):
36-
return f"diffusion_model_output_blocks_{m[0] * 3 + m[1]}_1_{m[2]}"
58+
if match(m, r"lora_unet_up_blocks_(\d+)_upsamplers_0_conv"):
59+
return f"diffusion_model_output_blocks_{2 + m[0] * 3}_{2 if m[0]>0 else 1}_conv"
60+
61+
if match(m, r"lora_te_text_model_encoder_layers_(\d+)_(.+)"):
62+
if is_sd2:
63+
if 'mlp_fc1' in m[1]:
64+
return f"model_transformer_resblocks_{m[0]}_{m[1].replace('mlp_fc1', 'mlp_c_fc')}"
65+
elif 'mlp_fc2' in m[1]:
66+
return f"model_transformer_resblocks_{m[0]}_{m[1].replace('mlp_fc2', 'mlp_c_proj')}"
67+
else:
68+
return f"model_transformer_resblocks_{m[0]}_{m[1].replace('self_attn', 'attn')}"
3769

38-
if match(m, re_text_block):
3970
return f"transformer_text_model_encoder_layers_{m[0]}_{m[1]}"
4071

4172
return key
@@ -101,15 +132,22 @@ def load_lora(name, filename):
101132

102133
sd = sd_models.read_state_dict(filename)
103134

104-
keys_failed_to_match = []
135+
keys_failed_to_match = {}
136+
is_sd2 = 'model_transformer_resblocks' in shared.sd_model.lora_layer_mapping
105137

106138
for key_diffusers, weight in sd.items():
107-
fullkey = convert_diffusers_name_to_compvis(key_diffusers)
108-
key, lora_key = fullkey.split(".", 1)
139+
key_diffusers_without_lora_parts, lora_key = key_diffusers.split(".", 1)
140+
key = convert_diffusers_name_to_compvis(key_diffusers_without_lora_parts, is_sd2)
109141

110142
sd_module = shared.sd_model.lora_layer_mapping.get(key, None)
143+
111144
if sd_module is None:
112-
keys_failed_to_match.append(key_diffusers)
145+
m = re_x_proj.match(key)
146+
if m:
147+
sd_module = shared.sd_model.lora_layer_mapping.get(m.group(1), None)
148+
149+
if sd_module is None:
150+
keys_failed_to_match[key_diffusers] = key
113151
continue
114152

115153
lora_module = lora.modules.get(key, None)
@@ -123,15 +161,21 @@ def load_lora(name, filename):
123161

124162
if type(sd_module) == torch.nn.Linear:
125163
module = torch.nn.Linear(weight.shape[1], weight.shape[0], bias=False)
164+
elif type(sd_module) == torch.nn.modules.linear.NonDynamicallyQuantizableLinear:
165+
module = torch.nn.Linear(weight.shape[1], weight.shape[0], bias=False)
166+
elif type(sd_module) == torch.nn.MultiheadAttention:
167+
module = torch.nn.Linear(weight.shape[1], weight.shape[0], bias=False)
126168
elif type(sd_module) == torch.nn.Conv2d:
127169
module = torch.nn.Conv2d(weight.shape[1], weight.shape[0], (1, 1), bias=False)
128170
else:
171+
print(f'Lora layer {key_diffusers} matched a layer with unsupported type: {type(sd_module).__name__}')
172+
continue
129173
assert False, f'Lora layer {key_diffusers} matched a layer with unsupported type: {type(sd_module).__name__}'
130174

131175
with torch.no_grad():
132176
module.weight.copy_(weight)
133177

134-
module.to(device=devices.device, dtype=devices.dtype)
178+
module.to(device=devices.cpu, dtype=devices.dtype)
135179

136180
if lora_key == "lora_up.weight":
137181
lora_module.up = module
@@ -177,29 +221,120 @@ def load_loras(names, multipliers=None):
177221
loaded_loras.append(lora)
178222

179223

180-
def lora_forward(module, input, res):
181-
input = devices.cond_cast_unet(input)
182-
if len(loaded_loras) == 0:
183-
return res
224+
def lora_calc_updown(lora, module, target):
225+
with torch.no_grad():
226+
up = module.up.weight.to(target.device, dtype=target.dtype)
227+
down = module.down.weight.to(target.device, dtype=target.dtype)
184228

185-
lora_layer_name = getattr(module, 'lora_layer_name', None)
186-
for lora in loaded_loras:
187-
module = lora.modules.get(lora_layer_name, None)
188-
if module is not None:
189-
if shared.opts.lora_apply_to_outputs and res.shape == input.shape:
190-
res = res + module.up(module.down(res)) * lora.multiplier * (module.alpha / module.up.weight.shape[1] if module.alpha else 1.0)
229+
if up.shape[2:] == (1, 1) and down.shape[2:] == (1, 1):
230+
updown = (up.squeeze(2).squeeze(2) @ down.squeeze(2).squeeze(2)).unsqueeze(2).unsqueeze(3)
231+
else:
232+
updown = up @ down
233+
234+
updown = updown * lora.multiplier * (module.alpha / module.up.weight.shape[1] if module.alpha else 1.0)
235+
236+
return updown
237+
238+
239+
def lora_apply_weights(self: Union[torch.nn.Conv2d, torch.nn.Linear, torch.nn.MultiheadAttention]):
240+
"""
241+
Applies the currently selected set of Loras to the weights of torch layer self.
242+
If weights already have this particular set of loras applied, does nothing.
243+
If not, restores orginal weights from backup and alters weights according to loras.
244+
"""
245+
246+
lora_layer_name = getattr(self, 'lora_layer_name', None)
247+
if lora_layer_name is None:
248+
return
249+
250+
current_names = getattr(self, "lora_current_names", ())
251+
wanted_names = tuple((x.name, x.multiplier) for x in loaded_loras)
252+
253+
weights_backup = getattr(self, "lora_weights_backup", None)
254+
if weights_backup is None:
255+
if isinstance(self, torch.nn.MultiheadAttention):
256+
weights_backup = (self.in_proj_weight.to(devices.cpu, copy=True), self.out_proj.weight.to(devices.cpu, copy=True))
257+
else:
258+
weights_backup = self.weight.to(devices.cpu, copy=True)
259+
260+
self.lora_weights_backup = weights_backup
261+
262+
if current_names != wanted_names:
263+
if weights_backup is not None:
264+
if isinstance(self, torch.nn.MultiheadAttention):
265+
self.in_proj_weight.copy_(weights_backup[0])
266+
self.out_proj.weight.copy_(weights_backup[1])
191267
else:
192-
res = res + module.up(module.down(input)) * lora.multiplier * (module.alpha / module.up.weight.shape[1] if module.alpha else 1.0)
268+
self.weight.copy_(weights_backup)
193269

194-
return res
270+
for lora in loaded_loras:
271+
module = lora.modules.get(lora_layer_name, None)
272+
if module is not None and hasattr(self, 'weight'):
273+
self.weight += lora_calc_updown(lora, module, self.weight)
274+
continue
275+
276+
module_q = lora.modules.get(lora_layer_name + "_q_proj", None)
277+
module_k = lora.modules.get(lora_layer_name + "_k_proj", None)
278+
module_v = lora.modules.get(lora_layer_name + "_v_proj", None)
279+
module_out = lora.modules.get(lora_layer_name + "_out_proj", None)
280+
281+
if isinstance(self, torch.nn.MultiheadAttention) and module_q and module_k and module_v and module_out:
282+
updown_q = lora_calc_updown(lora, module_q, self.in_proj_weight)
283+
updown_k = lora_calc_updown(lora, module_k, self.in_proj_weight)
284+
updown_v = lora_calc_updown(lora, module_v, self.in_proj_weight)
285+
updown_qkv = torch.vstack([updown_q, updown_k, updown_v])
286+
287+
self.in_proj_weight += updown_qkv
288+
self.out_proj.weight += lora_calc_updown(lora, module_out, self.out_proj.weight)
289+
continue
290+
291+
if module is None:
292+
continue
293+
294+
print(f'failed to calculate lora weights for layer {lora_layer_name}')
295+
296+
setattr(self, "lora_current_names", wanted_names)
297+
298+
299+
def lora_reset_cached_weight(self: Union[torch.nn.Conv2d, torch.nn.Linear]):
300+
setattr(self, "lora_current_names", ())
301+
setattr(self, "lora_weights_backup", None)
195302

196303

197304
def lora_Linear_forward(self, input):
198-
return lora_forward(self, input, torch.nn.Linear_forward_before_lora(self, input))
305+
lora_apply_weights(self)
306+
307+
return torch.nn.Linear_forward_before_lora(self, input)
308+
309+
310+
def lora_Linear_load_state_dict(self, *args, **kwargs):
311+
lora_reset_cached_weight(self)
312+
313+
return torch.nn.Linear_load_state_dict_before_lora(self, *args, **kwargs)
199314

200315

201316
def lora_Conv2d_forward(self, input):
202-
return lora_forward(self, input, torch.nn.Conv2d_forward_before_lora(self, input))
317+
lora_apply_weights(self)
318+
319+
return torch.nn.Conv2d_forward_before_lora(self, input)
320+
321+
322+
def lora_Conv2d_load_state_dict(self, *args, **kwargs):
323+
lora_reset_cached_weight(self)
324+
325+
return torch.nn.Conv2d_load_state_dict_before_lora(self, *args, **kwargs)
326+
327+
328+
def lora_MultiheadAttention_forward(self, *args, **kwargs):
329+
lora_apply_weights(self)
330+
331+
return torch.nn.MultiheadAttention_forward_before_lora(self, *args, **kwargs)
332+
333+
334+
def lora_MultiheadAttention_load_state_dict(self, *args, **kwargs):
335+
lora_reset_cached_weight(self)
336+
337+
return torch.nn.MultiheadAttention_load_state_dict_before_lora(self, *args, **kwargs)
203338

204339

205340
def list_available_loras():
@@ -212,7 +347,7 @@ def list_available_loras():
212347
glob.glob(os.path.join(shared.cmd_opts.lora_dir, '**/*.safetensors'), recursive=True) + \
213348
glob.glob(os.path.join(shared.cmd_opts.lora_dir, '**/*.ckpt'), recursive=True)
214349

215-
for filename in sorted(candidates):
350+
for filename in sorted(candidates, key=str.lower):
216351
if os.path.isdir(filename):
217352
continue
218353

0 commit comments

Comments
 (0)