Skip to content

Conversation

@bogeyturn
Copy link

@bogeyturn bogeyturn commented Feb 2, 2025

This is only a draft.

Goals

  • modular library(dont mix frontend & backend)
  • modern python/tensorflow 2
  • mask isn't based on color anymore
    • relates to use as a library so detection software can just give a mask as a ndarray

Issues/TODOS

  • improvements
    • make model class
    • compute mask on init
  • release action
  • make it pip installable
  • loss & optimizer/training
  • missing model
  • dataset generation/preprocessing for training
  • testing if I broke something
    • SNConv2D has no weights
    • bar model generation
    • mosaic model generation
    • training
      • it variable
      • optimizer variables
  • interface
    • missing server/cli
    • missing ui
      • upload
      • rgb selector
      • mask suffix file
      • change/create mask
      • (detector)(disabled button)
      • cancel tasks
      • better progress view
      • variation selector
      • is_moasic
      • output location
  • logger

@bogeyturn
Copy link
Author

bogeyturn commented Feb 2, 2025

#9 #5

@naphteine
Copy link
Collaborator

This seems very promising. Thank you for your work. The modular library is exactly what I had in my mind. As usual I will keep this open for improvements, discussion and testing. Take your time!

@bogeyturn
Copy link
Author

the release is for x86_64 on windows&linux. and aarch64 on Mac. I dont know how the support for x86_64 support for Mac is with tensorflow so I left it out

@bogeyturn
Copy link
Author

@naphteine the code is mosty converted to tf2.

im not quite sure what IT does or how it should be implemented. Also not quite sure about training. The training code might be incorrect. I was inspired by https://www.tensorflow.org/guide/keras/writing_a_training_loop_from_scratch. I dont know which format the dataset should have or how to generate it. Could you help me out with that?

@bogeyturn bogeyturn mentioned this pull request Feb 5, 2025
@bogeyturn
Copy link
Author

bogeyturn commented Feb 5, 2025

training should be working now, but there isn't any dataset generation code/preprocessing as far as ive seen in the original code. I didn't look at the dead code tho.

@bogeyturn
Copy link
Author

But it seems like it wasn't the model, but the preprocessing. I tried it again with the old Decensor class(only swapped out InpaintNN) and I got this result
mermaid_censored 0

and with Image.Resample.Nearest it generates this image
mermaid_censored 0

@bogeyturn
Copy link
Author

bogeyturn commented Feb 6, 2025

models
bar.keras.zip
mosaic.keras.zip

@naphteine
Copy link
Collaborator

Sorry I have been sick for the last few days. To be honest I'm not really an expert on this project or Tensorflow, but I will try to answer some of your questions.

the release is for x86_64 on windows&linux. and aarch64 on Mac. I dont know how the support for x86_64 support for Mac is with tensorflow so I left it out

That's okay. I believe having Windows, Linux and modern Mac releases are good enough for now. Also we had some people using Macs and someone would give a hand to support other architectures if it's much needed.

@naphteine the code is mosty converted to tf2.

im not quite sure what IT does or how it should be implemented. Also not quite sure about training. The training code might be incorrect. I was inspired by https://www.tensorflow.org/guide/keras/writing_a_training_loop_from_scratch. I dont know which format the dataset should have or how to generate it. Could you help me out with that?

Seems like you already solved the training problem but I'm answering just in case.

According to deeppomf, the original training code was a modified version of PEPSI code. When I tried it 2 years ago I ran into some problems as this project is in Tensorflow 1 too. You can also check this and this projects, which were referenced way back in first versions, just to have some ideas.

training should be working now, but there isn't any dataset generation code/preprocessing as far as ive seen in the original code. I didn't look at the dead code tho.

Good work again, thank you! I believe there isn't anything in the original code either. Original creator, deeppomf, didn't include any training code or resources they used when training the model, as far as I know. Only reference to it was in the FAQ document.

I had some plans to generate new models from scratch but it will take time to get some resources (I will need some storage). Also I believe we can convert and continue using old models using ONNX for now, but ideally we will need training code so any work is much appreciated.

But it seems like it wasn't the model, but the preprocessing. I tried it again with the old Decensor class(only swapped out InpaintNN) and I got this result !...

That seems like a bug from DCPv2 era, good catch. I will try to test these things out but it may take some time. Also I would like to learn about your opinion on PyTorch, which was suggested before.

I admire your effort, keep up the good work!

@bogeyturn
Copy link
Author

@naphteine the problem isn't the training, but the dataset. I created a train loop, but I dont know if it's working(I might have made a mistake in the discriminator). About training. there was a IT variable which calculates the alpha. It's not defined what the optimal value would be. What's missing from training is

  • testing if discriminator works
  • images to dataset
  • optimizer variables not converted
    there are some other todos in the code like "SSIM never used".

As there were no real training loop in the old repo I cant say if the old code would allow training or if this was broken in the first place. I won't implement any more training stuff. Converting an old model to a new keras model works & the lib part is ready. There are 5 todos, about stuff that can be improved or I didn't understand, but this isn't required for merging.

I will finish up the server today. Stuff like mask editor won't be implemented by me and the ui might be not as good looking as I am not a frontend dev. Also I dont know how I could design cancel tasks so this won't have a ui either. I didn't write that much input validation so if 2 images with the same file name are uploaded & a mask was used than this will cause an issue. Same goes for an unexpected save path. I am assuming that users won't forcefully try to break the ui

@bogeyturn
Copy link
Author

@naphteine im done. Its not complete as the todos above suggest, but the last features aren't that important and didn't exist before so...
if the release action works depends on git+https://github.com/Deepshift/DeepCreamPy.git@master#egg=lib-cream-py&subdirectory=lib-cream-py

What do you think

@naphteine
Copy link
Collaborator

naphteine commented Feb 9, 2025

@naphteine the problem isn't the training, but the dataset. I created a train loop, but I dont know if it's working(I might have made a mistake in the discriminator). About training. there was a IT variable which calculates the alpha. It's not defined what the optimal value would be.

I have no idea how they trained original data. But don't worry, you probably did everything possible without dataset. I will look into it when I get the datasets, will publish details. Gwern Danbooru 2018 dataset is ~2.5tb and 2020 is 3.4tb, so it will take time (and I'm not that dedicated person tbh)

@naphteine im done. Its not complete as the todos above suggest, but the last features aren't that important and didn't exist before so... if the release action works depends on git+https://github.com/Deepshift/DeepCreamPy.git@master#egg=lib-cream-py&subdirectory=lib-cream-py

What do you think

I'm planning to check the codes this week, I'm still not feeling great (i'm still sick) so I can't tell exactly when it will be done. But without looking at the codes, it seems all set. My only worry was loss of capability with new UI but you even added useful new features. I just need to check it all on my own, test if it all works then I will gladly merge this with the base.

@bogeyturn
Copy link
Author

@naphteine that would be great. the ui isn't great. Id say the server is alright. The ui isn't that good. it does what it's supposed to, but it lacks some features & could look better. I am not a web developer so I didn't put much work into the ui. btw I am thinking about training models with other inpainting projects, but I dont have a gpu that can be used & I am lacking in time right now. Would you consider helping in training? I will create an issue about that anyway

@naphteine
Copy link
Collaborator

naphteine commented Feb 13, 2025

@bogeyturn last time I checked AMD GPUs were having problems so mine might not work as well, but I can request from my friends. I will try to help you unless someone else shows up. For the code aspect I'm very bad with both Python and Tensorflow, so I can't promise anything (hence the status of this repo)

I can handle the frontend part. It won't be much but I will update any part I feel needs changes.

Btw, I had some problems even installing the requirements on my Arch system, can you share some steps you took? (pip requiring versions for libraries, also failing to fetch lib-cream-py, i will try again on Windows tomorrow)

@bogeyturn
Copy link
Author

@naphteine neither do I. I am trying to learn machine learning right now. thats why I was asking and quite unsure if I did it right. I know a bit about the basics and learned a bit of syntax when converting the code. The goal is to build onto existing projects. lama & stable diffusion only need to be configured. Writing a wrapper around these projects isn't an issue. The biggest problem is to get a good dataset. As for training data I would either scrape some from https://exhentai.org/?f_cats=705&f_search=other%3Auncensored%24+other%3A%22full+color%24%22 and https://huggingface.co/datasets?sort=trending&search=+danbooru. These need to be cleaned tho. Bad art and black and white images would be an issue. Training won't be an issue as it is described here https://github.com/advimman/lama or https://github.com/nickyisadog/latent-diffusion-inpainting

I build the project on Mac(another reason why I didn't touch training ;D):

❯ python3 -m venv venv
❯ source venv/bin/activate
❯ pip install -e ./lib-cream-py

The installing from git cant work right now as it isn't tracked by the master branch. The local package needs to be installed manually.

If you want to change the code I would recommend to write tests or just use if name == ...
The dep it uses are:
"keras",
"tensorflow",
"numpy",
"Pillow"
but if you just install lib-cream-py as a package it will auto install these dep.

I can try it out on my desktop(arch) and see if it works.

@bogeyturn
Copy link
Author

bogeyturn commented Feb 14, 2025

I tested it on arch and I actually did run into an issue, but it's not my fault. the python version it used was python3.13 which has no tensorflow releases. I got around it by using an older python version.

Here is the issue tensorflow/tensorflow#78774

I guess you could get around it by building the tensorflow package yourself, but it seems like the max supported version is about 1 minor version behind.

Python releases

how I got it working

yay -S python312
python3.12 -m venv venv
source venv/bin/activate
pip install -e lib-cream-py
pip install fastapi uvicorn
python server/main.py

Instead of the 2 pip installs server/requirements.txt can be used in the future. I wrote it manually, because installing from git doesnt work as long as it isn't on master.
To summarize, dont use the latest python version. python 3.14 is only a pre-release & the start of the 3.13 support is only 2024-10-07 so its understandable that there isn't a release of tensorflow for 3.13.

Also I got some warnings from tensorflow. I didn't test if it used my gpu. It inpainted, but maybe we need to check if the gpu is enabled & if not how to manually enable that. As I reinstalled arch recently it could be a missing driver tho ;D

@bogeyturn
Copy link
Author

obv u need to download & unzip the models first and put it into the models folder. for more info use python server/main.py -h. The file needed to be zipped bc GitHub doesnt allow .keras files to be uploaded,

@bogeyturn
Copy link
Author

I would recommend to rebuild the models for the final release. I am not quite sure if I changed something after building the models & I dont know if I rebuilt the newer or the older models. Github limits the uploads in issues to 25mb so I can only upload a single model in a zip. It's not like it's hard to convert the model. migrate_weights takes the old model path as input and InpaintNN takes the new model path as input.

using the python shell:

>>> import lib_cream_py
>>> model = InpaintNN("./new-07-08-2019-DCPv2-model/bar.keras", create_model=True)
>>> model.train(0, [], "./temp/checkpoints")
>>> model.migrate_weights("07-08-2019 DCPv2 model/bar/Train_775000")

@naphteine
Copy link
Collaborator

I know this has been waiting longer than it should, but I'm in really tight schedule right now and won't be available next month. I can merge it directly without checking or wait for April and do some testing/improvements. Sorry for wait. What's your opinion @bogeyturn ?

@naphteine
Copy link
Collaborator

I'm finally back and will look into this, but no promises I'm not good with Python & Tensorflow might end up just merging it.

@bogeyturn
Copy link
Author

you could do some ui stuff. I think it's acceptable, but something like a mask creator online would be great and some general fine tuning

@bogeyturn
Copy link
Author

oh and docker prob. should be straight forward tho. just install python, npm, node and forward the gpu.

services:
  app:
    image: your-image
    deploy:
      resources:
        reservations:
          devices:
            - capabilities: [gpu]

@bogeyturn
Copy link
Author

bogeyturn commented May 19, 2025

I will look into integrating hent-ai into the project. maybe also as a lib which will be used by the server.

I cant say that I am a fan of the implementation, as it seems to be downscaling the images, upscaling them as preprocessing and than just using Mask_RCNN for detection

@bogeyturn
Copy link
Author

bogeyturn commented May 31, 2025

@naphteine do you have a running version of hent-ai(not the exe). Kinda annoying I dont have a amd64 system available rn and I need a summary how each variable is named in which layer. I migrated the code to use the latest versions(the project used a popular nn that was rewritten with tf2, so the only thing to do now is clean up the mess what calls itself a preprocessor which isnt a big deal and migrate the model which is more or less a nightmare. I think it's quite easy if I get the old model to run and can print out the model structure(each weight). Before I update the preprocessing I want to get the detection to work

@naphteine
Copy link
Collaborator

sorry, i don't have a running version right now, but i'm on amd64 and will look into it in my free time if that's okay

@bogeyturn
Copy link
Author

@naphteine I failed to install the version on arch. kinda annoying. I will retry it the next days with docker on my amd64 machine, but its such a pain

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants