Skip to content

disin7c9/Modified-Imagic-Stable-Diffusion-Pipeline

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

5 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Modified-Imagic-Stable-Diffusion-Pipeline

A modified version of Imagic pipeline with Stable Diffusion displaying progress. The original version was introduced in huggingface diffusers community examples.

Imagic

Imagic examples Figure 1: Examples of Imagic with Imagen.

How to use

I used Jupyter Notebook with NVIDIA RTX A4000. Imagic needs only 1 image & 1 prompt. Write a prompt and load a single image, then run. If "show_progress = False (default)" during training, it will takes about 1 hour.

Modifications

  1. The py file from github - huggingface/diffusers is noted that their pipeline is "modeled after the textual_inversion.py / train_dreambooth.py and the work of justinpinkney". However some code lines are equivalent to the contents of this page, but there are dummy or missing parts. So I edited them.

  2. A bit of efficiency improvement

  3. The pipeline samples images and displays them to show how things are going, if you want.

etc: I didn't edited any of the original annotations in py file.

Performances

Imagic examples Figure 2: Examples

The original pipeline with default settings prints awful images. On the other hand, the modified one produces better images, but quite disappointing. In trivial, The differences between the original and the modified are hyperparameters, schedulers, and the 1st modification.

Imagic examples Figure 3: Result of the original pipeline with default settings

Anyway, the core problem is that Imagic pipeline does not work well with Stable Diffusion, unlike Imagen. It means that if you try to fully preserve important characteristics in an image such as identity or face, then overfitting happens. Meanwhile, if you use a trick like controling diffusion steps to prevent such overfitting, then the prompt loses the power and only minor changes occur. e.g.

  1. a ginger cat => the same cat with gray or white fur: relatively easy;
  2. a cat => the same cat dressed like a musketeer or a medieval knight: maybe impossible or poor quality;

The reason is probably that Stable diffusion deals with the latent of an image in the semantic latent space, while Imagen dealing with an image in the pixel space.

Perhaps this is why the author's of Imagic paper only presented inanimate object examples for their applications of Stable Diffusion like this:

Imagic examples Figure 4: Examples of Imagic with Stable Diffusion.

References

Image sources

About

A modified version of Imagic pipeline with Stable Diffusion displaying progress. The original version was introduced in huggingface diffusers community examples.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors