Vista de Lectura

Hay nuevos artículos disponibles. Pincha para refrescar la página.

Meta Movie Gen (model not released)

Blog post: https://ai.meta.com/blog/movie-gen-media-foundation-models-generative-ai-video/

Samples: https://ai.meta.com/research/movie-gen/

Paper: https://ai.meta.com/static-resource/movie-gen-research-paper

  • 30B parameter model capable of generating videos and images
  • Video resolution 768x768 or similar amount of pixels in different aspect ratio
  • Video length 16sec
  • Blog mentions "potential future release" whatever that means
submitted by /u/rerri
[link] [comments]

CogvideoXfun Pose is insanely powerful

CogvideoXfun Pose is insanely powerful

cinematic, beautiful, in the street of a city, a red car is moving towards the camera

cinematic, beautiful, in the street of a city, a red car is moving towards the camera

cinematic, beautiful, in a park, in the background a samoyedan dog is moving towards the camera

After some initial bad results, I decided to give Cogvideoxfun Pose a second opportunity, this time using some basic 3D renders as Control... And oooooh boy, this is impressive. The basic workflow is in the ComfyUI-CogVideoXWrapper folder, and you can also find it here:

https://github.com/kijai/ComfyUI-CogVideoXWrapper/blob/main/examples/cogvideox_fun_pose_example_01.json

These are tests done with Cogvideoxfun-2B at low resolutions and with a low number of steps, just to show how powerful this technique is.

cinematic, beautiful, in a park, a samoyedan dog is moving towards the camera

NOTE: Prompts are very important; poor word order can lead to unexpected results. For example

cinematic, beautiful, a beautiful red car in a city at morning

submitted by /u/Striking-Long-2960
[link] [comments]

How to train hyper inpaint (SD1.5) model?

As you're aware, when training the vanilla in-painting model using SD1.5, we can utilize the diffusers pipeline and scripts, which typically require at least 20 steps to achieve optimal image generation results. However, hyper-inpainting models, such as this one link, only need around 4–6 steps to produce the same results. Both the vanilla and hyper-inpainting models consume similar resources and memory. I would like to know if the training process for hyper models differs from that of vanilla models, and how I can fine-tune a vanilla in-painting model to generate images in fewer steps during the inference phase, similar to the hyper in-painting model.

submitted by /u/love_ai_
[link] [comments]

Is it possible to preserve an actor's appearance (LoRA) when adding cinematic LoRAs in Flux?

Is it possible to preserve an actor's appearance (LoRA) when adding cinematic LoRAs in Flux?

https://preview.redd.it/6abzabqlgpsd1.jpg?width=1280&format=pjpg&auto=webp&s=cc150df0b8bc021db9e2cff65532f51a6239f526

https://preview.redd.it/o2y9agqlgpsd1.png?width=1792&format=png&auto=webp&s=b341a35e30c4506f242d8559b00f38a6d1add2b4

https://preview.redd.it/tctm7dplgpsd1.jpg?width=1280&format=pjpg&auto=webp&s=7ecfe6dc1cce138502a2dc39a438c17aab103bc8

Hi everyone!

I'm facing a challenge while trying to use LoRAs that give a cinematic look to the image (like Anamorphic Lens, Color Grading, Cinematic Lighting).

These are the ones I'm currently using.

https://civitai.com/models/432586/cinematic-shothttps://civitai.com/models/587016/anamorphic-bokeh-special-effect-shallow-depth-of-field-cinematic-style-xl-f1d-sd15

At the same time, I want to use a LoRA with a well-known actor, such as Arnold Schwarzenegger. This is the actor LoRA I’m working with.

https://civitai.com/search/models?sortBy=models_v9&query=arnold

I’m generating images at a resolution of 1536 x 640.

The tricky part is that I want to achieve the highest possible likeness to the actor. I’m looking for a way to do this without creating the "uncanny valley" effect. Any ideas on how to approach this? For example, would upscaling again with just the face LoRA or doing a Face Swap help?

Thanks in advance for your help!

submitted by /u/zhigar
[link] [comments]

How do you deal with the object to background scale problem?

How do you deal with the object to background scale problem?

In my workflow, I input images of objects and its supposed to place it in the correct background according to the prompt. It does that but the scale is a problem. Example: The input is a 'milk bottle' and its supposed to be placed on the kitchen table. In the output the bottle is placed on the kitchen table but the bottle is just as big as the table, how do I solve this issue?

https://preview.redd.it/y32j57hgdqsd1.png?width=896&format=png&auto=webp&s=0bde145eaa3785d3bb828107716f091f555425e2

submitted by /u/coeus_koalemoss
[link] [comments]

I fixed Prodigy and made a function to modify the loss

Going straight to the point, I fixed the Prodigy main issue. With my fix, you can train the Unet and TEs for as long as you want without frying the TEs and undertraining the Unet. To use it, just get the code I submitted in a PR on Prodigy’s GitHub. I don’t know if they’ll accept it, so you’ll probably have to manually replace it in the venv.

https://github.com/konstmish/prodigy/pull/20

Edit: it's also possible to put a different LR in each network

About the loss modifier, I made it based on my limited knowledge of diffusion training and machine learning. It’s not perfect, it’s not the holy grail, but my trainings always turn out better when I use it.

Feel free to suggest ways to improve it.

For convenience, I replaced OneTrainer's min snr gamma function with my own, so all I need to do is activate msg and my function will take over.

https://github.com/sangoi-exe/sangoi-loss-function

I’m not going to post any examples here, but if anyone’s curious, I uploaded a training I did of my ugly face in the training results channel on the OT discord.

Edit:

To use the prodigy fix, get the prodigy.py here:

https://github.com/sangoi-exe/prodigy/tree/main/prodigyopt

and put it in this folder:

C:\your-trainer-folder\OneTrainer\venv\Lib\site-packages\prodigyopt\

That's it, all the settings in OT stay the same, unless you want to set different LRs for each network, because that's possible now.

To use my custom loss modifier, get the ModelSetupDiffusionLossMixin.py here:

https://github.com/sangoi-exe/sangoi-loss-function

and put it in this folder:

C:\your-trainer-folder\OneTrainer\modules\modelSetup\mixin

Then in the OT's UI, select MIN_SNR_GAMMA in the Loss Weight Function on training tab, and insert any positive value other than 0.

The value itself doesn't matter, it's just to get OT to trigger the conditionals to use the min snr gamma function, which now has my function in place.

There was a typo in the function name in the loss modifier file, I fixed it now, it was missing an underline in the name.

04/10/2024: there was another typo inside the function 😅

submitted by /u/isnaiter
[link] [comments]
❌