Vista Normal

Hay nuevos artículos disponibles. Pincha para refrescar la página.
Ayer — 30 Enero 2025IA

A realistic cave painting lora for all your misinformation needs

29 Enero 2025 at 15:25
A realistic cave painting lora for all your misinformation needs

You can try it out on tensor (or just download it from there), I didn't know Tensor was blocked but it's there under Cave Paintings.

If you do try it, for best results try to base your prompts on these, https://www.bradshawfoundation.com/chauvet/chauvet_cave_art/index.php

Best way is to paste one of them to your fav ai buddy and ask him to change it to what you want.

Lora weight works best at 1, but you can try +/-0.1, lower makes your new addition less like cave art but higher can make it barely recognizable. Same with guidance 2.5 to 3.5 is best.

submitted by /u/soitgoes__again
[link] [comments]

Multiple RNG Generation

30 Enero 2025 at 12:43

So, as always I am at the cutting edge of SD info, so I only recently found out you can have random options in a prompt.

Just for Gits & Shiggles, I tried a prompt with containing 10 random options. It could be useful if you have prompters block ;)

Here's the prompt -> one {Happy | Sad | Angry | Scared} , {man | woman | child} with {long | Short | Updo | Permed} , {Ginger | Blond | Black | Brown} hair, wearing {dark | bright} , {casual | formal | Sports | Evening} clothing, looking {up | Down | Left | Right}, infront of a {wall | Bush | Shop | building} , {neon| Bright | Dim} lighting , {day | night}

Example - https://imgur.com/a/SKPuwD9

submitted by /u/DJSpadge
[link] [comments]

Need help making video ad

30 Enero 2025 at 13:26

(Posting this here, because people here are more talented than some other freelance platform).I’m looking to hire a freelance AI Video Generation Specialist who’s experienced with ComfyUI, ControlNet, and other AI tools to create videos similar to Varun Mayya’s style. This is a consultancy-based role (hourly pay), but full-time is also an option if you’re interested. If you’re up-to-date with the latest AI image/video generation models and can create stunning video presentations, please DM me or reply with your portfolio and rate. Let’s make something awesome!

submitted by /u/Kitchen_Court4783
[link] [comments]

Need advice on Flux Lora settings or on my Dataset

30 Enero 2025 at 14:31
Need advice on Flux Lora settings or on my Dataset

I could use some help with finding the right parameters for training my Lora with Flux.
Eventually I need to create a Lora for multiple objects, but for now I'm trying with just one (it does have an open and closed version though).
In this post I'll share images of the object that need to be, results, my training set, config parameters, captioning.
Any advice is welcome as I'm pretty new to Lora training with Flux.

This is the object that I'm trying to reproduce with a Lora.

Object closed

Object open

And these are the best results so far (I handpicked the best, I also get some unusable results):

https://preview.redd.it/c5mv98zk35ge1.jpg?width=1024&format=pjpg&auto=webp&s=e125f16ee671b5364ac77bff17fcaf5940513ead

https://preview.redd.it/zr7o47zk35ge1.jpg?width=1024&format=pjpg&auto=webp&s=5abf98ecc5c7253ea1ea5889e51205ea4de1bf2c

https://preview.redd.it/mewrp7zk35ge1.jpg?width=1024&format=pjpg&auto=webp&s=d979fb86c15fab0492e6948fba9b07bfad248aeb

https://preview.redd.it/fmood8zk35ge1.jpg?width=1024&format=pjpg&auto=webp&s=d480ceae6031b3eb3cb273a21a49456fb8ac735d

I used the guide from this repo: https://github.com/geocine/flux to have an environment to train my Lora in.

I'm using this for my config.yaml

trigger: "l3g4m4st3r_g3ck0_w0rksh0ps3t"

prompts:

- "an open [trigger] hanging on a whiteboard in an office space"

- "a [trigger] nicely placed on top of a wooden table"

- "[trigger] being used in a classroom"

- "a closed [trigger] attached to a whiteboard"
max_step_saves_to_keep: 24

This is the output config.yaml

job: extension

config:

name: lora

process:

- type: sd_trainer

training_folder: /workspace/output

device: cuda:0

trigger_word: l3g4m4st3r_g3ck0_w0rksh0ps3t

network:

type: lora

linear: 16

linear_alpha: 16

save:

dtype: float16

save_every: 200

max_step_saves_to_keep: 24

datasets:

- folder_path: /workspace/data

caption_ext: txt

caption_dropout_rate: 0.05

shuffle_tokens: false

cache_latents_to_disk: true

resolution:

- 512

- 768

- 1024

train:

batch_size: 1

steps: 3600

gradient_accumulation_steps: 1

train_unet: true

train_text_encoder: false

content_or_style: balanced

gradient_checkpointing: true

noise_scheduler: flowmatch

optimizer: adamw8bit

lr: 0.0004

skip_first_sample: true

ema_config:

use_ema: true

ema_decay: 0.99

dtype: bf16

model:

name_or_path: black-forest-labs/FLUX.1-dev

is_flux: true

quantize: true

sample:

sampler: flowmatch

sample_every: 200

width: 1024

height: 1024

prompts:

- an open [trigger] hanging on a whiteboard in an office space

- a [trigger] nicely placed on top of a wooden table

- '[trigger] being used in a classroom'

- a closed [trigger] attached to a whiteboard

neg: ''

seed: 42

walk_seed: true

guidance_scale: 4

sample_steps: 20

meta:

name: lora

version: '1.0'

Here are some images that I have used to train with

images dataset

And here are some prompts I have used together with the images

A whiteboard wall with a closed [trigger] attached to it.

A white whiteboard wall with a sign that reads "Think Outside the Box" and a open [trigger] attached to it.

A whiteboard with a red boarder and an open [trigger] attached to it. Above the [trigger] on the whiteboard is a blue and white sticker that reads "ROOM". To the left of the board is a stand with papers on it, and the floor is covered with a carpet.

A whiteboard with a red border on a ledge leaning against a partially green and orange wall with white text on it. On the whiteboard, there is a closed [trigger] attached to it. In front of the whiteboard, on the ledge, is a bottle of whiteboard eraser fluid. The red border of the whiteboard also has some whiteboard markers placed into it. On the right side there is a smaller whiteboard with a gray border leaning against the bigger whiteboard.

A whiteboard with a red border and a green wall behind it. On the whiteboard, there is a open [trigger], as well as a bottle and other smaller whiteboard with a gray border in front of the board.

A white wall with a variety of items on it, including marker holders, whiteboard markers, a closed [trigger], 4 whiteboard erasers in different colors and 2 wooden objects.

A wooden stand with a closed [trigger] sitting on top of it. The background is a white wall.

A wooden stand with a closed [trigger] on top of it standing in front of a white wall.

Any help is truly appreciated!

submitted by /u/Suspicious_Tutor6015
[link] [comments]

Is there a way to extract faces from a faceswap safetensors file?

30 Enero 2025 at 14:22

I have a few different faceswap models that I compiled using FaceSwapLab. In the last few years, I vaguely remember using some extension that allowed me to drop my safetensors file into it and it would display all the training images used to create it. Does this actually exist, or am I hallucinating? Thanks for any help :)

submitted by /u/omg_can_you_not
[link] [comments]

Looking for an alternative to Viggle Ai

30 Enero 2025 at 14:19

I'm by no means an expert regarding AI stuff, so please correct me if I get something wrong; that way I can learn.

Abstract, sort of

Recently I've been seeing a lot of videos with a watermark referencing the company ViggleAi.

ViggleAI offers a service which lets you replace characters in videos. To do this, it will ask for two things:

  1. a video

they call it the "motion"

  1. an image

the reference character that replaces one in the video/motion

What is so mind blowing to me is how smooth and coherent their output is. I mean sure, it does get jittery sometimes, but generally, the motion and even the facial expressions all look incredibly smooth.

I'm curious on how this is done and if there is any open source solution offering something similar.

For example

I know this example is kind of immature and all of that, but it's the best one that I have found

How do they do this?

I've been looking at https://huggingface.co/models, but I have found no model that could do this out of the box.

I was thinking that perhaps they process the video frame by frame with a model like Leffa or SDXL-inpaint, but if each frame is processed with no context of the surrounding frames, must that not cause a lot of incoherency? I certainly believe so.

Perhaps the video is segmented into masks of objects and then this is used to map the input character to a character in the video? Sounds to me like quite a reasonable way to go about it.

Let's say we have a mask of a character that stretches throughout a certain scene. Now what? Well I'm not too sure to be honest.

What strikes me as challenging is:

  • how the consistency is kept with surrounding frames in mind
  • how to go about handling scenarios where e.g the character is turned around with the back facing the camera

How can the orientation be predicted from a single image? Is a 3d-representation of the input character created and mapped to a character's mask from the video?

Anyone got any advice?

submitted by /u/drippedoutong
[link] [comments]
❌
❌