6 Seconds video In 60 Seconds in this quality is mind blowing!!! LTXV Distilled won my and my graphic cards heart 💖💝
![]() | I used this workflow someone posted here and replaced LLM node with LTXV prompt enhancer [link] [comments] |
![]() | I used this workflow someone posted here and replaced LLM node with LTXV prompt enhancer [link] [comments] |
![]() | I've been testing the new 0.9.6 model that came out today on dozens of images and honestly feel like 90% of the outputs are definitely usable. With previous versions I'd have to generate 10-20 results to get something decent. Workflow: I'm using the official workflow they've shared on github with some adjustments to the parameters + a prompt enhancement LLM node with ChatGPT (You can replace it with any LLM node, local or API) The workflow is organized in a manner that makes sense to me and feels very comfortable. [link] [comments] |
![]() | submitted by /u/smereces [link] [comments] |
![]() | New model – new AT-J LoRA https://civitai.com/models/1483540?modelVersionId=1678127 I think HiDream has a bright future as a potential new base model. Training is very smooth (but a bit expensive or slow... pick one), though that's probably only a temporary problem until the nerds finish their optimization work and my toaster can train LoRAs. It's probably too good of a model, meaning it will also learn the bad properties of your source images pretty well, as you probably notice if you look too closely. Images should all include the prompt and the ComfyUI workflow. Currently trying out training of such kind of models which would get me banned here, but you will find them on the stable diffusion subs for grown ups when they are done. Looking promising sofar! [link] [comments] |
![]() | submitted by /u/haofanw [link] [comments] |
With all the new stuff coming out I've been seeing a lot of posts and error threads being opened for various issues with cuda/pytorch/sage attantion/triton/flash attention. I was tired of digging links up so I initially made this as a cheat sheet for myself but expanded it with hopes that this will help some of you get your venvs and systems running smoothly.
To list all installed versions of Python on your system, open cmd and run:
py -0p
You can have multiple versions installed on your system. The version of Python that runs when you type python
is determined by the order of Python directories in your PATH variable. The first python.exe
found is used as the default.
Steps:
Environment Variables
, and select Edit system environment variables.Path
variable, then click Edit.C:\Users\<yourname>\AppData\Local\Programs\Python\Python310\
and its Scripts
subfolder) to the top of the list, above any other Python versions.Restart your command prompt and run:
python --version
It should now display your chosen Python version.
To see which CUDA version is currently active, run:
nvcc --version
Download and install from the official NVIDIA CUDA Toolkit page:
https://developer.nvidia.com/cuda-toolkit-archive
Install the version that you need. Multiple version can be installed.
env
in the Windows search bar.CUDA_PATH
.If it doesn't point to your intended CUDA version, change it. Example value:
C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v12.4
From this point to install any of these to a virtual environment you first need to activate it. For system you just skip this part and run as is.
Open a command prompt in your venv/python folder (folder name might be different) and run:
Scripts\activate
You will now see (venv)
in your cmd. You can now just run the pip commands as normal.
Make a versioncheck.py
file. Edit it with any text/code editor and paste the code below. Open a CMD to the root folder and run with: python
versioncheck.py
This will print the versions for torch, CUDA, torchvision, torchaudio, CUDA, Triton, SageAttention, FlashAttention. To use this in a VENV activate the venv first then run the script.
import torch import torchvision import torchaudio print("torch version:", torch.__version__) print("cuda version (torch):", torch.version.cuda) print("torchvision version:", torchvision.__version__) print("torchaudio version:", torchaudio.__version__) print("cuda available:", torch.cuda.is_available()) try: import flash_attn print("flash-attention version:", flash_attn.__version__) except ImportError: print("flash-attention is not installed or cannot be imported") try: import triton print("triton version:", triton.__version__) except ImportError: print("triton is not installed or cannot be imported") try: import sageattention print("sageattention version:", sageattention.__version__) except ImportError: print("sageattention is not installed or cannot be imported") except AttributeError: print("sageattention is installed but has no __version__ attribute")
This will print the versions for torch, CUDA, torchvision, torchaudio, CUDA, Triton, SageAttention, FlashAttention.
torch version: 2.6.0+cu126 cuda version (torch): 12.6 torchvision version: 0.21.0+cu126 torchaudio version: 2.6.0+cu126 cuda available: True flash-attention version: 2.7.4 triton version: 3.2.0 sageattention is installed but has no version attribute
Use the official install selector to get the correct command for your system:
Install PyTorch
To install Triton for Windows, run:
pip install triton-windows
For a specific version:
pip install triton-windows==3.2.0.post18
Triton Windows releases and info:
Get the correct prebuilt Sage Attention wheel for your system here:
pip install sageattention "path to downloaded wheel"
Example :
pip install sageattention "D:\sageattention-2.1.1+cu124torch2.5.1-cp310-cp310-win_amd64.whl"
sageattention-2.1.1+cu124torch2.5.1-cp310-cp310-win_amd64.whl This translates to being compatible with Cuda 12.4 | Py Torch 2.5.1 | Python 3.10.
Get the correct prebuilt Flash Attention wheel compatible with your python version here:
pip install "path to downloaded wheel"
You can install a new python venv in your root folder by using the following command. You can change C:\path\to\python310 to match your required version of python.
"C:\path\to\python310\python.exe" -m venv venv
To activate and start installing dependencies
your_env_name\Scripts\activate
Most projects will come with a requirements.txt to install this to your venv
pip install -r requirements.txt
![]() | The model weights and code are fully open-sourced and available now! Via their README: Run First-Last-Frame-to-Video Generation First-Last-Frame-to-Video is also divided into processes with and without the prompt extension step. Currently, only 720P is supported. The specific parameters and corresponding settings are as follows: Task Resolution Model 480P 720P flf2v-14B ❌ ✔️ Wan2.1-FLF2V-14B-720P [link] [comments] |
![]() | It's work in progress by Kijai: Followed this method and it's working for me on Windows: git clone https://github.com/kijai/ComfyUI-FramePackWrapper into Custom Nodes folder cd ComfyUI-FramePackWrapper
Download: BF16 or FP8 https://huggingface.co/Kijai/HunyuanVideo_comfy/tree/main Workflow is included inside the ComfyUI-FramePackWrapper folder: https://github.com/kijai/ComfyUI-FramePackWrapper/tree/main/example_workflows [link] [comments] |
![]() | I have been doing AI artwork with Stable Diffusion and beyond (Flux and now HiDream) for over 2.5 years, and I am still impressed by the things that can be made with just a prompt. This image was made on a RTX 4070 12GB in comfyui with hidream-i1-dev-Q8.gguf. The prompt adherence is pretty amazing. It took me just 4 or 5 tweaks to the prompt to get this. The tweaks I made were just to keep adding and being more and more specific with what I wanted. Here is the prompt: "tarot card in the style of alphonse mucha, the card is the death card. the art style is art nouveau, it has death personified as skeleton in armor riding a horse and carrying a banner, there are adults and children on the ground around them, the scene is at night, there is a castle far in the background, a priest and man and women are also on the ground around the feet of the horse, the priest is laying on the ground apparently dead" [link] [comments] |
https://github.com/MNeMoNiCuZ/FramePack-Batch
FramePack Batch Processor is a command-line tool that processes a folder of images and transforms them into animated videos using the FramePack I2V model. This tool enables you to batch process multiple images without needing to use the Gradio web interface, and it also allows you to extract and use the prompt used in your original image, if it's saved in the EXIF metadata (like A1111 or other tools does).
https://github.com/lllyasviel/FramePack
venv_create.bat
to set up your environment: pip install -r requirements-batch.txt
in your virtual environmentThe script will create:
venv_activate.bat
for activating the environmentvenv_update.bat
for updating pipinput
folder
python batch.py [optional input arguments]
outputs
folder and alongside the original images--input_dir PATH Directory containing input images (default: ./input) --output_dir PATH Directory to save output videos (default: ./outputs) --prompt TEXT Prompt to guide the generation (default: "") --seed NUMBER Random seed, -1 for random (default: -1) --use_teacache Use TeaCache - faster but may affect hand quality (default: True) --video_length FLOAT Total video length in seconds, range 1-120 (default: 1.0) --steps NUMBER Number of sampling steps, range 1-100 (default: 5) --distilled_cfg FLOAT Distilled CFG scale, range 1.0-32.0 (default: 10.0) --gpu_memory FLOAT GPU memory preservation in GB, range 6-128 (default: 6.0) --use_image_prompt Use prompt from image metadata if available (default: True) --overwrite Overwrite existing output videos (default: False)
Process all images in the input folder with default settings:
python batch.py
Generate longer videos with more sampling steps:
python batch.py --video_length 10 --steps 25
Apply the same prompt to all images:
python batch.py --prompt "A character doing some simple body movements"
Extract and use prompts embedded in image metadata:
python batch.py --use_image_prompt
By default, the processor skips images that already have corresponding videos. To regenerate them:
python batch.py --overwrite
Process images from a different folder:
python batch.py --input_dir "my_images" --output_dir "my_videos"
The script automatically detects your available VRAM and adjusts its operation mode:
You can adjust the amount of preserved memory with the --gpu_memory
option if you encounter out-of-memory errors.
steps
for higher quality animations (but slower processing)--video_length
to control the duration of the generated videos--use_teacache false
![]() | Hi, I've been trying to recreate this user's image, but it doesn't look right. I'm using the HassakuXL checkpoint and some LoRAs. The images I generate lack that distinctive essence, it feels like the character isn't properly integrated with the background, and their expressions and eyes look mediocre. I'd like to get some advice on how to improve the image to make it look good, including lighting, shadows, background, particles, expressions, etc. Do I need to download a specific LoRA or checkpoint, or is it maybe the prompt? [link] [comments] |
![]() | Created with hidream dev. [link] [comments] |
![]() | Github: https://github.com/Tencent/InstantCharacter The model weights + code are finally open-sourced! InstantCharacter is an innovative, tuning-free method designed to achieve character-preserving generation from a single image, supporting a variety of downstream tasks. This is basically a much better InstantID that operates on Flux. [link] [comments] |
I know for Flux there's FluxGym, which makes it pretty straightforward to train LoRAs specifically for Flux models.
Is there an equivalent tool or workflow for training LoRAs that are compatible with HiDream AI? Any pointers or resources would be super appreciated. Thanks in advance!