Star 历史趋势
数据来源: GitHub API · 生成自 Stargazers.cn
README.md

Awesome LTX-2

A curated list of models, text encoders, and tools for the LTX-2 video generation suite.

ltx-logo

Stargazers Telegram X

Intro

▓ Apps & Tools

LTX2.3-Multifunctional

LTX2.3-Multifunctional is a desktop-optimized version of LTX that lowers GPU requirements and simplifies usage. It integrates all features including image-to-video, text-to-video, start/end frames, lip-sync, video enhancement, and image generation into a single application.

Key Features:

  • Lower GPU Requirements: Only needs 24GB VRAM (vs 32GB for standard desktop version)
  • All-in-One Interface: No complex ComfyUI workflows or error-prone nodes
  • Features: T2V, I2V, start/end frames, lip-sync, video enhancement, image generation, LoRA support
  • Multi-Frame Insertion: Two modes for generating long videos
  • Easy Setup: No third-party software required, just install LTX desktop

Downloads & Resources:

▓ Models

LTX-2 models are available in various formats including full weights, transformers-only, and GGUF quantizations for efficient inference.

▣ Checkpoints

VerNamePrecisionSizeDownload
2.3ltx-2.3-22b devbf1646.1 GB
2.3ltx-2.3-22b devfp829.1 GB
2.3ltx-2.3-22b devfp829.9 GB
2.3ltx-2.3-22b devint829.1 GB
2.3ltx-2.3-22b devnvfp421.7 GB
2.3ltx-2.3-22b devfp829.1 GB
2.3ltx-2.3-22b distilledbf1646.1 GB
2.3ltx-2.3-22b distilledfp829.5 GB
2.3ltx-2.3-22b distilledfp829.9 GB
2.3ltx-2.3-22b distilledint8tensormixed29.1 GB
2.3ltx-2.3-22b distillednvfp417.6 GB
2.3ltx-2.3-22b distilledmxfp8mixed29.7 GB
2.3ltx-2.3-22b distilled 1.1bf1646.1 GB
2ltx-2-19b devbf1643.3 GB
2ltx-2-19b devfp827.1 GB
2ltx-2-19b devfp420 GB
2ltx-2-19b distilledbf1643.3 GB
2ltx-2-19b distilledfp827.1 GB
2ltx-2-19b distillednvfp420 GB

Quantized to fp8_e5m2 to support older Triton with older Pytorch on 30 series GPUs. For WangGP in Pinokio

VerNamePrecisionSizeDownload
2ltx-2-19b devfp8_e5m227.1 GB

silveroxides Quantizations (mxfp8)

Note: The mxfp8mixed quantization requires a custom fork of ComfyUI-Kitchen with mxfp8 support. Standard ComfyUI installations may not support this quantization format.

ModelQuantSizeDownload
ltx-2.3-22b-devint8mixedtensorwise29.2 GB
ltx-2.3-22b-distilledint8tensormixed29.1 GB
ltx-2.3-22b-distilledint8mixedtensorwise29.2 GB
ltx-2.3-22b-distilledmxfp8mixed29.7 GB

Distilled LoRA

VerRankPrecisionSizeDownload
2.3384bf167.61 GB
2.3208bf164.97 GB
2.3159bf163.83 GB
2.3111bf162.74 GB
2.3105bf162.59 GB
2384bf167.67 GB
2242bf164.88 GB
2175bf163.58 GB
2175fp81.79 GB

▣ TenStrip Distilled LoRA Experiments

Experimental distilled LoRAs optimized for finetunes and I2V workflows. These LoRAs avoid the issues of the massive rank 384 official LoRA which can be counterproductive with conditioned inputs and finetunes.

LoRARankSizeDescription
ltx-2.3-22b-distilled-lora-1.1_fro90_ceil3636739 MBCompact LoRA with dynamic ceiling at 36
ltx-2.3-22b-distilled-lora-1.1_fro90_ceil72_condsafe72662 MBCond-safe version with cross-attention bridges, adaln/scale-shift tables, gate logits, and prompt scale-shift zeroed. Much better suited for I2V and input conditioned workflows. Can use 1.0 strength safely on first pass I2V.
ltx-2.3-22b-distilled-lora-fro90_ceil72721.4 GBStandard version with higher dynamic ceiling

Notes:

  • Lower rank LoRAs (72 and below) can be used at 1.0 strength safely for I2V first pass, with upscale pass at 0.4-0.5 strength
  • _ceil suffix indicates the dynamic ceiling during reranking
  • _condsafe suffix indicates cross-attention and other conditioning layers have been zeroed for better I2V compatibility
  • The official rank 384 LoRA can actively dampen conditioning signals in I2V workflows; cond_safe versions work much better

Download All LoRAs

Spatial Upscaler

Required for current two-stage pipeline implementations in this repository. Download to COMFYUI_ROOT_FOLDER/models/latent_upscale_models folder.

VerNameSizeDownload
2.3spatial-upscaler x2 1.0996 MB
2.3spatial-upscaler x1.5 1.01.09 GB
2spatial-upscaler x2 1.01.05 GB

Temporal Upscaler

Required for current two-stage pipeline implementations in this repository. Download to COMFYUI_ROOT_FOLDER/models/latent_upscale_models folder.

VerNameSizeDownload
2.3temporal-upscaler x2 1.0262 MB
2temporal-upscaler x2 1.0262 MB

▣ Merges

Custom merged models combining multiple control signals or specialized configurations.

VerNameDescriptionDownload
2.3ltx-2.3-22b-distilled-1.1-fused-union-controlMerged model combining Canny, Depth, and Pose control signals for unified control

══════════════════════════════════

▣ Finetunes

Community finetuned models based on LTX-2.3 with specialized improvements and optimizations.

ModelDescription
High-performance LoRA-integrated checkpoint family based on LTX 2.3. Includes both distilled (4-step) and non-distilled variants (20-30 steps). Recommended sampler: Euler + Simple/Normal/Linear_Quadratic.
I2V-optimized merge using layer scaled merges at different steps. Not a straight weight merge - behaves much nicer than standard LoRA loading and respects prompts better. Includes BF16 full checkpoint and fp8_mixed_learned quantized versions.
Uncensored video generation model based on LTX 2.3 supporting T2V and I2V natively. Includes a built-in prompt enhancer. Merge base for 10Eros. Supports GGUF format.

══════════════════════════════════

▣ GGUF Quantized Models

These models are optimized for lower memory usage. Note that in ComfyUI, these are typically loaded as transformer-only models.

QuantStack

QuantStack LTX-2.3

ModelQuantSizeDownload
ltx-2.3-22bQ2_K12.4 GBdevdistilleddistilled-1.1
ltx-2.3-22bQ3_K_M14.7 GBdevdistilleddistilled-1.1
ltx-2.3-22bQ3_K_S14 GBdevdistilleddistilled-1.1
ltx-2.3-22bQ4_K_M17.8 GBdevdistilleddistilled-1.1
ltx-2.3-22bQ4_K_S16.7 GBdevdistilleddistilled-1.1
ltx-2.3-22bQ5_K_M19.4 GBdevdistilleddistilled-1.1
ltx-2.3-22bQ5_K_S18.5 GBdevdistilleddistilled-1.1
ltx-2.3-22bQ6_K21 GBdevdistilleddistilled-1.1
ltx-2.3-22bQ8_025.5 GBdevdistilleddistilled-1.1

QuantStack LTX-2

ModelQuantSizeDownload
LTX-2-devQ2_K8.03 GB
LTX-2-devQ3_K_M10.3 GB
LTX-2-devQ3_K_S9.57 GB
LTX-2-devQ4_K_M13.4 GB
LTX-2-devQ4_K_S12.3 GB
LTX-2-devQ5_K_M15 GB
LTX-2-devQ5_K_S14.2 GB
LTX-2-devQ6_K16.6 GB
LTX-2-devQ8_021.1 GB
Unsloth

Unsloth LTX-2.3 GGUF

ModelQuantSizeDownload
ltx-2.3-22bBF1642 GBdevdistilled
ltx-2.3-22bF1642 GBdevdistilled
ltx-2.3-22bQ2_K8.28 GBdevdistilled
ltx-2.3-22bQ3_K_M10.8 GBdevdistilled
ltx-2.3-22bQ3_K_S9.95 GBdevdistilled
ltx-2.3-22bQ4_012.7 GBdevdistilled
ltx-2.3-22bQ4_113.8 GBdevdistilled
ltx-2.3-22bQ4_K_M14.3 GBdevdistilled
ltx-2.3-22bQ4_K_S13.1 GBdevdistilled
ltx-2.3-22bQ5_015.3 GBdevdistilled
ltx-2.3-22bQ5_116.3 GBdevdistilled
ltx-2.3-22bQ5_K_M16.1 GBdevdistilled
ltx-2.3-22bQ5_K_S15.2 GBdevdistilled
ltx-2.3-22bQ6_K17.8 GBdevdistilled
ltx-2.3-22bQ8_022.8 GBdevdistilled
ltx-2.3-22bUD-Q2_K9.5 GBdevdistilled
ltx-2.3-22bUD-Q3_K_M13.5 GBdevdistilled
ltx-2.3-22bUD-Q3_K_S11.4 GBdevdistilled
ltx-2.3-22bUD-Q4_K_M16.5 GBdevdistilled
ltx-2.3-22bUD-Q4_K_S14.2 GBdevdistilled
ltx-2.3-22bUD-Q5_K_M18.3 GBdevdistilled
ltx-2.3-22bUD-Q5_K_S16.3 GBdevdistilled

Unsloth LTX-2.3 GGUF - Distilled 1.1

ModelQuantSizeDownload
ltx-2.3-22bBF1642 GBdistilled-1.1
ltx-2.3-22bF1642 GBdistilled-1.1
ltx-2.3-22bQ2_K7.94 GBdistilled-1.1
ltx-2.3-22bQ3_K_M10.6 GBdistilled-1.1
ltx-2.3-22bQ3_K_S9.74 GBdistilled-1.1
ltx-2.3-22bQ4_K_M14.2 GBdistilled-1.1
ltx-2.3-22bQ4_K_S13 GBdistilled-1.1
ltx-2.3-22bQ5_K_M15.9 GBdistilled-1.1
ltx-2.3-22bQ5_K_S15 GBdistilled-1.1
ltx-2.3-22bQ6_K17.8 GBdistilled-1.1
ltx-2.3-22bQ8_022.8 GBdistilled-1.1
ltx-2.3-22bUD-Q2_K10.9 GBdistilled-1.1
ltx-2.3-22bUD-Q3_K_M13.4 GBdistilled-1.1
ltx-2.3-22bUD-Q4_K_M16.4 GBdistilled-1.1
ltx-2.3-22bUD-Q4_K_S14.1 GBdistilled-1.1
ltx-2.3-22bUD-Q5_K_M18.2 GBdistilled-1.1

Unsloth LTX-2 GGUF

ModelQuantSizeDownload
ltx-2-19b-devBF1637.8 GB
ltx-2-19b-devF1637.8 GB
ltx-2-19b-devUD-Q2_K_L10.1 GB
ltx-2-19b-devUD-Q2_K_XL11.6 GB
ltx-2-19b-devQ2_K8.1 GB
ltx-2-19b-devQ3_K_L10.7 GB
ltx-2-19b-devQ3_K_M10.1 GB
ltx-2-19b-devQ3_K_S9.47 GB
ltx-2-19b-devQ4_011.3 GB
ltx-2-19b-devQ4_112.3 GB
ltx-2-19b-devQ4_K_M12.8 GB
ltx-2-19b-devQ4_K_S11.9 GB
ltx-2-19b-devQ5_013.7 GB
ltx-2-19b-devQ5_114.6 GB
ltx-2-19b-devQ5_K_M14.3 GB
ltx-2-19b-devQ5_K_S13.6 GB
ltx-2-19b-devQ6_K16 GB
ltx-2-19b-devQ8_020.4 GB
Vantage

Vantage AI GGUFs

ModelQuantSizeDownload
ltx-2-19b-devQ3_K_M9.96 GB
ltx-2-19b-devQ3_K_S9.28 GB
ltx-2-19b-devQ4_011.6 GB
ltx-2-19b-devQ4_112.4 GB
ltx-2-19b-devQ4_K_M12.8 GB
ltx-2-19b-devQ4_K_S11.8 GB
ltx-2-19b-devQ5_013.6 GB
ltx-2-19b-devQ5_114.5 GB
ltx-2-19b-devQ5_K_M14.4 GB
ltx-2-19b-devQ5_K_S13.5 GB
ltx-2-19b-devQ6_K15.9 GB
ltx-2-19b-devQ8_020.4 GB
ltx-2-19b-distilledQ3_K_M9.96 GB
ltx-2-19b-distilledQ3_K_S9.28 GB
ltx-2-19b-distilledQ4_011.6 GB
ltx-2-19b-distilledQ4_112.4 GB
ltx-2-19b-distilledQ4_K_M12.8 GB
ltx-2-19b-distilledQ4_K_S11.8 GB
ltx-2-19b-distilledQ5_013.6 GB
ltx-2-19b-distilledQ5_114.5 GB
ltx-2-19b-distilledQ5_K_M14.4 GB
ltx-2-19b-distilledQ5_K_S13.5 GB
ltx-2-19b-distilledQ6_K15.9 GB
ltx-2-19b-distilledQ8_020.4 GB

Special Quantization: PolarQuant Q5

LTX-2.3 (22B) — PolarQuant Q5 is a bit-packed quantization method using Hadamard-Rotated Lloyd-Max Quantization. It achieves optimal Gaussian weight quantization via Hadamard rotation, delivering near-lossless quality with significant size reduction.

Specification image
SpecificationValue
Parameters22B
Transformer Blocks48
Hidden Dimension4096
Layers Quantized1,347 (of 5,947 total tensors)

Compression Statistics:

ComponentOriginal SizePQ5 PackedReduction
Transformer (1,347 layers)37 GB4.6 GB-88%
VAE + Skip (4,600 layers)9.1 GB9.1 GBBF16 kept
Upscalers1.3 GB1.3 GBBF16 kept
Total46.2 GB15 GB-68%
image

Quality Metrics:

  • Cosine Similarity: 0.9986 (near-lossless)
  • Download Size: 15 GB
  • Beats torchao INT4 on perplexity (PPL)

Hardware Requirements:

GPUVRAMStatus
A100 (80 GB)80 GBFull speed
A100 (40 GB)40 GBRecommended
RTX 4090 (24 GB)24 GBWith offloading

Key Features:

  • Mixed precision approach: transformer heavily quantized (-88%) while VAE remains BF16
  • 5-bit bit-packed representation (Q5)
  • 50-65% smaller than original with zero quality loss
  • One-command setup with easy generation wrapper
ModelSizeDownload
LTX-2.3-22B-PolarQuant-Q515 GB

Installation: pip install safetensors huggingface_hub scipy ArXiv Reference: 2603.29078

◆◇◆◇◆◇◆◇◆◇◆◇◆◇◆◇◆◇◆◇◆◇◆◇◆

▓ Text Encoders

LTX-2 requires Gemma-3-12b variants. LTX-2.3 uses text projection layers.

Comfy-Org Optimized Encoders

Official and optimized versions for ComfyUI.

Model NameSizeDownload
gemma_3_12B_it24.4 GB
gemma_3_12B_it_fpmixed13.7 GB
gemma_3_12B_it_fp8_scaled13.2 GB
gemma_3_12B_it_fp4_mixed9.5 GB
gemma_3_12B_it-int8tensormixed13.2 GB
gemma_3_12B_it-int8mixedblockwise13.6 GB
gemma_3_12B_it-int8mixedtensorwise14.1 GB
gemma_3_12B_it-int8tensormixed13.2 GB
text_projection_fp81.16 GB
  • gemma_3_12B_it_fpmixed: Experimental quant. Should be better than the fp8 scaled
  • gemma_3_12B_it_fp4_mixed: 90% fp4 layers

Note: The mxfp8mixed quantization requires a custom fork of ComfyUI-Kitchen with mxfp8 support. Standard ComfyUI installations may not support this quantization format.

Gemma-3-12b Abliterated

Why Choose Abliterated Encoders?

Standard Gemma models often incorporate safety alignment that "sanitizes" or weakens specific concepts within prompt embeddings. Even when the model doesn't explicitly refuse a request, this internal filtering can dilute creative intent. For LTX-2 video generation, using a standard encoder often results in:

  • Reduced Prompt Adherence: Key stylistic or descriptive terms may be ignored or weakened.
  • Visual Softening: Visual intensity and fine details are often "muted" to fit generic safety profiles.
  • Concept Dilution: Complex or niche creative requests are subtly altered, leading to less faithful representations of your vision.

Abliteration bypasses these restrictive alignment layers, allowing the encoder to translate your prompts into embeddings with maximum fidelity. This ensures LTX-2 receives the most accurate and un-filtered instructions possible.

Gemma-3-12b-Abliterated

Fixed versions of the abliterated Gemma-3-12b-it model by FusionCow, modified specifically for compatibility with LTX-2. The original model

ModelPrecisionSizeDownload
Gemma ablit fixedbf1623.5 GB
Gemma ablit fixedfp813.8 GB
Gemma 3 12B IT Heretic

Models by DreamFast

Safetensors

ModelPrecisionSizeDownload
Gemma_3_12B_it Hereticbf1623.5 GB
Gemma_3_12B_it Hereticfp812.8 GB

GGUF

QuantSizeQualityRecommendationDownload
F1622GBLosslessReference, same as original
Q8_012GBExcellentBest quality quantization
Q6_K9.0GBVery GoodHigh quality, good compression
Q5_K_M7.9GBGoodBalanced quality/size
Q5_K_S7.7GBGoodSlightly smaller Q5
Q4_K_M6.8GBGoodStill useful
Q4_K_S6.5GBDecentSmaller Q4 variant
Q3_K_M5.6GBAcceptableFor very low VRAM only
Sikaworld1990 Gemma-3-12b Abliterated

NVFP4 quantization variants by Sikaworld1990 optimized for Blackwell GPUs.

ModelPrecisionSizeDownload
Gemma-3-12b QAT Abliterated FP4NVFP4-HF12.1 GB
Gemma-3-12b QAT Abliterated FP4NVFP4-Pure8.91 GB
Gemma-3-12b HereticX Abliteratedbf1615 GB
Gemma-3-12b High-Fidelity Abliteratedbf1614.1 GB
  • FP4-HF: High-fidelity mixed precision calibration
  • FP4-Pure: Pure FP4 quantization for maximum compression
  • HereticX: Uncensored variant with maximum prompt fidelity
  • High-Fidelity: Optimized for quality with better detail preservation

◆◇◆◇◆◇◆◇◆◇◆◇◆◇◆◇◆◇◆◇◆◇◆◇◆

▓ Separated Components

Separated LTX2 checkpoint by Kijai and Kijai for LTX-2.3. For alternative way to load the models in Comfy.

▣ Diffusion Models (Transformer Only)

VerNamePrecisionSizeDownload
2.3ltx-2.3-22b devbf1642 GB
2.3ltx-2.3-22b devfp823.5 GB
2.3ltx-2.3-22b devmxfp8_block3224.1 GB
2.3ltx-2.3-22b devfp8_input_scaled25 GB
2.3ltx-2.3-22b distilledbf1642 GB
2.3ltx-2.3-22b distilledfp8_input_scaled23.5 GB
2.3ltx-2.3-22b distilled v2fp8_input_scaled v223.2 GB
2.3ltx-2.3-22b distilledfp823.5 GB
2.3ltx-2.3-22b distilled (experimental)mxfp824.1 GB
2.3ltx-2.3-22b distilled 1.1bf1642 GB
2.3ltx-2.3-22b distilled 1.1fp825.2 GB
2.3ltx-2.3-22b distilled 1.1 (experimental)mxfp824.1 GB
2ltx-2-19b devbf1637.8 GB
2ltx-2-19b devfp821.6 GB
2ltx-2-19b devfp414.5 GB
2ltx-2-19b distilledbf1637.8 GB
2ltx-2-19b distilledfp821.6 GB

[!NOTE]
input_scaled additionally have activation scaling, and are set to run with fp8 matmuls on supported hardware (roughly 40xx and later Nvidia GPUs).

▣ VAE (Video & Audio)

VerComponentPrecisionSizeDownload
2.3Video VAEBF161.45 GB
2.3Audio VAEBF16365 MB
2Video VAEBF162.45 GB
2Audio VAEBF16218 MB

▣ Embedding Connectors & Text Projection

VerNamePrecisionSizeDownload
2.3Embeddings Connectors devbf162.31 GB
2.3Embeddings Connectors distilledbf162.31 GB
2Connector devbf162.86 GB
2Connector distilledbf162.86 GB

◆◇◆◇◆◇◆◇◆◇◆◇◆◇◆◇◆◇◆◇◆◇◆◇◆

▓ LoRA

▣ Enchancer, special

  • Lightricks LTX-2.3

    • LipDub IC-LoRA - Enables lip dubbing on top of LTX-2.3 for video dubbing via joint audio-visual diffusion (based on JustDubIt research)
  • OmerHagawa

  • systms

    • SYSTMS FLW IC-LoRA - Seamless shot-to-shot transitions IC-LoRA with trigger word FLW, uses gray frames (RGB 127,127,127) between clips
  • LTX-2.3-IC-LoRA-Colorizer by DoctorDiffusion (331 MB) - Colorize black and white videos

  • JUST-DUB-IT

  • Best-Face-Swap-Video

  • Image-to-Video Adapter LoRA

    • Original by MachineDelusions
    • siraxe variant - Stripped audio layers + rank64 compressed (2.62 GB, 655 MB rank64 bf16)
  • Lightricks LTX-2.3

    • HDR - Enables 16-bit HDR video generation and converts SDR video to HDR using LogC3 transform for extended dynamic range
    • Union Control - Unified IC-LoRA combining Canny + Depth + Pose control signals for multi-signal video generation conditioning
    • Motion Track Control - Guides object motion using sparse point trajectories via colored spline overlays on reference videos
  • vrgamedevgirl84

  • oumoumad

    • IC luminance map
    • LTX-2 IC-LoRA-Ungrade - Removes color grading and contrast from footage, returning neutral ungraded appearance
    • LTX-2.3 IC-LoRA-Ungrade - LTX-2.3 version of color grading removal IC-LoRA
    • IC-LoRA-Outpaint - Extends video canvas by generating new content in black regions (letterbox areas), filling with temporally consistent content
    • IC-LoRA-ReFocus - Removes lens blur and restores focus to out-of-focus footage (lens blur only)
    • IC-LoRA-Uncompress - Removes MP4 compression artifacts (blocking, banding, mosquito noise) and restores clean video
    • IC-LoRA-MotionDeblur - Removes motion blur from footage
    • IC-LoRA-Deinterlace - Removes interlacing artifacts from video
    • FXIC LTX2 IC-LoRA - Flux-inspired IC-LoRA for LTX video transformation with multiple optimizer variants (adamw, prodigy, masked) at various training steps
    • DeArchive LTX-2.3 - In-Context LoRA for restoring archive video (old B&W footage, low-res web rips, sepia-toned silent-era prints) into colored, high-definition modern cinematography (Rank 128, 5,000 steps)
  • Kijai

  • Cseti

    • IC-LoRA-Cameraman v1 - Transfers camera movements (zoom, pan, tilt, orbit) from reference video to generated output
    • IC-LoRA-EditRefVid v1 - Edit reference video IC-LoRA for editing existing videos using reference guidance
  • 100percentrobot

    • Audio-Reactive LORA - Generates audio-reactive videos with motion synchronized to musical elements (beats, rhythm)
  • LiconStudio

    • VBVR-lora-I2V - Enhances video generation for complex reasoning tasks including multi-object interactions, physical causality, and spatial relationships
    • VBVR-lora-I2V Special
  • TheBurgstall

    • LTX-2.3-Skin-Hair - Refines skin texture and hair rendering, reduces plastic skin artifacts, improves specular highlights
    • VR-360-Outpaint IC-LoRA - Outpaints standard widescreen footage into a full 360° equirectangular projection for immersive/VR viewing.
  • Nightfury16

  • siraxe

    • MergeGreen IC-lora - Maintains motion at start/end frames, use middle frames with RGB 0,191,0 (75% green fill) in IC-LoRA workflow
    • TTM IC-lora - Makes cutouts cartoony and adds cartoony characters to video scenes, based on the TTM approach (use with Img To Video bypass + Add Video IC-LoRA Guide node)
  • Lightricks LTX-2

    • Canny Control - Edge detection control for structural guidance
    • Depth Control - Depth map conditioning for 3D spatial control
    • Detailer - Enhances fine details and textures in generated videos
    • Pose Control - Human pose estimation control for motion guidance

Upscaler LoRAs:

  • LTX 2.3 Upscale IC-LoRA by Zlikwid
    • Generative refinement LoRA for upscaling lower-res or soft videos
    • Works by bicubic upscaling first, then running through LTX 2.3 with this LoRA
    • Use prompt: upscale
  • LTX2.3-ICEdit-Insight by JoyFox Lab
    • Task-aware video restoration and editing model family
    • Supports: Video Restoration, HD Enhancement, Watermark Removal, Subtitle Removal
  • Singularity LTX-2.3 OmniCine by WarmBloodAban
    • Comprehensive optimizer for LTX2.3 I2V and First/Last Frame workflows
    • Features: Limb Evolution, Shot Injection, Natural Expression, Physical Integrity, Cross-Style Potential
    • Uses "Singularity" prompting framework with 7-block bilingual structure

▣ Styles

▣ Special

  • Wan2.1 VAE Adapter
    • Latent space adapter for converting between LTX-2 and Wan2.1 VAE representations
    • latent_adapter_final.pt (447 MB)

▣ ID-LoRA (Identity-Driven In-Context LoRA)

ID-LoRA is a method that enables identity-preserving audio-video generation in a single model. It jointly generates a subject's appearance and voice, letting a text prompt, a reference image, and a short audio clip govern both modalities together. Built on top of LTX-2.3 (22B), it is the first method to personalize visual appearance and voice within a single generative pass.

Unlike cascaded pipelines that treat audio and video separately, ID-LoRA operates in a unified latent space where a single text prompt can simultaneously dictate the scene's visual content, environmental acoustics, and speaking style—while preserving the subject's vocal identity and visual likeness.

Key Features:

  • Text prompt controls the scene and content
  • Reference image preserves the subject's visual likeness
  • Short audio clip preserves the subject's vocal identity
  • Single unified generation pass for both appearance and voice

Available LoRAs for LTX-2.3:

LoRALoRA RankSizeDownload
ID-LoRA-TalkVid-3K1281.1 GB
ID-LoRA-CelebVHQ-3K1281.1 GB

Resources:

◆◇◆◇◆◇◆◇◆◇◆◇◆◇◆◇◆◇◆◇◆◇◆◇◆

▓ ComfyUI Nodes

▣ Custom Node Collections

  • 10S-Comfy-nodes by TenStrip - Custom ComfyUI nodes for improving motion quality when working with LTX 2.3's combined audio/video latent pipeline. Includes Latent Cross Fade Auto Concat, Audio Latent Stretch, Latent Motion Sharpener, Latent Temporal Upsampler, Latent Motion Retime, and Latent Temporal Inpainter for clean 30fps output from 24fps sampled models.

  • Deno Custom Nodes by Deno2026 - Practical ComfyUI custom nodes focused on fast real-world workflow improvements including (Deno) Resize Box, Multi Image Loader, LTX Sequencer, LTX Model Loader, Easy Model Download Helper, LTX Multi LoRA Loader, and LTX Prompt Guide.

  • PromptRelay by kijai - Enables consistent multilingual lip-sync while maintaining voice consistency across languages. Distributes video latent frames across segments with smart prompt node supporting inline and block syntax styles.

  • WhatDreamsCost ComfyUI by WhatDreamsCost - A variety of custom ComfyUI nodes and workflows for creating AI-generated video content including Multi Image Loader, LTX Sequencer, LTX Keyframer, Speech Length Calculator, Load Video UI, and Load Audio UI.

  • ComfyUI-Sapiens2 by kijai - ComfyUI nodes for Sapiens2 computer vision models from Facebook Research. Supports pose estimation, body-part segmentation, surface normal estimation, and pointmap estimation with model variants from 400M to 5B parameters.

◆◇◆◇◆◇◆◇◆◇◆◇◆◇◆◇◆◇◆◇◆◇◆◇◆

▓ LoRA Training

For training LTX LoRAs, the community uses a variety of official scripts, community-developed forks, and cloud-based platforms.

Primary Local Training Tools

  • Official LTX-2 Trainer: This is the standard Python-based package for training LoRAs, full fine-tuning, and In-Context (IC) LoRAs. It is designed for Linux and requires CUDA and Triton.
  • Musubi-Tuner (AkaneTendo25 Fork): Widely considered the fastest and most efficient local trainer for LTX-2 and 2.3. It features significantly smaller cache sizes (up to 12x smaller than AI Toolkit) and better iteration speeds, reaching up to 2 iterations per second on an RTX 5090.
  • AI Toolkit (by Ostris): A popular third-party tool that supports LTX-2 character and image-to-video LoRAs. While beginner-friendly, some users reported issues with audio training on the main branch.
  • AI Toolkit: BIG-DADDY-VERSION (ArtDesignAwesome Fork): This specific fork was created to fix broken audio and voice training in the original AI Toolkit. It is optimized for hardware like the RTX 5090.
  • rs-nodes (richservo): A collection of nodes that includes a full LTX Lora trainer directly within ComfyUI. It is designed to be memory-efficient, allowing training on cards with as little as 11GB-12GB of VRAM by using ComfyUI's native weight loaders.
  • SimpleTuner: A highly optimized trainer for Linux that supports LTX-2 and is noted for its ability to handle larger datasets on limited VRAM via block swapping.

Cloud Training Platforms

  • Fal.ai: Provides a dedicated cloud trainer for custom styles and effects, though it is primarily limited to image-based training datasets.
  • RunComfy: A cloud service that offers a pre-configured AI Toolkit setup specifically for LTX-2 training.

Essential Dataset & Captioning Tools

  • Taz's Ultimate Captioning Tool: A Hugging Face space frequently used by the community to generate the long, detailed, cinematographic prompts (around 200 words) that LTX-2 requires for high-quality training.
  • AI Video Clipper & LoRA Captioner: A modular pipeline designed to automate local dataset creation using WhisperX and Qwen2-VL, including support for RTX 5090 Blackwell cards.

Training Requirements Summary

  • Dataset: Videos should typically be cut to 121 frames (exactly 4.84 seconds) to align with the model's architectural "8n+1" rule.
  • Hardware: While 16GB VRAM is possible with extreme offloading in tools like rs-nodes, 24GB is the practical minimum for quantized training. For best results and speed, 48GB to 80GB (H100 or RTX 6000) is preferred.
  • Precision: It is now officially recommended to train on the full BF16 model for LTX 2.3 rather than FP8 for superior quality.

◆◇◆◇◆◇◆◇◆◇◆◇◆◇◆◇◆◇◆◇◆◇◆◇◆

▓ Workflow & Technical Notes

❖ Lightricks

LTX-2.3:

LTX-2:

❖ vrgamedevgirl84

vrgamedevgirl84 LTX 2.3 Music Video Creator:

  • Music Video Creator Workflow
    • Prompt Creator Workflow - Audio upload, beat detection, scene timing, lyrics analysis, style selection, prompt generation
    • Text-to-Video Workflow - LoRA integration, advanced prompt controls, Remake Mode, video stitching
    • Image-to-Video Workflow - Uses Z-Image Turbo and LTX 2.3
    • Requirements: ComfyUI, LTX 2.3 models, Z-Image Turbo model, FFmpeg, vrgamedevgirl custom nodes

❖ ComfyUI

❖ RuneXX

RuneXX LTX-2.3 Workflows:

Movie-Maker:

Talking-Avatar-TTS:

Video-2-Video:

Music-Video-Creator:

Others:

Custom-Audio:

First-Last-Frame:

Long-Video-Experimental:

3-Pass-Experimental:

Control-reference:

Helper-Workflows:

Other-examples:

RuneXX LTX-2 Workflows old pre_feb2026

关于 About

All available LTX-2 models, encoders, workflows, LoRAs for ComfyUI
aicomfyuiitvloraltx-2ltxvmodelt2vvideovideo-ai

语言 Languages

提交活跃度 Commit Activity

代码提交热力图
过去 52 周的开发活跃度
78
Total Commits
峰值: 14次/周
Less
More

核心贡献者 Contributors