|
|
@@ -16,6 +16,9 @@ A C++ based REST API wrapper for the [stable-diffusion.cpp](https://github.com/l
|
|
|
- [Web UI](#web-ui)
|
|
|
- [Architecture](#architecture)
|
|
|
- [Model Detection and Architecture Support](#model-detection-and-architecture-support)
|
|
|
+- [Model Architecture Requirements](#model-architecture-requirements)
|
|
|
+- [Context Creation Methods per Architecture](#context-creation-methods-per-architecture)
|
|
|
+- [Model Quantization and Conversion](#model-quantization-and-conversion)
|
|
|
- [Technical Requirements](#technical-requirements)
|
|
|
- [Project Structure](#project-structure)
|
|
|
- [Model Types and File Extensions](#model-types-and-file-extensions)
|
|
|
@@ -203,6 +206,131 @@ These models are loaded using the modern `ctxParams.diffusion_model_path` parame
|
|
|
- **Future-Proof**: Easy to add support for new architectures
|
|
|
- **Backward Compatible**: Existing models continue to work without changes
|
|
|
|
|
|
+## Model Architecture Requirements
|
|
|
+
|
|
|
+> **Note:** The following tables contain extensive information and may require horizontal scrolling to view all columns properly.
|
|
|
+
|
|
|
+| Architecture | Extra VAE | Standalone High Noise | T5XXL | CLIP-Vision | CLIP-G | CLIP-L | Model Files | Example Commands |
|
|
|
+|--------------|-----------|----------------------|-------|-------------|--------|--------|-------------|------------------|
|
|
|
+| SD 1.x | No | No | No | No | No | No | sd-v1-4.ckpt, v1-5-pruned-emaonly.safetensors | `./bin/sd -m ../models/sd-v1-4.ckpt -p "a lovely cat"` |
|
|
|
+| SD 2.x | No | No | No | No | No | No | (Similar to SD 1.x) | `./bin/sd -m ../models/sd2-model.ckpt -p "a lovely cat"` |
|
|
|
+| SDXL | Yes | No | No | No | No | No | sd_xl_base_1.0.safetensors, sdxl_vae-fp16-fix.safetensors | `./bin/sd -m ../models/sd_xl_base_1.0.safetensors --vae ../models/sdxl_vae-fp16-fix.safetensors -H 1024 -W 1024 -p "a lovely cat" -v` |
|
|
|
+| SD3 | No | No | Yes | No | No | No | sd3_medium_incl_clips_t5xxlfp16.safetensors | `./bin/sd -m ../models/sd3_medium_incl_clips_t5xxlfp16.safetensors -H 1024 -W 1024 -p 'a lovely cat holding a sign says "Stable Diffusion CPP"' --cfg-scale 4.5 --sampling-method euler -v --clip-on-cpu` |
|
|
|
+| SD3.5 Large | No | No | Yes | No | Yes | Yes | sd3.5_large.safetensors, clip_l.safetensors, clip_g.safetensors, t5xxl_fp16.safetensors | `./bin/sd -m ../models/sd3.5_large.safetensors --clip_l ../models/clip_l.safetensors --clip_g ../models/clip_g.safetensors --t5xxl ../models/t5xxl_fp16.safetensors -H 1024 -W 1024 -p 'a lovely cat holding a sign says "Stable diffusion 3.5 Large"' --cfg-scale 4.5 --sampling-method euler -v --clip-on-cpu` |
|
|
|
+| FLUX Models | Yes | No | Yes | No | No | Yes | flux1-dev-q3_k.gguf, flux1-dev-q8_0.gguf, flux1-schnell-q8_0.gguf, ae.sft, clip_l.safetensors, t5xxl_fp16.safetensors | `./bin/sd --diffusion-model ../models/flux1-dev-q8_0.gguf --vae ../models/ae.sft --clip_l ../models/clip_l.safetensors --t5xxl ../models/t5xxl_fp16.safetensors -p "a lovely cat holding a sign says 'flux.cpp'" --cfg-scale 1.0 --sampling-method euler -v --clip-on-cpu` |
|
|
|
+| Kontext | Yes | No | Yes | No | No | Yes | flux1-kontext-dev-q8_0.gguf, ae.sft, clip_l.safetensors, t5xxl_fp16.safetensors | `./bin/sd -r ./flux1-dev-q8_0.png --diffusion-model ../models/flux1-kontext-dev-q8_0.gguf --vae ../models/ae.sft --clip_l ../models/clip_l.safetensors --t5xxl ../models/t5xxl_fp16.safetensors -p "change 'flux.cpp' to 'kontext.cpp'" --cfg-scale 1.0 --sampling-method euler -v --clip-on-cpu` |
|
|
|
+| Chroma | Yes | No | Yes | No | No | No | chroma-unlocked-v40-q8_0.gguf, ae.sft, t5xxl_fp16.safetensors | `./bin/sd --diffusion-model ../models/chroma-unlocked-v40-q8_0.gguf --vae ../models/ae.sft --t5xxl ../models/t5xxl_fp16.safetensors -p "a lovely cat holding a sign says 'chroma.cpp'" --cfg-scale 4.0 --sampling-method euler -v --chroma-disable-dit-mask --clip-on-cpu` |
|
|
|
+| Wan Models | Yes | No | Yes | Yes | No | No | wan2.1_t2v_1.3B_fp16.safetensors, wan_2.1_vae.safetensors, umt5-xxl-encoder-Q8_0.gguf | `./bin/sd -M vid_gen --diffusion-model ../models/wan2.1_t2v_1.3B_fp16.safetensors --vae ../models/wan_2.1_vae.safetensors --t5xxl ../models/umt5-xxl-encoder-Q8_0.gguf -p "a lovely cat" --cfg-scale 6.0 --sampling-method euler -v -W 832 -H 480 --diffusion-fa --video-frames 33 --flow-shift 3.0` |
|
|
|
+| Wan2.2 T2V A14B | No | Yes | No | No | No | No | Wan2.2-T2V-A14B-LowNoise-Q8_0.gguf, Wan2.2-T2V-A14B-HighNoise-Q8_0.gguf | `./bin/sd -M vid_gen --diffusion-model ../models/Wan2.2-T2V-A14B-LowNoise-Q8_0.gguf -p "a lovely cat" --cfg-scale 6.0 --sampling-method euler -v -W 832 -H 480 --diffusion-fa --video-frames 33 --flow-shift 3.0` |
|
|
|
+| Wan2.2 I2V A14B | No | Yes | No | No | No | No | Wan2.2-I2V-A14B-LowNoise-Q8_0.gguf, Wan2.2-I2V-A14B-HighNoise-Q8_0.gguf | `./bin/sd -M vid_gen --diffusion-model ../models/Wan2.2-I2V-A14B-LowNoise-Q8_0.gguf -r input.png -p "a lovely cat" --cfg-scale 6.0 --sampling-method euler -v -W 832 -H 480 --diffusion-fa --video-frames 33 --flow-shift 3.0` |
|
|
|
+| Qwen Image Models | Yes | No | No | No | No | No | qwen-image-Q8_0.gguf, qwen_image_vae.safetensors, Qwen2.5-VL-7B-Instruct-Q8_0.gguf | `./bin/sd --diffusion-model ../models/qwen-image-Q8_0.gguf --vae ../models/qwen_image_vae.safetensors --qwen2vl ../models/Qwen2.5-VL-7B-Instruct-Q8_0.gguf -p '一个穿着"QWEN"标志的T恤的中国美女正拿着黑色的马克笔面相镜头微笑。' --cfg-scale 2.5 --sampling-method euler -v --offload-to-cpu -H 1024 -W 1024 --diffusion-fa --flow-shift 3` |
|
|
|
+| Qwen Image Edit | Yes | No | No | No | No | No | Qwen_Image_Edit-Q8_0.gguf, qwen_image_vae.safetensors, qwen_2.5_vl_7b.safetensors | `./bin/sd --diffusion-model ../models/Qwen_Image_Edit-Q8_0.gguf --vae ../models/qwen_image_vae.safetensors --qwen2vl ../models/qwen_2.5_vl_7b.safetensors --cfg-scale 2.5 --sampling-method euler -v --offload-to-cpu --diffusion-fa --flow-shift 3 -r ../assets/flux/flux1-dev-q8_0.png -p "change 'flux.cpp' to 'edit.cpp'"` |
|
|
|
+| PhotoMaker | No | No | No | No | No | No | sdxlUnstableDiffusers_v11.safetensors, sdxl_vae.safetensors, photomaker-v1.safetensors | `./bin/sd -m ../models/sdxlUnstableDiffusers_v11.safetensors --vae ../models/sdxl_vae.safetensors --photo-maker ../models/photomaker-v1.safetensors --pm-id-images-dir ../assets/photomaker_examples/scarletthead_woman -p "a girl img, retro futurism" --cfg-scale 5.0 --sampling-method euler -H 1024 -W 1024 --pm-style-strength 10 --vae-on-cpu --steps 50` |
|
|
|
+| LCM | No | No | No | No | No | No | lcm-lora-sdv1-5 | `./bin/sd -m ../models/v1-5-pruned-emaonly.safetensors -p "a lovely cat<lora:lcm-lora-sdv1-5:1>" --steps 4 --lora-model-dir ../models -v --cfg-scale 1` |
|
|
|
+| SSD1B | No | No | No | No | No | No | (Various SSD-1B models) | `./bin/sd -m ../models/ssd-1b.safetensors -p "a lovely cat"` |
|
|
|
+| Tiny SD | No | No | No | No | No | No | (Various Tiny SD models) | `./bin/sd -m ../models/tiny-sd.safetensors -p "a lovely cat"` |
|
|
|
+
|
|
|
+## Context Creation Methods per Architecture
|
|
|
+
|
|
|
+| Architecture | Context Creation Method | Special Parameters | Model Files | Example Commands |
|
|
|
+|--------------|------------------------|-------------------|-------------|------------------|
|
|
|
+| SD 1.x, SD 2.x, SDXL | Standard prompt-based generation | --cfg-scale, --sampling-method, --steps | sd-v1-4.ckpt, v1-5-pruned-emaonly.safetensors, sd_xl_base_1.0.safetensors | `./bin/sd -m ../models/sd-v1-4.ckpt -p "a lovely cat"` |
|
|
|
+| SD3 | Multiple text encoders | --clip-on-cpu recommended | sd3_medium_incl_clips_t5xxlfp16.safetensors | `./bin/sd -m ../models/sd3_medium_incl_clips_t5xxlfp16.safetensors -H 1024 -W 1024 -p 'a lovely cat holding a sign says "Stable Diffusion CPP"' --cfg-scale 4.5 --sampling-method euler -v --clip-on-cpu` |
|
|
|
+| SD3.5 Large | Multiple text encoders | --clip-on-cpu recommended | sd3.5_large.safetensors, clip_l.safetensors, clip_g.safetensors, t5xxl_fp16.safetensors | `./bin/sd -m ../models/sd3.5_large.safetensors --clip_l ../models/clip_l.safetensors --clip_g ../models/clip_g.safetensors --t5xxl ../models/t5xxl_fp16.safetensors -H 1024 -W 1024 -p 'a lovely cat holding a sign says "Stable diffusion 3.5 Large"' --cfg-scale 4.5 --sampling-method euler -v --clip-on-cpu` |
|
|
|
+| FLUX Models | Text-to-image generation | --cfg-scale 1.0 recommended, --clip-on-cpu for memory efficiency | flux1-dev-q3_k.gguf, flux1-dev-q8_0.gguf, flux1-schnell-q8_0.gguf, ae.sft, clip_l.safetensors, t5xxl_fp16.safetensors | `./bin/sd --diffusion-model ../models/flux1-dev-q8_0.gguf --vae ../models/ae.sft --clip_l ../models/clip_l.safetensors --t5xxl ../models/t5xxl_fp16.safetensors -p "a lovely cat holding a sign says 'flux.cpp'" --cfg-scale 1.0 --sampling-method euler -v --clip-on-cpu` |
|
|
|
+| Kontext | Image-to-image transformation | -r for reference image, --cfg-scale 1.0 recommended | flux1-kontext-dev-q8_0.gguf, ae.sft, clip_l.safetensors, t5xxl_fp16.safetensors | `./bin/sd -r ./flux1-dev-q8_0.png --diffusion-model ../models/flux1-kontext-dev-q8_0.gguf --vae ../models/ae.sft --clip_l ../models/clip_l.safetensors --t5xxl ../models/t5xxl_fp16.safetensors -p "change 'flux.cpp' to 'kontext.cpp'" --cfg-scale 1.0 --sampling-method euler -v --clip-on-cpu` |
|
|
|
+| Chroma | Text-to-image generation | --cfg-scale 1.0 recommended, --clip-on-cpu for memory efficiency | chroma-unlocked-v40-q8_0.gguf, ae.sft, t5xxl_fp16.safetensors | `./bin/sd --diffusion-model ../models/chroma-unlocked-v40-q8_0.gguf --vae ../models/ae.sft --t5xxl ../models/t5xxl_fp16.safetensors -p "a lovely cat holding a sign says 'chroma.cpp'" --cfg-scale 4.0 --sampling-method euler -v --chroma-disable-dit-mask --clip-on-cpu` |
|
|
|
+| Chroma1-Radiance | Text-to-image generation | --cfg-scale 4.0 recommended | Chroma1-Radiance-v0.4-Q8_0.gguf, t5xxl_fp16.safetensors | `./bin/sd --diffusion-model ../models/Chroma1-Radiance-v0.4-Q8_0.gguf --t5xxl ../models/clip/t5xxl_fp16.safetensors -p "a lovely cat holding a sign says 'chroma radiance cpp'" --cfg-scale 4.0 --sampling-method euler -v` |
|
|
|
+| Wan Models | Video generation with text prompts | -M vid_gen, --video-frames, --flow-shift, --diffusion-fa | wan2.1_t2v_1.3B_fp16.safetensors, wan_2.1_vae.safetensors, umt5-xxl-encoder-Q8_0.gguf | `./bin/sd -M vid_gen --diffusion-model ../models/wan2.1_t2v_1.3B_fp16.safetensors --vae ../models/wan_2.1_vae.safetensors --t5xxl ../models/umt5-xxl-encoder-Q8_0.gguf -p "a lovely cat" --cfg-scale 6.0 --sampling-method euler -v -W 832 -H 480 --diffusion-fa --video-frames 33 --flow-shift 3.0` |
|
|
|
+| Wan2.1 I2V Models | Image-to-video generation | Requires clip_vision_h.safetensors | wan2.1-i2v-14b-480p-Q8_0.gguf, wan_2.1_vae.safetensors, clip_vision_h.safetensors | `./bin/sd -M vid_gen --diffusion-model ../models/wan2.1-i2v-14b-480p-Q8_0.gguf --vae ../models/wan_2.1_vae.safetensors -r input.png -p "a lovely cat" --cfg-scale 6.0 --sampling-method euler -v -W 832 -H 480 --diffusion-fa --video-frames 33 --flow-shift 3.0` |
|
|
|
+| Wan2.1 FLF2V Models | Flow-to-video generation | Requires clip_vision_h.safetensors | wan2.1-flf2v-14b-720p-Q8_0.gguf, wan_2.1_vae.safetensors, clip_vision_h.safetensors | `./bin/sd -M vid_gen --diffusion-model ../models/wan2.1-flf2v-14b-720p-Q8_0.gguf --vae ../models/wan_2.1_vae.safetensors -r flow.png -p "a lovely cat" --cfg-scale 6.0 --sampling-method euler -v -W 832 -H 480 --diffusion-fa --video-frames 33 --flow-shift 3.0` |
|
|
|
+| Wan2.2 T2V A14B | Text-to-video generation | Uses dual diffusion models | Wan2.2-T2V-A14B-LowNoise-Q8_0.gguf, Wan2.2-T2V-A14B-HighNoise-Q8_0.gguf | `./bin/sd -M vid_gen --diffusion-model ../models/Wan2.2-T2V-A14B-LowNoise-Q8_0.gguf -p "a lovely cat" --cfg-scale 6.0 --sampling-method euler -v -W 832 -H 480 --diffusion-fa --video-frames 33 --flow-shift 3.0` |
|
|
|
+| Wan2.2 I2V A14B | Image-to-video generation | Uses dual diffusion models | Wan2.2-I2V-A14B-LowNoise-Q8_0.gguf, Wan2.2-I2V-A14B-HighNoise-Q8_0.gguf | `./bin/sd -M vid_gen --diffusion-model ../models/Wan2.2-I2V-A14B-LowNoise-Q8_0.gguf -r input.png -p "a lovely cat" --cfg-scale 6.0 --sampling-method euler -v -W 832 -H 480 --diffusion-fa --video-frames 33 --flow-shift 3.0` |
|
|
|
+| Qwen Image Models | Text-to-image generation with Chinese language support | --qwen2vl for the language model, --diffusion-fa, --flow-shift | qwen-image-Q8_0.gguf, qwen_image_vae.safetensors, Qwen2.5-VL-7B-Instruct-Q8_0.gguf | `./bin/sd --diffusion-model ../models/qwen-image-Q8_0.gguf --vae ../models/qwen_image_vae.safetensors --qwen2vl ../models/Qwen2.5-VL-7B-Instruct-Q8_0.gguf -p '一个穿着"QWEN"标志的T恤的中国美女正拿着黑色的马克笔面相镜头微笑。' --cfg-scale 2.5 --sampling-method euler -v --offload-to-cpu -H 1024 -W 1024 --diffusion-fa --flow-shift 3` |
|
|
|
+| Qwen Image Edit | Image editing with reference image | -r for reference image, --qwen2vl_vision for vision model | Qwen_Image_Edit-Q8_0.gguf, qwen_image_vae.safetensors, qwen_2.5_vl_7b.safetensors | `./bin/sd --diffusion-model ../models/Qwen_Image_Edit-Q8_0.gguf --vae ../models/qwen_image_vae.safetensors --qwen2vl ../models/qwen_2.5_vl_7b.safetensors --cfg-scale 2.5 --sampling-method euler -v --offload-to-cpu --diffusion-fa --flow-shift 3 -r ../assets/flux/flux1-dev-q8_0.png -p "change 'flux.cpp' to 'edit.cpp'"` |
|
|
|
+| PhotoMaker | Personalized image generation with ID images | --photo-maker, --pm-id-images-dir, --pm-style-strength | sdxlUnstableDiffusers_v11.safetensors, sdxl_vae.safetensors, photomaker-v1.safetensors | `./bin/sd -m ../models/sdxlUnstableDiffusers_v11.safetensors --vae ../models/sdxl_vae.safetensors --photo-maker ../models/photomaker-v1.safetensors --pm-id-images-dir ../assets/photomaker_examples/scarletthead_woman -p "a girl img, retro futurism" --cfg-scale 5.0 --sampling-method euler -H 1024 -W 1024 --pm-style-strength 10 --vae-on-cpu --steps 50` |
|
|
|
+| LCM | Fast generation with LoRA | --cfg-scale 1.0, --steps 2-8, --sampling-method lcm/euler_a | lcm-lora-sdv1-5 | `./bin/sd -m ../models/v1-5-pruned-emaonly.safetensors -p "a lovely cat<lora:lcm-lora-sdv1-5:1>" --steps 4 --lora-model-dir ../models -v --cfg-scale 1` |
|
|
|
+| SSD1B | Standard prompt-based generation | Standard SD parameters | (Various SSD-1B models) | `./bin/sd -m ../models/ssd-1b.safetensors -p "a lovely cat"` |
|
|
|
+| Tiny SD | Standard prompt-based generation | Standard SD parameters | (Various Tiny SD models) | `./bin/sd -m ../models/tiny-sd.safetensors -p "a lovely cat"` |
|
|
|
+
|
|
|
+## Model Quantization and Conversion
|
|
|
+
|
|
|
+### Quantization Levels Supported
|
|
|
+
|
|
|
+The stable-diffusion.cpp library supports various quantization levels to balance model size and performance:
|
|
|
+
|
|
|
+| Quantization Level | Description | Model Size Reduction | Quality Impact |
|
|
|
+|-------------------|-------------|----------------------|----------------|
|
|
|
+| `f32` | 32-bit floating-point | None (original) | No quality loss |
|
|
|
+| `f16` | 16-bit floating-point | ~50% | Minimal quality loss |
|
|
|
+| `q8_0` | 8-bit integer quantization | ~75% | Slight quality loss |
|
|
|
+| `q5_0`, `q5_1` | 5-bit integer quantization | ~80% | Moderate quality loss |
|
|
|
+| `q4_0`, `q4_1` | 4-bit integer quantization | ~85% | Noticeable quality loss |
|
|
|
+| `q3_k` | 3-bit K-quantization | ~87% | Significant quality loss |
|
|
|
+| `q4_k` | 4-bit K-quantization | ~85% | Good balance of size/quality |
|
|
|
+| `q2_k` | 2-bit K-quantization | ~90% | Major quality loss |
|
|
|
+| `Q4_K_S` | 4-bit K-quantization Small | ~85% | Optimized for smaller models |
|
|
|
+
|
|
|
+### Model Conversion Commands
|
|
|
+
|
|
|
+To convert models from their original format to quantized GGUF format, use the following commands:
|
|
|
+
|
|
|
+#### Stable Diffusion Models
|
|
|
+```bash
|
|
|
+# Convert SD 1.5 model to 8-bit quantization
|
|
|
+./bin/sd -M convert -m ../models/v1-5-pruned-emaonly.safetensors -o ../models/v1-5-pruned-emaonly.q8_0.gguf -v --type q8_0
|
|
|
+
|
|
|
+# Convert SDXL model to 4-bit quantization
|
|
|
+./bin/sd -M convert -m ../models/sd_xl_base_1.0.safetensors -o ../models/sd_xl_base_1.0.q4_0.gguf -v --type q4_0
|
|
|
+```
|
|
|
+
|
|
|
+#### Flux Models
|
|
|
+```bash
|
|
|
+# Convert Flux Dev model to 8-bit quantization
|
|
|
+./bin/sd -M convert -m ../models/flux1-dev.sft -o ../models/flux1-dev-q8_0.gguf -v --type q8_0
|
|
|
+
|
|
|
+# Convert Flux Schnell model to 3-bit K-quantization
|
|
|
+./bin/sd -M convert -m ../models/flux1-schnell.sft -o ../models/flux1-schnell-q3_k.gguf -v --type q3_k
|
|
|
+```
|
|
|
+
|
|
|
+#### Chroma Models
|
|
|
+```bash
|
|
|
+# Convert Chroma model to 8-bit quantization
|
|
|
+./bin/sd -M convert -m ../models/chroma-unlocked-v40.safetensors -o ../models/chroma-unlocked-v40-q8_0.gguf -v --type q8_0
|
|
|
+```
|
|
|
+
|
|
|
+#### Kontext Models
|
|
|
+```bash
|
|
|
+# Convert Kontext model to 8-bit quantization
|
|
|
+./bin/sd -M convert -m ../models/flux1-kontext-dev.safetensors -o ../models/flux1-kontext-dev-q8_0.gguf -v --type q8_0
|
|
|
+```
|
|
|
+
|
|
|
+### LoRA Models
|
|
|
+
|
|
|
+The project supports LoRA (Low-Rank Adaptation) models for fine-tuning and style transfer:
|
|
|
+
|
|
|
+| LoRA Model | Compatible Base Models | Example Usage |
|
|
|
+|------------|----------------------|----------------|
|
|
|
+| `marblesh.safetensors` | SD 1.5, SD 2.1 | `./bin/sd -m ../models/v1-5-pruned-emaonly.safetensors -p "a lovely cat<lora:marblesh:1>" --lora-model-dir ../models` |
|
|
|
+| `lcm-lora-sdv1-5` | SD 1.5 | `./bin/sd -m ../models/v1-5-pruned-emaonly.safetensors -p "a lovely cat<lora:lcm-lora-sdv1-5:1>" --steps 4 --lora-model-dir ../models -v --cfg-scale 1` |
|
|
|
+| `realism_lora_comfy_converted` | FLUX Models | `./bin/sd --diffusion-model ../models/flux1-dev-q8_0.gguf --vae ../models/ae.sft --clip_l ../models/clip_l.safetensors --t5xxl ../models/t5xxl_fp16.safetensors -p "a lovely cat holding a sign says 'flux.cpp'<lora:realism_lora_comfy_converted:1>" --cfg-scale 1.0 --sampling-method euler -v --lora-model-dir ../models --clip-on-cpu` |
|
|
|
+| `wan2.2_t2v_lightx2v_4steps_lora_v1.1_low_noise` | Wan2.2 T2V Models | `./bin/sd -M vid_gen --diffusion-model ../models/Wan2.2-T2V-A14B-LowNoise-Q8_0.gguf --lora-model-dir ../models -p "a lovely cat<lora:wan2.2_t2v_lightx2v_4steps_lora_v1.1_low_noise:1>" --steps 4` |
|
|
|
+| `wan2.2_t2v_lightx2v_4steps_lora_v1.1_high_noise` | Wan2.2 T2V Models | `./bin/sd -M vid_gen --diffusion-model ../models/Wan2.2-T2V-A14B-HighNoise-Q8_0.gguf --lora-model-dir ../models -p "a lovely cat<lora:wan2.2_t2v_lightx2v_4steps_lora_v1.1_high_noise:1>" --steps 4` |
|
|
|
+
|
|
|
+### Additional Model Types
|
|
|
+
|
|
|
+#### Upscaling (ESRGAN)
|
|
|
+```bash
|
|
|
+# Use ESRGAN for upscaling generated images
|
|
|
+./bin/sd -m ../models/v1-5-pruned-emaonly.safetensors -p "a lovely cat" --upscale-model ../models/RealESRGAN_x4plus_anime_6B.pth
|
|
|
+```
|
|
|
+
|
|
|
+#### Fast Decoding (TAESD)
|
|
|
+```bash
|
|
|
+# Use TAESD for faster VAE decoding
|
|
|
+./bin/sd -m ../models/v1-5-pruned-emaonly.safetensors -p "a lovely cat" --taesd ../models/diffusion_pytorch_model.safetensors
|
|
|
+```
|
|
|
+
|
|
|
## Model Types and File Extensions
|
|
|
|
|
|
The project supports various model types, each with specific file extensions:
|