|
|
3 місяців тому | |
|---|---|---|
| cmake | 3 місяців тому | |
| docs | 3 місяців тому | |
| include | 3 місяців тому | |
| src | 3 місяців тому | |
| webui | 3 місяців тому | |
| .clang-format | 3 місяців тому | |
| .clang-tidy | 3 місяців тому | |
| .gitignore | 3 місяців тому | |
| .roomodes | 3 місяців тому | |
| CMakeLists.txt | 3 місяців тому | |
| CRUSH.md | 3 місяців тому | |
| ISSUE_49_PROGRESS_CALLBACK_FIX.md | 3 місяців тому | |
| README.md | 3 місяців тому | |
| install.sh | 3 місяців тому | |
| stable-diffusion-rest.service.template | 3 місяців тому | |
| uninstall.sh | 3 місяців тому |
A C++ based REST API wrapper for the stable-diffusion.cpp library, providing HTTP endpoints for image generation with Stable Diffusion models.
The stable-diffusion.cpp-rest project aims to create a high-performance REST API server that wraps the functionality of the stable-diffusion.cpp library. This enables developers to integrate Stable Diffusion image generation capabilities into their applications through standard HTTP requests, rather than directly using the C++ library.
A modern, responsive web interface is included and automatically built with the server!
Features:
Quick Start:
# Build (automatically builds web UI)
mkdir build && cd build
cmake ..
cmake --build .
# Run server with web UI
./src/stable-diffusion-rest-server --models-dir /path/to/models --checkpoints checkpoints --ui-dir ../webui
# Access web UI
open http://localhost:8080/ui/
See WEBUI.md for detailed documentation.
The project is designed with a modular architecture consisting of three main components:
Provides type-based model organization
┌─────────────────┐ ┌─────────────────┐ ┌─────────────────┐
│ HTTP Server │───▶│ Generation Queue│───▶│ Model Manager │
│ │ │ │ │ │
│ - Request Parse │ │ - Job Queue │ │ - Model Loading │
│ - Response │ │ - Sequential │ │ - Type Detection│
│ Formatting │ │ Processing │ │ - Memory Mgmt │
└─────────────────┘ └─────────────────┘ └─────────────────┘
stable-diffusion.cpp-rest/
├── CMakeLists.txt # Main CMake configuration
├── README.md # This file
├── src/ # Source code directory
│ ├── main.cpp # Application entry point
│ ├── http/ # HTTP server implementation
│ │ ├── server.h/.cpp # HTTP server class
│ │ ├── handlers.h/.cpp # Request handlers
│ │ └── responses.h/.cpp # Response formatting
│ ├── generation/ # Generation queue implementation
│ │ ├── queue.h/.cpp # Job queue management
│ │ ├── worker.h/.cpp # Generation worker thread
│ │ └── job.h/.cpp # Job definition and status
│ ├── models/ # Model manager implementation
│ │ ├── manager.h/.cpp # Model manager class
│ │ ├── loader.h/.cpp # Model loading logic
│ │ └── types.h/.cpp # Model type definitions
│ └── utils/ # Utility functions
│ ├── config.h/.cpp # Configuration management
│ └── logger.h/.cpp # Logging utilities
├── include/ # Public header files
├── external/ # External dependencies (managed by CMake)
├── models/ # Default model storage directory
│ ├── lora/ # LoRA models
│ ├── checkpoints/ # Checkpoint models
│ ├── vae/ # VAE models
│ ├── presets/ # Preset files
│ ├── prompts/ # Prompt templates
│ ├── neg_prompts/ # Negative prompt templates
│ ├── taesd/ # TAESD models
│ ├── esrgan/ # ESRGAN models
│ ├── controlnet/ # ControlNet models
│ ├── upscaler/ # Upscaler models
│ └── embeddings/ # Textual embeddings
├── tests/ # Unit and integration tests
├── examples/ # Usage examples
└── docs/ # Additional documentation
The stable-diffusion.cpp-rest server includes intelligent model detection that automatically identifies the model architecture and selects the appropriate loading method. This ensures compatibility with both traditional Stable Diffusion models and modern architectures.
These models are loaded using the traditional ctxParams.model_path parameter.
These models are loaded using the modern ctxParams.diffusion_model_path parameter.
ctxParams.model_pathctxParams.diffusion_model_pathNote: The following tables contain extensive information and may require horizontal scrolling to view all columns properly.
| Architecture | Extra VAE | Standalone High Noise | T5XXL | CLIP-Vision | CLIP-G | CLIP-L | Model Files | Example Commands |
|---|---|---|---|---|---|---|---|---|
| SD 1.x | No | No | No | No | No | No | sd-v1-4.ckpt, v1-5-pruned-emaonly.safetensors | ./bin/sd -m ../models/sd-v1-4.ckpt -p "a lovely cat" |
| SD 2.x | No | No | No | No | No | No | (Similar to SD 1.x) | ./bin/sd -m ../models/sd2-model.ckpt -p "a lovely cat" |
| SDXL | Yes | No | No | No | No | No | sd_xl_base_1.0.safetensors, sdxl_vae-fp16-fix.safetensors | ./bin/sd -m ../models/sd_xl_base_1.0.safetensors --vae ../models/sdxl_vae-fp16-fix.safetensors -H 1024 -W 1024 -p "a lovely cat" -v |
| SD3 | No | No | Yes | No | No | No | sd3_medium_incl_clips_t5xxlfp16.safetensors | ./bin/sd -m ../models/sd3_medium_incl_clips_t5xxlfp16.safetensors -H 1024 -W 1024 -p 'a lovely cat holding a sign says "Stable Diffusion CPP"' --cfg-scale 4.5 --sampling-method euler -v --clip-on-cpu |
| SD3.5 Large | No | No | Yes | No | Yes | Yes | sd3.5_large.safetensors, clip_l.safetensors, clip_g.safetensors, t5xxl_fp16.safetensors | ./bin/sd -m ../models/sd3.5_large.safetensors --clip_l ../models/clip_l.safetensors --clip_g ../models/clip_g.safetensors --t5xxl ../models/t5xxl_fp16.safetensors -H 1024 -W 1024 -p 'a lovely cat holding a sign says "Stable diffusion 3.5 Large"' --cfg-scale 4.5 --sampling-method euler -v --clip-on-cpu |
| FLUX Models | Yes | No | Yes | No | No | Yes | flux1-dev-q3_k.gguf, flux1-dev-q8_0.gguf, flux1-schnell-q8_0.gguf, ae.sft, clip_l.safetensors, t5xxl_fp16.safetensors | ./bin/sd --diffusion-model ../models/flux1-dev-q8_0.gguf --vae ../models/ae.sft --clip_l ../models/clip_l.safetensors --t5xxl ../models/t5xxl_fp16.safetensors -p "a lovely cat holding a sign says 'flux.cpp'" --cfg-scale 1.0 --sampling-method euler -v --clip-on-cpu |
| Kontext | Yes | No | Yes | No | No | Yes | flux1-kontext-dev-q8_0.gguf, ae.sft, clip_l.safetensors, t5xxl_fp16.safetensors | ./bin/sd -r ./flux1-dev-q8_0.png --diffusion-model ../models/flux1-kontext-dev-q8_0.gguf --vae ../models/ae.sft --clip_l ../models/clip_l.safetensors --t5xxl ../models/t5xxl_fp16.safetensors -p "change 'flux.cpp' to 'kontext.cpp'" --cfg-scale 1.0 --sampling-method euler -v --clip-on-cpu |
| Chroma | Yes | No | Yes | No | No | No | chroma-unlocked-v40-q8_0.gguf, ae.sft, t5xxl_fp16.safetensors | ./bin/sd --diffusion-model ../models/chroma-unlocked-v40-q8_0.gguf --vae ../models/ae.sft --t5xxl ../models/t5xxl_fp16.safetensors -p "a lovely cat holding a sign says 'chroma.cpp'" --cfg-scale 4.0 --sampling-method euler -v --chroma-disable-dit-mask --clip-on-cpu |
| Wan Models | Yes | No | Yes | Yes | No | No | wan2.1_t2v_1.3B_fp16.safetensors, wan_2.1_vae.safetensors, umt5-xxl-encoder-Q8_0.gguf | ./bin/sd -M vid_gen --diffusion-model ../models/wan2.1_t2v_1.3B_fp16.safetensors --vae ../models/wan_2.1_vae.safetensors --t5xxl ../models/umt5-xxl-encoder-Q8_0.gguf -p "a lovely cat" --cfg-scale 6.0 --sampling-method euler -v -W 832 -H 480 --diffusion-fa --video-frames 33 --flow-shift 3.0 |
| Wan2.2 T2V A14B | No | Yes | No | No | No | No | Wan2.2-T2V-A14B-LowNoise-Q8_0.gguf, Wan2.2-T2V-A14B-HighNoise-Q8_0.gguf | ./bin/sd -M vid_gen --diffusion-model ../models/Wan2.2-T2V-A14B-LowNoise-Q8_0.gguf -p "a lovely cat" --cfg-scale 6.0 --sampling-method euler -v -W 832 -H 480 --diffusion-fa --video-frames 33 --flow-shift 3.0 |
| Wan2.2 I2V A14B | No | Yes | No | No | No | No | Wan2.2-I2V-A14B-LowNoise-Q8_0.gguf, Wan2.2-I2V-A14B-HighNoise-Q8_0.gguf | ./bin/sd -M vid_gen --diffusion-model ../models/Wan2.2-I2V-A14B-LowNoise-Q8_0.gguf -r input.png -p "a lovely cat" --cfg-scale 6.0 --sampling-method euler -v -W 832 -H 480 --diffusion-fa --video-frames 33 --flow-shift 3.0 |
| Qwen Image Models | Yes | No | No | No | No | No | qwen-image-Q8_0.gguf, qwen_image_vae.safetensors, Qwen2.5-VL-7B-Instruct-Q8_0.gguf | ./bin/sd --diffusion-model ../models/qwen-image-Q8_0.gguf --vae ../models/qwen_image_vae.safetensors --qwen2vl ../models/Qwen2.5-VL-7B-Instruct-Q8_0.gguf -p '一个穿着"QWEN"标志的T恤的中国美女正拿着黑色的马克笔面相镜头微笑。' --cfg-scale 2.5 --sampling-method euler -v --offload-to-cpu -H 1024 -W 1024 --diffusion-fa --flow-shift 3 |
| Qwen Image Edit | Yes | No | No | No | No | No | Qwen_Image_Edit-Q8_0.gguf, qwen_image_vae.safetensors, qwen_2.5_vl_7b.safetensors | ./bin/sd --diffusion-model ../models/Qwen_Image_Edit-Q8_0.gguf --vae ../models/qwen_image_vae.safetensors --qwen2vl ../models/qwen_2.5_vl_7b.safetensors --cfg-scale 2.5 --sampling-method euler -v --offload-to-cpu --diffusion-fa --flow-shift 3 -r ../assets/flux/flux1-dev-q8_0.png -p "change 'flux.cpp' to 'edit.cpp'" |
| PhotoMaker | No | No | No | No | No | No | sdxlUnstableDiffusers_v11.safetensors, sdxl_vae.safetensors, photomaker-v1.safetensors | ./bin/sd -m ../models/sdxlUnstableDiffusers_v11.safetensors --vae ../models/sdxl_vae.safetensors --photo-maker ../models/photomaker-v1.safetensors --pm-id-images-dir ../assets/photomaker_examples/scarletthead_woman -p "a girl img, retro futurism" --cfg-scale 5.0 --sampling-method euler -H 1024 -W 1024 --pm-style-strength 10 --vae-on-cpu --steps 50 |
| LCM | No | No | No | No | No | No | lcm-lora-sdv1-5 | ./bin/sd -m ../models/v1-5-pruned-emaonly.safetensors -p "a lovely cat<lora:lcm-lora-sdv1-5:1>" --steps 4 --lora-model-dir ../models -v --cfg-scale 1 |
| SSD1B | No | No | No | No | No | No | (Various SSD-1B models) | ./bin/sd -m ../models/ssd-1b.safetensors -p "a lovely cat" |
| Tiny SD | No | No | No | No | No | No | (Various Tiny SD models) | ./bin/sd -m ../models/tiny-sd.safetensors -p "a lovely cat" |
| Architecture | Context Creation Method | Special Parameters | Model Files | Example Commands |
|---|---|---|---|---|
| SD 1.x, SD 2.x, SDXL | Standard prompt-based generation | --cfg-scale, --sampling-method, --steps | sd-v1-4.ckpt, v1-5-pruned-emaonly.safetensors, sd_xl_base_1.0.safetensors | ./bin/sd -m ../models/sd-v1-4.ckpt -p "a lovely cat" |
| SD3 | Multiple text encoders | --clip-on-cpu recommended | sd3_medium_incl_clips_t5xxlfp16.safetensors | ./bin/sd -m ../models/sd3_medium_incl_clips_t5xxlfp16.safetensors -H 1024 -W 1024 -p 'a lovely cat holding a sign says "Stable Diffusion CPP"' --cfg-scale 4.5 --sampling-method euler -v --clip-on-cpu |
| SD3.5 Large | Multiple text encoders | --clip-on-cpu recommended | sd3.5_large.safetensors, clip_l.safetensors, clip_g.safetensors, t5xxl_fp16.safetensors | ./bin/sd -m ../models/sd3.5_large.safetensors --clip_l ../models/clip_l.safetensors --clip_g ../models/clip_g.safetensors --t5xxl ../models/t5xxl_fp16.safetensors -H 1024 -W 1024 -p 'a lovely cat holding a sign says "Stable diffusion 3.5 Large"' --cfg-scale 4.5 --sampling-method euler -v --clip-on-cpu |
| FLUX Models | Text-to-image generation | --cfg-scale 1.0 recommended, --clip-on-cpu for memory efficiency | flux1-dev-q3_k.gguf, flux1-dev-q8_0.gguf, flux1-schnell-q8_0.gguf, ae.sft, clip_l.safetensors, t5xxl_fp16.safetensors | ./bin/sd --diffusion-model ../models/flux1-dev-q8_0.gguf --vae ../models/ae.sft --clip_l ../models/clip_l.safetensors --t5xxl ../models/t5xxl_fp16.safetensors -p "a lovely cat holding a sign says 'flux.cpp'" --cfg-scale 1.0 --sampling-method euler -v --clip-on-cpu |
| Kontext | Image-to-image transformation | -r for reference image, --cfg-scale 1.0 recommended | flux1-kontext-dev-q8_0.gguf, ae.sft, clip_l.safetensors, t5xxl_fp16.safetensors | ./bin/sd -r ./flux1-dev-q8_0.png --diffusion-model ../models/flux1-kontext-dev-q8_0.gguf --vae ../models/ae.sft --clip_l ../models/clip_l.safetensors --t5xxl ../models/t5xxl_fp16.safetensors -p "change 'flux.cpp' to 'kontext.cpp'" --cfg-scale 1.0 --sampling-method euler -v --clip-on-cpu |
| Chroma | Text-to-image generation | --cfg-scale 1.0 recommended, --clip-on-cpu for memory efficiency | chroma-unlocked-v40-q8_0.gguf, ae.sft, t5xxl_fp16.safetensors | ./bin/sd --diffusion-model ../models/chroma-unlocked-v40-q8_0.gguf --vae ../models/ae.sft --t5xxl ../models/t5xxl_fp16.safetensors -p "a lovely cat holding a sign says 'chroma.cpp'" --cfg-scale 4.0 --sampling-method euler -v --chroma-disable-dit-mask --clip-on-cpu |
| Chroma1-Radiance | Text-to-image generation | --cfg-scale 4.0 recommended | Chroma1-Radiance-v0.4-Q8_0.gguf, t5xxl_fp16.safetensors | ./bin/sd --diffusion-model ../models/Chroma1-Radiance-v0.4-Q8_0.gguf --t5xxl ../models/clip/t5xxl_fp16.safetensors -p "a lovely cat holding a sign says 'chroma radiance cpp'" --cfg-scale 4.0 --sampling-method euler -v |
| Wan Models | Video generation with text prompts | -M vid_gen, --video-frames, --flow-shift, --diffusion-fa | wan2.1_t2v_1.3B_fp16.safetensors, wan_2.1_vae.safetensors, umt5-xxl-encoder-Q8_0.gguf | ./bin/sd -M vid_gen --diffusion-model ../models/wan2.1_t2v_1.3B_fp16.safetensors --vae ../models/wan_2.1_vae.safetensors --t5xxl ../models/umt5-xxl-encoder-Q8_0.gguf -p "a lovely cat" --cfg-scale 6.0 --sampling-method euler -v -W 832 -H 480 --diffusion-fa --video-frames 33 --flow-shift 3.0 |
| Wan2.1 I2V Models | Image-to-video generation | Requires clip_vision_h.safetensors | wan2.1-i2v-14b-480p-Q8_0.gguf, wan_2.1_vae.safetensors, clip_vision_h.safetensors | ./bin/sd -M vid_gen --diffusion-model ../models/wan2.1-i2v-14b-480p-Q8_0.gguf --vae ../models/wan_2.1_vae.safetensors -r input.png -p "a lovely cat" --cfg-scale 6.0 --sampling-method euler -v -W 832 -H 480 --diffusion-fa --video-frames 33 --flow-shift 3.0 |
| Wan2.1 FLF2V Models | Flow-to-video generation | Requires clip_vision_h.safetensors | wan2.1-flf2v-14b-720p-Q8_0.gguf, wan_2.1_vae.safetensors, clip_vision_h.safetensors | ./bin/sd -M vid_gen --diffusion-model ../models/wan2.1-flf2v-14b-720p-Q8_0.gguf --vae ../models/wan_2.1_vae.safetensors -r flow.png -p "a lovely cat" --cfg-scale 6.0 --sampling-method euler -v -W 832 -H 480 --diffusion-fa --video-frames 33 --flow-shift 3.0 |
| Wan2.2 T2V A14B | Text-to-video generation | Uses dual diffusion models | Wan2.2-T2V-A14B-LowNoise-Q8_0.gguf, Wan2.2-T2V-A14B-HighNoise-Q8_0.gguf | ./bin/sd -M vid_gen --diffusion-model ../models/Wan2.2-T2V-A14B-LowNoise-Q8_0.gguf -p "a lovely cat" --cfg-scale 6.0 --sampling-method euler -v -W 832 -H 480 --diffusion-fa --video-frames 33 --flow-shift 3.0 |
| Wan2.2 I2V A14B | Image-to-video generation | Uses dual diffusion models | Wan2.2-I2V-A14B-LowNoise-Q8_0.gguf, Wan2.2-I2V-A14B-HighNoise-Q8_0.gguf | ./bin/sd -M vid_gen --diffusion-model ../models/Wan2.2-I2V-A14B-LowNoise-Q8_0.gguf -r input.png -p "a lovely cat" --cfg-scale 6.0 --sampling-method euler -v -W 832 -H 480 --diffusion-fa --video-frames 33 --flow-shift 3.0 |
| Qwen Image Models | Text-to-image generation with Chinese language support | --qwen2vl for the language model, --diffusion-fa, --flow-shift | qwen-image-Q8_0.gguf, qwen_image_vae.safetensors, Qwen2.5-VL-7B-Instruct-Q8_0.gguf | ./bin/sd --diffusion-model ../models/qwen-image-Q8_0.gguf --vae ../models/qwen_image_vae.safetensors --qwen2vl ../models/Qwen2.5-VL-7B-Instruct-Q8_0.gguf -p '一个穿着"QWEN"标志的T恤的中国美女正拿着黑色的马克笔面相镜头微笑。' --cfg-scale 2.5 --sampling-method euler -v --offload-to-cpu -H 1024 -W 1024 --diffusion-fa --flow-shift 3 |
| Qwen Image Edit | Image editing with reference image | -r for reference image, --qwen2vl_vision for vision model | Qwen_Image_Edit-Q8_0.gguf, qwen_image_vae.safetensors, qwen_2.5_vl_7b.safetensors | ./bin/sd --diffusion-model ../models/Qwen_Image_Edit-Q8_0.gguf --vae ../models/qwen_image_vae.safetensors --qwen2vl ../models/qwen_2.5_vl_7b.safetensors --cfg-scale 2.5 --sampling-method euler -v --offload-to-cpu --diffusion-fa --flow-shift 3 -r ../assets/flux/flux1-dev-q8_0.png -p "change 'flux.cpp' to 'edit.cpp'" |
| PhotoMaker | Personalized image generation with ID images | --photo-maker, --pm-id-images-dir, --pm-style-strength | sdxlUnstableDiffusers_v11.safetensors, sdxl_vae.safetensors, photomaker-v1.safetensors | ./bin/sd -m ../models/sdxlUnstableDiffusers_v11.safetensors --vae ../models/sdxl_vae.safetensors --photo-maker ../models/photomaker-v1.safetensors --pm-id-images-dir ../assets/photomaker_examples/scarletthead_woman -p "a girl img, retro futurism" --cfg-scale 5.0 --sampling-method euler -H 1024 -W 1024 --pm-style-strength 10 --vae-on-cpu --steps 50 |
| LCM | Fast generation with LoRA | --cfg-scale 1.0, --steps 2-8, --sampling-method lcm/euler_a | lcm-lora-sdv1-5 | ./bin/sd -m ../models/v1-5-pruned-emaonly.safetensors -p "a lovely cat<lora:lcm-lora-sdv1-5:1>" --steps 4 --lora-model-dir ../models -v --cfg-scale 1 |
| SSD1B | Standard prompt-based generation | Standard SD parameters | (Various SSD-1B models) | ./bin/sd -m ../models/ssd-1b.safetensors -p "a lovely cat" |
| Tiny SD | Standard prompt-based generation | Standard SD parameters | (Various Tiny SD models) | ./bin/sd -m ../models/tiny-sd.safetensors -p "a lovely cat" |
The stable-diffusion.cpp library supports various quantization levels to balance model size and performance:
| Quantization Level | Description | Model Size Reduction | Quality Impact |
|---|---|---|---|
f32 |
32-bit floating-point | None (original) | No quality loss |
f16 |
16-bit floating-point | ~50% | Minimal quality loss |
q8_0 |
8-bit integer quantization | ~75% | Slight quality loss |
q5_0, q5_1 |
5-bit integer quantization | ~80% | Moderate quality loss |
q4_0, q4_1 |
4-bit integer quantization | ~85% | Noticeable quality loss |
q3_k |
3-bit K-quantization | ~87% | Significant quality loss |
q4_k |
4-bit K-quantization | ~85% | Good balance of size/quality |
q2_k |
2-bit K-quantization | ~90% | Major quality loss |
Q4_K_S |
4-bit K-quantization Small | ~85% | Optimized for smaller models |
To convert models from their original format to quantized GGUF format, use the following commands:
# Convert SD 1.5 model to 8-bit quantization
./bin/sd -M convert -m ../models/v1-5-pruned-emaonly.safetensors -o ../models/v1-5-pruned-emaonly.q8_0.gguf -v --type q8_0
# Convert SDXL model to 4-bit quantization
./bin/sd -M convert -m ../models/sd_xl_base_1.0.safetensors -o ../models/sd_xl_base_1.0.q4_0.gguf -v --type q4_0
# Convert Flux Dev model to 8-bit quantization
./bin/sd -M convert -m ../models/flux1-dev.sft -o ../models/flux1-dev-q8_0.gguf -v --type q8_0
# Convert Flux Schnell model to 3-bit K-quantization
./bin/sd -M convert -m ../models/flux1-schnell.sft -o ../models/flux1-schnell-q3_k.gguf -v --type q3_k
# Convert Chroma model to 8-bit quantization
./bin/sd -M convert -m ../models/chroma-unlocked-v40.safetensors -o ../models/chroma-unlocked-v40-q8_0.gguf -v --type q8_0
# Convert Kontext model to 8-bit quantization
./bin/sd -M convert -m ../models/flux1-kontext-dev.safetensors -o ../models/flux1-kontext-dev-q8_0.gguf -v --type q8_0
The project supports LoRA (Low-Rank Adaptation) models for fine-tuning and style transfer:
| LoRA Model | Compatible Base Models | Example Usage |
|---|---|---|
marblesh.safetensors |
SD 1.5, SD 2.1 | ./bin/sd -m ../models/v1-5-pruned-emaonly.safetensors -p "a lovely cat<lora:marblesh:1>" --lora-model-dir ../models |
lcm-lora-sdv1-5 |
SD 1.5 | ./bin/sd -m ../models/v1-5-pruned-emaonly.safetensors -p "a lovely cat<lora:lcm-lora-sdv1-5:1>" --steps 4 --lora-model-dir ../models -v --cfg-scale 1 |
realism_lora_comfy_converted |
FLUX Models | ./bin/sd --diffusion-model ../models/flux1-dev-q8_0.gguf --vae ../models/ae.sft --clip_l ../models/clip_l.safetensors --t5xxl ../models/t5xxl_fp16.safetensors -p "a lovely cat holding a sign says 'flux.cpp'<lora:realism_lora_comfy_converted:1>" --cfg-scale 1.0 --sampling-method euler -v --lora-model-dir ../models --clip-on-cpu |
wan2.2_t2v_lightx2v_4steps_lora_v1.1_low_noise |
Wan2.2 T2V Models | ./bin/sd -M vid_gen --diffusion-model ../models/Wan2.2-T2V-A14B-LowNoise-Q8_0.gguf --lora-model-dir ../models -p "a lovely cat<lora:wan2.2_t2v_lightx2v_4steps_lora_v1.1_low_noise:1>" --steps 4 |
wan2.2_t2v_lightx2v_4steps_lora_v1.1_high_noise |
Wan2.2 T2V Models | ./bin/sd -M vid_gen --diffusion-model ../models/Wan2.2-T2V-A14B-HighNoise-Q8_0.gguf --lora-model-dir ../models -p "a lovely cat<lora:wan2.2_t2v_lightx2v_4steps_lora_v1.1_high_noise:1>" --steps 4 |
# Use ESRGAN for upscaling generated images
./bin/sd -m ../models/v1-5-pruned-emaonly.safetensors -p "a lovely cat" --upscale-model ../models/RealESRGAN_x4plus_anime_6B.pth
# Use TAESD for faster VAE decoding
./bin/sd -m ../models/v1-5-pruned-emaonly.safetensors -p "a lovely cat" --taesd ../models/diffusion_pytorch_model.safetensors
The project supports various model types, each with specific file extensions:
| Model Type | Enum Value | Description | Supported Extensions |
|---|---|---|---|
| LORA | 1 | Low-Rank Adaptation models | .safetensors, .pt, .ckpt |
| CHECKPOINT | 2 | Main model checkpoints | .safetensors, .pt, .ckpt |
| VAE | 4 | Variational Autoencoder models | .safetensors, .pt, .ckpt |
| PRESETS | 8 | Generation preset files | .json, .yaml, .yml |
| PROMPTS | 16 | Prompt template files | .txt, .json |
| NEG_PROMPTS | 32 | Negative prompt templates | .txt, .json |
| TAESD | 64 | Tiny AutoEncoder for SD | .safetensors, .pt, .ckpt |
| ESRGAN | 128 | Super-resolution models | .pth, .pt |
| CONTROLNET | 256 | ControlNet models | .safetensors, .pt, .ckpt |
| UPSCALER | 512 | Image upscaler models | .pth, .pt |
| EMBEDDING | 1024 | Textual embeddings | .safetensors, .pt, .ckpt |
enum ModelType {
LORA = 1,
CHECKPOINT = 2,
VAE = 4,
PRESETS = 8,
PROMPTS = 16,
NEG_PROMPTS = 32,
TAESD = 64,
ESRGAN = 128,
CONTROLNET = 256,
UPSCALER = 512,
EMBEDDING = 1024
};
POST /api/v1/generate/text2img - Generate image from text promptPOST /api/v1/generate/img2img - Transform image with text promptPOST /api/v1/generate/inpainting - Inpaint image with maskGET /api/v1/generate/{job_id} - Get generation status and resultDELETE /api/v1/generate/{job_id} - Cancel a generation jobGET /api/v1/models - List available modelsGET /api/v1/models/{type} - List models of specific typePOST /api/v1/models/load - Load a modelPOST /api/v1/models/unload - Unload a modelGET /api/v1/models/{model_id} - Get model informationGET /api/v1/status - Get server status and statisticsGET /api/v1/system - Get system information (GPU, memory, etc.)POST /api/v1/generate
{
"prompt": "a beautiful landscape",
"negative_prompt": "blurry, low quality",
"model": "sd-v1-5",
"width": 512,
"height": 512,
"steps": 20,
"cfg_scale": 7.5,
"seed": -1,
"batch_size": 1
}
{
"job_id": "uuid-string",
"status": "completed",
"images": [
{
"data": "base64-encoded-image-data",
"seed": 12345,
"parameters": {
"prompt": "a beautiful landscape",
"negative_prompt": "blurry, low quality",
"model": "sd-v1-5",
"width": 512,
"height": 512,
"steps": 20,
"cfg_scale": 7.5,
"seed": 12345,
"batch_size": 1
}
}
],
"generation_time": 3.2
}
The server supports multiple authentication methods to secure API access:
No Authentication (Default)
JWT Token Authentication
API Key Authentication
PAM Authentication
Authentication can be configured via command-line arguments or configuration files:
# Enable PAM authentication
./stable-diffusion-rest-server --auth pam --models-dir /path/to/models --checkpoints checkpoints
# Enable JWT authentication
./stable-diffusion-rest-server --auth jwt --models-dir /path/to/models --checkpoints checkpoints
# Enable API key authentication
./stable-diffusion-rest-server --auth api-key --models-dir /path/to/models --checkpoints checkpoints
# Enable Unix authentication
./stable-diffusion-rest-server --auth unix --models-dir /path/to/models --checkpoints checkpoints
# No authentication (default)
./stable-diffusion-rest-server --auth none --models-dir /path/to/models --checkpoints checkpoints
none - No authentication required (default)jwt - JWT token authenticationapi-key - API key authenticationunix - Unix system authenticationpam - PAM authenticationoptional - Authentication optional (guest access allowed)The following options are deprecated and will be removed in a future version:
--enable-unix-auth - Use --auth unix instead--enable-pam-auth - Use --auth pam insteadPOST /api/v1/auth/login - Authenticate with username/password (PAM/JWT)POST /api/v1/auth/refresh - Refresh JWT tokenGET /api/v1/auth/profile - Get current user profilePOST /api/v1/auth/logout - Logout/invalidate tokenFor detailed authentication setup instructions, see PAM_AUTHENTICATION.md.
Clone the repository:
git clone https://github.com/your-username/stable-diffusion.cpp-rest.git
cd stable-diffusion.cpp-rest
Create a build directory:
mkdir build
cd build
Configure with CMake:
cmake ..
Build the project:
cmake --build . --parallel
(Optional) Install the binary:
cmake --install .
The project uses CMake's external project feature to automatically download and build the stable-diffusion.cpp library:
include(ExternalProject)
ExternalProject_Add(
stable-diffusion.cpp
GIT_REPOSITORY https://github.com/leejet/stable-diffusion.cpp.git
GIT_TAG master-334-d05e46c
SOURCE_DIR "${CMAKE_CURRENT_BINARY_DIR}/stable-diffusion.cpp-src"
BINARY_DIR "${CMAKE_CURRENT_BINARY_DIR}/stable-diffusion.cpp-build"
CMAKE_ARGS -DCMAKE_INSTALL_PREFIX=${CMAKE_CURRENT_BINARY_DIR}/stable-diffusion.cpp-install
INSTALL_COMMAND ""
)
PAM authentication support is enabled by default when PAM libraries are available. You can control this with CMake options:
# Build with PAM support (default when available)
cmake -DENABLE_PAM_AUTH=ON ..
# Build without PAM support
cmake -DENABLE_PAM_AUTH=OFF ..
# Check if PAM support will be built
cmake -LA | grep ENABLE_PAM_AUTH
Note: PAM authentication requires the PAM development libraries:
# Ubuntu/Debian
sudo apt-get install libpam0g-dev
# CentOS/RHEL/Fedora
sudo yum install pam-devel
# Basic usage
./stable-diffusion.cpp-rest
# With custom configuration
./stable-diffusion.cpp-rest --config config.json
# With custom model directory
./stable-diffusion.cpp-rest --model-dir /path/to/models
import requests
import base64
import json
# Generate an image
response = requests.post('http://localhost:8080/api/v1/generate', json={
'prompt': 'a beautiful landscape',
'width': 512,
'height': 512,
'steps': 20
})
result = response.json()
if result['status'] == 'completed':
# Decode and save the first image
image_data = base64.b64decode(result['images'][0]['data'])
with open('generated_image.png', 'wb') as f:
f.write(image_data)
async function generateImage() {
const response = await fetch('http://localhost:8080/api/v1/generate', {
method: 'POST',
headers: {
'Content-Type': 'application/json'
},
body: JSON.stringify({
prompt: 'a beautiful landscape',
width: 512,
height: 512,
steps: 20
})
});
const result = await response.json();
if (result.status === 'completed') {
// Create an image element with the generated image
const img = document.createElement('img');
img.src = `data:image/png;base64,${result.images[0].data}`;
document.body.appendChild(img);
}
}
# Generate an image
curl -X POST http://localhost:8080/api/v1/generate \
-H "Content-Type: application/json" \
-d '{
"prompt": "a beautiful landscape",
"width": 512,
"height": 512,
"steps": 20
}'
# Check job status
curl http://localhost:8080/api/v1/generate/{job_id}
# List available models
curl http://localhost:8080/api/v1/models
Contributions are welcome! Please feel free to submit a Pull Request. For major changes, please open an issue first to discuss what you would like to change.
This project is licensed under the MIT License - see the LICENSE file for details.