|
|
3 luni în urmă | |
|---|---|---|
| cmake | 3 luni în urmă | |
| docs | 3 luni în urmă | |
| include | 3 luni în urmă | |
| src | 3 luni în urmă | |
| webui | 3 luni în urmă | |
| .clang-format | 3 luni în urmă | |
| .clang-tidy | 3 luni în urmă | |
| .gitignore | 3 luni în urmă | |
| .roomodes | 3 luni în urmă | |
| AGENTS.md | 3 luni în urmă | |
| CMakeLists.txt | 3 luni în urmă | |
| README.md | 3 luni în urmă | |
| install.sh | 3 luni în urmă | |
| stable-diffusion-rest.service.template | 3 luni în urmă | |
| uninstall.sh | 3 luni în urmă |
A production-ready C++ REST API server that wraps the stable-diffusion.cpp library, providing comprehensive HTTP endpoints for image generation with Stable Diffusion models. Features a modern web interface built with Next.js and robust authentication system.
The stable-diffusion.cpp-rest project aims to create a high-performance REST API server that wraps the functionality of the stable-diffusion.cpp library. This enables developers to integrate Stable Diffusion image generation capabilities into their applications through standard HTTP requests, rather than directly using the C++ library.
A modern, responsive web interface is included and automatically built with the server! Built with Next.js 16, React 19, and Tailwind CSS 4.
# Build (automatically builds web UI)
mkdir build && cd build
cmake ..
cmake --build .
# Run server with web UI
./src/stable-diffusion-rest-server --models-dir /path/to/models --ui-dir ../build/webui
# Access web UI
open http://localhost:8080/ui/
webui/
├── app/ # Next.js app directory
│ ├── components/ # React components
│ ├── lib/ # Utilities and API clients
│ └── globals.css # Global styles
├── public/ # Static assets
├── package.json # Dependencies
├── next.config.ts # Next.js configuration
└── tsconfig.json # TypeScript configuration
The project is designed with a modular architecture consisting of three main components:
Provides type-based model organization
┌─────────────────┐ ┌─────────────────┐ ┌─────────────────┐
│ HTTP Server │───▶│ Generation Queue│───▶│ Model Manager │
│ │ │ │ │ │
│ - Request Parse │ │ - Job Queue │ │ - Model Loading │
│ - Response │ │ - Sequential │ │ - Type Detection│
│ Formatting │ │ Processing │ │ - Memory Mgmt │
└─────────────────┘ └─────────────────┘ └─────────────────┘
stable-diffusion.cpp-rest/
├── CMakeLists.txt # Main CMake configuration
├── README.md # This file
├── src/ # Source code directory
│ ├── main.cpp # Application entry point
│ ├── http/ # HTTP server implementation
│ │ ├── server.h/.cpp # HTTP server class
│ │ ├── handlers.h/.cpp # Request handlers
│ │ └── responses.h/.cpp # Response formatting
│ ├── generation/ # Generation queue implementation
│ │ ├── queue.h/.cpp # Job queue management
│ │ ├── worker.h/.cpp # Generation worker thread
│ │ └── job.h/.cpp # Job definition and status
│ ├── models/ # Model manager implementation
│ │ ├── manager.h/.cpp # Model manager class
│ │ ├── loader.h/.cpp # Model loading logic
│ │ └── types.h/.cpp # Model type definitions
│ └── utils/ # Utility functions
│ ├── config.h/.cpp # Configuration management
│ └── logger.h/.cpp # Logging utilities
├── include/ # Public header files
├── external/ # External dependencies (managed by CMake)
├── models/ # Default model storage directory
│ ├── lora/ # LoRA models
│ ├── checkpoints/ # Checkpoint models
│ ├── vae/ # VAE models
│ ├── presets/ # Preset files
│ ├── prompts/ # Prompt templates
│ ├── neg_prompts/ # Negative prompt templates
│ ├── taesd/ # TAESD models
│ ├── esrgan/ # ESRGAN models
│ ├── controlnet/ # ControlNet models
│ ├── upscaler/ # Upscaler models
│ └── embeddings/ # Textual embeddings
├── tests/ # Unit and integration tests
├── examples/ # Usage examples
└── docs/ # Additional documentation
The stable-diffusion.cpp-rest server includes intelligent model detection that automatically identifies the model architecture and selects the appropriate loading method. This ensures compatibility with both traditional Stable Diffusion models and modern architectures.
These models are loaded using the traditional ctxParams.model_path parameter.
These models are loaded using the modern ctxParams.diffusion_model_path parameter.
ctxParams.model_pathctxParams.diffusion_model_pathNote: The following tables contain extensive information and may require horizontal scrolling to view all columns properly.
| Architecture | Extra VAE | Standalone High Noise | T5XXL | CLIP-Vision | CLIP-G | CLIP-L | Model Files | Example Commands |
|---|---|---|---|---|---|---|---|---|
| SD 1.x | No | No | No | No | No | No | sd-v1-4.ckpt, v1-5-pruned-emaonly.safetensors | ./bin/sd -m ../models/sd-v1-4.ckpt -p "a lovely cat" |
| SD 2.x | No | No | No | No | No | No | (Similar to SD 1.x) | ./bin/sd -m ../models/sd2-model.ckpt -p "a lovely cat" |
| SDXL | Yes | No | No | No | No | No | sd_xl_base_1.0.safetensors, sdxl_vae-fp16-fix.safetensors | ./bin/sd -m ../models/sd_xl_base_1.0.safetensors --vae ../models/sdxl_vae-fp16-fix.safetensors -H 1024 -W 1024 -p "a lovely cat" -v |
| SD3 | No | No | Yes | No | No | No | sd3_medium_incl_clips_t5xxlfp16.safetensors | ./bin/sd -m ../models/sd3_medium_incl_clips_t5xxlfp16.safetensors -H 1024 -W 1024 -p 'a lovely cat holding a sign says "Stable Diffusion CPP"' --cfg-scale 4.5 --sampling-method euler -v --clip-on-cpu |
| SD3.5 Large | No | No | Yes | No | Yes | Yes | sd3.5_large.safetensors, clip_l.safetensors, clip_g.safetensors, t5xxl_fp16.safetensors | ./bin/sd -m ../models/sd3.5_large.safetensors --clip_l ../models/clip_l.safetensors --clip_g ../models/clip_g.safetensors --t5xxl ../models/t5xxl_fp16.safetensors -H 1024 -W 1024 -p 'a lovely cat holding a sign says "Stable diffusion 3.5 Large"' --cfg-scale 4.5 --sampling-method euler -v --clip-on-cpu |
| FLUX Models | Yes | No | Yes | No | No | Yes | flux1-dev-q3_k.gguf, flux1-dev-q8_0.gguf, flux1-schnell-q8_0.gguf, ae.sft, clip_l.safetensors, t5xxl_fp16.safetensors | ./bin/sd --diffusion-model ../models/flux1-dev-q8_0.gguf --vae ../models/ae.sft --clip_l ../models/clip_l.safetensors --t5xxl ../models/t5xxl_fp16.safetensors -p "a lovely cat holding a sign says 'flux.cpp'" --cfg-scale 1.0 --sampling-method euler -v --clip-on-cpu |
| Kontext | Yes | No | Yes | No | No | Yes | flux1-kontext-dev-q8_0.gguf, ae.sft, clip_l.safetensors, t5xxl_fp16.safetensors | ./bin/sd -r ./flux1-dev-q8_0.png --diffusion-model ../models/flux1-kontext-dev-q8_0.gguf --vae ../models/ae.sft --clip_l ../models/clip_l.safetensors --t5xxl ../models/t5xxl_fp16.safetensors -p "change 'flux.cpp' to 'kontext.cpp'" --cfg-scale 1.0 --sampling-method euler -v --clip-on-cpu |
| Chroma | Yes | No | Yes | No | No | No | chroma-unlocked-v40-q8_0.gguf, ae.sft, t5xxl_fp16.safetensors | ./bin/sd --diffusion-model ../models/chroma-unlocked-v40-q8_0.gguf --vae ../models/ae.sft --t5xxl ../models/t5xxl_fp16.safetensors -p "a lovely cat holding a sign says 'chroma.cpp'" --cfg-scale 4.0 --sampling-method euler -v --chroma-disable-dit-mask --clip-on-cpu |
| Wan Models | Yes | No | Yes | Yes | No | No | wan2.1_t2v_1.3B_fp16.safetensors, wan_2.1_vae.safetensors, umt5-xxl-encoder-Q8_0.gguf | ./bin/sd -M vid_gen --diffusion-model ../models/wan2.1_t2v_1.3B_fp16.safetensors --vae ../models/wan_2.1_vae.safetensors --t5xxl ../models/umt5-xxl-encoder-Q8_0.gguf -p "a lovely cat" --cfg-scale 6.0 --sampling-method euler -v -W 832 -H 480 --diffusion-fa --video-frames 33 --flow-shift 3.0 |
| Wan2.2 T2V A14B | No | Yes | No | No | No | No | Wan2.2-T2V-A14B-LowNoise-Q8_0.gguf, Wan2.2-T2V-A14B-HighNoise-Q8_0.gguf | ./bin/sd -M vid_gen --diffusion-model ../models/Wan2.2-T2V-A14B-LowNoise-Q8_0.gguf -p "a lovely cat" --cfg-scale 6.0 --sampling-method euler -v -W 832 -H 480 --diffusion-fa --video-frames 33 --flow-shift 3.0 |
| Wan2.2 I2V A14B | No | Yes | No | No | No | No | Wan2.2-I2V-A14B-LowNoise-Q8_0.gguf, Wan2.2-I2V-A14B-HighNoise-Q8_0.gguf | ./bin/sd -M vid_gen --diffusion-model ../models/Wan2.2-I2V-A14B-LowNoise-Q8_0.gguf -r input.png -p "a lovely cat" --cfg-scale 6.0 --sampling-method euler -v -W 832 -H 480 --diffusion-fa --video-frames 33 --flow-shift 3.0 |
| Qwen Image Models | Yes | No | No | No | No | No | qwen-image-Q8_0.gguf, qwen_image_vae.safetensors, Qwen2.5-VL-7B-Instruct-Q8_0.gguf | ./bin/sd --diffusion-model ../models/qwen-image-Q8_0.gguf --vae ../models/qwen_image_vae.safetensors --qwen2vl ../models/Qwen2.5-VL-7B-Instruct-Q8_0.gguf -p '一个穿着"QWEN"标志的T恤的中国美女正拿着黑色的马克笔面相镜头微笑。' --cfg-scale 2.5 --sampling-method euler -v --offload-to-cpu -H 1024 -W 1024 --diffusion-fa --flow-shift 3 |
| Qwen Image Edit | Yes | No | No | No | No | No | Qwen_Image_Edit-Q8_0.gguf, qwen_image_vae.safetensors, qwen_2.5_vl_7b.safetensors | ./bin/sd --diffusion-model ../models/Qwen_Image_Edit-Q8_0.gguf --vae ../models/qwen_image_vae.safetensors --qwen2vl ../models/qwen_2.5_vl_7b.safetensors --cfg-scale 2.5 --sampling-method euler -v --offload-to-cpu --diffusion-fa --flow-shift 3 -r ../assets/flux/flux1-dev-q8_0.png -p "change 'flux.cpp' to 'edit.cpp'" |
| PhotoMaker | No | No | No | No | No | No | sdxlUnstableDiffusers_v11.safetensors, sdxl_vae.safetensors, photomaker-v1.safetensors | ./bin/sd -m ../models/sdxlUnstableDiffusers_v11.safetensors --vae ../models/sdxl_vae.safetensors --photo-maker ../models/photomaker-v1.safetensors --pm-id-images-dir ../assets/photomaker_examples/scarletthead_woman -p "a girl img, retro futurism" --cfg-scale 5.0 --sampling-method euler -H 1024 -W 1024 --pm-style-strength 10 --vae-on-cpu --steps 50 |
| LCM | No | No | No | No | No | No | lcm-lora-sdv1-5 | ./bin/sd -m ../models/v1-5-pruned-emaonly.safetensors -p "a lovely cat<lora:lcm-lora-sdv1-5:1>" --steps 4 --lora-model-dir ../models -v --cfg-scale 1 |
| SSD1B | No | No | No | No | No | No | (Various SSD-1B models) | ./bin/sd -m ../models/ssd-1b.safetensors -p "a lovely cat" |
| Tiny SD | No | No | No | No | No | No | (Various Tiny SD models) | ./bin/sd -m ../models/tiny-sd.safetensors -p "a lovely cat" |
| Architecture | Context Creation Method | Special Parameters | Model Files | Example Commands |
|---|---|---|---|---|
| SD 1.x, SD 2.x, SDXL | Standard prompt-based generation | --cfg-scale, --sampling-method, --steps | sd-v1-4.ckpt, v1-5-pruned-emaonly.safetensors, sd_xl_base_1.0.safetensors | ./bin/sd -m ../models/sd-v1-4.ckpt -p "a lovely cat" |
| SD3 | Multiple text encoders | --clip-on-cpu recommended | sd3_medium_incl_clips_t5xxlfp16.safetensors | ./bin/sd -m ../models/sd3_medium_incl_clips_t5xxlfp16.safetensors -H 1024 -W 1024 -p 'a lovely cat holding a sign says "Stable Diffusion CPP"' --cfg-scale 4.5 --sampling-method euler -v --clip-on-cpu |
| SD3.5 Large | Multiple text encoders | --clip-on-cpu recommended | sd3.5_large.safetensors, clip_l.safetensors, clip_g.safetensors, t5xxl_fp16.safetensors | ./bin/sd -m ../models/sd3.5_large.safetensors --clip_l ../models/clip_l.safetensors --clip_g ../models/clip_g.safetensors --t5xxl ../models/t5xxl_fp16.safetensors -H 1024 -W 1024 -p 'a lovely cat holding a sign says "Stable diffusion 3.5 Large"' --cfg-scale 4.5 --sampling-method euler -v --clip-on-cpu |
| FLUX Models | Text-to-image generation | --cfg-scale 1.0 recommended, --clip-on-cpu for memory efficiency | flux1-dev-q3_k.gguf, flux1-dev-q8_0.gguf, flux1-schnell-q8_0.gguf, ae.sft, clip_l.safetensors, t5xxl_fp16.safetensors | ./bin/sd --diffusion-model ../models/flux1-dev-q8_0.gguf --vae ../models/ae.sft --clip_l ../models/clip_l.safetensors --t5xxl ../models/t5xxl_fp16.safetensors -p "a lovely cat holding a sign says 'flux.cpp'" --cfg-scale 1.0 --sampling-method euler -v --clip-on-cpu |
| Kontext | Image-to-image transformation | -r for reference image, --cfg-scale 1.0 recommended | flux1-kontext-dev-q8_0.gguf, ae.sft, clip_l.safetensors, t5xxl_fp16.safetensors | ./bin/sd -r ./flux1-dev-q8_0.png --diffusion-model ../models/flux1-kontext-dev-q8_0.gguf --vae ../models/ae.sft --clip_l ../models/clip_l.safetensors --t5xxl ../models/t5xxl_fp16.safetensors -p "change 'flux.cpp' to 'kontext.cpp'" --cfg-scale 1.0 --sampling-method euler -v --clip-on-cpu |
| Chroma | Text-to-image generation | --cfg-scale 1.0 recommended, --clip-on-cpu for memory efficiency | chroma-unlocked-v40-q8_0.gguf, ae.sft, t5xxl_fp16.safetensors | ./bin/sd --diffusion-model ../models/chroma-unlocked-v40-q8_0.gguf --vae ../models/ae.sft --t5xxl ../models/t5xxl_fp16.safetensors -p "a lovely cat holding a sign says 'chroma.cpp'" --cfg-scale 4.0 --sampling-method euler -v --chroma-disable-dit-mask --clip-on-cpu |
| Chroma1-Radiance | Text-to-image generation | --cfg-scale 4.0 recommended | Chroma1-Radiance-v0.4-Q8_0.gguf, t5xxl_fp16.safetensors | ./bin/sd --diffusion-model ../models/Chroma1-Radiance-v0.4-Q8_0.gguf --t5xxl ../models/clip/t5xxl_fp16.safetensors -p "a lovely cat holding a sign says 'chroma radiance cpp'" --cfg-scale 4.0 --sampling-method euler -v |
| Wan Models | Video generation with text prompts | -M vid_gen, --video-frames, --flow-shift, --diffusion-fa | wan2.1_t2v_1.3B_fp16.safetensors, wan_2.1_vae.safetensors, umt5-xxl-encoder-Q8_0.gguf | ./bin/sd -M vid_gen --diffusion-model ../models/wan2.1_t2v_1.3B_fp16.safetensors --vae ../models/wan_2.1_vae.safetensors --t5xxl ../models/umt5-xxl-encoder-Q8_0.gguf -p "a lovely cat" --cfg-scale 6.0 --sampling-method euler -v -W 832 -H 480 --diffusion-fa --video-frames 33 --flow-shift 3.0 |
| Wan2.1 I2V Models | Image-to-video generation | Requires clip_vision_h.safetensors | wan2.1-i2v-14b-480p-Q8_0.gguf, wan_2.1_vae.safetensors, clip_vision_h.safetensors | ./bin/sd -M vid_gen --diffusion-model ../models/wan2.1-i2v-14b-480p-Q8_0.gguf --vae ../models/wan_2.1_vae.safetensors -r input.png -p "a lovely cat" --cfg-scale 6.0 --sampling-method euler -v -W 832 -H 480 --diffusion-fa --video-frames 33 --flow-shift 3.0 |
| Wan2.1 FLF2V Models | Flow-to-video generation | Requires clip_vision_h.safetensors | wan2.1-flf2v-14b-720p-Q8_0.gguf, wan_2.1_vae.safetensors, clip_vision_h.safetensors | ./bin/sd -M vid_gen --diffusion-model ../models/wan2.1-flf2v-14b-720p-Q8_0.gguf --vae ../models/wan_2.1_vae.safetensors -r flow.png -p "a lovely cat" --cfg-scale 6.0 --sampling-method euler -v -W 832 -H 480 --diffusion-fa --video-frames 33 --flow-shift 3.0 |
| Wan2.2 T2V A14B | Text-to-video generation | Uses dual diffusion models | Wan2.2-T2V-A14B-LowNoise-Q8_0.gguf, Wan2.2-T2V-A14B-HighNoise-Q8_0.gguf | ./bin/sd -M vid_gen --diffusion-model ../models/Wan2.2-T2V-A14B-LowNoise-Q8_0.gguf -p "a lovely cat" --cfg-scale 6.0 --sampling-method euler -v -W 832 -H 480 --diffusion-fa --video-frames 33 --flow-shift 3.0 |
| Wan2.2 I2V A14B | Image-to-video generation | Uses dual diffusion models | Wan2.2-I2V-A14B-LowNoise-Q8_0.gguf, Wan2.2-I2V-A14B-HighNoise-Q8_0.gguf | ./bin/sd -M vid_gen --diffusion-model ../models/Wan2.2-I2V-A14B-LowNoise-Q8_0.gguf -r input.png -p "a lovely cat" --cfg-scale 6.0 --sampling-method euler -v -W 832 -H 480 --diffusion-fa --video-frames 33 --flow-shift 3.0 |
| Qwen Image Models | Text-to-image generation with Chinese language support | --qwen2vl for the language model, --diffusion-fa, --flow-shift | qwen-image-Q8_0.gguf, qwen_image_vae.safetensors, Qwen2.5-VL-7B-Instruct-Q8_0.gguf | ./bin/sd --diffusion-model ../models/qwen-image-Q8_0.gguf --vae ../models/qwen_image_vae.safetensors --qwen2vl ../models/Qwen2.5-VL-7B-Instruct-Q8_0.gguf -p '一个穿着"QWEN"标志的T恤的中国美女正拿着黑色的马克笔面相镜头微笑。' --cfg-scale 2.5 --sampling-method euler -v --offload-to-cpu -H 1024 -W 1024 --diffusion-fa --flow-shift 3 |
| Qwen Image Edit | Image editing with reference image | -r for reference image, --qwen2vl_vision for vision model | Qwen_Image_Edit-Q8_0.gguf, qwen_image_vae.safetensors, qwen_2.5_vl_7b.safetensors | ./bin/sd --diffusion-model ../models/Qwen_Image_Edit-Q8_0.gguf --vae ../models/qwen_image_vae.safetensors --qwen2vl ../models/qwen_2.5_vl_7b.safetensors --cfg-scale 2.5 --sampling-method euler -v --offload-to-cpu --diffusion-fa --flow-shift 3 -r ../assets/flux/flux1-dev-q8_0.png -p "change 'flux.cpp' to 'edit.cpp'" |
| PhotoMaker | Personalized image generation with ID images | --photo-maker, --pm-id-images-dir, --pm-style-strength | sdxlUnstableDiffusers_v11.safetensors, sdxl_vae.safetensors, photomaker-v1.safetensors | ./bin/sd -m ../models/sdxlUnstableDiffusers_v11.safetensors --vae ../models/sdxl_vae.safetensors --photo-maker ../models/photomaker-v1.safetensors --pm-id-images-dir ../assets/photomaker_examples/scarletthead_woman -p "a girl img, retro futurism" --cfg-scale 5.0 --sampling-method euler -H 1024 -W 1024 --pm-style-strength 10 --vae-on-cpu --steps 50 |
| LCM | Fast generation with LoRA | --cfg-scale 1.0, --steps 2-8, --sampling-method lcm/euler_a | lcm-lora-sdv1-5 | ./bin/sd -m ../models/v1-5-pruned-emaonly.safetensors -p "a lovely cat<lora:lcm-lora-sdv1-5:1>" --steps 4 --lora-model-dir ../models -v --cfg-scale 1 |
| SSD1B | Standard prompt-based generation | Standard SD parameters | (Various SSD-1B models) | ./bin/sd -m ../models/ssd-1b.safetensors -p "a lovely cat" |
| Tiny SD | Standard prompt-based generation | Standard SD parameters | (Various Tiny SD models) | ./bin/sd -m ../models/tiny-sd.safetensors -p "a lovely cat" |
The stable-diffusion.cpp library supports various quantization levels to balance model size and performance:
| Quantization Level | Description | Model Size Reduction | Quality Impact |
|---|---|---|---|
f32 |
32-bit floating-point | None (original) | No quality loss |
f16 |
16-bit floating-point | ~50% | Minimal quality loss |
q8_0 |
8-bit integer quantization | ~75% | Slight quality loss |
q5_0, q5_1 |
5-bit integer quantization | ~80% | Moderate quality loss |
q4_0, q4_1 |
4-bit integer quantization | ~85% | Noticeable quality loss |
q3_k |
3-bit K-quantization | ~87% | Significant quality loss |
q4_k |
4-bit K-quantization | ~85% | Good balance of size/quality |
q2_k |
2-bit K-quantization | ~90% | Major quality loss |
Q4_K_S |
4-bit K-quantization Small | ~85% | Optimized for smaller models |
To convert models from their original format to quantized GGUF format, use the following commands:
# Convert SD 1.5 model to 8-bit quantization
./bin/sd -M convert -m ../models/v1-5-pruned-emaonly.safetensors -o ../models/v1-5-pruned-emaonly.q8_0.gguf -v --type q8_0
# Convert SDXL model to 4-bit quantization
./bin/sd -M convert -m ../models/sd_xl_base_1.0.safetensors -o ../models/sd_xl_base_1.0.q4_0.gguf -v --type q4_0
# Convert Flux Dev model to 8-bit quantization
./bin/sd -M convert -m ../models/flux1-dev.sft -o ../models/flux1-dev-q8_0.gguf -v --type q8_0
# Convert Flux Schnell model to 3-bit K-quantization
./bin/sd -M convert -m ../models/flux1-schnell.sft -o ../models/flux1-schnell-q3_k.gguf -v --type q3_k
# Convert Chroma model to 8-bit quantization
./bin/sd -M convert -m ../models/chroma-unlocked-v40.safetensors -o ../models/chroma-unlocked-v40-q8_0.gguf -v --type q8_0
# Convert Kontext model to 8-bit quantization
./bin/sd -M convert -m ../models/flux1-kontext-dev.safetensors -o ../models/flux1-kontext-dev-q8_0.gguf -v --type q8_0
The project supports LoRA (Low-Rank Adaptation) models for fine-tuning and style transfer:
| LoRA Model | Compatible Base Models | Example Usage |
|---|---|---|
marblesh.safetensors |
SD 1.5, SD 2.1 | ./bin/sd -m ../models/v1-5-pruned-emaonly.safetensors -p "a lovely cat<lora:marblesh:1>" --lora-model-dir ../models |
lcm-lora-sdv1-5 |
SD 1.5 | ./bin/sd -m ../models/v1-5-pruned-emaonly.safetensors -p "a lovely cat<lora:lcm-lora-sdv1-5:1>" --steps 4 --lora-model-dir ../models -v --cfg-scale 1 |
realism_lora_comfy_converted |
FLUX Models | ./bin/sd --diffusion-model ../models/flux1-dev-q8_0.gguf --vae ../models/ae.sft --clip_l ../models/clip_l.safetensors --t5xxl ../models/t5xxl_fp16.safetensors -p "a lovely cat holding a sign says 'flux.cpp'<lora:realism_lora_comfy_converted:1>" --cfg-scale 1.0 --sampling-method euler -v --lora-model-dir ../models --clip-on-cpu |
wan2.2_t2v_lightx2v_4steps_lora_v1.1_low_noise |
Wan2.2 T2V Models | ./bin/sd -M vid_gen --diffusion-model ../models/Wan2.2-T2V-A14B-LowNoise-Q8_0.gguf --lora-model-dir ../models -p "a lovely cat<lora:wan2.2_t2v_lightx2v_4steps_lora_v1.1_low_noise:1>" --steps 4 |
wan2.2_t2v_lightx2v_4steps_lora_v1.1_high_noise |
Wan2.2 T2V Models | ./bin/sd -M vid_gen --diffusion-model ../models/Wan2.2-T2V-A14B-HighNoise-Q8_0.gguf --lora-model-dir ../models -p "a lovely cat<lora:wan2.2_t2v_lightx2v_4steps_lora_v1.1_high_noise:1>" --steps 4 |
# Use ESRGAN for upscaling generated images
./bin/sd -m ../models/v1-5-pruned-emaonly.safetensors -p "a lovely cat" --upscale-model ../models/RealESRGAN_x4plus_anime_6B.pth
# Use TAESD for faster VAE decoding
./bin/sd -m ../models/v1-5-pruned-emaonly.safetensors -p "a lovely cat" --taesd ../models/diffusion_pytorch_model.safetensors
The project supports various model types, each with specific file extensions:
| Model Type | Enum Value | Description | Supported Extensions |
|---|---|---|---|
| LORA | 1 | Low-Rank Adaptation models | .safetensors, .pt, .ckpt |
| CHECKPOINT | 2 | Main model checkpoints | .safetensors, .pt, .ckpt |
| VAE | 4 | Variational Autoencoder models | .safetensors, .pt, .ckpt |
| PRESETS | 8 | Generation preset files | .json, .yaml, .yml |
| PROMPTS | 16 | Prompt template files | .txt, .json |
| NEG_PROMPTS | 32 | Negative prompt templates | .txt, .json |
| TAESD | 64 | Tiny AutoEncoder for SD | .safetensors, .pt, .ckpt |
| ESRGAN | 128 | Super-resolution models | .pth, .pt |
| CONTROLNET | 256 | ControlNet models | .safetensors, .pt, .ckpt |
| UPSCALER | 512 | Image upscaler models | .pth, .pt |
| EMBEDDING | 1024 | Textual embeddings | .safetensors, .pt, .ckpt |
enum ModelType {
LORA = 1,
CHECKPOINT = 2,
VAE = 4,
PRESETS = 8,
PROMPTS = 16,
NEG_PROMPTS = 32,
TAESD = 64,
ESRGAN = 128,
CONTROLNET = 256,
UPSCALER = 512,
EMBEDDING = 1024
};
POST /api/generate/text2img
Generate images from text prompts with comprehensive parameter support.
Example Request:
{
"prompt": "a beautiful landscape",
"negative_prompt": "blurry, low quality",
"width": 1024,
"height": 1024,
"steps": 20,
"cfg_scale": 7.5,
"sampling_method": "euler",
"scheduler": "karras",
"seed": "random",
"batch_count": 1,
"vae_model": "optional_vae_name"
}
POST /api/generate/img2img
Transform existing images with text guidance.
Example Request:
{
"prompt": "transform into anime style",
"init_image": "base64_encoded_image_or_url",
"strength": 0.75,
"width": 1024,
"height": 1024,
"steps": 20,
"cfg_scale": 7.5
}
POST /api/generate/controlnet
Apply precise control using ControlNet models.
Example Request:
{
"prompt": "a person standing",
"control_image": "base64_encoded_control_image",
"control_net_model": "canny",
"control_strength": 0.9,
"width": 512,
"height": 512
}
POST /api/generate/inpainting
Edit specific regions of images using masks.
Example Request:
{
"prompt": "change hair color to blonde",
"source_image": "base64_encoded_source_image",
"mask_image": "base64_encoded_mask_image",
"strength": 0.75,
"width": 512,
"height": 512
}
POST /api/generate/upscale
Enhance image resolution using ESRGAN models.
Example Request:
{
"image": "base64_encoded_image",
"esrgan_model": "esrgan_model_name",
"upscale_factor": 4
}
GET /api/queue/job/{job_id}
Get detailed status and results for a specific job.
GET /api/queue/status
Get current queue state and active jobs.
POST /api/queue/cancel
Cancel a pending or running job.
Example Request:
{
"job_id": "uuid-of-job-to-cancel"
}
POST /api/queue/clear
Clear all pending jobs from the queue.
GET /api/models
List all available models with metadata and filtering options.
Query Parameters:
type - Filter by model type (lora, checkpoint, vae, etc.)search - Search in model names and descriptionssort_by - Sort by name, size, date, typesort_order - asc or descpage - Page number for paginationlimit - Items per pageGET /api/models/{model_id}
Get detailed information about a specific model.
POST /api/models/{model_id}/load
Load a model into memory.
POST /api/models/{model_id}/unload
Unload a model from memory.
GET /api/models/types
Get information about supported model types and their capabilities.
GET /api/models/directories
List and check status of model directories.
POST /api/models/refresh
Rescan model directories and update cache.
GET /api/models/stats
Get comprehensive statistics about models.
POST /api/models/batch
Perform batch operations on multiple models.
Example Request:
{
"operation": "load",
"models": ["model1", "model2", "model3"]
}
POST /api/models/validate
Validate model files and check compatibility.
POST /api/models/convert
Convert models between quantization formats.
Example Request:
{
"model_name": "checkpoint_model_name",
"quantization_type": "q8_0",
"output_path": "/path/to/output.gguf"
}
POST /api/models/hash
Generate SHA256 hashes for model verification.
GET /api/status
Get server status, queue information, and loaded models.
GET /api/system
Get detailed system information including hardware, capabilities, and limits.
GET /api/config
Get current server configuration and limits.
POST /api/system/restart
Trigger graceful server restart.
POST /api/auth/login
Authenticate user and receive access token.
Example Request:
{
"username": "admin",
"password": "password123"
}
GET /api/auth/validate
Validate and check current token status.
POST /api/auth/refresh
Refresh authentication token.
GET /api/auth/me
Get current user information and permissions.
POST /api/auth/logout
Logout and invalidate token.
GET /api/samplers
Get available sampling methods and their properties.
GET /api/schedulers
Get available schedulers and their properties.
GET /api/parameters
Get detailed parameter information and validation rules.
POST /api/validate
Validate generation parameters before submission.
POST /api/estimate
Estimate generation time and memory usage.
POST /api/image/resize
POST /api/image/crop
Resize or crop images server-side.
GET /api/image/download?url=image_url
Download and encode images from URLs.
GET /api/v1/jobs/{job_id}/output/{filename}
GET /api/queue/job/{job_id}/output/{filename}
Download generated images and output files.
GET /api/v1/jobs/{job_id}/output/{filename}?thumb=1&size=200
Get thumbnails for faster web UI loading.
GET /api/health
Simple health check endpoint.
GET /api/version
Get detailed version and build information.
Public (No Authentication Required):
/api/health - Basic health check/api/status - Server status (read-only)/api/version - Version informationProtected (Authentication Required):
POST /api/v1/generate
{
"prompt": "a beautiful landscape",
"negative_prompt": "blurry, low quality",
"model": "sd-v1-5",
"width": 512,
"height": 512,
"steps": 20,
"cfg_scale": 7.5,
"seed": -1,
"batch_size": 1
}
{
"job_id": "uuid-string",
"status": "completed",
"images": [
{
"data": "base64-encoded-image-data",
"seed": 12345,
"parameters": {
"prompt": "a beautiful landscape",
"negative_prompt": "blurry, low quality",
"model": "sd-v1-5",
"width": 512,
"height": 512,
"steps": 20,
"cfg_scale": 7.5,
"seed": 12345,
"batch_size": 1
}
}
],
"generation_time": 3.2
}
The server supports multiple authentication methods to secure API access:
No Authentication (Default)
JWT Token Authentication
API Key Authentication
PAM Authentication
Authentication can be configured via command-line arguments or configuration files:
# Enable PAM authentication
./stable-diffusion-rest-server --auth pam --models-dir /path/to/models --checkpoints checkpoints
# Enable JWT authentication
./stable-diffusion-rest-server --auth jwt --models-dir /path/to/models --checkpoints checkpoints
# Enable API key authentication
./stable-diffusion-rest-server --auth api-key --models-dir /path/to/models --checkpoints checkpoints
# Enable Unix authentication
./stable-diffusion-rest-server --auth unix --models-dir /path/to/models --checkpoints checkpoints
# No authentication (default)
./stable-diffusion-rest-server --auth none --models-dir /path/to/models --checkpoints checkpoints
none - No authentication required (default)jwt - JWT token authenticationapi-key - API key authenticationunix - Unix system authenticationpam - PAM authenticationoptional - Authentication optional (guest access allowed)The following options are deprecated and will be removed in a future version:
--enable-unix-auth - Use --auth unix instead--enable-pam-auth - Use --auth pam insteadPOST /api/v1/auth/login - Authenticate with username/password (PAM/JWT)POST /api/v1/auth/refresh - Refresh JWT tokenGET /api/v1/auth/profile - Get current user profilePOST /api/v1/auth/logout - Logout/invalidate tokenFor detailed authentication setup instructions, see PAM_AUTHENTICATION.md.
Clone the repository:
git clone https://github.com/your-username/stable-diffusion.cpp-rest.git
cd stable-diffusion.cpp-rest
Create a build directory:
mkdir build
cd build
Configure with CMake:
cmake ..
Build the project:
cmake --build . --parallel
(Optional) Install the binary:
cmake --install .
The project uses CMake's external project feature to automatically download and build the stable-diffusion.cpp library:
include(ExternalProject)
ExternalProject_Add(
stable-diffusion.cpp
GIT_REPOSITORY https://github.com/leejet/stable-diffusion.cpp.git
GIT_TAG master-334-d05e46c
SOURCE_DIR "${CMAKE_CURRENT_BINARY_DIR}/stable-diffusion.cpp-src"
BINARY_DIR "${CMAKE_CURRENT_BINARY_DIR}/stable-diffusion.cpp-build"
CMAKE_ARGS -DCMAKE_INSTALL_PREFIX=${CMAKE_CURRENT_BINARY_DIR}/stable-diffusion.cpp-install
INSTALL_COMMAND ""
)
PAM authentication support is enabled by default when PAM libraries are available. You can control this with CMake options:
# Build with PAM support (default when available)
cmake -DENABLE_PAM_AUTH=ON ..
# Build without PAM support
cmake -DENABLE_PAM_AUTH=OFF ..
# Check if PAM support will be built
cmake -LA | grep ENABLE_PAM_AUTH
Note: PAM authentication requires the PAM development libraries:
# Ubuntu/Debian
sudo apt-get install libpam0g-dev
# CentOS/RHEL/Fedora
sudo yum install pam-devel
# Basic usage
./stable-diffusion.cpp-rest
# With custom configuration
./stable-diffusion.cpp-rest --config config.json
# With custom model directory
./stable-diffusion.cpp-rest --model-dir /path/to/models
import requests
import base64
import json
# Generate an image
response = requests.post('http://localhost:8080/api/v1/generate', json={
'prompt': 'a beautiful landscape',
'width': 512,
'height': 512,
'steps': 20
})
result = response.json()
if result['status'] == 'completed':
# Decode and save the first image
image_data = base64.b64decode(result['images'][0]['data'])
with open('generated_image.png', 'wb') as f:
f.write(image_data)
async function generateImage() {
const response = await fetch('http://localhost:8080/api/v1/generate', {
method: 'POST',
headers: {
'Content-Type': 'application/json'
},
body: JSON.stringify({
prompt: 'a beautiful landscape',
width: 512,
height: 512,
steps: 20
})
});
const result = await response.json();
if (result.status === 'completed') {
// Create an image element with the generated image
const img = document.createElement('img');
img.src = `data:image/png;base64,${result.images[0].data}`;
document.body.appendChild(img);
}
}
# Generate an image
curl -X POST http://localhost:8080/api/v1/generate \
-H "Content-Type: application/json" \
-d '{
"prompt": "a beautiful landscape",
"width": 512,
"height": 512,
"steps": 20
}'
# Check job status
curl http://localhost:8080/api/v1/generate/{job_id}
# List available models
curl http://localhost:8080/api/v1/models
Status: ✅ FIXED (See ISSUE_49_PROGRESS_CALLBACK_FIX.md)
The project is production-ready with:
This represents a mature, feature-complete implementation ready for production deployment with comprehensive documentation and robust error handling.
Contributions are welcome! Please feel free to submit a Pull Request. For major changes, please open an issue first to discuss what you would like to change.
This project is licensed under the MIT License - see the LICENSE file for details.