Không có mô tả

Fszontagh 5dc7608fee feat: implement comprehensive versioning system 3 tháng trước cách đây
cmake e262669103 Implement multiple authentication and UI improvements 3 tháng trước cách đây
include 5dc7608fee feat: implement comprehensive versioning system 3 tháng trước cách đây
src 5dc7608fee feat: implement comprehensive versioning system 3 tháng trước cách đây
webui 5dc7608fee feat: implement comprehensive versioning system 3 tháng trước cách đây
.clang-format 186fbe2b12 Implement logAuthAttempt for detailed auth logging 3 tháng trước cách đây
.clang-tidy 186fbe2b12 Implement logAuthAttempt for detailed auth logging 3 tháng trước cách đây
.gitignore 6a6343738c Remove legacy code from Model Manager 3 tháng trước cách đây
.roomodes 528b44e561 Implement folder-based model detection and caching mechanism (#40) 3 tháng trước cách đây
CMakeLists.txt 5dc7608fee feat: implement comprehensive versioning system 3 tháng trước cách đây
CRUSH.md 5eb4c96b4f Clean up project: remove debug files, test files, and build artifacts 3 tháng trước cách đây
README.md b864306631 Implement model existence checking logic 3 tháng trước cách đây
install.sh af1d6cfb03 Initial commit: Stable Diffusion REST API Server 3 tháng trước cách đây
stable-diffusion-rest.service.template af1d6cfb03 Initial commit: Stable Diffusion REST API Server 3 tháng trước cách đây
uninstall.sh af1d6cfb03 Initial commit: Stable Diffusion REST API Server 3 tháng trước cách đây

README.md

stable-diffusion.cpp-rest

A C++ based REST API wrapper for the stable-diffusion.cpp library, providing HTTP endpoints for image generation with Stable Diffusion models.

✨ Features

  • REST API - Complete HTTP API for Stable Diffusion image generation
  • Web UI - Modern, responsive web interface (automatically built with the server)
  • Queue System - Efficient job queue for managing generation requests
  • Model Management - Support for multiple model types with automatic detection
  • CUDA Support - Optional GPU acceleration for faster generation
  • Authentication - Multiple authentication methods including JWT, API keys, and PAM

Table of Contents

Project Overview

The stable-diffusion.cpp-rest project aims to create a high-performance REST API server that wraps the functionality of the stable-diffusion.cpp library. This enables developers to integrate Stable Diffusion image generation capabilities into their applications through standard HTTP requests, rather than directly using the C++ library.

Objectives

  • Provide a simple, RESTful interface for Stable Diffusion image generation
  • Support all parameters available in examples/cli/main.cpp
  • Implement efficient resource management with a generation queue system
  • Support multiple model types with automatic detection and loading
  • Ensure thread-safe operation with separate HTTP server and generation threads

Web UI

A modern, responsive web interface is included and automatically built with the server!

Features:

  • Text-to-Image, Image-to-Image, Inpainting, and Upscaler interfaces
  • Real-time job queue monitoring
  • Model management (load/unload models, scan for new models)
  • Light/Dark theme with auto-detection
  • Full parameter control for generation
  • Interactive mask editor for inpainting

Quick Start:

# Build (automatically builds web UI)
mkdir build && cd build
cmake ..
cmake --build .

# Run server with web UI
./src/stable-diffusion-rest-server --models-dir /path/to/models --checkpoints checkpoints --ui-dir ../webui

# Access web UI
open http://localhost:8080/ui/

See WEBUI.md for detailed documentation.

Architecture

The project is designed with a modular architecture consisting of three main components:

HTTP Server

  • Handles incoming HTTP requests
  • Parses request parameters and validates input
  • Returns generated images or error responses
  • Operates independently of the generation process

Generation Queue

  • Manages image generation requests
  • Processes jobs sequentially (one at a time)
  • Maintains thread-safe operations
  • Provides job status tracking

Model Manager

  • Handles loading and management of different model types
  • Supports automatic model detection from default folders
  • Automatically detects diffusion model architecture and selects appropriate loading method
  • Supports traditional SD models (SD 1.5, SD 2.1, SDXL Base/Refiner) using ctxParams.model_path
  • Supports modern architectures (Flux Schnell/Dev/Chroma, SD3, Qwen2VL) using ctxParams.diffusion_model_path
  • Includes fallback mechanisms for unknown architectures
  • Applies optimal parameters based on detected model type
  • Manages model lifecycle and memory usage
  • Provides type-based model organization

    ┌─────────────────┐    ┌─────────────────┐    ┌─────────────────┐
    │   HTTP Server   │───▶│ Generation Queue│───▶│  Model Manager  │
    │                 │    │                 │    │                 │
    │ - Request Parse │    │ - Job Queue     │    │ - Model Loading │
    │ - Response      │    │ - Sequential    │    │ - Type Detection│
    │   Formatting    │    │   Processing    │    │ - Memory Mgmt   │
    └─────────────────┘    └─────────────────┘    └─────────────────┘
    

Technical Requirements

Core Technologies

  • C++17 or later
  • CMake 3.15 or later
  • Threading support (std::thread, std::mutex, std::condition_variable)
  • CUDA support (optional but recommended for performance)

Dependencies

  • stable-diffusion.cpp library (automatically downloaded via CMake)
  • HTTP server library (to be determined based on requirements)
  • JSON library for request/response handling
  • Build tools compatible with CMake

Platform Support

  • Linux (primary development platform)
  • Windows (planned support)
  • macOS (planned support)

Project Structure

stable-diffusion.cpp-rest/
├── CMakeLists.txt              # Main CMake configuration
├── README.md                   # This file
├── src/                        # Source code directory
│   ├── main.cpp                # Application entry point
│   ├── http/                   # HTTP server implementation
│   │   ├── server.h/.cpp       # HTTP server class
│   │   ├── handlers.h/.cpp     # Request handlers
│   │   └── responses.h/.cpp    # Response formatting
│   ├── generation/             # Generation queue implementation
│   │   ├── queue.h/.cpp        # Job queue management
│   │   ├── worker.h/.cpp       # Generation worker thread
│   │   └── job.h/.cpp          # Job definition and status
│   ├── models/                 # Model manager implementation
│   │   ├── manager.h/.cpp      # Model manager class
│   │   ├── loader.h/.cpp       # Model loading logic
│   │   └── types.h/.cpp        # Model type definitions
│   └── utils/                  # Utility functions
│       ├── config.h/.cpp       # Configuration management
│       └── logger.h/.cpp       # Logging utilities
├── include/                    # Public header files
├── external/                   # External dependencies (managed by CMake)
├── models/                     # Default model storage directory
│   ├── lora/                   # LoRA models
│   ├── checkpoints/            # Checkpoint models
│   ├── vae/                    # VAE models
│   ├── presets/                # Preset files
│   ├── prompts/                # Prompt templates
│   ├── neg_prompts/            # Negative prompt templates
│   ├── taesd/                  # TAESD models
│   ├── esrgan/                 # ESRGAN models
│   ├── controlnet/             # ControlNet models
│   ├── upscaler/               # Upscaler models
│   └── embeddings/             # Textual embeddings
├── tests/                      # Unit and integration tests
├── examples/                   # Usage examples
└── docs/                       # Additional documentation

Model Detection and Architecture Support

The stable-diffusion.cpp-rest server includes intelligent model detection that automatically identifies the model architecture and selects the appropriate loading method. This ensures compatibility with both traditional Stable Diffusion models and modern architectures.

Supported Architectures

Traditional Stable Diffusion Models

  • SD 1.5 - Stable Diffusion version 1.5 models
  • SD 2.1 - Stable Diffusion version 2.1 models
  • SDXL Base - Stable Diffusion XL base models
  • SDXL Refiner - Stable Diffusion XL refiner models

These models are loaded using the traditional ctxParams.model_path parameter.

Modern Architectures

  • Flux Schnell - Fast Flux variant
  • Flux Dev - Development version of Flux
  • Flux Chroma - Chroma-optimized Flux
  • SD 3 - Stable Diffusion 3 models
  • Qwen2VL - Qwen2 Vision-Language models

These models are loaded using the modern ctxParams.diffusion_model_path parameter.

Detection Process

  1. Model Analysis: When loading a model, the system analyzes the model file structure and metadata
  2. Architecture Identification: The model architecture is identified based on key signatures in the model
  3. Loading Method Selection: The appropriate loading method is automatically selected:
    • Traditional models → ctxParams.model_path
    • Modern architectures → ctxParams.diffusion_model_path
  4. Fallback Handling: Unknown architectures default to traditional loading for backward compatibility
  5. Error Recovery: If loading with the detected method fails, the system attempts fallback loading

Benefits

  • Automatic Compatibility: No need to manually specify model type
  • Optimal Loading: Each architecture uses its optimal loading parameters
  • Future-Proof: Easy to add support for new architectures
  • Backward Compatible: Existing models continue to work without changes

Model Architecture Requirements

Note: The following tables contain extensive information and may require horizontal scrolling to view all columns properly.

Architecture Extra VAE Standalone High Noise T5XXL CLIP-Vision CLIP-G CLIP-L Model Files Example Commands
SD 1.x No No No No No No sd-v1-4.ckpt, v1-5-pruned-emaonly.safetensors ./bin/sd -m ../models/sd-v1-4.ckpt -p "a lovely cat"
SD 2.x No No No No No No (Similar to SD 1.x) ./bin/sd -m ../models/sd2-model.ckpt -p "a lovely cat"
SDXL Yes No No No No No sd_xl_base_1.0.safetensors, sdxl_vae-fp16-fix.safetensors ./bin/sd -m ../models/sd_xl_base_1.0.safetensors --vae ../models/sdxl_vae-fp16-fix.safetensors -H 1024 -W 1024 -p "a lovely cat" -v
SD3 No No Yes No No No sd3_medium_incl_clips_t5xxlfp16.safetensors ./bin/sd -m ../models/sd3_medium_incl_clips_t5xxlfp16.safetensors -H 1024 -W 1024 -p 'a lovely cat holding a sign says "Stable Diffusion CPP"' --cfg-scale 4.5 --sampling-method euler -v --clip-on-cpu
SD3.5 Large No No Yes No Yes Yes sd3.5_large.safetensors, clip_l.safetensors, clip_g.safetensors, t5xxl_fp16.safetensors ./bin/sd -m ../models/sd3.5_large.safetensors --clip_l ../models/clip_l.safetensors --clip_g ../models/clip_g.safetensors --t5xxl ../models/t5xxl_fp16.safetensors -H 1024 -W 1024 -p 'a lovely cat holding a sign says "Stable diffusion 3.5 Large"' --cfg-scale 4.5 --sampling-method euler -v --clip-on-cpu
FLUX Models Yes No Yes No No Yes flux1-dev-q3_k.gguf, flux1-dev-q8_0.gguf, flux1-schnell-q8_0.gguf, ae.sft, clip_l.safetensors, t5xxl_fp16.safetensors ./bin/sd --diffusion-model ../models/flux1-dev-q8_0.gguf --vae ../models/ae.sft --clip_l ../models/clip_l.safetensors --t5xxl ../models/t5xxl_fp16.safetensors -p "a lovely cat holding a sign says 'flux.cpp'" --cfg-scale 1.0 --sampling-method euler -v --clip-on-cpu
Kontext Yes No Yes No No Yes flux1-kontext-dev-q8_0.gguf, ae.sft, clip_l.safetensors, t5xxl_fp16.safetensors ./bin/sd -r ./flux1-dev-q8_0.png --diffusion-model ../models/flux1-kontext-dev-q8_0.gguf --vae ../models/ae.sft --clip_l ../models/clip_l.safetensors --t5xxl ../models/t5xxl_fp16.safetensors -p "change 'flux.cpp' to 'kontext.cpp'" --cfg-scale 1.0 --sampling-method euler -v --clip-on-cpu
Chroma Yes No Yes No No No chroma-unlocked-v40-q8_0.gguf, ae.sft, t5xxl_fp16.safetensors ./bin/sd --diffusion-model ../models/chroma-unlocked-v40-q8_0.gguf --vae ../models/ae.sft --t5xxl ../models/t5xxl_fp16.safetensors -p "a lovely cat holding a sign says 'chroma.cpp'" --cfg-scale 4.0 --sampling-method euler -v --chroma-disable-dit-mask --clip-on-cpu
Wan Models Yes No Yes Yes No No wan2.1_t2v_1.3B_fp16.safetensors, wan_2.1_vae.safetensors, umt5-xxl-encoder-Q8_0.gguf ./bin/sd -M vid_gen --diffusion-model ../models/wan2.1_t2v_1.3B_fp16.safetensors --vae ../models/wan_2.1_vae.safetensors --t5xxl ../models/umt5-xxl-encoder-Q8_0.gguf -p "a lovely cat" --cfg-scale 6.0 --sampling-method euler -v -W 832 -H 480 --diffusion-fa --video-frames 33 --flow-shift 3.0
Wan2.2 T2V A14B No Yes No No No No Wan2.2-T2V-A14B-LowNoise-Q8_0.gguf, Wan2.2-T2V-A14B-HighNoise-Q8_0.gguf ./bin/sd -M vid_gen --diffusion-model ../models/Wan2.2-T2V-A14B-LowNoise-Q8_0.gguf -p "a lovely cat" --cfg-scale 6.0 --sampling-method euler -v -W 832 -H 480 --diffusion-fa --video-frames 33 --flow-shift 3.0
Wan2.2 I2V A14B No Yes No No No No Wan2.2-I2V-A14B-LowNoise-Q8_0.gguf, Wan2.2-I2V-A14B-HighNoise-Q8_0.gguf ./bin/sd -M vid_gen --diffusion-model ../models/Wan2.2-I2V-A14B-LowNoise-Q8_0.gguf -r input.png -p "a lovely cat" --cfg-scale 6.0 --sampling-method euler -v -W 832 -H 480 --diffusion-fa --video-frames 33 --flow-shift 3.0
Qwen Image Models Yes No No No No No qwen-image-Q8_0.gguf, qwen_image_vae.safetensors, Qwen2.5-VL-7B-Instruct-Q8_0.gguf ./bin/sd --diffusion-model ../models/qwen-image-Q8_0.gguf --vae ../models/qwen_image_vae.safetensors --qwen2vl ../models/Qwen2.5-VL-7B-Instruct-Q8_0.gguf -p '一个穿着"QWEN"标志的T恤的中国美女正拿着黑色的马克笔面相镜头微笑。' --cfg-scale 2.5 --sampling-method euler -v --offload-to-cpu -H 1024 -W 1024 --diffusion-fa --flow-shift 3
Qwen Image Edit Yes No No No No No Qwen_Image_Edit-Q8_0.gguf, qwen_image_vae.safetensors, qwen_2.5_vl_7b.safetensors ./bin/sd --diffusion-model ../models/Qwen_Image_Edit-Q8_0.gguf --vae ../models/qwen_image_vae.safetensors --qwen2vl ../models/qwen_2.5_vl_7b.safetensors --cfg-scale 2.5 --sampling-method euler -v --offload-to-cpu --diffusion-fa --flow-shift 3 -r ../assets/flux/flux1-dev-q8_0.png -p "change 'flux.cpp' to 'edit.cpp'"
PhotoMaker No No No No No No sdxlUnstableDiffusers_v11.safetensors, sdxl_vae.safetensors, photomaker-v1.safetensors ./bin/sd -m ../models/sdxlUnstableDiffusers_v11.safetensors --vae ../models/sdxl_vae.safetensors --photo-maker ../models/photomaker-v1.safetensors --pm-id-images-dir ../assets/photomaker_examples/scarletthead_woman -p "a girl img, retro futurism" --cfg-scale 5.0 --sampling-method euler -H 1024 -W 1024 --pm-style-strength 10 --vae-on-cpu --steps 50
LCM No No No No No No lcm-lora-sdv1-5 ./bin/sd -m ../models/v1-5-pruned-emaonly.safetensors -p "a lovely cat<lora:lcm-lora-sdv1-5:1>" --steps 4 --lora-model-dir ../models -v --cfg-scale 1
SSD1B No No No No No No (Various SSD-1B models) ./bin/sd -m ../models/ssd-1b.safetensors -p "a lovely cat"
Tiny SD No No No No No No (Various Tiny SD models) ./bin/sd -m ../models/tiny-sd.safetensors -p "a lovely cat"

Context Creation Methods per Architecture

Architecture Context Creation Method Special Parameters Model Files Example Commands
SD 1.x, SD 2.x, SDXL Standard prompt-based generation --cfg-scale, --sampling-method, --steps sd-v1-4.ckpt, v1-5-pruned-emaonly.safetensors, sd_xl_base_1.0.safetensors ./bin/sd -m ../models/sd-v1-4.ckpt -p "a lovely cat"
SD3 Multiple text encoders --clip-on-cpu recommended sd3_medium_incl_clips_t5xxlfp16.safetensors ./bin/sd -m ../models/sd3_medium_incl_clips_t5xxlfp16.safetensors -H 1024 -W 1024 -p 'a lovely cat holding a sign says "Stable Diffusion CPP"' --cfg-scale 4.5 --sampling-method euler -v --clip-on-cpu
SD3.5 Large Multiple text encoders --clip-on-cpu recommended sd3.5_large.safetensors, clip_l.safetensors, clip_g.safetensors, t5xxl_fp16.safetensors ./bin/sd -m ../models/sd3.5_large.safetensors --clip_l ../models/clip_l.safetensors --clip_g ../models/clip_g.safetensors --t5xxl ../models/t5xxl_fp16.safetensors -H 1024 -W 1024 -p 'a lovely cat holding a sign says "Stable diffusion 3.5 Large"' --cfg-scale 4.5 --sampling-method euler -v --clip-on-cpu
FLUX Models Text-to-image generation --cfg-scale 1.0 recommended, --clip-on-cpu for memory efficiency flux1-dev-q3_k.gguf, flux1-dev-q8_0.gguf, flux1-schnell-q8_0.gguf, ae.sft, clip_l.safetensors, t5xxl_fp16.safetensors ./bin/sd --diffusion-model ../models/flux1-dev-q8_0.gguf --vae ../models/ae.sft --clip_l ../models/clip_l.safetensors --t5xxl ../models/t5xxl_fp16.safetensors -p "a lovely cat holding a sign says 'flux.cpp'" --cfg-scale 1.0 --sampling-method euler -v --clip-on-cpu
Kontext Image-to-image transformation -r for reference image, --cfg-scale 1.0 recommended flux1-kontext-dev-q8_0.gguf, ae.sft, clip_l.safetensors, t5xxl_fp16.safetensors ./bin/sd -r ./flux1-dev-q8_0.png --diffusion-model ../models/flux1-kontext-dev-q8_0.gguf --vae ../models/ae.sft --clip_l ../models/clip_l.safetensors --t5xxl ../models/t5xxl_fp16.safetensors -p "change 'flux.cpp' to 'kontext.cpp'" --cfg-scale 1.0 --sampling-method euler -v --clip-on-cpu
Chroma Text-to-image generation --cfg-scale 1.0 recommended, --clip-on-cpu for memory efficiency chroma-unlocked-v40-q8_0.gguf, ae.sft, t5xxl_fp16.safetensors ./bin/sd --diffusion-model ../models/chroma-unlocked-v40-q8_0.gguf --vae ../models/ae.sft --t5xxl ../models/t5xxl_fp16.safetensors -p "a lovely cat holding a sign says 'chroma.cpp'" --cfg-scale 4.0 --sampling-method euler -v --chroma-disable-dit-mask --clip-on-cpu
Chroma1-Radiance Text-to-image generation --cfg-scale 4.0 recommended Chroma1-Radiance-v0.4-Q8_0.gguf, t5xxl_fp16.safetensors ./bin/sd --diffusion-model ../models/Chroma1-Radiance-v0.4-Q8_0.gguf --t5xxl ../models/clip/t5xxl_fp16.safetensors -p "a lovely cat holding a sign says 'chroma radiance cpp'" --cfg-scale 4.0 --sampling-method euler -v
Wan Models Video generation with text prompts -M vid_gen, --video-frames, --flow-shift, --diffusion-fa wan2.1_t2v_1.3B_fp16.safetensors, wan_2.1_vae.safetensors, umt5-xxl-encoder-Q8_0.gguf ./bin/sd -M vid_gen --diffusion-model ../models/wan2.1_t2v_1.3B_fp16.safetensors --vae ../models/wan_2.1_vae.safetensors --t5xxl ../models/umt5-xxl-encoder-Q8_0.gguf -p "a lovely cat" --cfg-scale 6.0 --sampling-method euler -v -W 832 -H 480 --diffusion-fa --video-frames 33 --flow-shift 3.0
Wan2.1 I2V Models Image-to-video generation Requires clip_vision_h.safetensors wan2.1-i2v-14b-480p-Q8_0.gguf, wan_2.1_vae.safetensors, clip_vision_h.safetensors ./bin/sd -M vid_gen --diffusion-model ../models/wan2.1-i2v-14b-480p-Q8_0.gguf --vae ../models/wan_2.1_vae.safetensors -r input.png -p "a lovely cat" --cfg-scale 6.0 --sampling-method euler -v -W 832 -H 480 --diffusion-fa --video-frames 33 --flow-shift 3.0
Wan2.1 FLF2V Models Flow-to-video generation Requires clip_vision_h.safetensors wan2.1-flf2v-14b-720p-Q8_0.gguf, wan_2.1_vae.safetensors, clip_vision_h.safetensors ./bin/sd -M vid_gen --diffusion-model ../models/wan2.1-flf2v-14b-720p-Q8_0.gguf --vae ../models/wan_2.1_vae.safetensors -r flow.png -p "a lovely cat" --cfg-scale 6.0 --sampling-method euler -v -W 832 -H 480 --diffusion-fa --video-frames 33 --flow-shift 3.0
Wan2.2 T2V A14B Text-to-video generation Uses dual diffusion models Wan2.2-T2V-A14B-LowNoise-Q8_0.gguf, Wan2.2-T2V-A14B-HighNoise-Q8_0.gguf ./bin/sd -M vid_gen --diffusion-model ../models/Wan2.2-T2V-A14B-LowNoise-Q8_0.gguf -p "a lovely cat" --cfg-scale 6.0 --sampling-method euler -v -W 832 -H 480 --diffusion-fa --video-frames 33 --flow-shift 3.0
Wan2.2 I2V A14B Image-to-video generation Uses dual diffusion models Wan2.2-I2V-A14B-LowNoise-Q8_0.gguf, Wan2.2-I2V-A14B-HighNoise-Q8_0.gguf ./bin/sd -M vid_gen --diffusion-model ../models/Wan2.2-I2V-A14B-LowNoise-Q8_0.gguf -r input.png -p "a lovely cat" --cfg-scale 6.0 --sampling-method euler -v -W 832 -H 480 --diffusion-fa --video-frames 33 --flow-shift 3.0
Qwen Image Models Text-to-image generation with Chinese language support --qwen2vl for the language model, --diffusion-fa, --flow-shift qwen-image-Q8_0.gguf, qwen_image_vae.safetensors, Qwen2.5-VL-7B-Instruct-Q8_0.gguf ./bin/sd --diffusion-model ../models/qwen-image-Q8_0.gguf --vae ../models/qwen_image_vae.safetensors --qwen2vl ../models/Qwen2.5-VL-7B-Instruct-Q8_0.gguf -p '一个穿着"QWEN"标志的T恤的中国美女正拿着黑色的马克笔面相镜头微笑。' --cfg-scale 2.5 --sampling-method euler -v --offload-to-cpu -H 1024 -W 1024 --diffusion-fa --flow-shift 3
Qwen Image Edit Image editing with reference image -r for reference image, --qwen2vl_vision for vision model Qwen_Image_Edit-Q8_0.gguf, qwen_image_vae.safetensors, qwen_2.5_vl_7b.safetensors ./bin/sd --diffusion-model ../models/Qwen_Image_Edit-Q8_0.gguf --vae ../models/qwen_image_vae.safetensors --qwen2vl ../models/qwen_2.5_vl_7b.safetensors --cfg-scale 2.5 --sampling-method euler -v --offload-to-cpu --diffusion-fa --flow-shift 3 -r ../assets/flux/flux1-dev-q8_0.png -p "change 'flux.cpp' to 'edit.cpp'"
PhotoMaker Personalized image generation with ID images --photo-maker, --pm-id-images-dir, --pm-style-strength sdxlUnstableDiffusers_v11.safetensors, sdxl_vae.safetensors, photomaker-v1.safetensors ./bin/sd -m ../models/sdxlUnstableDiffusers_v11.safetensors --vae ../models/sdxl_vae.safetensors --photo-maker ../models/photomaker-v1.safetensors --pm-id-images-dir ../assets/photomaker_examples/scarletthead_woman -p "a girl img, retro futurism" --cfg-scale 5.0 --sampling-method euler -H 1024 -W 1024 --pm-style-strength 10 --vae-on-cpu --steps 50
LCM Fast generation with LoRA --cfg-scale 1.0, --steps 2-8, --sampling-method lcm/euler_a lcm-lora-sdv1-5 ./bin/sd -m ../models/v1-5-pruned-emaonly.safetensors -p "a lovely cat<lora:lcm-lora-sdv1-5:1>" --steps 4 --lora-model-dir ../models -v --cfg-scale 1
SSD1B Standard prompt-based generation Standard SD parameters (Various SSD-1B models) ./bin/sd -m ../models/ssd-1b.safetensors -p "a lovely cat"
Tiny SD Standard prompt-based generation Standard SD parameters (Various Tiny SD models) ./bin/sd -m ../models/tiny-sd.safetensors -p "a lovely cat"

Model Quantization and Conversion

Quantization Levels Supported

The stable-diffusion.cpp library supports various quantization levels to balance model size and performance:

Quantization Level Description Model Size Reduction Quality Impact
f32 32-bit floating-point None (original) No quality loss
f16 16-bit floating-point ~50% Minimal quality loss
q8_0 8-bit integer quantization ~75% Slight quality loss
q5_0, q5_1 5-bit integer quantization ~80% Moderate quality loss
q4_0, q4_1 4-bit integer quantization ~85% Noticeable quality loss
q3_k 3-bit K-quantization ~87% Significant quality loss
q4_k 4-bit K-quantization ~85% Good balance of size/quality
q2_k 2-bit K-quantization ~90% Major quality loss
Q4_K_S 4-bit K-quantization Small ~85% Optimized for smaller models

Model Conversion Commands

To convert models from their original format to quantized GGUF format, use the following commands:

Stable Diffusion Models

# Convert SD 1.5 model to 8-bit quantization
./bin/sd -M convert -m ../models/v1-5-pruned-emaonly.safetensors -o ../models/v1-5-pruned-emaonly.q8_0.gguf -v --type q8_0

# Convert SDXL model to 4-bit quantization
./bin/sd -M convert -m ../models/sd_xl_base_1.0.safetensors -o ../models/sd_xl_base_1.0.q4_0.gguf -v --type q4_0

Flux Models

# Convert Flux Dev model to 8-bit quantization
./bin/sd -M convert -m ../models/flux1-dev.sft -o ../models/flux1-dev-q8_0.gguf -v --type q8_0

# Convert Flux Schnell model to 3-bit K-quantization
./bin/sd -M convert -m ../models/flux1-schnell.sft -o ../models/flux1-schnell-q3_k.gguf -v --type q3_k

Chroma Models

# Convert Chroma model to 8-bit quantization
./bin/sd -M convert -m ../models/chroma-unlocked-v40.safetensors -o ../models/chroma-unlocked-v40-q8_0.gguf -v --type q8_0

Kontext Models

# Convert Kontext model to 8-bit quantization
./bin/sd -M convert -m ../models/flux1-kontext-dev.safetensors -o ../models/flux1-kontext-dev-q8_0.gguf -v --type q8_0

LoRA Models

The project supports LoRA (Low-Rank Adaptation) models for fine-tuning and style transfer:

LoRA Model Compatible Base Models Example Usage
marblesh.safetensors SD 1.5, SD 2.1 ./bin/sd -m ../models/v1-5-pruned-emaonly.safetensors -p "a lovely cat<lora:marblesh:1>" --lora-model-dir ../models
lcm-lora-sdv1-5 SD 1.5 ./bin/sd -m ../models/v1-5-pruned-emaonly.safetensors -p "a lovely cat<lora:lcm-lora-sdv1-5:1>" --steps 4 --lora-model-dir ../models -v --cfg-scale 1
realism_lora_comfy_converted FLUX Models ./bin/sd --diffusion-model ../models/flux1-dev-q8_0.gguf --vae ../models/ae.sft --clip_l ../models/clip_l.safetensors --t5xxl ../models/t5xxl_fp16.safetensors -p "a lovely cat holding a sign says 'flux.cpp'<lora:realism_lora_comfy_converted:1>" --cfg-scale 1.0 --sampling-method euler -v --lora-model-dir ../models --clip-on-cpu
wan2.2_t2v_lightx2v_4steps_lora_v1.1_low_noise Wan2.2 T2V Models ./bin/sd -M vid_gen --diffusion-model ../models/Wan2.2-T2V-A14B-LowNoise-Q8_0.gguf --lora-model-dir ../models -p "a lovely cat<lora:wan2.2_t2v_lightx2v_4steps_lora_v1.1_low_noise:1>" --steps 4
wan2.2_t2v_lightx2v_4steps_lora_v1.1_high_noise Wan2.2 T2V Models ./bin/sd -M vid_gen --diffusion-model ../models/Wan2.2-T2V-A14B-HighNoise-Q8_0.gguf --lora-model-dir ../models -p "a lovely cat<lora:wan2.2_t2v_lightx2v_4steps_lora_v1.1_high_noise:1>" --steps 4

Additional Model Types

Upscaling (ESRGAN)

# Use ESRGAN for upscaling generated images
./bin/sd -m ../models/v1-5-pruned-emaonly.safetensors -p "a lovely cat" --upscale-model ../models/RealESRGAN_x4plus_anime_6B.pth

Fast Decoding (TAESD)

# Use TAESD for faster VAE decoding
./bin/sd -m ../models/v1-5-pruned-emaonly.safetensors -p "a lovely cat" --taesd ../models/diffusion_pytorch_model.safetensors

Model Types and File Extensions

The project supports various model types, each with specific file extensions:

Model Type Enum Value Description Supported Extensions
LORA 1 Low-Rank Adaptation models .safetensors, .pt, .ckpt
CHECKPOINT 2 Main model checkpoints .safetensors, .pt, .ckpt
VAE 4 Variational Autoencoder models .safetensors, .pt, .ckpt
PRESETS 8 Generation preset files .json, .yaml, .yml
PROMPTS 16 Prompt template files .txt, .json
NEG_PROMPTS 32 Negative prompt templates .txt, .json
TAESD 64 Tiny AutoEncoder for SD .safetensors, .pt, .ckpt
ESRGAN 128 Super-resolution models .pth, .pt
CONTROLNET 256 ControlNet models .safetensors, .pt, .ckpt
UPSCALER 512 Image upscaler models .pth, .pt
EMBEDDING 1024 Textual embeddings .safetensors, .pt, .ckpt

Model Type Enum Definition

enum ModelType {
    LORA = 1,
    CHECKPOINT = 2,
    VAE = 4,
    PRESETS = 8,
    PROMPTS = 16,
    NEG_PROMPTS = 32,
    TAESD = 64,
    ESRGAN = 128,
    CONTROLNET = 256,
    UPSCALER = 512,
    EMBEDDING = 1024
};

API Endpoints

Planned Endpoints

Image Generation

  • POST /api/v1/generate/text2img - Generate image from text prompt
  • POST /api/v1/generate/img2img - Transform image with text prompt
  • POST /api/v1/generate/inpainting - Inpaint image with mask
  • GET /api/v1/generate/{job_id} - Get generation status and result
  • DELETE /api/v1/generate/{job_id} - Cancel a generation job

Model Management

  • GET /api/v1/models - List available models
  • GET /api/v1/models/{type} - List models of specific type
  • POST /api/v1/models/load - Load a model
  • POST /api/v1/models/unload - Unload a model
  • GET /api/v1/models/{model_id} - Get model information

System Information

  • GET /api/v1/status - Get server status and statistics
  • GET /api/v1/system - Get system information (GPU, memory, etc.)

Example Request/Response

Generate Image Request

POST /api/v1/generate
{
    "prompt": "a beautiful landscape",
    "negative_prompt": "blurry, low quality",
    "model": "sd-v1-5",
    "width": 512,
    "height": 512,
    "steps": 20,
    "cfg_scale": 7.5,
    "seed": -1,
    "batch_size": 1
}

Generate Image Response

{
    "job_id": "uuid-string",
    "status": "completed",
    "images": [
        {
            "data": "base64-encoded-image-data",
            "seed": 12345,
            "parameters": {
                "prompt": "a beautiful landscape",
                "negative_prompt": "blurry, low quality",
                "model": "sd-v1-5",
                "width": 512,
                "height": 512,
                "steps": 20,
                "cfg_scale": 7.5,
                "seed": 12345,
                "batch_size": 1
            }
        }
    ],
    "generation_time": 3.2
}

Authentication

The server supports multiple authentication methods to secure API access:

Supported Authentication Methods

  1. No Authentication (Default)

    • Open access to all endpoints
    • Suitable for development or trusted networks
  2. JWT Token Authentication

    • JSON Web Tokens for stateless authentication
    • Configurable token expiration
    • Secure for production deployments
  3. API Key Authentication

    • Static API keys for service-to-service communication
    • Simple integration for external applications
  4. PAM Authentication

    • Integration with system authentication via PAM (Pluggable Authentication Modules)
    • Supports LDAP, Kerberos, and other PAM backends
    • Leverages existing system user accounts
    • See PAM_AUTHENTICATION.md for detailed setup

Authentication Configuration

Authentication can be configured via command-line arguments or configuration files:

# Enable PAM authentication
./stable-diffusion-rest-server --auth pam --models-dir /path/to/models --checkpoints checkpoints

# Enable JWT authentication
./stable-diffusion-rest-server --auth jwt --models-dir /path/to/models --checkpoints checkpoints

# Enable API key authentication
./stable-diffusion-rest-server --auth api-key --models-dir /path/to/models --checkpoints checkpoints

# Enable Unix authentication
./stable-diffusion-rest-server --auth unix --models-dir /path/to/models --checkpoints checkpoints

# No authentication (default)
./stable-diffusion-rest-server --auth none --models-dir /path/to/models --checkpoints checkpoints

Authentication Methods

  • none - No authentication required (default)
  • jwt - JWT token authentication
  • api-key - API key authentication
  • unix - Unix system authentication
  • pam - PAM authentication
  • optional - Authentication optional (guest access allowed)

Deprecated Options

The following options are deprecated and will be removed in a future version:

  • --enable-unix-auth - Use --auth unix instead
  • --enable-pam-auth - Use --auth pam instead

Authentication Endpoints

  • POST /api/v1/auth/login - Authenticate with username/password (PAM/JWT)
  • POST /api/v1/auth/refresh - Refresh JWT token
  • GET /api/v1/auth/profile - Get current user profile
  • POST /api/v1/auth/logout - Logout/invalidate token

For detailed authentication setup instructions, see PAM_AUTHENTICATION.md.

Build Instructions

Prerequisites

  1. CMake 3.15 or later
  2. C++17 compatible compiler
  3. Git for cloning dependencies
  4. CUDA Toolkit (optional but recommended)

Build Steps

  1. Clone the repository:

    git clone https://github.com/your-username/stable-diffusion.cpp-rest.git
    cd stable-diffusion.cpp-rest
    
  2. Create a build directory:

    mkdir build
    cd build
    
  3. Configure with CMake:

    cmake ..
    
  4. Build the project:

    cmake --build . --parallel
    
  5. (Optional) Install the binary:

    cmake --install .
    

CMake Configuration

The project uses CMake's external project feature to automatically download and build the stable-diffusion.cpp library:

include(ExternalProject)

ExternalProject_Add(
    stable-diffusion.cpp
    GIT_REPOSITORY https://github.com/leejet/stable-diffusion.cpp.git
    GIT_TAG master-334-d05e46c
    SOURCE_DIR "${CMAKE_CURRENT_BINARY_DIR}/stable-diffusion.cpp-src"
    BINARY_DIR "${CMAKE_CURRENT_BINARY_DIR}/stable-diffusion.cpp-build"
    CMAKE_ARGS -DCMAKE_INSTALL_PREFIX=${CMAKE_CURRENT_BINARY_DIR}/stable-diffusion.cpp-install
    INSTALL_COMMAND ""
)

PAM Authentication Build Options

PAM authentication support is enabled by default when PAM libraries are available. You can control this with CMake options:

# Build with PAM support (default when available)
cmake -DENABLE_PAM_AUTH=ON ..

# Build without PAM support
cmake -DENABLE_PAM_AUTH=OFF ..

# Check if PAM support will be built
cmake -LA | grep ENABLE_PAM_AUTH

Note: PAM authentication requires the PAM development libraries:

# Ubuntu/Debian
sudo apt-get install libpam0g-dev

# CentOS/RHEL/Fedora
sudo yum install pam-devel

Usage Examples

Starting the Server

# Basic usage
./stable-diffusion.cpp-rest

# With custom configuration
./stable-diffusion.cpp-rest --config config.json

# With custom model directory
./stable-diffusion.cpp-rest --model-dir /path/to/models

Client Examples

Python with requests

import requests
import base64
import json

# Generate an image
response = requests.post('http://localhost:8080/api/v1/generate', json={
    'prompt': 'a beautiful landscape',
    'width': 512,
    'height': 512,
    'steps': 20
})

result = response.json()
if result['status'] == 'completed':
    # Decode and save the first image
    image_data = base64.b64decode(result['images'][0]['data'])
    with open('generated_image.png', 'wb') as f:
        f.write(image_data)

JavaScript with fetch

async function generateImage() {
    const response = await fetch('http://localhost:8080/api/v1/generate', {
        method: 'POST',
        headers: {
            'Content-Type': 'application/json'
        },
        body: JSON.stringify({
            prompt: 'a beautiful landscape',
            width: 512,
            height: 512,
            steps: 20
        })
    });

    const result = await response.json();
    if (result.status === 'completed') {
        // Create an image element with the generated image
        const img = document.createElement('img');
        img.src = `data:image/png;base64,${result.images[0].data}`;
        document.body.appendChild(img);
    }
}

cURL

# Generate an image
curl -X POST http://localhost:8080/api/v1/generate \
  -H "Content-Type: application/json" \
  -d '{
    "prompt": "a beautiful landscape",
    "width": 512,
    "height": 512,
    "steps": 20
  }'

# Check job status
curl http://localhost:8080/api/v1/generate/{job_id}

# List available models
curl http://localhost:8080/api/v1/models

Development Roadmap

Phase 1: Core Infrastructure

  • Basic HTTP server implementation
  • Generation queue system
  • Model manager with basic loading
  • CMake configuration with external dependencies
  • Basic API endpoints for image generation

Phase 2: Feature Enhancement

  • Complete parameter support from examples/cli/main.cpp
  • Model type detection and organization
  • Job status tracking and cancellation
  • Error handling and validation
  • Configuration management

Phase 3: Advanced Features

  • Batch processing support
  • Model hot-swapping
  • Performance optimization
  • Comprehensive logging
  • API authentication and security

Phase 4: Production Readiness

  • Comprehensive testing suite
  • Documentation and examples
  • Docker containerization
  • Performance benchmarking
  • Deployment guides

Future Considerations

  • WebSocket support for real-time updates
  • Plugin system for custom processors
  • Distributed processing support
  • Web UI for model management
  • Integration with popular AI frameworks

Contributing

Contributions are welcome! Please feel free to submit a Pull Request. For major changes, please open an issue first to discuss what you would like to change.

License

This project is licensed under the MIT License - see the LICENSE file for details.

Acknowledgments

  • stable-diffusion.cpp for the underlying C++ implementation
  • The Stable Diffusion community for models and examples
  • Contributors and users of this project