Keine Beschreibung

Fszontagh 342a8f3777 fix: resolve upscaler page freeze and model list display issues		vor 7 Monaten
cmake	e262669103 Implement multiple authentication and UI improvements	vor 8 Monaten
docs	e0f4f91f3d docs: add project documentation and structure guide	vor 7 Monaten
include	342a8f3777 fix: resolve upscaler page freeze and model list display issues	vor 7 Monaten
src	342a8f3777 fix: resolve upscaler page freeze and model list display issues	vor 7 Monaten
webui	342a8f3777 fix: resolve upscaler page freeze and model list display issues	vor 7 Monaten
.clang-format	186fbe2b12 Implement logAuthAttempt for detailed auth logging	vor 7 Monaten
.clang-tidy	186fbe2b12 Implement logAuthAttempt for detailed auth logging	vor 7 Monaten
.gitignore	8f80961d5a Fix gallery display and scrolling issues	vor 7 Monaten
.roomodes	528b44e561 Implement folder-based model detection and caching mechanism (#40)	vor 7 Monaten
AGENTS.md	342a8f3777 fix: resolve upscaler page freeze and model list display issues	vor 7 Monaten
CMakeLists.txt	5dc7608fee feat: implement comprehensive versioning system	vor 7 Monaten
README.md	eb6d02798c feat: backend improvements and webui cleanup	vor 7 Monaten
install.sh	af1d6cfb03 Initial commit: Stable Diffusion REST API Server	vor 8 Monaten
stable-diffusion-rest.service.template	af1d6cfb03 Initial commit: Stable Diffusion REST API Server	vor 8 Monaten
uninstall.sh	af1d6cfb03 Initial commit: Stable Diffusion REST API Server	vor 8 Monaten

stable-diffusion.cpp-rest

A production-ready C++ REST API server that wraps the stable-diffusion.cpp library, providing comprehensive HTTP endpoints for image generation with Stable Diffusion models. Features a modern web interface built with Next.js and robust authentication system.

✨ Features

Core Functionality

REST API - Complete HTTP API for Stable Diffusion image generation
Web UI - Modern, responsive web interface (automatically built with the server)
Queue System - Efficient job queue with status tracking and cancellation
Model Management - Intelligent model detection across multiple architectures
CUDA Support - Optional GPU acceleration for faster generation
Authentication - Multi-method authentication with JWT, PAM, Unix, and API keys

Generation Capabilities

Text-to-Image - Generate images from text prompts
Image-to-Image - Transform existing images with text guidance
ControlNet - Precise control over output composition
Inpainting - Edit specific regions of images
Upscaling - Enhance image resolution with ESRGAN models
Batch Processing - Generate multiple images in parallel

Advanced Features

Real-time Progress Tracking - WebSocket-like progress updates
Image Processing - Built-in resize, crop, and format conversion
Thumbnail Generation - Automatic thumbnail creation for galleries
Model Conversion - Convert models between quantization formats
System Monitoring - Comprehensive status and performance metrics
Flexible Authentication - Optional or required auth with multiple methods

Project Overview
Web UI Features
Architecture
Model Detection and Architecture Support
Model Architecture Requirements
Context Creation Methods per Architecture
Model Quantization and Conversion
Technical Requirements
Model Types and File Extensions
API Endpoints
Authentication System
Build Instructions
Usage Examples
Development Status

Project Overview

The stable-diffusion.cpp-rest project aims to create a high-performance REST API server that wraps the functionality of the stable-diffusion.cpp library. This enables developers to integrate Stable Diffusion image generation capabilities into their applications through standard HTTP requests, rather than directly using the C++ library.

Objectives

Provide a simple, RESTful interface for Stable Diffusion image generation
Support all parameters available in examples/cli/main.cpp
Implement efficient resource management with a generation queue system
Support multiple model types with automatic detection and loading
Ensure thread-safe operation with separate HTTP server and generation threads

Web UI Features

A modern, responsive web interface is included and automatically built with the server! Built with Next.js 16, React 19, and Tailwind CSS 4.

✨ Features

Multiple Generation Types
- Text-to-Image with comprehensive parameter controls
- Image-to-Image with strength adjustment
- ControlNet with multiple control modes
- Inpainting with interactive mask editor
- Upscaling with various ESRGAN models

📊 Real-time Monitoring

Job Queue Management - Real-time queue status and progress tracking
Generation Progress - Live progress updates with time estimates
System Status - Server performance and resource monitoring
Model Management - Load/unload models with dependency checking

🎨 User Experience

Responsive Design - Works on desktop, tablet, and mobile
Light/Dark Themes - Automatic theme detection and manual toggle
Interactive Controls - Intuitive parameter adjustments
Image Gallery - Thumbnail generation and batch download
Authentication - Secure login with multiple auth methods

⚡ Advanced Functionality

Image Processing - Built-in resize, crop, and format conversion
Batch Operations - Generate multiple images simultaneously
Model Compatibility - Smart model detection and requirement checking
URL Downloads - Import images from URLs for img2img and inpainting
CORS Support - Seamless integration with web applications

🔧 Technical Features

Static Export - Optimized for production deployment
Caching - Intelligent asset caching for performance
Error Handling - Comprehensive error reporting and recovery
WebSocket-like Updates - Real-time progress without WebSockets
Image Download - Direct file downloads with proper headers

🎯 Quick Start

# Build (automatically builds web UI)
mkdir build && cd build
cmake ..
cmake --build .

# Run server with web UI
./src/stable-diffusion-rest-server --models-dir /path/to/models --ui-dir ../build/webui

# Access web UI
open http://localhost:8080/ui/

📁 Web UI Structure

webui/
├── app/                     # Next.js app directory
│   ├── components/          # React components
│   ├── lib/                # Utilities and API clients
│   └── globals.css         # Global styles
├── public/                 # Static assets
├── package.json           # Dependencies
├── next.config.ts         # Next.js configuration
└── tsconfig.json          # TypeScript configuration

🚀 Built-in Components

Generation Forms - Specialized forms for each generation type
Model Browser - Interactive model selection with metadata
Progress Indicators - Visual progress bars and status updates
Image Preview - Thumbnail generation and full-size viewing
Settings Panel - Configuration and preference management

🎨 Styling System

Tailwind CSS 4 - Utility-first CSS framework
Radix UI Components - Accessible, unstyled components
Lucide React Icons - Beautiful icon system
Custom CSS Variables - Theme-aware design tokens
Responsive Grid - Mobile-first responsive layout

Architecture

The project is designed with a modular architecture consisting of three main components:

HTTP Server

Handles incoming HTTP requests
Parses request parameters and validates input
Returns generated images or error responses
Operates independently of the generation process

Generation Queue

Manages image generation requests
Processes jobs sequentially (one at a time)
Maintains thread-safe operations
Provides job status tracking

Model Manager

Handles loading and management of different model types
Supports automatic model detection from default folders
Automatically detects diffusion model architecture and selects appropriate loading method
Supports traditional SD models (SD 1.5, SD 2.1, SDXL Base/Refiner) using ctxParams.model_path
Supports modern architectures (Flux Schnell/Dev/Chroma, SD3, Qwen2VL) using ctxParams.diffusion_model_path
Includes fallback mechanisms for unknown architectures
Applies optimal parameters based on detected model type
Manages model lifecycle and memory usage

Provides type-based model organization

┌─────────────────┐    ┌─────────────────┐    ┌─────────────────┐
│   HTTP Server   │───▶│ Generation Queue│───▶│  Model Manager  │
│                 │    │                 │    │                 │
│ - Request Parse │    │ - Job Queue     │    │ - Model Loading │
│ - Response      │    │ - Sequential    │    │ - Type Detection│
│   Formatting    │    │   Processing    │    │ - Memory Mgmt   │
└─────────────────┘    └─────────────────┘    └─────────────────┘

Technical Requirements

Core Technologies

C++17 or later
CMake 3.15 or later
Threading support (std::thread, std::mutex, std::condition_variable)
CUDA support (optional but recommended for performance)

Dependencies

stable-diffusion.cpp library (automatically downloaded via CMake)
HTTP server library (to be determined based on requirements)
JSON library for request/response handling
Build tools compatible with CMake

Platform Support

Linux (primary development platform)
Windows (planned support)
macOS (planned support)

Project Structure

stable-diffusion.cpp-rest/
├── CMakeLists.txt              # Main CMake configuration
├── README.md                   # This file
├── src/                        # Source code directory
│   ├── main.cpp                # Application entry point
│   ├── http/                   # HTTP server implementation
│   │   ├── server.h/.cpp       # HTTP server class
│   │   ├── handlers.h/.cpp     # Request handlers
│   │   └── responses.h/.cpp    # Response formatting
│   ├── generation/             # Generation queue implementation
│   │   ├── queue.h/.cpp        # Job queue management
│   │   ├── worker.h/.cpp       # Generation worker thread
│   │   └── job.h/.cpp          # Job definition and status
│   ├── models/                 # Model manager implementation
│   │   ├── manager.h/.cpp      # Model manager class
│   │   ├── loader.h/.cpp       # Model loading logic
│   │   └── types.h/.cpp        # Model type definitions
│   └── utils/                  # Utility functions
│       ├── config.h/.cpp       # Configuration management
│       └── logger.h/.cpp       # Logging utilities
├── include/                    # Public header files
├── external/                   # External dependencies (managed by CMake)
├── models/                     # Default model storage directory
│   ├── lora/                   # LoRA models
│   ├── checkpoints/            # Checkpoint models
│   ├── vae/                    # VAE models
│   ├── presets/                # Preset files
│   ├── prompts/                # Prompt templates
│   ├── neg_prompts/            # Negative prompt templates
│   ├── taesd/                  # TAESD models
│   ├── esrgan/                 # ESRGAN models
│   ├── controlnet/             # ControlNet models
│   ├── upscaler/               # Upscaler models
│   └── embeddings/             # Textual embeddings
├── tests/                      # Unit and integration tests
├── examples/                   # Usage examples
└── docs/                       # Additional documentation

Model Detection and Architecture Support

The stable-diffusion.cpp-rest server includes intelligent model detection that automatically identifies the model architecture and selects the appropriate loading method. This ensures compatibility with both traditional Stable Diffusion models and modern architectures.

Supported Architectures

Traditional Stable Diffusion Models

SD 1.5 - Stable Diffusion version 1.5 models
SD 2.1 - Stable Diffusion version 2.1 models
SDXL Base - Stable Diffusion XL base models
SDXL Refiner - Stable Diffusion XL refiner models

These models are loaded using the traditional ctxParams.model_path parameter.

Modern Architectures

Flux Schnell - Fast Flux variant
Flux Dev - Development version of Flux
Flux Chroma - Chroma-optimized Flux
SD 3 - Stable Diffusion 3 models
Qwen2VL - Qwen2 Vision-Language models

These models are loaded using the modern ctxParams.diffusion_model_path parameter.

Detection Process

Model Analysis: When loading a model, the system analyzes the model file structure and metadata
Architecture Identification: The model architecture is identified based on key signatures in the model
Loading Method Selection: The appropriate loading method is automatically selected:
- Traditional models → ctxParams.model_path
- Modern architectures → ctxParams.diffusion_model_path
Fallback Handling: Unknown architectures default to traditional loading for backward compatibility
Error Recovery: If loading with the detected method fails, the system attempts fallback loading

Benefits

Automatic Compatibility: No need to manually specify model type
Optimal Loading: Each architecture uses its optimal loading parameters
Future-Proof: Easy to add support for new architectures
Backward Compatible: Existing models continue to work without changes

Model Architecture Requirements

Note: The following tables contain extensive information and may require horizontal scrolling to view all columns properly.

Architecture	Extra VAE	Standalone High Noise	T5XXL	CLIP-Vision	CLIP-G	CLIP-L	Model Files	Example Commands
SD 1.x	No	No	No	No	No	No	sd-v1-4.ckpt, v1-5-pruned-emaonly.safetensors	`./bin/sd -m ../models/sd-v1-4.ckpt -p "a lovely cat"`
SD 2.x	No	No	No	No	No	No	(Similar to SD 1.x)	`./bin/sd -m ../models/sd2-model.ckpt -p "a lovely cat"`
SDXL	Yes	No	No	No	No	No	sd_xl_base_1.0.safetensors, sdxl_vae-fp16-fix.safetensors	`./bin/sd -m ../models/sd_xl_base_1.0.safetensors --vae ../models/sdxl_vae-fp16-fix.safetensors -H 1024 -W 1024 -p "a lovely cat" -v`
SD3	No	No	Yes	No	No	No	sd3_medium_incl_clips_t5xxlfp16.safetensors	`./bin/sd -m ../models/sd3_medium_incl_clips_t5xxlfp16.safetensors -H 1024 -W 1024 -p 'a lovely cat holding a sign says "Stable Diffusion CPP"' --cfg-scale 4.5 --sampling-method euler -v --clip-on-cpu`
SD3.5 Large	No	No	Yes	No	Yes	Yes	sd3.5_large.safetensors, clip_l.safetensors, clip_g.safetensors, t5xxl_fp16.safetensors	`./bin/sd -m ../models/sd3.5_large.safetensors --clip_l ../models/clip_l.safetensors --clip_g ../models/clip_g.safetensors --t5xxl ../models/t5xxl_fp16.safetensors -H 1024 -W 1024 -p 'a lovely cat holding a sign says "Stable diffusion 3.5 Large"' --cfg-scale 4.5 --sampling-method euler -v --clip-on-cpu`
FLUX Models	Yes	No	Yes	No	No	Yes	flux1-dev-q3_k.gguf, flux1-dev-q8_0.gguf, flux1-schnell-q8_0.gguf, ae.sft, clip_l.safetensors, t5xxl_fp16.safetensors	`./bin/sd --diffusion-model ../models/flux1-dev-q8_0.gguf --vae ../models/ae.sft --clip_l ../models/clip_l.safetensors --t5xxl ../models/t5xxl_fp16.safetensors -p "a lovely cat holding a sign says 'flux.cpp'" --cfg-scale 1.0 --sampling-method euler -v --clip-on-cpu`
Kontext	Yes	No	Yes	No	No	Yes	flux1-kontext-dev-q8_0.gguf, ae.sft, clip_l.safetensors, t5xxl_fp16.safetensors	`./bin/sd -r ./flux1-dev-q8_0.png --diffusion-model ../models/flux1-kontext-dev-q8_0.gguf --vae ../models/ae.sft --clip_l ../models/clip_l.safetensors --t5xxl ../models/t5xxl_fp16.safetensors -p "change 'flux.cpp' to 'kontext.cpp'" --cfg-scale 1.0 --sampling-method euler -v --clip-on-cpu`
Chroma	Yes	No	Yes	No	No	No	chroma-unlocked-v40-q8_0.gguf, ae.sft, t5xxl_fp16.safetensors	`./bin/sd --diffusion-model ../models/chroma-unlocked-v40-q8_0.gguf --vae ../models/ae.sft --t5xxl ../models/t5xxl_fp16.safetensors -p "a lovely cat holding a sign says 'chroma.cpp'" --cfg-scale 4.0 --sampling-method euler -v --chroma-disable-dit-mask --clip-on-cpu`
Wan Models	Yes	No	Yes	Yes	No	No	wan2.1_t2v_1.3B_fp16.safetensors, wan_2.1_vae.safetensors, umt5-xxl-encoder-Q8_0.gguf	`./bin/sd -M vid_gen --diffusion-model ../models/wan2.1_t2v_1.3B_fp16.safetensors --vae ../models/wan_2.1_vae.safetensors --t5xxl ../models/umt5-xxl-encoder-Q8_0.gguf -p "a lovely cat" --cfg-scale 6.0 --sampling-method euler -v -W 832 -H 480 --diffusion-fa --video-frames 33 --flow-shift 3.0`
Wan2.2 T2V A14B	No	Yes	No	No	No	No	Wan2.2-T2V-A14B-LowNoise-Q8_0.gguf, Wan2.2-T2V-A14B-HighNoise-Q8_0.gguf	`./bin/sd -M vid_gen --diffusion-model ../models/Wan2.2-T2V-A14B-LowNoise-Q8_0.gguf -p "a lovely cat" --cfg-scale 6.0 --sampling-method euler -v -W 832 -H 480 --diffusion-fa --video-frames 33 --flow-shift 3.0`
Wan2.2 I2V A14B	No	Yes	No	No	No	No	Wan2.2-I2V-A14B-LowNoise-Q8_0.gguf, Wan2.2-I2V-A14B-HighNoise-Q8_0.gguf	`./bin/sd -M vid_gen --diffusion-model ../models/Wan2.2-I2V-A14B-LowNoise-Q8_0.gguf -r input.png -p "a lovely cat" --cfg-scale 6.0 --sampling-method euler -v -W 832 -H 480 --diffusion-fa --video-frames 33 --flow-shift 3.0`
Qwen Image Models	Yes	No	No	No	No	No	qwen-image-Q8_0.gguf, qwen_image_vae.safetensors, Qwen2.5-VL-7B-Instruct-Q8_0.gguf	`./bin/sd --diffusion-model ../models/qwen-image-Q8_0.gguf --vae ../models/qwen_image_vae.safetensors --qwen2vl ../models/Qwen2.5-VL-7B-Instruct-Q8_0.gguf -p '一个穿着"QWEN"标志的T恤的中国美女正拿着黑色的马克笔面相镜头微笑。' --cfg-scale 2.5 --sampling-method euler -v --offload-to-cpu -H 1024 -W 1024 --diffusion-fa --flow-shift 3`
Qwen Image Edit	Yes	No	No	No	No	No	Qwen_Image_Edit-Q8_0.gguf, qwen_image_vae.safetensors, qwen_2.5_vl_7b.safetensors	`./bin/sd --diffusion-model ../models/Qwen_Image_Edit-Q8_0.gguf --vae ../models/qwen_image_vae.safetensors --qwen2vl ../models/qwen_2.5_vl_7b.safetensors --cfg-scale 2.5 --sampling-method euler -v --offload-to-cpu --diffusion-fa --flow-shift 3 -r ../assets/flux/flux1-dev-q8_0.png -p "change 'flux.cpp' to 'edit.cpp'"`
PhotoMaker	No	No	No	No	No	No	sdxlUnstableDiffusers_v11.safetensors, sdxl_vae.safetensors, photomaker-v1.safetensors	`./bin/sd -m ../models/sdxlUnstableDiffusers_v11.safetensors --vae ../models/sdxl_vae.safetensors --photo-maker ../models/photomaker-v1.safetensors --pm-id-images-dir ../assets/photomaker_examples/scarletthead_woman -p "a girl img, retro futurism" --cfg-scale 5.0 --sampling-method euler -H 1024 -W 1024 --pm-style-strength 10 --vae-on-cpu --steps 50`
LCM	No	No	No	No	No	No	lcm-lora-sdv1-5	`./bin/sd -m ../models/v1-5-pruned-emaonly.safetensors -p "a lovely cat<lora:lcm-lora-sdv1-5:1>" --steps 4 --lora-model-dir ../models -v --cfg-scale 1`
SSD1B	No	No	No	No	No	No	(Various SSD-1B models)	`./bin/sd -m ../models/ssd-1b.safetensors -p "a lovely cat"`
Tiny SD	No	No	No	No	No	No	(Various Tiny SD models)	`./bin/sd -m ../models/tiny-sd.safetensors -p "a lovely cat"`

Context Creation Methods per Architecture

Architecture	Context Creation Method	Special Parameters	Model Files	Example Commands
SD 1.x, SD 2.x, SDXL	Standard prompt-based generation	--cfg-scale, --sampling-method, --steps	sd-v1-4.ckpt, v1-5-pruned-emaonly.safetensors, sd_xl_base_1.0.safetensors	`./bin/sd -m ../models/sd-v1-4.ckpt -p "a lovely cat"`
SD3	Multiple text encoders	--clip-on-cpu recommended	sd3_medium_incl_clips_t5xxlfp16.safetensors	`./bin/sd -m ../models/sd3_medium_incl_clips_t5xxlfp16.safetensors -H 1024 -W 1024 -p 'a lovely cat holding a sign says "Stable Diffusion CPP"' --cfg-scale 4.5 --sampling-method euler -v --clip-on-cpu`
SD3.5 Large	Multiple text encoders	--clip-on-cpu recommended	sd3.5_large.safetensors, clip_l.safetensors, clip_g.safetensors, t5xxl_fp16.safetensors	`./bin/sd -m ../models/sd3.5_large.safetensors --clip_l ../models/clip_l.safetensors --clip_g ../models/clip_g.safetensors --t5xxl ../models/t5xxl_fp16.safetensors -H 1024 -W 1024 -p 'a lovely cat holding a sign says "Stable diffusion 3.5 Large"' --cfg-scale 4.5 --sampling-method euler -v --clip-on-cpu`
FLUX Models	Text-to-image generation	--cfg-scale 1.0 recommended, --clip-on-cpu for memory efficiency	flux1-dev-q3_k.gguf, flux1-dev-q8_0.gguf, flux1-schnell-q8_0.gguf, ae.sft, clip_l.safetensors, t5xxl_fp16.safetensors	`./bin/sd --diffusion-model ../models/flux1-dev-q8_0.gguf --vae ../models/ae.sft --clip_l ../models/clip_l.safetensors --t5xxl ../models/t5xxl_fp16.safetensors -p "a lovely cat holding a sign says 'flux.cpp'" --cfg-scale 1.0 --sampling-method euler -v --clip-on-cpu`
Kontext	Image-to-image transformation	-r for reference image, --cfg-scale 1.0 recommended	flux1-kontext-dev-q8_0.gguf, ae.sft, clip_l.safetensors, t5xxl_fp16.safetensors	`./bin/sd -r ./flux1-dev-q8_0.png --diffusion-model ../models/flux1-kontext-dev-q8_0.gguf --vae ../models/ae.sft --clip_l ../models/clip_l.safetensors --t5xxl ../models/t5xxl_fp16.safetensors -p "change 'flux.cpp' to 'kontext.cpp'" --cfg-scale 1.0 --sampling-method euler -v --clip-on-cpu`
Chroma	Text-to-image generation	--cfg-scale 1.0 recommended, --clip-on-cpu for memory efficiency	chroma-unlocked-v40-q8_0.gguf, ae.sft, t5xxl_fp16.safetensors	`./bin/sd --diffusion-model ../models/chroma-unlocked-v40-q8_0.gguf --vae ../models/ae.sft --t5xxl ../models/t5xxl_fp16.safetensors -p "a lovely cat holding a sign says 'chroma.cpp'" --cfg-scale 4.0 --sampling-method euler -v --chroma-disable-dit-mask --clip-on-cpu`
Chroma1-Radiance	Text-to-image generation	--cfg-scale 4.0 recommended	Chroma1-Radiance-v0.4-Q8_0.gguf, t5xxl_fp16.safetensors	`./bin/sd --diffusion-model ../models/Chroma1-Radiance-v0.4-Q8_0.gguf --t5xxl ../models/clip/t5xxl_fp16.safetensors -p "a lovely cat holding a sign says 'chroma radiance cpp'" --cfg-scale 4.0 --sampling-method euler -v`
Wan Models	Video generation with text prompts	-M vid_gen, --video-frames, --flow-shift, --diffusion-fa	wan2.1_t2v_1.3B_fp16.safetensors, wan_2.1_vae.safetensors, umt5-xxl-encoder-Q8_0.gguf	`./bin/sd -M vid_gen --diffusion-model ../models/wan2.1_t2v_1.3B_fp16.safetensors --vae ../models/wan_2.1_vae.safetensors --t5xxl ../models/umt5-xxl-encoder-Q8_0.gguf -p "a lovely cat" --cfg-scale 6.0 --sampling-method euler -v -W 832 -H 480 --diffusion-fa --video-frames 33 --flow-shift 3.0`
Wan2.1 I2V Models	Image-to-video generation	Requires clip_vision_h.safetensors	wan2.1-i2v-14b-480p-Q8_0.gguf, wan_2.1_vae.safetensors, clip_vision_h.safetensors	`./bin/sd -M vid_gen --diffusion-model ../models/wan2.1-i2v-14b-480p-Q8_0.gguf --vae ../models/wan_2.1_vae.safetensors -r input.png -p "a lovely cat" --cfg-scale 6.0 --sampling-method euler -v -W 832 -H 480 --diffusion-fa --video-frames 33 --flow-shift 3.0`
Wan2.1 FLF2V Models	Flow-to-video generation	Requires clip_vision_h.safetensors	wan2.1-flf2v-14b-720p-Q8_0.gguf, wan_2.1_vae.safetensors, clip_vision_h.safetensors	`./bin/sd -M vid_gen --diffusion-model ../models/wan2.1-flf2v-14b-720p-Q8_0.gguf --vae ../models/wan_2.1_vae.safetensors -r flow.png -p "a lovely cat" --cfg-scale 6.0 --sampling-method euler -v -W 832 -H 480 --diffusion-fa --video-frames 33 --flow-shift 3.0`
Wan2.2 T2V A14B	Text-to-video generation	Uses dual diffusion models	Wan2.2-T2V-A14B-LowNoise-Q8_0.gguf, Wan2.2-T2V-A14B-HighNoise-Q8_0.gguf	`./bin/sd -M vid_gen --diffusion-model ../models/Wan2.2-T2V-A14B-LowNoise-Q8_0.gguf -p "a lovely cat" --cfg-scale 6.0 --sampling-method euler -v -W 832 -H 480 --diffusion-fa --video-frames 33 --flow-shift 3.0`
Wan2.2 I2V A14B	Image-to-video generation	Uses dual diffusion models	Wan2.2-I2V-A14B-LowNoise-Q8_0.gguf, Wan2.2-I2V-A14B-HighNoise-Q8_0.gguf	`./bin/sd -M vid_gen --diffusion-model ../models/Wan2.2-I2V-A14B-LowNoise-Q8_0.gguf -r input.png -p "a lovely cat" --cfg-scale 6.0 --sampling-method euler -v -W 832 -H 480 --diffusion-fa --video-frames 33 --flow-shift 3.0`
Qwen Image Models	Text-to-image generation with Chinese language support	--qwen2vl for the language model, --diffusion-fa, --flow-shift	qwen-image-Q8_0.gguf, qwen_image_vae.safetensors, Qwen2.5-VL-7B-Instruct-Q8_0.gguf	`./bin/sd --diffusion-model ../models/qwen-image-Q8_0.gguf --vae ../models/qwen_image_vae.safetensors --qwen2vl ../models/Qwen2.5-VL-7B-Instruct-Q8_0.gguf -p '一个穿着"QWEN"标志的T恤的中国美女正拿着黑色的马克笔面相镜头微笑。' --cfg-scale 2.5 --sampling-method euler -v --offload-to-cpu -H 1024 -W 1024 --diffusion-fa --flow-shift 3`
Qwen Image Edit	Image editing with reference image	-r for reference image, --qwen2vl_vision for vision model	Qwen_Image_Edit-Q8_0.gguf, qwen_image_vae.safetensors, qwen_2.5_vl_7b.safetensors	`./bin/sd --diffusion-model ../models/Qwen_Image_Edit-Q8_0.gguf --vae ../models/qwen_image_vae.safetensors --qwen2vl ../models/qwen_2.5_vl_7b.safetensors --cfg-scale 2.5 --sampling-method euler -v --offload-to-cpu --diffusion-fa --flow-shift 3 -r ../assets/flux/flux1-dev-q8_0.png -p "change 'flux.cpp' to 'edit.cpp'"`
PhotoMaker	Personalized image generation with ID images	--photo-maker, --pm-id-images-dir, --pm-style-strength	sdxlUnstableDiffusers_v11.safetensors, sdxl_vae.safetensors, photomaker-v1.safetensors	`./bin/sd -m ../models/sdxlUnstableDiffusers_v11.safetensors --vae ../models/sdxl_vae.safetensors --photo-maker ../models/photomaker-v1.safetensors --pm-id-images-dir ../assets/photomaker_examples/scarletthead_woman -p "a girl img, retro futurism" --cfg-scale 5.0 --sampling-method euler -H 1024 -W 1024 --pm-style-strength 10 --vae-on-cpu --steps 50`
LCM	Fast generation with LoRA	--cfg-scale 1.0, --steps 2-8, --sampling-method lcm/euler_a	lcm-lora-sdv1-5	`./bin/sd -m ../models/v1-5-pruned-emaonly.safetensors -p "a lovely cat<lora:lcm-lora-sdv1-5:1>" --steps 4 --lora-model-dir ../models -v --cfg-scale 1`
SSD1B	Standard prompt-based generation	Standard SD parameters	(Various SSD-1B models)	`./bin/sd -m ../models/ssd-1b.safetensors -p "a lovely cat"`
Tiny SD	Standard prompt-based generation	Standard SD parameters	(Various Tiny SD models)	`./bin/sd -m ../models/tiny-sd.safetensors -p "a lovely cat"`

Model Quantization and Conversion

Quantization Levels Supported

The stable-diffusion.cpp library supports various quantization levels to balance model size and performance:

Quantization Level	Description	Model Size Reduction	Quality Impact
`f32`	32-bit floating-point	None (original)	No quality loss
`f16`	16-bit floating-point	~50%	Minimal quality loss
`q8_0`	8-bit integer quantization	~75%	Slight quality loss
`q5_0`, `q5_1`	5-bit integer quantization	~80%	Moderate quality loss
`q4_0`, `q4_1`	4-bit integer quantization	~85%	Noticeable quality loss
`q3_k`	3-bit K-quantization	~87%	Significant quality loss
`q4_k`	4-bit K-quantization	~85%	Good balance of size/quality
`q2_k`	2-bit K-quantization	~90%	Major quality loss
`Q4_K_S`	4-bit K-quantization Small	~85%	Optimized for smaller models

Model Conversion Commands

To convert models from their original format to quantized GGUF format, use the following commands:

Stable Diffusion Models

# Convert SD 1.5 model to 8-bit quantization
./bin/sd -M convert -m ../models/v1-5-pruned-emaonly.safetensors -o ../models/v1-5-pruned-emaonly.q8_0.gguf -v --type q8_0

# Convert SDXL model to 4-bit quantization
./bin/sd -M convert -m ../models/sd_xl_base_1.0.safetensors -o ../models/sd_xl_base_1.0.q4_0.gguf -v --type q4_0

Flux Models

# Convert Flux Dev model to 8-bit quantization
./bin/sd -M convert -m ../models/flux1-dev.sft -o ../models/flux1-dev-q8_0.gguf -v --type q8_0

# Convert Flux Schnell model to 3-bit K-quantization
./bin/sd -M convert -m ../models/flux1-schnell.sft -o ../models/flux1-schnell-q3_k.gguf -v --type q3_k

Chroma Models

# Convert Chroma model to 8-bit quantization
./bin/sd -M convert -m ../models/chroma-unlocked-v40.safetensors -o ../models/chroma-unlocked-v40-q8_0.gguf -v --type q8_0

Kontext Models

# Convert Kontext model to 8-bit quantization
./bin/sd -M convert -m ../models/flux1-kontext-dev.safetensors -o ../models/flux1-kontext-dev-q8_0.gguf -v --type q8_0

LoRA Models

The project supports LoRA (Low-Rank Adaptation) models for fine-tuning and style transfer:

LoRA Model	Compatible Base Models	Example Usage
`marblesh.safetensors`	SD 1.5, SD 2.1	`./bin/sd -m ../models/v1-5-pruned-emaonly.safetensors -p "a lovely cat<lora:marblesh:1>" --lora-model-dir ../models`
`lcm-lora-sdv1-5`	SD 1.5	`./bin/sd -m ../models/v1-5-pruned-emaonly.safetensors -p "a lovely cat<lora:lcm-lora-sdv1-5:1>" --steps 4 --lora-model-dir ../models -v --cfg-scale 1`
`realism_lora_comfy_converted`	FLUX Models	`./bin/sd --diffusion-model ../models/flux1-dev-q8_0.gguf --vae ../models/ae.sft --clip_l ../models/clip_l.safetensors --t5xxl ../models/t5xxl_fp16.safetensors -p "a lovely cat holding a sign says 'flux.cpp'<lora:realism_lora_comfy_converted:1>" --cfg-scale 1.0 --sampling-method euler -v --lora-model-dir ../models --clip-on-cpu`
`wan2.2_t2v_lightx2v_4steps_lora_v1.1_low_noise`	Wan2.2 T2V Models	`./bin/sd -M vid_gen --diffusion-model ../models/Wan2.2-T2V-A14B-LowNoise-Q8_0.gguf --lora-model-dir ../models -p "a lovely cat<lora:wan2.2_t2v_lightx2v_4steps_lora_v1.1_low_noise:1>" --steps 4`
`wan2.2_t2v_lightx2v_4steps_lora_v1.1_high_noise`	Wan2.2 T2V Models	`./bin/sd -M vid_gen --diffusion-model ../models/Wan2.2-T2V-A14B-HighNoise-Q8_0.gguf --lora-model-dir ../models -p "a lovely cat<lora:wan2.2_t2v_lightx2v_4steps_lora_v1.1_high_noise:1>" --steps 4`

Additional Model Types

Upscaling (ESRGAN)

# Use ESRGAN for upscaling generated images
./bin/sd -m ../models/v1-5-pruned-emaonly.safetensors -p "a lovely cat" --upscale-model ../models/RealESRGAN_x4plus_anime_6B.pth

Fast Decoding (TAESD)

# Use TAESD for faster VAE decoding
./bin/sd -m ../models/v1-5-pruned-emaonly.safetensors -p "a lovely cat" --taesd ../models/diffusion_pytorch_model.safetensors

Model Types and File Extensions

The project supports various model types, each with specific file extensions:

Model Type	Enum Value	Description	Supported Extensions
LORA	1	Low-Rank Adaptation models	.safetensors, .pt, .ckpt
CHECKPOINT	2	Main model checkpoints	.safetensors, .pt, .ckpt
VAE	4	Variational Autoencoder models	.safetensors, .pt, .ckpt
PRESETS	8	Generation preset files	.json, .yaml, .yml
PROMPTS	16	Prompt template files	.txt, .json
NEG_PROMPTS	32	Negative prompt templates	.txt, .json
TAESD	64	Tiny AutoEncoder for SD	.safetensors, .pt, .ckpt
ESRGAN	128	Super-resolution models	.pth, .pt
CONTROLNET	256	ControlNet models	.safetensors, .pt, .ckpt
UPSCALER	512	Image upscaler models	.pth, .pt
EMBEDDING	1024	Textual embeddings	.safetensors, .pt, .ckpt

Model Type Enum Definition

enum ModelType {
    LORA = 1,
    CHECKPOINT = 2,
    VAE = 4,
    PRESETS = 8,
    PROMPTS = 16,
    NEG_PROMPTS = 32,
    TAESD = 64,
    ESRGAN = 128,
    CONTROLNET = 256,
    UPSCALER = 512,
    EMBEDDING = 1024
};

API Endpoints

Core Generation Endpoints

Text-to-Image Generation

POST /api/generate/text2img

Generate images from text prompts with comprehensive parameter support.

Example Request:

{
    "prompt": "a beautiful landscape",
    "negative_prompt": "blurry, low quality", 
    "width": 1024,
    "height": 1024,
    "steps": 20,
    "cfg_scale": 7.5,
    "sampling_method": "euler",
    "scheduler": "karras",
    "seed": "random",
    "batch_count": 1,
    "vae_model": "optional_vae_name"
}

Image-to-Image Generation

POST /api/generate/img2img

Transform existing images with text guidance.

Example Request:

{
    "prompt": "transform into anime style",
    "init_image": "base64_encoded_image_or_url",
    "strength": 0.75,
    "width": 1024,
    "height": 1024,
    "steps": 20,
    "cfg_scale": 7.5
}

ControlNet Generation

POST /api/generate/controlnet

Apply precise control using ControlNet models.

Example Request:

{
    "prompt": "a person standing",
    "control_image": "base64_encoded_control_image",
    "control_net_model": "canny",
    "control_strength": 0.9,
    "width": 512,
    "height": 512
}

Inpainting

POST /api/generate/inpainting

Edit specific regions of images using masks.

Example Request:

{
    "prompt": "change hair color to blonde",
    "source_image": "base64_encoded_source_image",
    "mask_image": "base64_encoded_mask_image",
    "strength": 0.75,
    "width": 512,
    "height": 512
}

Upscaling

POST /api/generate/upscale

Enhance image resolution using ESRGAN models.

Example Request:

{
    "image": "base64_encoded_image",
    "esrgan_model": "esrgan_model_name",
    "upscale_factor": 4
}

Job Management

Job Status

GET /api/queue/job/{job_id}

Get detailed status and results for a specific job.

Queue Status

GET /api/queue/status

Get current queue state and active jobs.

Cancel Job

POST /api/queue/cancel

Cancel a pending or running job.

Example Request:

{
    "job_id": "uuid-of-job-to-cancel"
}

Clear Queue

POST /api/queue/clear

Clear all pending jobs from the queue.

Model Management

List Models

GET /api/models

List all available models with metadata and filtering options.

Query Parameters:

type - Filter by model type (lora, checkpoint, vae, etc.)
search - Search in model names and descriptions
sort_by - Sort by name, size, date, type
sort_order - asc or desc
page - Page number for pagination
limit - Items per page

Model Information

GET /api/models/{model_id}

Get detailed information about a specific model.

Load Model

POST /api/models/{model_id}/load

Load a model into memory.

Unload Model

POST /api/models/{model_id}/unload

Unload a model from memory.

Model Types

GET /api/models/types

Get information about supported model types and their capabilities.

Model Directories

GET /api/models/directories

List and check status of model directories.

Refresh Models

POST /api/models/refresh

Rescan model directories and update cache.

Model Statistics

GET /api/models/stats

Get comprehensive statistics about models.

Batch Model Operations

POST /api/models/batch

Perform batch operations on multiple models.

Example Request:

{
    "operation": "load",
    "models": ["model1", "model2", "model3"]
}

Model Validation

POST /api/models/validate

Validate model files and check compatibility.

Model Conversion

POST /api/models/convert

Convert models between quantization formats.

Example Request:

{
    "model_name": "checkpoint_model_name",
    "quantization_type": "q8_0",
    "output_path": "/path/to/output.gguf"
}

Model Hashing

POST /api/models/hash

Generate SHA256 hashes for model verification.

System Information

Server Status

GET /api/status

Get server status, queue information, and loaded models.

System Information

GET /api/system

Get detailed system information including hardware, capabilities, and limits.

Server Configuration

GET /api/config

Get current server configuration and limits.

Server Restart

POST /api/system/restart

Trigger graceful server restart.

Authentication

Login

POST /api/auth/login

Authenticate user and receive access token.

Example Request:

{
    "username": "admin",
    "password": "password123"
}

Token Validation

GET /api/auth/validate

Validate and check current token status.

Refresh Token

POST /api/auth/refresh

Refresh authentication token.

User Profile

GET /api/auth/me

Get current user information and permissions.

Logout

POST /api/auth/logout

Logout and invalidate token.

Utility Endpoints

Samplers

GET /api/samplers

Get available sampling methods and their properties.

Schedulers

GET /api/schedulers

Get available schedulers and their properties.

Parameters

GET /api/parameters

Get detailed parameter information and validation rules.

Validation

POST /api/validate

Validate generation parameters before submission.

Time Estimation

POST /api/estimate

Estimate generation time and memory usage.

Image Processing

POST /api/image/resize
POST /api/image/crop

Resize or crop images server-side.

Image Download

GET /api/image/download?url=image_url

Download and encode images from URLs.

File Downloads

Job Output Files

GET /api/v1/jobs/{job_id}/output/{filename}
GET /api/queue/job/{job_id}/output/{filename}

Download generated images and output files.

Thumbnail Support

GET /api/v1/jobs/{job_id}/output/{filename}?thumb=1&size=200

Get thumbnails for faster web UI loading.

Health Check

Basic Health

GET /api/health

Simple health check endpoint.

Version Information

GET /api/version

Get detailed version and build information.

Public vs Protected Endpoints

Public (No Authentication Required):

/api/health - Basic health check
/api/status - Server status (read-only)
/api/version - Version information
Image download endpoints (for web UI display)

Protected (Authentication Required):

All generation endpoints
Model management (except listing)
Job management cancellation
System management operations
Authentication profile access

Example Request/Response

Generate Image Request

POST /api/v1/generate
{
    "prompt": "a beautiful landscape",
    "negative_prompt": "blurry, low quality",
    "model": "sd-v1-5",
    "width": 512,
    "height": 512,
    "steps": 20,
    "cfg_scale": 7.5,
    "seed": -1,
    "batch_size": 1
}

Generate Image Response

{
    "job_id": "uuid-string",
    "status": "completed",
    "images": [
        {
            "data": "base64-encoded-image-data",
            "seed": 12345,
            "parameters": {
                "prompt": "a beautiful landscape",
                "negative_prompt": "blurry, low quality",
                "model": "sd-v1-5",
                "width": 512,
                "height": 512,
                "steps": 20,
                "cfg_scale": 7.5,
                "seed": 12345,
                "batch_size": 1
            }
        }
    ],
    "generation_time": 3.2
}

Authentication

The server supports multiple authentication methods to secure API access:

Supported Authentication Methods

No Authentication (Default)
- Open access to all endpoints
- Suitable for development or trusted networks
JWT Token Authentication
- JSON Web Tokens for stateless authentication
- Configurable token expiration
- Secure for production deployments
API Key Authentication
- Static API keys for service-to-service communication
- Simple integration for external applications
PAM Authentication
- Integration with system authentication via PAM (Pluggable Authentication Modules)
- Supports LDAP, Kerberos, and other PAM backends
- Leverages existing system user accounts
- See PAM_AUTHENTICATION.md for detailed setup

Authentication Configuration

Authentication can be configured via command-line arguments or configuration files:

# Enable PAM authentication
./stable-diffusion-rest-server --auth pam --models-dir /path/to/models --checkpoints checkpoints

# Enable JWT authentication
./stable-diffusion-rest-server --auth jwt --models-dir /path/to/models --checkpoints checkpoints

# Enable API key authentication
./stable-diffusion-rest-server --auth api-key --models-dir /path/to/models --checkpoints checkpoints

# Enable Unix authentication
./stable-diffusion-rest-server --auth unix --models-dir /path/to/models --checkpoints checkpoints

# No authentication (default)
./stable-diffusion-rest-server --auth none --models-dir /path/to/models --checkpoints checkpoints

Authentication Methods

none - No authentication required (default)
jwt - JWT token authentication
api-key - API key authentication
unix - Unix system authentication
pam - PAM authentication
optional - Authentication optional (guest access allowed)

Deprecated Options

The following options are deprecated and will be removed in a future version:

--enable-unix-auth - Use --auth unix instead
--enable-pam-auth - Use --auth pam instead

Authentication Endpoints

POST /api/v1/auth/login - Authenticate with username/password (PAM/JWT)
POST /api/v1/auth/refresh - Refresh JWT token
GET /api/v1/auth/profile - Get current user profile
POST /api/v1/auth/logout - Logout/invalidate token

For detailed authentication setup instructions, see PAM_AUTHENTICATION.md.

Build Instructions

Prerequisites

CMake 3.15 or later
C++17 compatible compiler
Git for cloning dependencies
CUDA Toolkit (optional but recommended)

Build Steps

Clone the repository:

git clone https://github.com/your-username/stable-diffusion.cpp-rest.git
cd stable-diffusion.cpp-rest

Create a build directory:
```
mkdir build
cd build
```
Configure with CMake:
```
cmake ..
```
Build the project:
```
cmake --build . --parallel
```
(Optional) Install the binary:
```
cmake --install .
```

CMake Configuration

The project uses CMake's external project feature to automatically download and build the stable-diffusion.cpp library:

include(ExternalProject)

ExternalProject_Add(
    stable-diffusion.cpp
    GIT_REPOSITORY https://github.com/leejet/stable-diffusion.cpp.git
    GIT_TAG master-334-d05e46c
    SOURCE_DIR "${CMAKE_CURRENT_BINARY_DIR}/stable-diffusion.cpp-src"
    BINARY_DIR "${CMAKE_CURRENT_BINARY_DIR}/stable-diffusion.cpp-build"
    CMAKE_ARGS -DCMAKE_INSTALL_PREFIX=${CMAKE_CURRENT_BINARY_DIR}/stable-diffusion.cpp-install
    INSTALL_COMMAND ""
)

PAM Authentication Build Options

PAM authentication support is enabled by default when PAM libraries are available. You can control this with CMake options:

# Build with PAM support (default when available)
cmake -DENABLE_PAM_AUTH=ON ..

# Build without PAM support
cmake -DENABLE_PAM_AUTH=OFF ..

# Check if PAM support will be built
cmake -LA | grep ENABLE_PAM_AUTH

Note: PAM authentication requires the PAM development libraries:

# Ubuntu/Debian
sudo apt-get install libpam0g-dev

# CentOS/RHEL/Fedora
sudo yum install pam-devel

Usage Examples

Starting the Server

# Basic usage
./stable-diffusion.cpp-rest

# With custom configuration
./stable-diffusion.cpp-rest --config config.json

# With custom model directory
./stable-diffusion.cpp-rest --model-dir /path/to/models

Client Examples

Python with requests

import requests
import base64
import json

# Generate an image
response = requests.post('http://localhost:8080/api/v1/generate', json={
    'prompt': 'a beautiful landscape',
    'width': 512,
    'height': 512,
    'steps': 20
})

result = response.json()
if result['status'] == 'completed':
    # Decode and save the first image
    image_data = base64.b64decode(result['images'][0]['data'])
    with open('generated_image.png', 'wb') as f:
        f.write(image_data)

JavaScript with fetch

async function generateImage() {
    const response = await fetch('http://localhost:8080/api/v1/generate', {
        method: 'POST',
        headers: {
            'Content-Type': 'application/json'
        },
        body: JSON.stringify({
            prompt: 'a beautiful landscape',
            width: 512,
            height: 512,
            steps: 20
        })
    });

    const result = await response.json();
    if (result.status === 'completed') {
        // Create an image element with the generated image
        const img = document.createElement('img');
        img.src = `data:image/png;base64,${result.images[0].data}`;
        document.body.appendChild(img);
    }
}

cURL

# Generate an image
curl -X POST http://localhost:8080/api/v1/generate \
  -H "Content-Type: application/json" \
  -d '{
    "prompt": "a beautiful landscape",
    "width": 512,
    "height": 512,
    "steps": 20
  }'

# Check job status
curl http://localhost:8080/api/v1/generate/{job_id}

# List available models
curl http://localhost:8080/api/v1/models

Development Status

✅ Completed Features (Production Ready)

Core System

✅ REST API Server - Full HTTP server with comprehensive error handling
✅ Generation Queue - Thread-safe job queue with status tracking
✅ Model Manager - Intelligent model detection and management
✅ Model Detection - Support for 15+ model architectures
✅ Authentication System - JWT, PAM, Unix, API key methods
✅ Progress Tracking - Real-time progress updates for all generation types

Generation Capabilities

✅ Text-to-Image - Full parameter support with all stable-diffusion.cpp options
✅ Image-to-Image - Transform images with strength control
✅ ControlNet - Multiple control modes and models
✅ Inpainting - Interactive mask editing with source and mask images
✅ Upscaling - ESRGAN model support with various scaling factors
✅ Batch Processing - Generate multiple images simultaneously

Model Management

✅ Model Types - Checkpoint, LoRA, VAE, ControlNet, ESRGAN, Embeddings, TAESD
✅ Model Validation - File validation and compatibility checking
✅ Model Conversion - Convert between quantization formats
✅ Model Hashing - SHA256 generation for verification
✅ Dependency Checking - Automatic dependency detection for architectures
✅ Batch Operations - Load/unload multiple models simultaneously

Web UI

✅ Modern Interface - Next.js 16 with React 19 and TypeScript
✅ Responsive Design - Mobile-first with Tailwind CSS 4
✅ Real-time Updates - Live progress and queue monitoring
✅ Interactive Forms - Specialized forms for each generation type
✅ Theme Support - Light/dark themes with auto-detection
✅ Image Processing - Built-in resize, crop, and format conversion
✅ File Downloads - Direct downloads with thumbnail support
✅ Authentication - Secure login with multiple auth methods

API Features

✅ Comprehensive Endpoints - 40+ API endpoints covering all functionality
✅ Parameter Validation - Request validation with detailed error messages
✅ File Handling - Upload/download images with base64 and URL support
✅ Error Handling - Structured error responses with proper HTTP codes
✅ CORS Support - Proper CORS headers for web integration
✅ Request Tracking - Unique request IDs for debugging

Advanced Features

✅ System Monitoring - Server status, system info, and performance metrics
✅ Configuration Management - Flexible command-line and file configuration
✅ Logging System - File and console logging with configurable levels
✅ Build System - CMake with automatic dependency management
✅ Installation Scripts - Systemd service installation with configuration

Supported Models

✅ Traditional Models - SD 1.5, SD 2.1, SDXL (base/refiner)
✅ Modern Architectures - Flux (Schnell/Dev/Chroma), SD3, SD3.5
✅ Video Models - Wan 2.1/2.2 T2V/I2V/FLF2V models
✅ Vision-Language - Qwen2VL with Chinese language support
✅ Specialized Models - PhotoMaker, LCM, SSD1B, Tiny SD
✅ Model Formats - safetensors, ckpt, gguf with conversion support

🔄 In Development

WebSocket Support

Real-time WebSocket connections for live updates
Currently using HTTP polling approach (works well)

Advanced Caching

Redis backend for distributed caching
Currently using in-memory caching

📋 Known Issues & Limitations

Progress Callback Issue

Status: ✅ FIXED (See ISSUE_49_PROGRESS_CALLBACK_FIX.md)

Originally segfaulted on second generation
Root cause was CUDA error, not progress callback
Callback cleanup mechanism properly implemented
Thread-safe memory management added

GPU Memory Management

Issue: CUDA errors during consecutive generations
Status: Requires investigation at stable-diffusion.cpp level
Workaround: Server restart clears memory state
Impact: Functional but may need periodic restarts

File Encoding Issues

Issue: Occasional zero-byte output files
Status: Detection implemented, recovery in progress
Workaround: Automatic retry with different parameters

🎯 Production Deployment Ready

The project is production-ready with:

✅ Comprehensive API coverage
✅ Robust error handling
✅ Security features
✅ Modern web interface
✅ Installation and deployment scripts
✅ Extensive model support
✅ Real monitoring capabilities

📊 Statistics

Total Codebase: 12 C++ files (13,341 lines) + Web UI (29 files, 16,565 lines)
API Endpoints: 40+ endpoints covering all functionality
Model Types: 12 different model categories supported
Model Architectures: 15+ architectures with intelligent detection
Authentication Methods: 6 different authentication options
Build System: Complete CMake with automatic dependency management

🚀 Performance Characteristics

Architecture: Three-thread design (HTTP server, generation queue, model manager)
Concurrency: Single generation at a time (thread-safe queue)
Web UI: Static export with long-term caching for optimal performance
Memory: Intelligent model loading and unloading
Response Times: Sub-second API responses, generation depends on model size

This represents a mature, feature-complete implementation ready for production deployment with comprehensive documentation and robust error handling.

Contributing

Contributions are welcome! Please feel free to submit a Pull Request. For major changes, please open an issue first to discuss what you would like to change.

License

This project is licensed under the MIT License - see the LICENSE file for details.

Acknowledgments

stable-diffusion.cpp for the underlying C++ implementation
The Stable Diffusion community for models and examples
Contributors and users of this project

README.md

stable-diffusion.cpp-rest

✨ Features

Core Functionality

Generation Capabilities

Advanced Features

Table of Contents

Project Overview

Objectives

Web UI Features

✨ Features

📊 Real-time Monitoring

🎨 User Experience

⚡ Advanced Functionality

🔧 Technical Features

🎯 Quick Start

📁 Web UI Structure

🚀 Built-in Components

🎨 Styling System

Architecture

HTTP Server

Generation Queue

Model Manager

Technical Requirements

Core Technologies

Dependencies

Platform Support

Project Structure

Model Detection and Architecture Support

Supported Architectures

Traditional Stable Diffusion Models

Modern Architectures

Detection Process

Benefits

Model Architecture Requirements

Context Creation Methods per Architecture

Model Quantization and Conversion

Quantization Levels Supported

Model Conversion Commands

Stable Diffusion Models

Flux Models

Chroma Models

Kontext Models

LoRA Models

Additional Model Types

Upscaling (ESRGAN)

Fast Decoding (TAESD)

Model Types and File Extensions

Model Type Enum Definition

API Endpoints

Core Generation Endpoints

Text-to-Image Generation

Image-to-Image Generation

ControlNet Generation

Inpainting

Upscaling

Job Management

Job Status

Queue Status

Cancel Job

Clear Queue

Model Management

List Models

Model Information

Load Model

Unload Model

Model Types

Model Directories

Refresh Models

Model Statistics

Batch Model Operations

Model Validation

Model Conversion

Model Hashing

System Information

Server Status

System Information

Server Configuration

Server Restart

Authentication