# stable-diffusion.cpp-rest

A production-ready C++ REST API server that wraps the [stable-diffusion.cpp](https://github.com/leejet/stable-diffusion.cpp.git) library, providing comprehensive HTTP endpoints for image generation with Stable Diffusion models. Features a modern web interface built with Next.js and robust authentication system.

## ✨ Features

### Core Functionality
- **REST API** - Complete HTTP API for Stable Diffusion image generation
- **Web UI** - Modern, responsive web interface (automatically built with the server)
- **Queue System** - Efficient job queue with status tracking and cancellation
- **Model Management** - Intelligent model detection across multiple architectures
- **CUDA Support** - Optional GPU acceleration for faster generation
- **Authentication** - Multi-method authentication with JWT, PAM, Unix, and API keys

### Generation Capabilities
- **Text-to-Image** - Generate images from text prompts
- **Image-to-Image** - Transform existing images with text guidance
- **ControlNet** - Precise control over output composition
- **Inpainting** - Edit specific regions of images
- **Upscaling** - Enhance image resolution with ESRGAN models
- **Batch Processing** - Generate multiple images in parallel

### Advanced Features
- **Real-time Progress Tracking** - WebSocket-like progress updates
- **Image Processing** - Built-in resize, crop, and format conversion
- **Thumbnail Generation** - Automatic thumbnail creation for galleries
- **Model Conversion** - Convert models between quantization formats
- **System Monitoring** - Comprehensive status and performance metrics
- **Flexible Authentication** - Optional or required auth with multiple methods

## Table of Contents
- [Project Overview](#project-overview)
- [Web UI Features](#web-ui-features)
- [Architecture](#architecture)
- [Model Detection and Architecture Support](#model-detection-and-architecture-support)
- [Model Architecture Requirements](#model-architecture-requirements)
- [Context Creation Methods per Architecture](#context-creation-methods-per-architecture)
- [Model Quantization and Conversion](#model-quantization-and-conversion)
- [Technical Requirements](#technical-requirements)
- [Model Types and File Extensions](#model-types-and-file-extensions)
- [API Endpoints](#api-endpoints)
- [Authentication System](#authentication-system)
- [Build Instructions](#build-instructions)
- [Usage Examples](#usage-examples)
- [Development Status](#development-status)

## Project Overview

The stable-diffusion.cpp-rest project aims to create a high-performance REST API server that wraps the functionality of the stable-diffusion.cpp library. This enables developers to integrate Stable Diffusion image generation capabilities into their applications through standard HTTP requests, rather than directly using the C++ library.

### Objectives

- Provide a simple, RESTful interface for Stable Diffusion image generation
- Support all parameters available in examples/cli/main.cpp
- Implement efficient resource management with a generation queue system
- Support multiple model types with automatic detection and loading
- Ensure thread-safe operation with separate HTTP server and generation threads

## Web UI Features

A modern, responsive web interface is included and automatically built with the server! Built with Next.js 16, React 19, and Tailwind CSS 4.

### ✨ Features
- **Multiple Generation Types**
  - Text-to-Image with comprehensive parameter controls
  - Image-to-Image with strength adjustment
  - ControlNet with multiple control modes
  - Inpainting with interactive mask editor
  - Upscaling with various ESRGAN models

### 📊 Real-time Monitoring
- **Job Queue Management** - Real-time queue status and progress tracking
- **Generation Progress** - Live progress updates with time estimates
- **System Status** - Server performance and resource monitoring
- **Model Management** - Load/unload models with dependency checking

### 🎨 User Experience
- **Responsive Design** - Works on desktop, tablet, and mobile
- **Light/Dark Themes** - Automatic theme detection and manual toggle
- **Interactive Controls** - Intuitive parameter adjustments
- **Image Gallery** - Thumbnail generation and batch download
- **Authentication** - Secure login with multiple auth methods

### ⚡ Advanced Functionality
- **Image Processing** - Built-in resize, crop, and format conversion
- **Batch Operations** - Generate multiple images simultaneously
- **Model Compatibility** - Smart model detection and requirement checking
- **URL Downloads** - Import images from URLs for img2img and inpainting
- **CORS Support** - Seamless integration with web applications

### 🔧 Technical Features
- **Static Export** - Optimized for production deployment
- **Caching** - Intelligent asset caching for performance
- **Error Handling** - Comprehensive error reporting and recovery
- **WebSocket-like Updates** - Real-time progress without WebSockets
- **Image Download** - Direct file downloads with proper headers

### 🎯 Quick Start
```bash
# Build (automatically builds web UI)
mkdir build && cd build
cmake ..
cmake --build .

# Run server with web UI
./src/stable-diffusion-rest-server --models-dir /path/to/models --ui-dir ../build/webui

# Access web UI
open http://localhost:8080/ui/
```

### 📁 Web UI Structure
```
webui/
├── app/                     # Next.js app directory
│   ├── components/          # React components
│   ├── lib/                # Utilities and API clients
│   └── globals.css         # Global styles
├── public/                 # Static assets
├── package.json           # Dependencies
├── next.config.ts         # Next.js configuration
└── tsconfig.json          # TypeScript configuration
```

### 🚀 Built-in Components
- **Generation Forms** - Specialized forms for each generation type
- **Model Browser** - Interactive model selection with metadata
- **Progress Indicators** - Visual progress bars and status updates
- **Image Preview** - Thumbnail generation and full-size viewing
- **Settings Panel** - Configuration and preference management

### 🎨 Styling System
- **Tailwind CSS 4** - Utility-first CSS framework
- **Radix UI Components** - Accessible, unstyled components
- **Lucide React Icons** - Beautiful icon system
- **Custom CSS Variables** - Theme-aware design tokens
- **Responsive Grid** - Mobile-first responsive layout

## Architecture

The project is designed with a modular architecture consisting of three main components:

### HTTP Server
- Handles incoming HTTP requests
- Parses request parameters and validates input
- Returns generated images or error responses
- Operates independently of the generation process

### Generation Queue
- Manages image generation requests
- Processes jobs sequentially (one at a time)
- Maintains thread-safe operations
- Provides job status tracking

### Model Manager
- Handles loading and management of different model types
- Supports automatic model detection from default folders
- Automatically detects diffusion model architecture and selects appropriate loading method
- Supports traditional SD models (SD 1.5, SD 2.1, SDXL Base/Refiner) using ctxParams.model_path
- Supports modern architectures (Flux Schnell/Dev/Chroma, SD3, Qwen2VL) using ctxParams.diffusion_model_path
- Includes fallback mechanisms for unknown architectures
- Applies optimal parameters based on detected model type
- Manages model lifecycle and memory usage
- Provides type-based model organization

```
┌─────────────────┐    ┌─────────────────┐    ┌─────────────────┐
│   HTTP Server   │───▶│ Generation Queue│───▶│  Model Manager  │
│                 │    │                 │    │                 │
│ - Request Parse │    │ - Job Queue     │    │ - Model Loading │
│ - Response      │    │ - Sequential    │    │ - Type Detection│
│   Formatting    │    │   Processing    │    │ - Memory Mgmt   │
└─────────────────┘    └─────────────────┘    └─────────────────┘
```

## Technical Requirements

### Core Technologies
- **C++17** or later
- **CMake** 3.15 or later
- **Threading** support (std::thread, std::mutex, std::condition_variable)
- **CUDA** support (optional but recommended for performance)

### Dependencies
- **stable-diffusion.cpp** library (automatically downloaded via CMake)
- **HTTP server library** (to be determined based on requirements)
- **JSON library** for request/response handling
- **Build tools** compatible with CMake

### Platform Support
- Linux (primary development platform)
- Windows (planned support)
- macOS (planned support)

## Project Structure

```
stable-diffusion.cpp-rest/
├── CMakeLists.txt              # Main CMake configuration
├── README.md                   # This file
├── src/                        # Source code directory
│   ├── main.cpp                # Application entry point
│   ├── http/                   # HTTP server implementation
│   │   ├── server.h/.cpp       # HTTP server class
│   │   ├── handlers.h/.cpp     # Request handlers
│   │   └── responses.h/.cpp    # Response formatting
│   ├── generation/             # Generation queue implementation
│   │   ├── queue.h/.cpp        # Job queue management
│   │   ├── worker.h/.cpp       # Generation worker thread
│   │   └── job.h/.cpp          # Job definition and status
│   ├── models/                 # Model manager implementation
│   │   ├── manager.h/.cpp      # Model manager class
│   │   ├── loader.h/.cpp       # Model loading logic
│   │   └── types.h/.cpp        # Model type definitions
│   └── utils/                  # Utility functions
│       ├── config.h/.cpp       # Configuration management
│       └── logger.h/.cpp       # Logging utilities
├── include/                    # Public header files
├── external/                   # External dependencies (managed by CMake)
├── models/                     # Default model storage directory
│   ├── lora/                   # LoRA models
│   ├── checkpoints/            # Checkpoint models
│   ├── vae/                    # VAE models
│   ├── presets/                # Preset files
│   ├── prompts/                # Prompt templates
│   ├── neg_prompts/            # Negative prompt templates
│   ├── taesd/                  # TAESD models
│   ├── esrgan/                 # ESRGAN models
│   ├── controlnet/             # ControlNet models
│   ├── upscaler/               # Upscaler models
│   └── embeddings/             # Textual embeddings
├── tests/                      # Unit and integration tests
├── examples/                   # Usage examples
└── docs/                       # Additional documentation
```

## Model Detection and Architecture Support

The stable-diffusion.cpp-rest server includes intelligent model detection that automatically identifies the model architecture and selects the appropriate loading method. This ensures compatibility with both traditional Stable Diffusion models and modern architectures.

### Supported Architectures

#### Traditional Stable Diffusion Models
- **SD 1.5** - Stable Diffusion version 1.5 models
- **SD 2.1** - Stable Diffusion version 2.1 models
- **SDXL Base** - Stable Diffusion XL base models
- **SDXL Refiner** - Stable Diffusion XL refiner models

These models are loaded using the traditional `ctxParams.model_path` parameter.

#### Modern Architectures
- **Flux Schnell** - Fast Flux variant
- **Flux Dev** - Development version of Flux
- **Flux Chroma** - Chroma-optimized Flux
- **SD 3** - Stable Diffusion 3 models
- **Qwen2VL** - Qwen2 Vision-Language models

These models are loaded using the modern `ctxParams.diffusion_model_path` parameter.

### Detection Process

1. **Model Analysis**: When loading a model, the system analyzes the model file structure and metadata
2. **Architecture Identification**: The model architecture is identified based on key signatures in the model
3. **Loading Method Selection**: The appropriate loading method is automatically selected:
   - Traditional models → `ctxParams.model_path`
   - Modern architectures → `ctxParams.diffusion_model_path`
4. **Fallback Handling**: Unknown architectures default to traditional loading for backward compatibility
5. **Error Recovery**: If loading with the detected method fails, the system attempts fallback loading

### Benefits

- **Automatic Compatibility**: No need to manually specify model type
- **Optimal Loading**: Each architecture uses its optimal loading parameters
- **Future-Proof**: Easy to add support for new architectures
- **Backward Compatible**: Existing models continue to work without changes

## Model Architecture Requirements

> **Note:** The following tables contain extensive information and may require horizontal scrolling to view all columns properly.

| Architecture | Extra VAE | Standalone High Noise | T5XXL | CLIP-Vision | CLIP-G | CLIP-L | Model Files | Example Commands |
|--------------|-----------|----------------------|-------|-------------|--------|--------|-------------|------------------|
| SD 1.x | No | No | No | No | No | No | sd-v1-4.ckpt, v1-5-pruned-emaonly.safetensors | `./bin/sd -m ../models/sd-v1-4.ckpt -p "a lovely cat"` |
| SD 2.x | No | No | No | No | No | No | (Similar to SD 1.x) | `./bin/sd -m ../models/sd2-model.ckpt -p "a lovely cat"` |
| SDXL | Yes | No | No | No | No | No | sd_xl_base_1.0.safetensors, sdxl_vae-fp16-fix.safetensors | `./bin/sd -m ../models/sd_xl_base_1.0.safetensors --vae ../models/sdxl_vae-fp16-fix.safetensors -H 1024 -W 1024 -p "a lovely cat" -v` |
| SD3 | No | No | Yes | No | No | No | sd3_medium_incl_clips_t5xxlfp16.safetensors | `./bin/sd -m ../models/sd3_medium_incl_clips_t5xxlfp16.safetensors -H 1024 -W 1024 -p 'a lovely cat holding a sign says "Stable Diffusion CPP"' --cfg-scale 4.5 --sampling-method euler -v --clip-on-cpu` |
| SD3.5 Large | No | No | Yes | No | Yes | Yes | sd3.5_large.safetensors, clip_l.safetensors, clip_g.safetensors, t5xxl_fp16.safetensors | `./bin/sd -m ../models/sd3.5_large.safetensors --clip_l ../models/clip_l.safetensors --clip_g ../models/clip_g.safetensors --t5xxl ../models/t5xxl_fp16.safetensors -H 1024 -W 1024 -p 'a lovely cat holding a sign says "Stable diffusion 3.5 Large"' --cfg-scale 4.5 --sampling-method euler -v --clip-on-cpu` |
| FLUX Models | Yes | No | Yes | No | No | Yes | flux1-dev-q3_k.gguf, flux1-dev-q8_0.gguf, flux1-schnell-q8_0.gguf, ae.sft, clip_l.safetensors, t5xxl_fp16.safetensors | `./bin/sd --diffusion-model ../models/flux1-dev-q8_0.gguf --vae ../models/ae.sft --clip_l ../models/clip_l.safetensors --t5xxl ../models/t5xxl_fp16.safetensors -p "a lovely cat holding a sign says 'flux.cpp'" --cfg-scale 1.0 --sampling-method euler -v --clip-on-cpu` |
| Kontext | Yes | No | Yes | No | No | Yes | flux1-kontext-dev-q8_0.gguf, ae.sft, clip_l.safetensors, t5xxl_fp16.safetensors | `./bin/sd -r ./flux1-dev-q8_0.png --diffusion-model ../models/flux1-kontext-dev-q8_0.gguf --vae ../models/ae.sft --clip_l ../models/clip_l.safetensors --t5xxl ../models/t5xxl_fp16.safetensors -p "change 'flux.cpp' to 'kontext.cpp'" --cfg-scale 1.0 --sampling-method euler -v --clip-on-cpu` |
| Chroma | Yes | No | Yes | No | No | No | chroma-unlocked-v40-q8_0.gguf, ae.sft, t5xxl_fp16.safetensors | `./bin/sd --diffusion-model ../models/chroma-unlocked-v40-q8_0.gguf --vae ../models/ae.sft --t5xxl ../models/t5xxl_fp16.safetensors -p "a lovely cat holding a sign says 'chroma.cpp'" --cfg-scale 4.0 --sampling-method euler -v --chroma-disable-dit-mask --clip-on-cpu` |
| Wan Models | Yes | No | Yes | Yes | No | No | wan2.1_t2v_1.3B_fp16.safetensors, wan_2.1_vae.safetensors, umt5-xxl-encoder-Q8_0.gguf | `./bin/sd -M vid_gen --diffusion-model ../models/wan2.1_t2v_1.3B_fp16.safetensors --vae ../models/wan_2.1_vae.safetensors --t5xxl ../models/umt5-xxl-encoder-Q8_0.gguf -p "a lovely cat" --cfg-scale 6.0 --sampling-method euler -v -W 832 -H 480 --diffusion-fa --video-frames 33 --flow-shift 3.0` |
| Wan2.2 T2V A14B | No | Yes | No | No | No | No | Wan2.2-T2V-A14B-LowNoise-Q8_0.gguf, Wan2.2-T2V-A14B-HighNoise-Q8_0.gguf | `./bin/sd -M vid_gen --diffusion-model ../models/Wan2.2-T2V-A14B-LowNoise-Q8_0.gguf -p "a lovely cat" --cfg-scale 6.0 --sampling-method euler -v -W 832 -H 480 --diffusion-fa --video-frames 33 --flow-shift 3.0` |
| Wan2.2 I2V A14B | No | Yes | No | No | No | No | Wan2.2-I2V-A14B-LowNoise-Q8_0.gguf, Wan2.2-I2V-A14B-HighNoise-Q8_0.gguf | `./bin/sd -M vid_gen --diffusion-model ../models/Wan2.2-I2V-A14B-LowNoise-Q8_0.gguf -r input.png -p "a lovely cat" --cfg-scale 6.0 --sampling-method euler -v -W 832 -H 480 --diffusion-fa --video-frames 33 --flow-shift 3.0` |
| Qwen Image Models | Yes | No | No | No | No | No | qwen-image-Q8_0.gguf, qwen_image_vae.safetensors, Qwen2.5-VL-7B-Instruct-Q8_0.gguf | `./bin/sd --diffusion-model ../models/qwen-image-Q8_0.gguf --vae ../models/qwen_image_vae.safetensors --qwen2vl ../models/Qwen2.5-VL-7B-Instruct-Q8_0.gguf -p '一个穿着"QWEN"标志的T恤的中国美女正拿着黑色的马克笔面相镜头微笑。' --cfg-scale 2.5 --sampling-method euler -v --offload-to-cpu -H 1024 -W 1024 --diffusion-fa --flow-shift 3` |
| Qwen Image Edit | Yes | No | No | No | No | No | Qwen_Image_Edit-Q8_0.gguf, qwen_image_vae.safetensors, qwen_2.5_vl_7b.safetensors | `./bin/sd --diffusion-model ../models/Qwen_Image_Edit-Q8_0.gguf --vae ../models/qwen_image_vae.safetensors --qwen2vl ../models/qwen_2.5_vl_7b.safetensors --cfg-scale 2.5 --sampling-method euler -v --offload-to-cpu --diffusion-fa --flow-shift 3 -r ../assets/flux/flux1-dev-q8_0.png -p "change 'flux.cpp' to 'edit.cpp'"` |
| PhotoMaker | No | No | No | No | No | No | sdxlUnstableDiffusers_v11.safetensors, sdxl_vae.safetensors, photomaker-v1.safetensors | `./bin/sd -m ../models/sdxlUnstableDiffusers_v11.safetensors --vae ../models/sdxl_vae.safetensors --photo-maker ../models/photomaker-v1.safetensors --pm-id-images-dir ../assets/photomaker_examples/scarletthead_woman -p "a girl img, retro futurism" --cfg-scale 5.0 --sampling-method euler -H 1024 -W 1024 --pm-style-strength 10 --vae-on-cpu --steps 50` |
| LCM | No | No | No | No | No | No | lcm-lora-sdv1-5 | `./bin/sd -m ../models/v1-5-pruned-emaonly.safetensors -p "a lovely cat<lora:lcm-lora-sdv1-5:1>" --steps 4 --lora-model-dir ../models -v --cfg-scale 1` |
| SSD1B | No | No | No | No | No | No | (Various SSD-1B models) | `./bin/sd -m ../models/ssd-1b.safetensors -p "a lovely cat"` |
| Tiny SD | No | No | No | No | No | No | (Various Tiny SD models) | `./bin/sd -m ../models/tiny-sd.safetensors -p "a lovely cat"` |

## Context Creation Methods per Architecture

| Architecture | Context Creation Method | Special Parameters | Model Files | Example Commands |
|--------------|------------------------|-------------------|-------------|------------------|
| SD 1.x, SD 2.x, SDXL | Standard prompt-based generation | --cfg-scale, --sampling-method, --steps | sd-v1-4.ckpt, v1-5-pruned-emaonly.safetensors, sd_xl_base_1.0.safetensors | `./bin/sd -m ../models/sd-v1-4.ckpt -p "a lovely cat"` |
| SD3 | Multiple text encoders | --clip-on-cpu recommended | sd3_medium_incl_clips_t5xxlfp16.safetensors | `./bin/sd -m ../models/sd3_medium_incl_clips_t5xxlfp16.safetensors -H 1024 -W 1024 -p 'a lovely cat holding a sign says "Stable Diffusion CPP"' --cfg-scale 4.5 --sampling-method euler -v --clip-on-cpu` |
| SD3.5 Large | Multiple text encoders | --clip-on-cpu recommended | sd3.5_large.safetensors, clip_l.safetensors, clip_g.safetensors, t5xxl_fp16.safetensors | `./bin/sd -m ../models/sd3.5_large.safetensors --clip_l ../models/clip_l.safetensors --clip_g ../models/clip_g.safetensors --t5xxl ../models/t5xxl_fp16.safetensors -H 1024 -W 1024 -p 'a lovely cat holding a sign says "Stable diffusion 3.5 Large"' --cfg-scale 4.5 --sampling-method euler -v --clip-on-cpu` |
| FLUX Models | Text-to-image generation | --cfg-scale 1.0 recommended, --clip-on-cpu for memory efficiency | flux1-dev-q3_k.gguf, flux1-dev-q8_0.gguf, flux1-schnell-q8_0.gguf, ae.sft, clip_l.safetensors, t5xxl_fp16.safetensors | `./bin/sd --diffusion-model ../models/flux1-dev-q8_0.gguf --vae ../models/ae.sft --clip_l ../models/clip_l.safetensors --t5xxl ../models/t5xxl_fp16.safetensors -p "a lovely cat holding a sign says 'flux.cpp'" --cfg-scale 1.0 --sampling-method euler -v --clip-on-cpu` |
| Kontext | Image-to-image transformation | -r for reference image, --cfg-scale 1.0 recommended | flux1-kontext-dev-q8_0.gguf, ae.sft, clip_l.safetensors, t5xxl_fp16.safetensors | `./bin/sd -r ./flux1-dev-q8_0.png --diffusion-model ../models/flux1-kontext-dev-q8_0.gguf --vae ../models/ae.sft --clip_l ../models/clip_l.safetensors --t5xxl ../models/t5xxl_fp16.safetensors -p "change 'flux.cpp' to 'kontext.cpp'" --cfg-scale 1.0 --sampling-method euler -v --clip-on-cpu` |
| Chroma | Text-to-image generation | --cfg-scale 1.0 recommended, --clip-on-cpu for memory efficiency | chroma-unlocked-v40-q8_0.gguf, ae.sft, t5xxl_fp16.safetensors | `./bin/sd --diffusion-model ../models/chroma-unlocked-v40-q8_0.gguf --vae ../models/ae.sft --t5xxl ../models/t5xxl_fp16.safetensors -p "a lovely cat holding a sign says 'chroma.cpp'" --cfg-scale 4.0 --sampling-method euler -v --chroma-disable-dit-mask --clip-on-cpu` |
| Chroma1-Radiance | Text-to-image generation | --cfg-scale 4.0 recommended | Chroma1-Radiance-v0.4-Q8_0.gguf, t5xxl_fp16.safetensors | `./bin/sd --diffusion-model ../models/Chroma1-Radiance-v0.4-Q8_0.gguf --t5xxl ../models/clip/t5xxl_fp16.safetensors -p "a lovely cat holding a sign says 'chroma radiance cpp'" --cfg-scale 4.0 --sampling-method euler -v` |
| Wan Models | Video generation with text prompts | -M vid_gen, --video-frames, --flow-shift, --diffusion-fa | wan2.1_t2v_1.3B_fp16.safetensors, wan_2.1_vae.safetensors, umt5-xxl-encoder-Q8_0.gguf | `./bin/sd -M vid_gen --diffusion-model ../models/wan2.1_t2v_1.3B_fp16.safetensors --vae ../models/wan_2.1_vae.safetensors --t5xxl ../models/umt5-xxl-encoder-Q8_0.gguf -p "a lovely cat" --cfg-scale 6.0 --sampling-method euler -v -W 832 -H 480 --diffusion-fa --video-frames 33 --flow-shift 3.0` |
| Wan2.1 I2V Models | Image-to-video generation | Requires clip_vision_h.safetensors | wan2.1-i2v-14b-480p-Q8_0.gguf, wan_2.1_vae.safetensors, clip_vision_h.safetensors | `./bin/sd -M vid_gen --diffusion-model ../models/wan2.1-i2v-14b-480p-Q8_0.gguf --vae ../models/wan_2.1_vae.safetensors -r input.png -p "a lovely cat" --cfg-scale 6.0 --sampling-method euler -v -W 832 -H 480 --diffusion-fa --video-frames 33 --flow-shift 3.0` |
| Wan2.1 FLF2V Models | Flow-to-video generation | Requires clip_vision_h.safetensors | wan2.1-flf2v-14b-720p-Q8_0.gguf, wan_2.1_vae.safetensors, clip_vision_h.safetensors | `./bin/sd -M vid_gen --diffusion-model ../models/wan2.1-flf2v-14b-720p-Q8_0.gguf --vae ../models/wan_2.1_vae.safetensors -r flow.png -p "a lovely cat" --cfg-scale 6.0 --sampling-method euler -v -W 832 -H 480 --diffusion-fa --video-frames 33 --flow-shift 3.0` |
| Wan2.2 T2V A14B | Text-to-video generation | Uses dual diffusion models | Wan2.2-T2V-A14B-LowNoise-Q8_0.gguf, Wan2.2-T2V-A14B-HighNoise-Q8_0.gguf | `./bin/sd -M vid_gen --diffusion-model ../models/Wan2.2-T2V-A14B-LowNoise-Q8_0.gguf -p "a lovely cat" --cfg-scale 6.0 --sampling-method euler -v -W 832 -H 480 --diffusion-fa --video-frames 33 --flow-shift 3.0` |
| Wan2.2 I2V A14B | Image-to-video generation | Uses dual diffusion models | Wan2.2-I2V-A14B-LowNoise-Q8_0.gguf, Wan2.2-I2V-A14B-HighNoise-Q8_0.gguf | `./bin/sd -M vid_gen --diffusion-model ../models/Wan2.2-I2V-A14B-LowNoise-Q8_0.gguf -r input.png -p "a lovely cat" --cfg-scale 6.0 --sampling-method euler -v -W 832 -H 480 --diffusion-fa --video-frames 33 --flow-shift 3.0` |
| Qwen Image Models | Text-to-image generation with Chinese language support | --qwen2vl for the language model, --diffusion-fa, --flow-shift | qwen-image-Q8_0.gguf, qwen_image_vae.safetensors, Qwen2.5-VL-7B-Instruct-Q8_0.gguf | `./bin/sd --diffusion-model ../models/qwen-image-Q8_0.gguf --vae ../models/qwen_image_vae.safetensors --qwen2vl ../models/Qwen2.5-VL-7B-Instruct-Q8_0.gguf -p '一个穿着"QWEN"标志的T恤的中国美女正拿着黑色的马克笔面相镜头微笑。' --cfg-scale 2.5 --sampling-method euler -v --offload-to-cpu -H 1024 -W 1024 --diffusion-fa --flow-shift 3` |
| Qwen Image Edit | Image editing with reference image | -r for reference image, --qwen2vl_vision for vision model | Qwen_Image_Edit-Q8_0.gguf, qwen_image_vae.safetensors, qwen_2.5_vl_7b.safetensors | `./bin/sd --diffusion-model ../models/Qwen_Image_Edit-Q8_0.gguf --vae ../models/qwen_image_vae.safetensors --qwen2vl ../models/qwen_2.5_vl_7b.safetensors --cfg-scale 2.5 --sampling-method euler -v --offload-to-cpu --diffusion-fa --flow-shift 3 -r ../assets/flux/flux1-dev-q8_0.png -p "change 'flux.cpp' to 'edit.cpp'"` |
| PhotoMaker | Personalized image generation with ID images | --photo-maker, --pm-id-images-dir, --pm-style-strength | sdxlUnstableDiffusers_v11.safetensors, sdxl_vae.safetensors, photomaker-v1.safetensors | `./bin/sd -m ../models/sdxlUnstableDiffusers_v11.safetensors --vae ../models/sdxl_vae.safetensors --photo-maker ../models/photomaker-v1.safetensors --pm-id-images-dir ../assets/photomaker_examples/scarletthead_woman -p "a girl img, retro futurism" --cfg-scale 5.0 --sampling-method euler -H 1024 -W 1024 --pm-style-strength 10 --vae-on-cpu --steps 50` |
| LCM | Fast generation with LoRA | --cfg-scale 1.0, --steps 2-8, --sampling-method lcm/euler_a | lcm-lora-sdv1-5 | `./bin/sd -m ../models/v1-5-pruned-emaonly.safetensors -p "a lovely cat<lora:lcm-lora-sdv1-5:1>" --steps 4 --lora-model-dir ../models -v --cfg-scale 1` |
| SSD1B | Standard prompt-based generation | Standard SD parameters | (Various SSD-1B models) | `./bin/sd -m ../models/ssd-1b.safetensors -p "a lovely cat"` |
| Tiny SD | Standard prompt-based generation | Standard SD parameters | (Various Tiny SD models) | `./bin/sd -m ../models/tiny-sd.safetensors -p "a lovely cat"` |

## Model Quantization and Conversion

### Quantization Levels Supported

The stable-diffusion.cpp library supports various quantization levels to balance model size and performance:

| Quantization Level | Description | Model Size Reduction | Quality Impact |
|-------------------|-------------|----------------------|----------------|
| `f32` | 32-bit floating-point | None (original) | No quality loss |
| `f16` | 16-bit floating-point | ~50% | Minimal quality loss |
| `q8_0` | 8-bit integer quantization | ~75% | Slight quality loss |
| `q5_0`, `q5_1` | 5-bit integer quantization | ~80% | Moderate quality loss |
| `q4_0`, `q4_1` | 4-bit integer quantization | ~85% | Noticeable quality loss |
| `q3_k` | 3-bit K-quantization | ~87% | Significant quality loss |
| `q4_k` | 4-bit K-quantization | ~85% | Good balance of size/quality |
| `q2_k` | 2-bit K-quantization | ~90% | Major quality loss |
| `Q4_K_S` | 4-bit K-quantization Small | ~85% | Optimized for smaller models |

### Model Conversion Commands

To convert models from their original format to quantized GGUF format, use the following commands:

#### Stable Diffusion Models
```bash
# Convert SD 1.5 model to 8-bit quantization
./bin/sd -M convert -m ../models/v1-5-pruned-emaonly.safetensors -o ../models/v1-5-pruned-emaonly.q8_0.gguf -v --type q8_0

# Convert SDXL model to 4-bit quantization
./bin/sd -M convert -m ../models/sd_xl_base_1.0.safetensors -o ../models/sd_xl_base_1.0.q4_0.gguf -v --type q4_0
```

#### Flux Models
```bash
# Convert Flux Dev model to 8-bit quantization
./bin/sd -M convert -m ../models/flux1-dev.sft -o ../models/flux1-dev-q8_0.gguf -v --type q8_0

# Convert Flux Schnell model to 3-bit K-quantization
./bin/sd -M convert -m ../models/flux1-schnell.sft -o ../models/flux1-schnell-q3_k.gguf -v --type q3_k
```

#### Chroma Models
```bash
# Convert Chroma model to 8-bit quantization
./bin/sd -M convert -m ../models/chroma-unlocked-v40.safetensors -o ../models/chroma-unlocked-v40-q8_0.gguf -v --type q8_0
```

#### Kontext Models
```bash
# Convert Kontext model to 8-bit quantization
./bin/sd -M convert -m ../models/flux1-kontext-dev.safetensors -o ../models/flux1-kontext-dev-q8_0.gguf -v --type q8_0
```

### LoRA Models

The project supports LoRA (Low-Rank Adaptation) models for fine-tuning and style transfer:

| LoRA Model | Compatible Base Models | Example Usage |
|------------|----------------------|----------------|
| `marblesh.safetensors` | SD 1.5, SD 2.1 | `./bin/sd -m ../models/v1-5-pruned-emaonly.safetensors -p "a lovely cat<lora:marblesh:1>" --lora-model-dir ../models` |
| `lcm-lora-sdv1-5` | SD 1.5 | `./bin/sd -m ../models/v1-5-pruned-emaonly.safetensors -p "a lovely cat<lora:lcm-lora-sdv1-5:1>" --steps 4 --lora-model-dir ../models -v --cfg-scale 1` |
| `realism_lora_comfy_converted` | FLUX Models | `./bin/sd --diffusion-model ../models/flux1-dev-q8_0.gguf --vae ../models/ae.sft --clip_l ../models/clip_l.safetensors --t5xxl ../models/t5xxl_fp16.safetensors -p "a lovely cat holding a sign says 'flux.cpp'<lora:realism_lora_comfy_converted:1>" --cfg-scale 1.0 --sampling-method euler -v --lora-model-dir ../models --clip-on-cpu` |
| `wan2.2_t2v_lightx2v_4steps_lora_v1.1_low_noise` | Wan2.2 T2V Models | `./bin/sd -M vid_gen --diffusion-model ../models/Wan2.2-T2V-A14B-LowNoise-Q8_0.gguf --lora-model-dir ../models -p "a lovely cat<lora:wan2.2_t2v_lightx2v_4steps_lora_v1.1_low_noise:1>" --steps 4` |
| `wan2.2_t2v_lightx2v_4steps_lora_v1.1_high_noise` | Wan2.2 T2V Models | `./bin/sd -M vid_gen --diffusion-model ../models/Wan2.2-T2V-A14B-HighNoise-Q8_0.gguf --lora-model-dir ../models -p "a lovely cat<lora:wan2.2_t2v_lightx2v_4steps_lora_v1.1_high_noise:1>" --steps 4` |

### Additional Model Types

#### Upscaling (ESRGAN)
```bash
# Use ESRGAN for upscaling generated images
./bin/sd -m ../models/v1-5-pruned-emaonly.safetensors -p "a lovely cat" --upscale-model ../models/RealESRGAN_x4plus_anime_6B.pth
```

#### Fast Decoding (TAESD)
```bash
# Use TAESD for faster VAE decoding
./bin/sd -m ../models/v1-5-pruned-emaonly.safetensors -p "a lovely cat" --taesd ../models/diffusion_pytorch_model.safetensors
```

## Model Types and File Extensions

The project supports various model types, each with specific file extensions:

| Model Type | Enum Value | Description | Supported Extensions |
|------------|------------|-------------|----------------------|
| LORA | 1 | Low-Rank Adaptation models | .safetensors, .pt, .ckpt |
| CHECKPOINT | 2 | Main model checkpoints | .safetensors, .pt, .ckpt |
| VAE | 4 | Variational Autoencoder models | .safetensors, .pt, .ckpt |
| PRESETS | 8 | Generation preset files | .json, .yaml, .yml |
| PROMPTS | 16 | Prompt template files | .txt, .json |
| NEG_PROMPTS | 32 | Negative prompt templates | .txt, .json |
| TAESD | 64 | Tiny AutoEncoder for SD | .safetensors, .pt, .ckpt |
| ESRGAN | 128 | Super-resolution models | .pth, .pt |
| CONTROLNET | 256 | ControlNet models | .safetensors, .pt, .ckpt |
| UPSCALER | 512 | Image upscaler models | .pth, .pt |
| EMBEDDING | 1024 | Textual embeddings | .safetensors, .pt, .ckpt |

### Model Type Enum Definition

```cpp
enum ModelType {
    LORA = 1,
    CHECKPOINT = 2,
    VAE = 4,
    PRESETS = 8,
    PROMPTS = 16,
    NEG_PROMPTS = 32,
    TAESD = 64,
    ESRGAN = 128,
    CONTROLNET = 256,
    UPSCALER = 512,
    EMBEDDING = 1024
};
```

## API Endpoints

### Core Generation Endpoints

#### Text-to-Image Generation
```bash
POST /api/generate/text2img
```
Generate images from text prompts with comprehensive parameter support.

**Example Request:**
```json
{
    "prompt": "a beautiful landscape",
    "negative_prompt": "blurry, low quality", 
    "width": 1024,
    "height": 1024,
    "steps": 20,
    "cfg_scale": 7.5,
    "sampling_method": "euler",
    "scheduler": "karras",
    "seed": "random",
    "batch_count": 1,
    "vae_model": "optional_vae_name"
}
```

#### Image-to-Image Generation  
```bash
POST /api/generate/img2img
```
Transform existing images with text guidance.

**Example Request:**
```json
{
    "prompt": "transform into anime style",
    "init_image": "base64_encoded_image_or_url",
    "strength": 0.75,
    "width": 1024,
    "height": 1024,
    "steps": 20,
    "cfg_scale": 7.5
}
```

#### ControlNet Generation
```bash
POST /api/generate/controlnet
```
Apply precise control using ControlNet models.

**Example Request:**
```json
{
    "prompt": "a person standing",
    "control_image": "base64_encoded_control_image",
    "control_net_model": "canny",
    "control_strength": 0.9,
    "width": 512,
    "height": 512
}
```

#### Inpainting
```bash
POST /api/generate/inpainting
```
Edit specific regions of images using masks.

**Example Request:**
```json
{
    "prompt": "change hair color to blonde",
    "source_image": "base64_encoded_source_image",
    "mask_image": "base64_encoded_mask_image",
    "strength": 0.75,
    "width": 512,
    "height": 512
}
```

#### Upscaling
```bash
POST /api/generate/upscale
```
Enhance image resolution using ESRGAN models.

**Example Request:**
```json
{
    "image": "base64_encoded_image",
    "esrgan_model": "esrgan_model_name",
    "upscale_factor": 4
}
```

### Job Management

#### Job Status
```bash
GET /api/queue/job/{job_id}
```
Get detailed status and results for a specific job.

#### Queue Status
```bash
GET /api/queue/status
```
Get current queue state and active jobs.

#### Cancel Job
```bash
POST /api/queue/cancel
```
Cancel a pending or running job.

**Example Request:**
```json
{
    "job_id": "uuid-of-job-to-cancel"
}
```

#### Clear Queue
```bash
POST /api/queue/clear
```
Clear all pending jobs from the queue.

### Model Management

#### List Models
```bash
GET /api/models
```
List all available models with metadata and filtering options.

**Query Parameters:**
- `type` - Filter by model type (lora, checkpoint, vae, etc.)
- `search` - Search in model names and descriptions
- `sort_by` - Sort by name, size, date, type
- `sort_order` - asc or desc
- `page` - Page number for pagination
- `limit` - Items per page

#### Model Information
```bash
GET /api/models/{model_id}
```
Get detailed information about a specific model.

#### Load Model
```bash
POST /api/models/{model_id}/load
```
Load a model into memory.

#### Unload Model  
```bash
POST /api/models/{model_id}/unload
```
Unload a model from memory.

#### Model Types
```bash
GET /api/models/types
```
Get information about supported model types and their capabilities.

#### Model Directories
```bash
GET /api/models/directories
```
List and check status of model directories.

#### Refresh Models
```bash
POST /api/models/refresh
```
Rescan model directories and update cache.

#### Model Statistics
```bash
GET /api/models/stats
```
Get comprehensive statistics about models.

#### Batch Model Operations
```bash
POST /api/models/batch
```
Perform batch operations on multiple models.

**Example Request:**
```json
{
    "operation": "load",
    "models": ["model1", "model2", "model3"]
}
```

#### Model Validation
```bash
POST /api/models/validate
```
Validate model files and check compatibility.

#### Model Conversion
```bash
POST /api/models/convert
```
Convert models between quantization formats.

**Example Request:**
```json
{
    "model_name": "checkpoint_model_name",
    "quantization_type": "q8_0",
    "output_path": "/path/to/output.gguf"
}
```

#### Model Hashing
```bash
POST /api/models/hash
```
Generate SHA256 hashes for model verification.

### System Information

#### Server Status
```bash
GET /api/status
```
Get server status, queue information, and loaded models.

#### System Information
```bash
GET /api/system
```
Get detailed system information including hardware, capabilities, and limits.

#### Server Configuration
```bash
GET /api/config
```
Get current server configuration and limits.

#### Server Restart
```bash
POST /api/system/restart
```
Trigger graceful server restart.

### Authentication

#### Login
```bash
POST /api/auth/login
```
Authenticate user and receive access token.

**Example Request:**
```json
{
    "username": "admin",
    "password": "password123"
}
```

#### Token Validation
```bash
GET /api/auth/validate
```
Validate and check current token status.

#### Refresh Token
```bash
POST /api/auth/refresh
```
Refresh authentication token.

#### User Profile
```bash
GET /api/auth/me
```
Get current user information and permissions.

#### Logout
```bash
POST /api/auth/logout
```
Logout and invalidate token.

### Utility Endpoints

#### Samplers
```bash
GET /api/samplers
```
Get available sampling methods and their properties.

#### Schedulers  
```bash
GET /api/schedulers
```
Get available schedulers and their properties.

#### Parameters
```bash
GET /api/parameters
```
Get detailed parameter information and validation rules.

#### Validation
```bash
POST /api/validate
```
Validate generation parameters before submission.

#### Time Estimation
```bash
POST /api/estimate
```
Estimate generation time and memory usage.

#### Image Processing
```bash
POST /api/image/resize
POST /api/image/crop
```
Resize or crop images server-side.

#### Image Download
```bash
GET /api/image/download?url=image_url
```
Download and encode images from URLs.

### File Downloads

#### Job Output Files
```bash
GET /api/v1/jobs/{job_id}/output/{filename}
GET /api/queue/job/{job_id}/output/{filename}
```
Download generated images and output files.

#### Thumbnail Support
```bash
GET /api/v1/jobs/{job_id}/output/{filename}?thumb=1&size=200
```
Get thumbnails for faster web UI loading.

### Health Check

#### Basic Health
```bash
GET /api/health
```
Simple health check endpoint.

#### Version Information
```bash
GET /api/version
```
Get detailed version and build information.

### Public vs Protected Endpoints

**Public (No Authentication Required):**
- `/api/health` - Basic health check
- `/api/status` - Server status (read-only)
- `/api/version` - Version information
- Image download endpoints (for web UI display)

**Protected (Authentication Required):**
- All generation endpoints
- Model management (except listing)
- Job management cancellation
- System management operations
- Authentication profile access

### Example Request/Response

#### Generate Image Request
```json
POST /api/v1/generate
{
    "prompt": "a beautiful landscape",
    "negative_prompt": "blurry, low quality",
    "model": "sd-v1-5",
    "width": 512,
    "height": 512,
    "steps": 20,
    "cfg_scale": 7.5,
    "seed": -1,
    "batch_size": 1
}
```

#### Generate Image Response
```json
{
    "job_id": "uuid-string",
    "status": "completed",
    "images": [
        {
            "data": "base64-encoded-image-data",
            "seed": 12345,
            "parameters": {
                "prompt": "a beautiful landscape",
                "negative_prompt": "blurry, low quality",
                "model": "sd-v1-5",
                "width": 512,
                "height": 512,
                "steps": 20,
                "cfg_scale": 7.5,
                "seed": 12345,
                "batch_size": 1
            }
        }
    ],
    "generation_time": 3.2
}
```

## Authentication

The server supports multiple authentication methods to secure API access:

### Supported Authentication Methods

1. **No Authentication** (Default)
   - Open access to all endpoints
   - Suitable for development or trusted networks

2. **JWT Token Authentication**
   - JSON Web Tokens for stateless authentication
   - Configurable token expiration
   - Secure for production deployments

3. **API Key Authentication**
   - Static API keys for service-to-service communication
   - Simple integration for external applications

4. **PAM Authentication**
   - Integration with system authentication via PAM (Pluggable Authentication Modules)
   - Supports LDAP, Kerberos, and other PAM backends
   - Leverages existing system user accounts
   - See [PAM_AUTHENTICATION.md](PAM_AUTHENTICATION.md) for detailed setup

### Authentication Configuration

Authentication can be configured via command-line arguments or configuration files:

```bash
# Enable PAM authentication
./stable-diffusion-rest-server --auth pam --models-dir /path/to/models --checkpoints checkpoints

# Enable JWT authentication
./stable-diffusion-rest-server --auth jwt --models-dir /path/to/models --checkpoints checkpoints

# Enable API key authentication
./stable-diffusion-rest-server --auth api-key --models-dir /path/to/models --checkpoints checkpoints

# Enable Unix authentication
./stable-diffusion-rest-server --auth unix --models-dir /path/to/models --checkpoints checkpoints

# No authentication (default)
./stable-diffusion-rest-server --auth none --models-dir /path/to/models --checkpoints checkpoints
```

#### Authentication Methods

- `none` - No authentication required (default)
- `jwt` - JWT token authentication
- `api-key` - API key authentication
- `unix` - Unix system authentication
- `pam` - PAM authentication
- `optional` - Authentication optional (guest access allowed)

#### Deprecated Options

The following options are deprecated and will be removed in a future version:
- `--enable-unix-auth` - Use `--auth unix` instead
- `--enable-pam-auth` - Use `--auth pam` instead

### Authentication Endpoints

- `POST /api/v1/auth/login` - Authenticate with username/password (PAM/JWT)
- `POST /api/v1/auth/refresh` - Refresh JWT token
- `GET /api/v1/auth/profile` - Get current user profile
- `POST /api/v1/auth/logout` - Logout/invalidate token

For detailed authentication setup instructions, see [PAM_AUTHENTICATION.md](PAM_AUTHENTICATION.md).

## Build Instructions

### Prerequisites

1. **CMake** 3.15 or later
2. **C++17** compatible compiler
3. **Git** for cloning dependencies
4. **CUDA Toolkit** (optional but recommended)

### Build Steps

1. Clone the repository:
```bash
git clone https://github.com/your-username/stable-diffusion.cpp-rest.git
cd stable-diffusion.cpp-rest
```

2. Create a build directory:
```bash
mkdir build
cd build
```

3. Configure with CMake:
```bash
cmake ..
```

4. Build the project:
```bash
cmake --build . --parallel
```

5. (Optional) Install the binary:
```bash
cmake --install .
```

### CMake Configuration

The project uses CMake's external project feature to automatically download and build the stable-diffusion.cpp library:

```cmake
include(ExternalProject)

ExternalProject_Add(
    stable-diffusion.cpp
    GIT_REPOSITORY https://github.com/leejet/stable-diffusion.cpp.git
    GIT_TAG master-334-d05e46c
    SOURCE_DIR "${CMAKE_CURRENT_BINARY_DIR}/stable-diffusion.cpp-src"
    BINARY_DIR "${CMAKE_CURRENT_BINARY_DIR}/stable-diffusion.cpp-build"
    CMAKE_ARGS -DCMAKE_INSTALL_PREFIX=${CMAKE_CURRENT_BINARY_DIR}/stable-diffusion.cpp-install
    INSTALL_COMMAND ""
)
```

### PAM Authentication Build Options

PAM authentication support is enabled by default when PAM libraries are available. You can control this with CMake options:

```bash
# Build with PAM support (default when available)
cmake -DENABLE_PAM_AUTH=ON ..

# Build without PAM support
cmake -DENABLE_PAM_AUTH=OFF ..

# Check if PAM support will be built
cmake -LA | grep ENABLE_PAM_AUTH
```

**Note:** PAM authentication requires the PAM development libraries:
```bash
# Ubuntu/Debian
sudo apt-get install libpam0g-dev

# CentOS/RHEL/Fedora
sudo yum install pam-devel
```

## Usage Examples

### Starting the Server

```bash
# Basic usage
./stable-diffusion.cpp-rest

# With custom configuration
./stable-diffusion.cpp-rest --config config.json

# With custom model directory
./stable-diffusion.cpp-rest --model-dir /path/to/models
```

### Client Examples

#### Python with requests
```python
import requests
import base64
import json

# Generate an image
response = requests.post('http://localhost:8080/api/v1/generate', json={
    'prompt': 'a beautiful landscape',
    'width': 512,
    'height': 512,
    'steps': 20
})

result = response.json()
if result['status'] == 'completed':
    # Decode and save the first image
    image_data = base64.b64decode(result['images'][0]['data'])
    with open('generated_image.png', 'wb') as f:
        f.write(image_data)
```

#### JavaScript with fetch
```javascript
async function generateImage() {
    const response = await fetch('http://localhost:8080/api/v1/generate', {
        method: 'POST',
        headers: {
            'Content-Type': 'application/json'
        },
        body: JSON.stringify({
            prompt: 'a beautiful landscape',
            width: 512,
            height: 512,
            steps: 20
        })
    });

    const result = await response.json();
    if (result.status === 'completed') {
        // Create an image element with the generated image
        const img = document.createElement('img');
        img.src = `data:image/png;base64,${result.images[0].data}`;
        document.body.appendChild(img);
    }
}
```

#### cURL
```bash
# Generate an image
curl -X POST http://localhost:8080/api/v1/generate \
  -H "Content-Type: application/json" \
  -d '{
    "prompt": "a beautiful landscape",
    "width": 512,
    "height": 512,
    "steps": 20
  }'

# Check job status
curl http://localhost:8080/api/v1/generate/{job_id}

# List available models
curl http://localhost:8080/api/v1/models
```

## Development Status

### ✅ **Completed Features (Production Ready)**

#### Core System
- **✅ REST API Server** - Full HTTP server with comprehensive error handling
- **✅ Generation Queue** - Thread-safe job queue with status tracking
- **✅ Model Manager** - Intelligent model detection and management
- **✅ Model Detection** - Support for 15+ model architectures
- **✅ Authentication System** - JWT, PAM, Unix, API key methods
- **✅ Progress Tracking** - Real-time progress updates for all generation types

#### Generation Capabilities
- **✅ Text-to-Image** - Full parameter support with all stable-diffusion.cpp options
- **✅ Image-to-Image** - Transform images with strength control
- **✅ ControlNet** - Multiple control modes and models
- **✅ Inpainting** - Interactive mask editing with source and mask images
- **✅ Upscaling** - ESRGAN model support with various scaling factors
- **✅ Batch Processing** - Generate multiple images simultaneously

#### Model Management
- **✅ Model Types** - Checkpoint, LoRA, VAE, ControlNet, ESRGAN, Embeddings, TAESD
- **✅ Model Validation** - File validation and compatibility checking
- **✅ Model Conversion** - Convert between quantization formats
- **✅ Model Hashing** - SHA256 generation for verification
- **✅ Dependency Checking** - Automatic dependency detection for architectures
- **✅ Batch Operations** - Load/unload multiple models simultaneously

#### Web UI
- **✅ Modern Interface** - Next.js 16 with React 19 and TypeScript
- **✅ Responsive Design** - Mobile-first with Tailwind CSS 4
- **✅ Real-time Updates** - Live progress and queue monitoring
- **✅ Interactive Forms** - Specialized forms for each generation type
- **✅ Theme Support** - Light/dark themes with auto-detection
- **✅ Image Processing** - Built-in resize, crop, and format conversion
- **✅ File Downloads** - Direct downloads with thumbnail support
- **✅ Authentication** - Secure login with multiple auth methods

#### API Features
- **✅ Comprehensive Endpoints** - 40+ API endpoints covering all functionality
- **✅ Parameter Validation** - Request validation with detailed error messages
- **✅ File Handling** - Upload/download images with base64 and URL support
- **✅ Error Handling** - Structured error responses with proper HTTP codes
- **✅ CORS Support** - Proper CORS headers for web integration
- **✅ Request Tracking** - Unique request IDs for debugging

#### Advanced Features
- **✅ System Monitoring** - Server status, system info, and performance metrics
- **✅ Configuration Management** - Flexible command-line and file configuration
- **✅ Logging System** - File and console logging with configurable levels
- **✅ Build System** - CMake with automatic dependency management
- **✅ Installation Scripts** - Systemd service installation with configuration

#### Supported Models
- **✅ Traditional Models** - SD 1.5, SD 2.1, SDXL (base/refiner)
- **✅ Modern Architectures** - Flux (Schnell/Dev/Chroma), SD3, SD3.5
- **✅ Video Models** - Wan 2.1/2.2 T2V/I2V/FLF2V models
- **✅ Vision-Language** - Qwen2VL with Chinese language support
- **✅ Specialized Models** - PhotoMaker, LCM, SSD1B, Tiny SD
- **✅ Model Formats** - safetensors, ckpt, gguf with conversion support

### 🔄 **In Development**

#### WebSocket Support
- Real-time WebSocket connections for live updates
- Currently using HTTP polling approach (works well)

#### Advanced Caching
- Redis backend for distributed caching
- Currently using in-memory caching

### 📋 **Known Issues & Limitations**

#### Progress Callback Issue
**Status**: ✅ **FIXED** (See ISSUE_49_PROGRESS_CALLBACK_FIX.md)
- Originally segfaulted on second generation
- Root cause was CUDA error, not progress callback
- Callback cleanup mechanism properly implemented
- Thread-safe memory management added

#### GPU Memory Management
- **Issue**: CUDA errors during consecutive generations
- **Status**: Requires investigation at stable-diffusion.cpp level
- **Workaround**: Server restart clears memory state
- **Impact**: Functional but may need periodic restarts

#### File Encoding Issues
- **Issue**: Occasional zero-byte output files
- **Status**: Detection implemented, recovery in progress
- **Workaround**: Automatic retry with different parameters

### 🎯 **Production Deployment Ready**

The project is **production-ready** with:
- ✅ Comprehensive API coverage
- ✅ Robust error handling
- ✅ Security features
- ✅ Modern web interface
- ✅ Installation and deployment scripts
- ✅ Extensive model support
- ✅ Real monitoring capabilities

### 📊 **Statistics**

- **Total Codebase**: 12 C++ files (13,341 lines) + Web UI (29 files, 16,565 lines)
- **API Endpoints**: 40+ endpoints covering all functionality
- **Model Types**: 12 different model categories supported
- **Model Architectures**: 15+ architectures with intelligent detection
- **Authentication Methods**: 6 different authentication options
- **Build System**: Complete CMake with automatic dependency management

### 🚀 **Performance Characteristics**

- **Architecture**: Three-thread design (HTTP server, generation queue, model manager)
- **Concurrency**: Single generation at a time (thread-safe queue)
- **Web UI**: Static export with long-term caching for optimal performance
- **Memory**: Intelligent model loading and unloading
- **Response Times**: Sub-second API responses, generation depends on model size

This represents a **mature, feature-complete implementation** ready for production deployment with comprehensive documentation and robust error handling.

## Contributing

Contributions are welcome! Please feel free to submit a Pull Request. For major changes, please open an issue first to discuss what you would like to change.

## License

This project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details.

## Acknowledgments

- [stable-diffusion.cpp](https://github.com/leejet/stable-diffusion.cpp) for the underlying C++ implementation
- The Stable Diffusion community for models and examples
- Contributors and users of this project