CLAUDE.md

This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.

Project Overview

This is a C++ REST API server that wraps the stable-diffusion.cpp library, providing HTTP endpoints for Stable Diffusion image generation. The server is built with a modular architecture featuring three main components: HTTP Server, Generation Queue, and Model Manager.

Build Commands

Initial Setup and Build

# Create build directory and configure
mkdir build && cd build
cmake ..

# Build the project (parallel build)
cmake --build . --parallel

# Install (optional)
cmake --install .

Build Configuration Options

# Build with CUDA support (default: ON)
cmake -DSD_CUDA_SUPPORT=ON ..

# Build without CUDA
cmake -DSD_CUDA_SUPPORT=OFF ..

# Debug build
cmake -DCMAKE_BUILD_TYPE=Debug ..

# Release build (default)
cmake -DCMAKE_BUILD_TYPE=Release ..

Clean and Rebuild

# Clean build artifacts
cd build
cmake --build . --target clean

# Or delete build directory entirely
rm -rf build
mkdir build && cd build
cmake ..
cmake --build . --parallel

Running the Server

Required Parameters: Both --models-dir and --checkpoints are required.

# Basic usage with required parameters
./stable-diffusion-rest-server --models-dir /path/to/models --checkpoints checkpoints

# The above resolves checkpoints to: /path/to/models/checkpoints

# Using absolute path for checkpoints
./stable-diffusion-rest-server --models-dir /path/to/models --checkpoints /absolute/path/to/checkpoints

# With custom port and host
./stable-diffusion-rest-server --models-dir /path/to/models --checkpoints checkpoints --host 0.0.0.0 --port 8080

# With verbose logging
./stable-diffusion-rest-server --models-dir /path/to/models --checkpoints checkpoints --verbose

# With optional model directories (relative paths)
./stable-diffusion-rest-server --models-dir /path/to/models --checkpoints checkpoints --lora-dir lora --vae-dir vae

# With optional model directories (absolute paths)
./stable-diffusion-rest-server --models-dir /path/to/models --checkpoints checkpoints --lora-dir /other/lora --vae-dir /other/vae

Path Resolution Logic:

If a directory parameter is an absolute path, it's used as-is
If a directory parameter is a relative path, it's resolved relative to --models-dir
Example: --models-dir /data/models --checkpoints checkpoints → /data/models/checkpoints
Example: --models-dir /data/models --checkpoints /other/checkpoints → /other/checkpoints

Architecture

Three-Component Design

HTTP Server (src/server.cpp, include/server.h)
- Uses cpp-httplib for HTTP handling
- Runs in separate thread from generation
- Handles request validation and response formatting
- All endpoints are registered in registerEndpoints()
- CORS is configured in setupCORS()
Generation Queue (src/generation_queue.cpp, include/generation_queue.h)
- Thread-safe queue for managing generation requests
- Uses Pimpl idiom (implementation hidden in .cpp)
- Processes jobs sequentially (one at a time by default)
- Provides job tracking via JobInfo structures
- Main methods: enqueueRequest(), getQueueStatus(), cancelJob()
Model Manager (src/model_manager.cpp, include/model_manager.h)
- Handles loading/unloading of different model types
- Uses Pimpl idiom for implementation hiding
- All model directories are explicitly configured
- Supports path resolution: absolute paths used as-is, relative paths resolved from base models directory
- Thread-safe with shared_mutex for concurrent reads
- Model scanning is cancellable via cancelScan()

Threading Architecture

Main thread: Initialization, signal handling, coordination
Server thread: HTTP request handling (in Server::serverThreadFunction())
Queue worker threads: Generation processing (managed by GenerationQueue)
Signal handler sets global g_running flag for graceful shutdown

Model Type System

Model types are bit flags (can be combined):

enum class ModelType {
    LORA = 1, CHECKPOINT = 2, VAE = 4, PRESETS = 8,
    PROMPTS = 16, NEG_PROMPTS = 32, TAESD = 64,
    ESRGAN = 128, CONTROLNET = 256, UPSCALER = 512,
    EMBEDDING = 1024
};

Supported file extensions by type:

LORA, CHECKPOINT, VAE, TAESD, CONTROLNET, EMBEDDING: .safetensors, .pt, .ckpt
PRESETS: .json, .yaml, .yml
PROMPTS, NEG_PROMPTS: .txt, .json
ESRGAN, UPSCALER: .pth, .pt

Key API Endpoints

Generation Endpoints

POST /api/v1/generate - General image generation
POST /api/v1/text2img - Text-to-image generation
POST /api/v1/img2img - Image-to-image generation
POST /api/v1/controlnet - ControlNet generation

Model Management

GET /api/v1/models - List all available models
GET /api/v1/models/{type} - List models by type
GET /api/v1/models/{id} - Get model details
POST /api/v1/models/load - Load a model
POST /api/v1/models/unload - Unload a model
POST /api/v1/models/scan - Rescan models directory

Queue Management

GET /api/v1/queue - Get queue status
GET /api/v1/jobs/{id} - Get job status
DELETE /api/v1/jobs/{id} - Cancel a job
DELETE /api/v1/queue - Clear queue

System Information

GET /api/v1/health - Health check
GET /api/v1/status - API status
GET /api/v1/system - System capabilities

Dependencies Management

Dependencies are managed in cmake/FindDependencies.cmake using CMake's FetchContent:

nlohmann/json (v3.11.2) - JSON parsing/serialization
cpp-httplib (v0.14.1) - HTTP server library
stable-diffusion.cpp - Core SD library (via ExternalProject)
Threads - POSIX threads
OpenMP (optional) - Parallel processing
CUDA (optional) - GPU acceleration

The stable-diffusion.cpp library is downloaded and built automatically via ExternalProject_Add() in the root CMakeLists.txt. The specific git tag is pinned to master-334-d05e46c.

Important Implementation Details

Pimpl Idiom Usage

Both GenerationQueue and ModelManager use the Pimpl (Pointer to Implementation) idiom:

class GenerationQueue {
private:
    class Impl;
    std::unique_ptr<Impl> pImpl;
};

All implementation details are in the .cpp files, not headers. When modifying these classes, update the inner Impl class definition in the .cpp file.

Signal Handling

Global pointer g_server allows signal handler to trigger graceful shutdown
Signal handler sets g_running atomic flag and calls server->stop()
Shutdown sequence: stop server → stop queue → wait for threads → cleanup

Directory Configuration

The server requires explicit configuration of model directories:

Required Parameters:

--models-dir: Base directory for models (required)
--checkpoints: Checkpoints directory (required)

Optional Parameters:

--lora-dir, --vae-dir, --controlnet-dir, etc. (optional)

Path Resolution:

Absolute paths (e.g., /absolute/path/to/models) are used as-is
Relative paths (e.g., checkpoints) are resolved relative to --models-dir
The resolveDirectoryPath() function in main.cpp handles this logic

Generation Parameters

The GenerationRequest structure in generation_queue.h contains all parameters from stable-diffusion.cpp's CLI including:

Basic: prompt, negative_prompt, width, height, steps, cfg_scale
Sampling: sampling_method (EULER, EULER_A, HEUN, etc.), scheduler (DISCRETE, KARRAS, etc.)
Advanced: clip_skip, strength, control_strength, skip_layers
Performance: n_threads, offload_params_to_cpu, clip_on_cpu, vae_on_cpu, diffusion_flash_attn
Model paths: vae_path, taesd_path, controlnet_path, lora_model_dir, embedding_dir

Development Notes

When Adding New Endpoints

Add handler method declaration in include/server.h
Implement handler in src/server.cpp
Register endpoint in Server::registerEndpoints()
Use helper methods sendJsonResponse() and sendErrorResponse() for consistent responses

When Adding New Model Types

Add enum value to ModelType in model_manager.h
Update modelTypeToString() and stringToModelType() in model_manager.cpp
Add supported file extensions to model scanning logic
Update ServerConfig struct if a new directory parameter is needed

When Modifying Generation Parameters

Update GenerationRequest struct in generation_queue.h
Update parameter validation in Server::validateGenerationParameters()
Update request parsing in generation endpoint handlers
Update StableDiffusionWrapper to pass new parameters to underlying library

Thread Safety Considerations

Model Manager uses std::shared_mutex - multiple readers OR single writer
Generation Queue uses std::mutex and std::condition_variable
Always use RAII locks (std::lock_guard, std::unique_lock, std::shared_lock)
Atomic types used for flags: std::atomic<bool> for g_running, m_isRunning

External Project Integration

The stable-diffusion.cpp library is built as an external project. Include directories and libraries are configured via the sd-cpp interface target:

target_link_libraries(stable-diffusion-rest-server PRIVATE
    sd-cpp
    ggml
    ggml-base
    ggml-cpu
    ${DEPENDENCY_LIBRARIES}
)

When accessing stable-diffusion.cpp APIs, include from the installed headers:

#include <stable-diffusion.h>
#include <ggml.h>

The wrapper class StableDiffusionWrapper (stable_diffusion_wrapper.cpp/h) encapsulates all interactions with the stable-diffusion.cpp library.

Model Architecture Detection System

The server includes an automatic model architecture detection system that analyzes checkpoint files to determine their type and required auxiliary models.

Supported Architectures

The system can detect the following architectures:

Architecture	Required Files	Command-Line Flags
SD 1.5	VAE (optional)	`--vae vae-ft-mse-840000-ema-pruned.safetensors`
SD 2.1	VAE (optional)	`--vae vae-ft-ema-560000.safetensors`
SDXL Base/Refiner	VAE (optional)	`--vae sdxl_vae.safetensors`
Flux Schnell	VAE, CLIP-L, T5XXL	`--vae ae.safetensors --clip-l clip_l.safetensors --t5xxl t5xxl_fp16.safetensors`
Flux Dev	VAE, CLIP-L, T5XXL	`--vae ae.safetensors --clip-l clip_l.safetensors --t5xxl t5xxl_fp16.safetensors`
Flux Chroma	VAE, T5XXL	`--vae ae.safetensors --t5xxl t5xxl_fp16.safetensors`
SD3	VAE, CLIP-L, CLIP-G, T5XXL	`--vae sd3_vae.safetensors --clip-l clip_l.safetensors --clip-g clip_g.safetensors --t5xxl t5xxl_fp16.safetensors`
Qwen2-VL	Qwen2VL, Qwen2VL-Vision	`--qwen2vl qwen2vl.safetensors --qwen2vl-vision qwen2vl_vision.safetensors`

How Detection Works

File Format Support:
- Safetensors (.safetensors): Fully supported
- GGUF (.gguf): Fully supported (quantized models)
- PyTorch (.ckpt, .pt): Assumed to be SD1.5 (cannot parse pickle format safely)
Detection Method:
- Reads only file headers (~1MB) for fast detection
- Analyzes tensor names and shapes
- Checks for architecture-specific patterns:
  - Flux: double_blocks, single_blocks tensors
  - SD3: joint_blocks tensors
  - SDXL: conditioner, text_encoder_2 tensors
  - Chroma: Flux structure + "chroma" in filename
- Returns recommended settings (resolution, steps, sampler)
API Integration:
- Architecture info is returned in /api/models endpoint
- Includes required_models array listing needed auxiliary files
- Includes missing_models array if dependencies are not found
- Frontend can display warnings for missing dependencies

Usage in Model Manager

During model scanning (model_manager.cpp):

if (detectedType == ModelType::CHECKPOINT) {
    ModelDetectionResult detection = ModelDetector::detectModel(info.fullPath);
    info.architecture = detection.architectureName;
    info.recommendedVAE = detection.recommendedVAE;
    info.recommendedWidth = std::stoi(detection.suggestedParams["width"]);
    // ... parse other recommended parameters

    // Build required models list
    if (detection.needsVAE) {
        info.requiredModels.push_back("VAE: " + detection.recommendedVAE);
    }
}

API Response Example

{
  "name": "chroma-unlocked-v50-Q8_0.gguf",
  "type": "checkpoint",
  "architecture": "Flux Chroma (Unlocked)",
  "recommended_vae": "ae.safetensors",
  "recommended_width": 1024,
  "recommended_height": 1024,
  "recommended_steps": 20,
  "recommended_sampler": "euler",
  "required_models": [
    "VAE: ae.safetensors",
    "T5XXL: t5xxl_fp16.safetensors"
  ],
  "missing_models": [],
  "has_missing_dependencies": false
}

Testing the Detection System

A standalone test binary can be built to test model detection:

cd build
cmake -DBUILD_MODEL_DETECTOR_TEST=ON ..
cmake --build . --target test_model_detector

# Run tests
./src/test_model_detector /data/SD_MODELS/checkpoints

Architecture-Specific Loading

The server will automatically use the correct parameters when loading models based on detected architecture. For architectures requiring multiple auxiliary models (Flux, SD3, Qwen), the server will:

Check if all required models are available
Return warnings via API if models are missing
Display warnings in WebUI with instructions to load missing models
Provide correct command-line flags for manual loading

See MODEL_DETECTION.md for complete documentation on the detection system.

Web UI Architecture

The project includes a Next.js-based web UI located in /webui that provides a modern interface for interacting with the REST API.

Building the Web UI

# Build Web UI manually
cd webui
npm install
npm run build

# Build via CMake (automatically copies to build directory)
cmake --build build --target webui-build

The built UI is automatically copied to build/webui/ and served by the REST API server at /ui/.

WebUI Structure

Framework: Next.js 16 with React, TypeScript, and Tailwind CSS
Routing: App router with static page generation
UI Components: Shadcn/ui components in /webui/components/ui/
Pages:
- /webui/app/text2img/page.tsx - Text-to-image generation
- /webui/app/img2img/page.tsx - Image-to-image generation
- /webui/app/upscaler/page.tsx - Image upscaling
- /webui/app/models/page.tsx - Model management
- /webui/app/queue/page.tsx - Queue status

Important Components

Main Layout (/webui/components/main-layout.tsx)
- Provides consistent layout with sidebar and status bar
- Includes Sidebar and ModelStatusBar components
Sidebar (/webui/components/sidebar.tsx)
- Fixed position navigation (z-index: 40)
- Always visible on the left side
- Handles page navigation
Model Status Bar (/webui/components/model-status-bar.tsx)
- Fixed position at bottom (z-index: 35)
- Positioned with left-64 to avoid overlapping sidebar
- Shows current model status, queue status, and generation progress
- Polls server every 1-5 seconds for updates
Prompt Textarea (/webui/components/prompt-textarea.tsx)
- Advanced textarea with syntax highlighting
- Autocomplete for LoRAs and embeddings
- Suggestions dropdown (z-index: 30 - below sidebar)
- Highlights LoRA tags (<lora:name:weight>) and embedding names

Z-Index Layering

The WebUI uses a specific z-index hierarchy to ensure proper stacking:

Sidebar:                z-40  (always on top for navigation)
Model Status Bar:       z-35  (visible but doesn't block sidebar)
Autocomplete Dropdowns: z-30  (below sidebar to allow navigation)
Main Content:           z-0   (default)

Important: When adding new floating/fixed elements, respect this hierarchy to avoid blocking the sidebar navigation.

Form State Persistence

All generation pages (text2img, img2img, upscaler) use localStorage to persist form state across navigation.

Implementation (/webui/lib/hooks.ts):

// Custom hook for localStorage persistence
const [formData, setFormData] = useLocalStorage('page-form-data', defaultValues);

Keys used:

text2img-form-data - Text-to-image form state
img2img-form-data - Image-to-image form state
upscaler-form-data - Upscaler form state

Behavior:

Form state is automatically saved to localStorage on every change
State is restored when returning to the page
Users can navigate away and return without losing their settings
Large base64 images (img2img, upscaler) are also persisted but may hit localStorage size limits (~5-10MB)

API Client

The WebUI communicates with the REST API via /webui/lib/api.ts:

import { apiClient } from '@/lib/api';

// Examples
const models = await apiClient.getModels('checkpoint');
const job = await apiClient.text2img(formData);
const status = await apiClient.getJobStatus(jobId);

Base URL Configuration: The API base URL is configured in /webui/.env.local and defaults to the server's configured endpoint.

Development Workflow for WebUI

Making UI Changes:

cd webui
npm run dev  # Start development server on port 3000

Building for Production:

# Via CMake (recommended)
cmake --build build --target webui-build

# Or manually
cd webui && npm run build

Testing:
- Development: Changes are hot-reloaded at http://localhost:3000
- Production: Build and access via REST server at http://localhost:8080/ui/

Common UI Issues and Solutions

Issue: Sidebar menu items not clickable

Cause: Element with higher z-index overlapping sidebar
Solution: Ensure all floating elements have z-index < 40

Issue: Form state lost on navigation

Cause: Not using useLocalStorage hook
Solution: Replace useState with useLocalStorage for form data

Issue: Status bar not visible

Cause: Z-index too low or hidden behind content
Solution: Use z-index 35 and adjust left offset to avoid sidebar

WebUI Configuration

The server dynamically generates /ui/config.js with runtime configuration:

window.__SERVER_CONFIG__ = {
  apiUrl: 'http://localhost:8080',
  apiBasePath: '/api',
  host: 'localhost',
  port: 8080,
  uiVersion: 'a1b2c3d4'  // Git commit hash
};

This allows the WebUI to adapt to different server configurations without rebuilding.

UI Caching and Versioning

The WebUI implements a comprehensive caching strategy with git-based versioning to improve performance and ensure users always see the latest version.

Git-Based Versioning

Build Process:

During cmake --build build --target webui-build, the git commit hash is extracted

A version.json file is generated in /webui/public/ with:

{
 "version": "a1b2c3d4",  // Short git hash (8 chars)
 "buildTime": "2025-11-02T18:00:00Z"
}

This file is copied to the build output and served at /ui/version.json

Server Implementation (src/server.cpp:294-380):

Reads version.json on startup to get current UI version
Injects version into config.js as uiVersion
Sets HTTP cache headers based on file type and version

Cache Headers Strategy

Static Assets (JS, CSS, images, fonts):

Cache-Control: public, max-age=31536000, immutable
ETag: "a1b2c3d4"

Cached for 1 year (31536000 seconds)
immutable flag tells browser file will never change
ETag based on git hash for validation
When version changes, ETag changes, forcing fresh download

HTML Files:

Cache-Control: public, max-age=0, must-revalidate
ETag: "a1b2c3d4"

Always revalidate with server (max-age=0)
Can use cached version if ETag matches
Ensures users get latest HTML that references new assets

config.js (Dynamic Configuration):

Cache-Control: no-cache, no-store, must-revalidate
Pragma: no-cache
Expires: 0

Never cached - always fetched fresh
Contains runtime configuration and current version

Version Mismatch Detection

Component: /webui/components/version-checker.tsx

The VersionChecker component:

Loads current version from /ui/version.json on mount
Reads server version from window.__SERVER_CONFIG__.uiVersion
Compares versions every 5 minutes
Shows notification banner if versions don't match
Provides "Refresh" button to reload and get new version

User Experience:

Users see a yellow notification banner at top of page
Clear message: "New UI Version Available"
One-click refresh to get latest version
Automatic check every 5 minutes (configurable)

Benefits

✅ Performance: Static assets cached for 1 year, reducing bandwidth and load times
✅ Automatic Updates: Version mismatch detection ensures users know when to refresh
✅ Cache Invalidation: Git hash in ETag guarantees cache busting on updates
✅ Reduced Server Load: Browsers serve most assets from cache
✅ Traceability: Git hash allows tracking exactly which UI version is deployed

Testing Cache Behavior

First Load (Cache Miss):

# Open browser DevTools → Network tab
# Load page → See all assets with Status 200
# Check Response Headers for Cache-Control and ETag

Reload (Cache Hit):

# Reload page → See assets with Status 200 (from disk cache)
# Or Status 304 (Not Modified) if revalidating

After Rebuild (Cache Invalidation):

# Rebuild UI: cmake --build build --target webui-build
# Restart server
# Reload page → Version checker shows update notification
# Click Refresh → All assets redownloaded with new ETag

Development vs Production

Development (npm run dev):

Next.js dev server at localhost:3000
Hot module replacement (no caching)
Version checking disabled

Production (served by REST server):

Static files at /ui/
Full caching with git versioning
Version checking active
Served from build/webui/

Troubleshooting

Issue: UI not showing latest changes after rebuild

Cause: Browser cache still using old assets
Solution: Check version.json was generated correctly, restart server, hard refresh (Ctrl+Shift+R)

Issue: Version checker not showing update notification

Cause: config.js not loaded or version same as cached
Solution: Check browser console for errors, verify window.__SERVER_CONFIG__ exists

Issue: Assets returning 304 even after version change

Cause: Server not reading new version.json
Solution: Restart server to reload version file
when starting the server, use this parameters: --models-dir /data/SD_MODELS --port 8082 --host 0.0.0.0 --ui-dir ./webui so the webui will be usable

CLAUDE.md 23 KB История Исходник