This project includes a custom model detection system that analyzes model files to determine their architecture without modifying the stable-diffusion.cpp library.
Note: .ckpt and .pt files are Python pickle formats that cannot be safely parsed in C++ without the full PyTorch library. These files will return "Unknown" architecture. Consider converting them to safetensors format for detection support.
The system reads only the file headers (not the entire model) to extract:
Model architecture is determined by analyzing:
conditioner and text_encoder_2 tensorsdouble_blocks and single_blocksjoint_blocksmodelspec.architecture fielddiffusion_steps for Flux variantsFor each detected architecture, the system provides:
Recommended VAE: vae-ft-mse-840000-ema-pruned.safetensors
Resolution: 512x512
Steps: 20
CFG Scale: 7.5
Sampler: euler_a
TAESD: Supported
Recommended VAE: vae-ft-ema-560000.safetensors
Resolution: 768x768
Steps: 25
CFG Scale: 7.0
Sampler: euler_a
TAESD: Supported
Recommended VAE: sdxl_vae.safetensors
Resolution: 1024x1024
Steps: 30
CFG Scale: 7.0
Sampler: dpm++2m
TAESD: Supported
Has Conditioner: true
Recommended VAE: ae.safetensors
Resolution: 1024x1024
Steps: 4
CFG Scale: 1.0
Sampler: euler
Recommended VAE: ae.safetensors
Resolution: 1024x1024
Steps: 20
CFG Scale: 1.0
Sampler: euler
#include "model_detector.h"
// Detect model architecture
std::string modelPath = "/data/SD_MODELS/checkpoints/SDXL/myModel.safetensors";
ModelDetectionResult result = ModelDetector::detectModel(modelPath);
// Check detected architecture
std::cout << "Architecture: " << result.architectureName << std::endl;
std::cout << "Text Encoder Dim: " << result.textEncoderDim << std::endl;
std::cout << "UNet Channels: " << result.unetChannels << std::endl;
// Get VAE recommendation
if (result.needsVAE) {
std::cout << "Recommended VAE: " << result.recommendedVAE << std::endl;
}
// Get loading parameters
for (const auto& [param, value] : result.suggestedParams) {
std::cout << param << ": " << value << std::endl;
}
// Access metadata
for (const auto& [key, value] : result.metadata) {
std::cout << "Metadata " << key << ": " << value << std::endl;
}
You can integrate this into the ModelManager to:
Auto-detect model types during scanning
auto detection = ModelDetector::detectModel(filePath);
modelInfo.architecture = detection.architectureName;
modelInfo.recommendedVAE = detection.recommendedVAE;
Validate model-VAE compatibility
if (checkpoint.architecture == "SDXL" && vae.name != "sdxl_vae") {
// Warn user about potential issues
}
Auto-configure generation parameters
auto params = ModelDetector::getRecommendedParams(architecture);
request.width = std::stoi(params["width"]);
request.height = std::stoi(params["height"]);
request.steps = std::stoi(params["steps"]);
Provide UI hints
from safetensors.torch import save_fileSafetensors Format:
[8 bytes: header length (uint64 LE)]
[N bytes: JSON header with tensor info]
[M bytes: tensor data (not read)]
GGUF Format:
[4 bytes: magic "GGUF"]
[4 bytes: version (uint32)]
[8 bytes: tensor count (uint64)]
[8 bytes: metadata KV count (uint64)]
[metadata key-value pairs]
[tensor information]
[tensor data (not read)]
The system uses multiple heuristics for robust detection:
This multi-layered approach ensures accurate detection even for models with incomplete metadata.