浏览代码

Fix model architecture detection: prioritize text encoder dimensions over UNet channel count

- Reordered detection logic in analyzeArchitecture() to check text encoder dimensions first
- This fixes false SDXL detection for models like flatanime2d_v10.safetensors that have high UNet channels (10240) but SD 1.5 text encoder dimensions (768)
- New detection order: Flux/SD3/Qwen patterns → SDXL patterns → Text encoder dimensions → UNet channels (fallback)
- Resolves issue where SD1.5 models were incorrectly classified as SDXL
Fszontagh 3 月之前
父节点
当前提交
bfbd5a8da3
共有 1 个文件被更改,包括 19 次插入5 次删除
  1. 19 5
      src/model_detector.cpp

+ 19 - 5
src/model_detector.cpp

@@ -360,16 +360,30 @@ ModelArchitecture ModelDetector::analyzeArchitecture(
         return ModelArchitecture::QWEN2VL;
     }
 
+    // Check text encoder dimensions first (more reliable than UNet channel count)
+    if (textEncoderOutputDim == 768) {
+        return ModelArchitecture::SD_1_5;
+    }
+    
+    if (textEncoderOutputDim >= 1024 && textEncoderOutputDim < 1280) {
+        return ModelArchitecture::SD_2_1;
+    }
+    
+    if (textEncoderOutputDim == 1280) {
+        return ModelArchitecture::SDXL_BASE;
+    }
+    
+    // Only use UNet channel count as a last resort when text encoder dimensions are unclear
     if (maxUNetChannels >= 2048) {
         return ModelArchitecture::SDXL_BASE;
     }
-
-    // Distinguish between SD1.x and SD2.x by text encoder dimension
-    if (textEncoderOutputDim >= 1024 || maxUNetChannels == 1280) {
+    
+    // Fallback detection based on UNet channels when text encoder info is unavailable
+    if (maxUNetChannels == 1280) {
         return ModelArchitecture::SD_2_1;
     }
-
-    if (textEncoderOutputDim == 768 || maxUNetChannels <= 1280) {
+    
+    if (maxUNetChannels <= 1280) {
         return ModelArchitecture::SD_1_5;
     }