recent updates

2026-06-09 18:31:59 +02:00
parent 398b94965c
commit 94f7a1f72a
42 changed files with 8686 additions and 0 deletions
@@ -0,0 +1,99 @@
+# 02 — FLUX.1 Schnell Image Pipeline
+
+## Why FLUX over SDXL
+
+FLUX is a 12B-parameter transformer model. SDXL (RealVisXL) is 3.5B.
+FLUX has significantly better:
+- Spatial depth and perspective (lens simulation)
+- Scene geometry (vanishing points, depth-of-field)
+- Prompt following (T5-XXL understands long, detailed prompts)
+
+SDXL was tested on lahrcarpetcleaning.com and rejected: flat angles, no depth,
+poor spatial coherence. FLUX replaced it entirely.
+
+## Model stack
+
+| File | Size | Notes |
+|---|---|---|
+| flux1-schnell-Q8_0.gguf | 12GB | GGUF Q8, needs ComfyUI-GGUF node |
+| t5xxl_fp8_e4m3fn.safetensors | 4.6GB | T5-XXL text encoder, fp8 quantized |
+| clip_l.safetensors | 235MB | CLIP-L, short prompt encoder |
+| ae.safetensors | 108MB | Official FLUX VAE from Black Forest Labs |
+
+## Download (one-time)
+
+FLUX GGUF (public, no auth):
+```bash
+wget "https://huggingface.co/city96/FLUX.1-schnell-gguf/resolve/main/flux1-schnell-Q8_0.gguf" \
+  -O ~/ComfyUI/models/unet/flux1-schnell-Q8_0.gguf
+
+wget "https://huggingface.co/comfyanonymous/flux_text_encoders/resolve/main/t5xxl_fp8_e4m3fn.safetensors" \
+  -O ~/ComfyUI/models/clip/t5xxl_fp8_e4m3fn.safetensors
+
+wget "https://huggingface.co/comfyanonymous/flux_text_encoders/resolve/main/clip_l.safetensors" \
+  -O ~/ComfyUI/models/clip/clip_l.safetensors
+```
+
+FLUX VAE (gated — requires HF login and license acceptance):
+```bash
+hf auth login   # paste read token
+HF_TOKEN=$(cat ~/.cache/huggingface/token)
+wget --header="Authorization: Bearer $HF_TOKEN" \
+  "https://huggingface.co/black-forest-labs/FLUX.1-schnell/resolve/main/ae.safetensors" \
+  -O ~/ComfyUI/models/vae/ae.safetensors
+```
+
+## ComfyUI workflow (what gen-images-flux.py sends)
+
+```
+UnetLoaderGGUF    → flux1-schnell-Q8_0.gguf
+DualCLIPLoader    → t5xxl_fp8_e4m3fn + clip_l (type=flux)
+VAELoader         → ae.safetensors
+CLIPTextEncode    → prompt
+EmptyLatentImage  → 1024×576, batch=1
+KSampler          → steps=4, cfg=1.0, euler, simple
+VAEDecode
+SaveImage
+```
+
+## Settings
+
+| Setting | Value | Why |
+|---|---|---|
+| Steps | 4 | Schnell is distilled — 4 steps is optimal |
+| CFG | 1.0 | Distilled model, higher CFG degrades quality |
+| Sampler | euler | Best for FLUX |
+| Scheduler | simple | Matches FLUX training |
+| Negative prompt | none | Distilled model ignores it |
+| Resolution | 1024×576 | 16:9 hero format |
+
+## Running generation
+
+```bash
+# ComfyUI must be running first (see 01-comfyui-setup.md)
+cd /home/sirdrez/arisingmedia-websites/{domain}
+python3 tools/gen-images-flux.py 2>&1 | tee tools/flux-gen.log
+```
+
+Monitor:
+```bash
+tmux attach -t comfyui     # step progress bars
+tail -f tools/flux-gen.log  # per-image OK/FAIL
+```
+
+Speed: ~4 min/image on CPU (2GB VRAM insufficient for GPU). 28 images = ~1h50m.
+
+## After generation
+
+```bash
+python3 tools/convert-to-webp.py          # resize + convert to WebP
+rm assets/images/**/*.jpg                  # delete source JPGs
+docker compose build --no-cache web        # bake WebP into image
+docker compose up -d
+```
+
+Verify:
+```bash
+curl -s -o /dev/null -w "%{http_code}" http://localhost:{port}/assets/images/hero/hero-carpet-cleaning.webp
+# must return 200
+```