Files
2026-06-09 18:31:59 +02:00

100 lines
3.1 KiB
Markdown
Raw Permalink Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# 02 — FLUX.1 Schnell Image Pipeline
## Why FLUX over SDXL
FLUX is a 12B-parameter transformer model. SDXL (RealVisXL) is 3.5B.
FLUX has significantly better:
- Spatial depth and perspective (lens simulation)
- Scene geometry (vanishing points, depth-of-field)
- Prompt following (T5-XXL understands long, detailed prompts)
SDXL was tested on lahrcarpetcleaning.com and rejected: flat angles, no depth,
poor spatial coherence. FLUX replaced it entirely.
## Model stack
| File | Size | Notes |
|---|---|---|
| flux1-schnell-Q8_0.gguf | 12GB | GGUF Q8, needs ComfyUI-GGUF node |
| t5xxl_fp8_e4m3fn.safetensors | 4.6GB | T5-XXL text encoder, fp8 quantized |
| clip_l.safetensors | 235MB | CLIP-L, short prompt encoder |
| ae.safetensors | 108MB | Official FLUX VAE from Black Forest Labs |
## Download (one-time)
FLUX GGUF (public, no auth):
```bash
wget "https://huggingface.co/city96/FLUX.1-schnell-gguf/resolve/main/flux1-schnell-Q8_0.gguf" \
-O ~/ComfyUI/models/unet/flux1-schnell-Q8_0.gguf
wget "https://huggingface.co/comfyanonymous/flux_text_encoders/resolve/main/t5xxl_fp8_e4m3fn.safetensors" \
-O ~/ComfyUI/models/clip/t5xxl_fp8_e4m3fn.safetensors
wget "https://huggingface.co/comfyanonymous/flux_text_encoders/resolve/main/clip_l.safetensors" \
-O ~/ComfyUI/models/clip/clip_l.safetensors
```
FLUX VAE (gated — requires HF login and license acceptance):
```bash
hf auth login # paste read token
HF_TOKEN=$(cat ~/.cache/huggingface/token)
wget --header="Authorization: Bearer $HF_TOKEN" \
"https://huggingface.co/black-forest-labs/FLUX.1-schnell/resolve/main/ae.safetensors" \
-O ~/ComfyUI/models/vae/ae.safetensors
```
## ComfyUI workflow (what gen-images-flux.py sends)
```
UnetLoaderGGUF → flux1-schnell-Q8_0.gguf
DualCLIPLoader → t5xxl_fp8_e4m3fn + clip_l (type=flux)
VAELoader → ae.safetensors
CLIPTextEncode → prompt
EmptyLatentImage → 1024×576, batch=1
KSampler → steps=4, cfg=1.0, euler, simple
VAEDecode
SaveImage
```
## Settings
| Setting | Value | Why |
|---|---|---|
| Steps | 4 | Schnell is distilled — 4 steps is optimal |
| CFG | 1.0 | Distilled model, higher CFG degrades quality |
| Sampler | euler | Best for FLUX |
| Scheduler | simple | Matches FLUX training |
| Negative prompt | none | Distilled model ignores it |
| Resolution | 1024×576 | 16:9 hero format |
## Running generation
```bash
# ComfyUI must be running first (see 01-comfyui-setup.md)
cd /home/sirdrez/arisingmedia-websites/{domain}
python3 tools/gen-images-flux.py 2>&1 | tee tools/flux-gen.log
```
Monitor:
```bash
tmux attach -t comfyui # step progress bars
tail -f tools/flux-gen.log # per-image OK/FAIL
```
Speed: ~4 min/image on CPU (2GB VRAM insufficient for GPU). 28 images = ~1h50m.
## After generation
```bash
python3 tools/convert-to-webp.py # resize + convert to WebP
rm assets/images/**/*.jpg # delete source JPGs
docker compose build --no-cache web # bake WebP into image
docker compose up -d
```
Verify:
```bash
curl -s -o /dev/null -w "%{http_code}" http://localhost:{port}/assets/images/hero/hero-carpet-cleaning.webp
# must return 200
```