# STACK — Architecture, Deployment, and Build Pipeline Author: Andre Cobham / Arising Media Updated: 2026-06-09 ## Stack Philosophy Two primary stacks. Pick based on page count and update frequency. ### Stack A — PHP Router + SQLite (50+ pages, standard as of 2026-05-21) - **PHP Router** — `router.php` dispatches every content URL to the correct PHP template. Edit one template = entire page class updates on next request. No find-and-replace. No file edits. - **SQLite** — single-file content DB. `pages.sqlite` holds all page content (title, meta, sections JSON, schema). 10,000 rows = 5MB. Sub-millisecond reads. No server process. - **Vanilla JavaScript** — no frameworks. `fetch`, `IntersectionObserver`, `querySelector` - **Plain CSS** — `tokens.css` (design tokens) + `main.css` (components). No Sass, no Tailwind - **Docker + nginx** — nginx routes `/assets/*` directly; all content URLs → PHP-FPM → router.php - **Resend** — transactional email via `/api/contact.php` - **Reference:** `arisingmedia.us` — 10,000+ pages ### Stack B — Static HTML (fewer than 50 pages) - **Static HTML** — every page is a `.html` file on disk - Same JS, CSS, Docker, nginx, Resend as Stack A - Python 3 stdlib for build scripts (no pip) - **Reference:** `lahrcarpetcleaning.com` ### Never Use (Both Stacks) - Node.js / npm packages on the website. Front-end JS uses ZERO packages - WordPress for new builds (we migrate clients OUT of WordPress) - CSS frameworks (Bootstrap, Tailwind, Bulma) - JS frameworks (React, Vue, Angular, Svelte) - jQuery, Lodash, Moment, axios, or any utility library - CSS-in-JS, styled-components - Build tools that require `node_modules` (webpack, vite, parcel, esbuild) - Tracking pixels other than what the client explicitly requests ### Why This Stack 1. **Performance** — a static HTML page with vanilla JS loads in <100ms with no parse cost from frameworks 2. **Longevity** — no dependency rot. A site we build today still works in 10 years with no maintenance 3. **Security** — no `npm audit` warnings, no supply-chain attack vectors, no transitive deps to patch 4. **Auditability** — every line on the site is something we wrote and can read in plain text 5. **Hosting** — a static folder + tiny Python container fits in the smallest VM tier any provider sells ### When to Add a Server-Side Service Static-only is the default. Add a small Python service ONLY when needed for: - Form submission (handled via Resend in the stdlib HTTP server pattern) - A specific dynamic feature the client paid for (e.g., booking widget, AI chat) Each service is its own Docker container. Keep them small (single file when possible). Use Python `http.server` + `urllib` from stdlib. Do not introduce Flask, FastAPI, Django, or any third-party HTTP framework. --- ## Project Structure Two folders per project: source and deployment. ### Source Folder Lives in the dev tree under `concept-agent/projects/{domain}/site/`. Contains everything needed to maintain and rebuild the site. ``` {domain}/site/ ├── index.html # home page ├── about/index.html # /about/ ├── contact/index.html # /contact/ ├── reviews/index.html # /reviews/ ├── blog/index.html # /blog/ ├── locations/ # location pages │ ├── index.html # /locations/ │ ├── _template.html # template stamped with JSON │ ├── buffalo.html # generated, flat URL │ ├── amherst.html │ └── ... ├── services/ │ ├── index.html │ ├── _template.html │ ├── floor-refinishing.html │ └── ... ├── components/ │ ├── header.html # loaded via fetch() by components.js │ └── footer.html ├── data/ │ ├── locations.json # source data for build_locations.py │ └── services.json # source data for build_services.py ├── assets/ │ ├── css/ │ │ ├── main.css # variables, reset, layout │ │ └── components.css # cards, hero, header, footer, nav, responsive │ ├── js/ │ │ ├── main.js # scroll animations, count-up, etc. │ │ ├── components.js # fetch + inject header/footer │ │ └── form.js # form validation + submit │ ├── images/ │ ├── videos/ # hero video files (.mp4 + .webm) │ └── fonts/ # only if not using Google Fonts CDN ├── build_locations.py # JSON → flat .html stamping ├── build_services.py └── README.md # project notes, content sources, status ``` ### Deployment Folder Lives at `/home/sirdrez/arisingmedia-websites/{domain}/`. Contains ONLY what's needed to run `docker compose up`. ``` {domain}/ ├── index.html # all public website folders ├── about/ # ↑ ├── assets/ # ↑ ├── blog/ # ↑ ├── components/ # ↑ ├── contact/ # ↑ ├── locations/ # ↑ ├── reviews/ # ↑ ├── services/ # ↑ ├── api/ # form-submit Python service (if used) │ ├── server.py │ ├── Dockerfile │ ├── .env # gitignored — Resend key, etc. │ └── .env.example ├── Dockerfile # nginx web container ├── nginx.conf ├── docker-compose.yml ├── .dockerignore ├── .gitignore └── .planning/ # everything not needed at runtime ├── build_locations.py # build scripts moved here ├── data/ # JSON sources moved here ├── README.md ├── DNS_*.txt # DNS notes └── review_*.png # design review screenshots ``` ### What Goes Where **Source folder gets** every working file (build scripts, data JSON, screenshots, notes, raw assets). This is the dev/maintenance copy. NOT what gets deployed. **Deployment folder gets** ONLY the rendered website + the small API service. Build scripts, JSON data, and notes go into `.planning/` to keep root clean and prevent accidental web exposure. ### URL Structure — Two Valid Patterns #### Pattern A: Flat HTML (default for Docker/nginx projects) nginx `try_files $uri $uri/ $uri.html =404` serves `/locations/buffalo` and `/locations/buffalo.html`. Canonical form: `/locations/buffalo.html`. Why flat: - One file = one page, no `/index.html` confusion - Easier sitemap generation - `` links are unambiguous - Crawl budget benefit — Google indexes one URL per page, not two #### Pattern B: Directory-style (default for cPanel/Apache projects) Each page lives at `{slug}/index.html`. Apache auto-serves `index.html` when visiting `/{slug}/`. Use this when deploying to cPanel shared hosting. ``` services/ ├── carpet-cleaning/index.html → /services/carpet-cleaning/ ├── stairs/index.html → /services/stairs/ commercial/ ├── offices/index.html → /commercial/offices/ └── vacation-rentals/index.html → /commercial/vacation-rentals/ ``` ### Lahrcarpetcleaning.com Reference (Directory-Style, cPanel) ``` lahrcarpetcleaning.com/ ├── index.html ├── about/index.html ├── contact/index.html ├── reviews/index.html ├── service-area/index.html ├── locations/ │ ├── index.html │ ├── waterloo-ny/index.html │ ├── geneva-ny/index.html │ └── ... (20 location pages) ├── services/ │ ├── carpet-cleaning/index.html │ ├── stairs/index.html │ ├── upholstery/index.html │ ├── floors/index.html │ ├── area-rugs/index.html │ ├── add-ons/index.html │ └── commercial/index.html ├── commercial/ │ ├── offices/index.html │ ├── vacation-rentals/index.html │ ├── hotels-inns/index.html │ ├── retail-showrooms/index.html │ └── property-management/index.html ├── assets/ │ ├── css/styles.css?v=N ← always cache-bust on change │ ├── js/ │ │ ├── main.js │ │ └── components.js ← injects nav+footer via innerHTML │ ├── images/ │ │ ├── hero/ ← hero-{slug}.webp, one per page │ │ └── services/ ← {service}.webp card images │ └── videos/hero/hero-reel.mp4 ├── tools/ ← NOT deployed to webroot │ ├── convert-to-webp.py │ ├── gen-images-flux.py │ └── gen-hero-images.py ├── .cpanel.yml ├── robots.txt ├── sitemap.xml ├── 404.html └── 500.html ``` All images are `.webp`. cPanel deployment via `.cpanel.yml`. --- ## Build Pipeline When a site has many similar pages (location pages, service pages, blog posts, team-member pages), use a JSON + template + Python build script. ### When to Use a Build Script Use it when there are 4+ pages with identical structure differing only in content. For example: 6 location pages where only the city name and city-specific copy differs. For one-off pages (home, about, contact, services index), hand-write the HTML directly. Build scripts are for repetition, not for everything. ### Pattern Three files per template family: 1. **`data/{thing}.json`** — array of objects, one per page 2. **`{thing}/_template.html`** — HTML with `{{placeholder}}` markers 3. **`build_{thing}.py`** — stdlib Python, stamps template with data #### Example: locations.json ```json [ { "slug": "buffalo", "city": "Buffalo", "state": "NY", "title": "Hardwood Floor Refinishing in Buffalo, NY | Floor It", "meta_description": "Professional hardwood floor refinishing...", "canonical": "https://floorithardwoodfloors.com/locations/buffalo.html", "hero_h1": "Hardwood Floor Refinishing in Buffalo, NY", "hero_lead": "Western New York's most experienced...", "overview_h2": "Buffalo's Trusted Floor Refinishing Specialists", "overview_body_1": "...", "overview_body_2": "...", "faqs": [ { "q": "...", "a": "..." } ] } ] ``` #### Example: _template.html ```html {{title}} ...

{{hero_h1}}

{{hero_lead}}

... ``` #### Example: build_locations.py (skeleton) ```python """Build flat .html location pages from data/locations.json + locations/_template.html.""" import json, sys from pathlib import Path SITE_ROOT = Path(__file__).parent DATA_FILE = SITE_ROOT / "data" / "locations.json" TEMPLATE_FILE = SITE_ROOT / "locations" / "_template.html" OUT_DIR = SITE_ROOT / "locations" def render(template: str, item: dict) -> str: out = template for key, value in item.items(): if isinstance(value, (str, int, float)): out = out.replace("{{" + key + "}}", str(value)) # Custom rendering for nested arrays (e.g. faqs) # ... handle item['faqs'] etc. return out def main(): data = json.loads(DATA_FILE.read_text(encoding="utf-8")) template = TEMPLATE_FILE.read_text(encoding="utf-8") print(f"Building {len(data)} location pages...") for item in data: rendered = render(template, item) outfile = OUT_DIR / f"{item['slug']}.html" outfile.write_text(rendered, encoding="utf-8") print(f" Built: {outfile.relative_to(SITE_ROOT)}") print(f"Done. {len(data)} pages written.") if __name__ == "__main__": main() ``` ### Rules 1. **Source of truth is JSON, not HTML.** When content needs to change, edit the JSON and re-run the build script. Never hand-edit a generated `.html` file — the next build will overwrite your changes. 2. **Generated files land in the same folder as their template.** Do not nest into a subfolder. The template file is always named `_template.html` (leading underscore so it sorts above the generated pages). 3. **Build script lives in the SOURCE root**, not in deployment. After running the build, sync the rendered `.html` files (not the script, not the JSON) to deployment. 4. **Verify zero unreplaced placeholders** after every build: ```bash grep -rn "{{" {thing}/*.html # should return nothing ``` 5. **Build is idempotent.** Running it twice produces identical files. ### Stamping Rules — Escaping When a JSON value gets stamped into an HTML attribute or ``, special characters can break the page. Use these rules: - Plain text in `<p>` or `<h1>`: ampersand-encode (`&` → `&`) - `<title>` content: ampersand-encode + strip line breaks - `<meta>` content attribute: encode `&`, `"`, and remove line breaks - `href` URL attribute: never put user input here, but if needed, urlencode For our typical use case (controlled content authored by us), the simple `str.replace("{{key}}", value)` is sufficient because we don't have hostile input. Just don't put angle brackets or quotes in the JSON values. ### Re-Running the Build ```bash cd {project}/site python3 build_locations.py python3 build_services.py ``` After build, sync the rendered files to deployment. --- ## WordPress to Static HTML Migration The playbook for migrating a WordPress (Divi, Elementor, classic, whatever) site to vanilla static HTML. ### Phase 1 — Capture Source Before touching anything, capture the current site so nothing is lost. 1. **Database dump** — `wp db export ${domain}.sql --add-drop-table` 2. **Wp-content snapshot** — tar the entire `wp-content/` (themes, plugins, uploads) 3. **Crawl the live site** — use `wget --mirror --convert-links --adjust-extension --page-requisites --no-parent https://{domain}` to capture rendered HTML + all assets 4. **Inventory pages** — list every URL returning 200 (use the sitemap if it has one) 5. **Inventory forms** — note every Gravity Form / Contact Form 7 / etc. field-by-field 6. **Inventory dynamic features** — search, comments, members, anything truly dynamic Save all of this in the project's `.planning/` folder. ### Phase 2 — Decide What to Keep Re-design pass. Most WP sites have: - Bloated copy → cut by 30-50% - Outdated/inflated metrics → remove or replace with real, verifiable data - Stock photos → replace with real client photos when available - Cluttered layouts → strip back to one clear CTA per section - Plugin features the client never uses → drop entirely Show the client a wireframe of the simplified structure before building anything. ### Phase 3 — Information Architecture Standard structure for a small business: ``` / home /about/ about / story / team /services/ services index /services/{slug}.html one detail page per service /locations/ locations index /locations/{city}.html one detail page per service area (SEO gold) /reviews/ customer reviews /contact/ contact + form /blog/ optional blog index ``` For each location and each service: one flat `.html` page generated from JSON + template. ### Phase 4 — Build 1. Set up source folder per `01-project-structure.md` 2. Write `assets/css/main.css` (variables, reset, typography, layout) 3. Write `assets/css/components.css` (header, footer, hero, cards, forms) 4. Write `components/header.html` and `components/footer.html` 5. Write `assets/js/components.js` (fetch + inject header/footer) 6. Write `assets/js/main.js` (scroll animations, anything page-wide) 7. Build `index.html` first — this is the design system in working form 8. Generate location and service detail pages from JSON 9. Build remaining pages: about, contact, reviews, blog index ### Phase 5 — Forms If the WP site had Gravity Forms or similar, build a vanilla replacement: - HTML form in `contact/index.html` (and inline on service/location pages if needed) - Client-side validation in `assets/js/form.js` - POST to `/api/estimate` (or similar) handled by Python stdlib service - Server-side validation, reCAPTCHA verification, send via Resend ### Phase 6 — SEO Parity Before launch, every old URL must either: - Have a matching new URL with the same or better content, OR - 301-redirect to a relevant new URL Build a redirect map from the old WP sitemap. Add to `nginx.conf`: ```nginx location = /old-page-slug { return 301 /new-slug.html; } location = /?p=123 { return 301 /about/; } ``` Per-page parity checklist: - `<title>` matches or improves on the WP title - `<meta name="description">` matches or improves - `<link rel="canonical">` is set to the new URL - Headings (h1, h2, h3) preserve the topical structure - Internal links updated to new URLs - Image alt text preserved or improved - Schema.org JSON-LD added (`LocalBusiness`, `Service`, `BreadcrumbList`) ### Phase 7 — Switch DNS / Cutover 1. Deploy the static site to a separate URL first (`new.{domain}`) for client review 2. Once approved, point production DNS to the new container 3. Keep the WP container running for 14 days as fallback 4. Submit new sitemap to Google Search Console 5. Use Search Console URL inspection on 5-10 key pages to confirm indexing ### Phase 8 — Post-Launch - Monitor Search Console for crawl errors / 404s, fix in nginx as redirects - Monitor form submissions — first real lead through the new form is the ultimate "it works" check - Decommission WP only after 30 days of clean operation ### What NOT to Do - Do not run a "headless WordPress" or "WordPress as API" — that defeats the whole point. Static means static. - Do not use a static-site-generator tool (Hugo, 11ty, Jekyll, Astro, Next.js static export). We hand-write HTML and use small Python build scripts only where data is repeated. - Do not migrate the database. Content gets re-written cleaner during migration. --- ## WP + Divi to AM HTML Pipeline Overview End-to-end playbook for converting a WordPress / Divi site backup (.wpress) into an Arising Media vanilla HTML + vanilla JS deployment. ### What This Pipeline Does Takes a single `.wpress` archive (All-in-One WP Migration backup) and produces: - A fully structured `src/` directory matching AM project layout - A CSS design system derived from the original Divi theme settings - All page content extracted, cleaned, and re-authored into AM HTML templates - All media migrated to WebP and remapped to `/assets/images/` - SEO metadata (titles, descriptions, canonicals, schema.org) preserved or improved - Docker-ready deployment with nginx + PHP contact form ### Philosophy The goal is NOT a 1:1 copy. The goal is: 1. Preserve all content, SEO equity, and brand identity 2. ENHANCE the design — cleaner, faster, more modern 3. Remove all WordPress / Divi bloat (plugin CSS, shortcode residue, 300KB JS bundles) 4. Produce a site that loads in <2s on mobile and scores 95+ on Lighthouse Every migration is a design upgrade. The Divi site is the reference, not the target. ### Divi Version Matters Two distinct extraction paths: | Version | Content Storage | How to detect | |---------|----------------|---------------| | Divi 4 | `[et_pb_section]` shortcodes in `wp_posts.post_content` | `post_content` contains `[et_pb_` | | Divi 5 | Gutenberg blocks (`<!-- wp:divi/section -->`) + JSON in `wp_postmeta` | `post_content` contains `<!-- wp:divi/` | Run Phase 2 (database analysis) first to determine which version before choosing the extraction path. ### Pipeline Phases ``` Phase 0 Setup Verify .wpress location, create extraction directory Phase 1 Extract Unpack .wpress binary archive to wpress-extract/ Phase 2 DB Analysis Inspect WordPress database dump, detect Divi version, inventory pages Phase 3 Content Extract page content via Divi 4 or Divi 5 path Phase 4 Design System Pull colors, fonts, spacing from wp_options → CSS custom properties Phase 5 Media Catalog uploads/, convert to WebP, generate image manifest Phase 6 Build HTML Map extracted content to AM templates, generate JSON data files Phase 7 SEO Port titles, metas, canonicals, schema.org; build redirect map Phase 8 Forms Replace Gravity Forms / CF7 with AM vanilla form + Python API Phase 9 QA Lighthouse audit, grep for unreplaced placeholders, protection check ``` ### Script Reference All scripts live in `.am-webdesign-sops/wp-divi-pipeline/scripts/`. | Script | Phase | Purpose | |--------|-------|---------| | `extract_wpress.py` | 1 | Unpack .wpress binary archive | | `analyze_db.py` | 2 | Parse SQL dump, inventory pages + detect Divi version | | `extract_divi4.py` | 3 | Parse et_pb_ shortcodes → structured content JSON | | `extract_divi5.py` | 3 | Parse Gutenberg/Divi5 blocks → structured content JSON | | `extract_design.py` | 4 | Pull Divi theme options → design-system.json | | `extract_media.py` | 5 | Catalog uploads/, emit media-manifest.json | | `convert_images.py` | 5 | Batch convert images → WebP | | `run_pipeline.sh` | 0-7 | Master script — runs all phases in order | ### Per-Project Working Directory ``` {domain}/ └── .planning/ ├── vibrantyou-yoga-YYYYMMDD-*.wpress ← source archive (never modify) ├── wpress-extract/ ← Phase 1 output (gitignored) │ ├── package.json ← archive metadata │ ├── database.sql ← MySQL dump │ └── uploads/ ← all media (NOT in wp-content/) ├── data/ │ ├── pages.json ← Phase 2 output │ ├── design-system.json ← Phase 3 output │ └── media-manifest.json ← Phase 4 output └── scripts/ ← project-specific overrides if needed ``` ### .wpress Extraction Details The `.wpress` binary format is NOT a standard zip or tar. Custom sequential binary format: ``` [HEADER 4377 bytes] [FILE DATA n bytes] [HEADER] [FILE DATA] ... ``` Header breakdown: ``` Offset Length Field 0 255 Filename (null-padded) 255 14 File size in bytes (ASCII decimal, null-padded) 269 12 mtime unix timestamp (ASCII decimal, null-padded) 281 4096 Relative path (null-padded) 4377 n Raw file bytes (size from header) ``` The archive ends when a header of all null bytes is encountered, or EOF. Extraction script: ```bash python3 /home/sirdrez/arisingmedia-websites/.am-webdesign-sops/wp-divi-pipeline/scripts/extract_wpress.py \ /home/sirdrez/arisingmedia-websites/{domain}/.planning/{file}.wpress \ /home/sirdrez/arisingmedia-websites/{domain}/.planning/wpress-extract/ ``` ### Database Analysis Parse the WordPress MySQL dump to inventory pages, detect Divi version, extract design settings, and build the data JSON files. ```bash python3 /home/sirdrez/arisingmedia-websites/.am-webdesign-sops/wp-divi-pipeline/scripts/analyze_db.py \ {domain}/.planning/wpress-extract/ \ {domain}/.planning/data/ ``` Outputs three files into `.planning/data/`: - `pages.json` — all published pages/posts with content and SEO meta - `design-system.json` — colors, fonts, Divi settings - `site-info.json` — domain, plugin list, WP version, Divi version ### Divi 5 Content Extraction Parse raw Divi page content from `pages.json` into clean, structured HTML sections ready to map into AM templates. ```bash python3 /home/sirdrez/arisingmedia-websites/.am-webdesign-sops/wp-divi-pipeline/scripts/extract_divi5.py \ {domain}/.planning/data/pages.json \ {domain}/.planning/data/content/ ``` Produces one JSON file per page: `content/{slug}.json` Key fields in page JSON: - `slug`: page URL slug - `title`: page title - `seo_title`: SEO title (from Rank Math if available) - `seo_description`: SEO description (from Rank Math if available) - `sections`: array of content sections with type, background_color, and modules Map each Divi module type to AM component: | Divi module | Extract | Map to AM element | |-------------|---------|-------------------| | `divi/text` | inner HTML | `<section>`, `<p>`, headings as-is | | `divi/button` | `text`, `url` | `<a class="btn-primary">` | | `divi/image` | `src`, `alt`, `title` | `<img>` → rewrite to WebP path | | `divi/blurb` | icon, title, body | `.am-card` component | | `divi/testimonial` | quote, author, company | `.am-testimonial` component | | `divi/video` | `src`, poster | `<video>` or YouTube embed | | `divi/contact_form` | field list | → replace with AM form | | `divi/accordion` | Q+A pairs | `<details><summary>` | | `divi/fullwidth_header` | title, subhead, CTA | hero section | Strip Divi class/attribute noise using `clean_divi_html()` from `divi_to_html.py`: ```python from divi_to_html import clean_divi_html, rewrite_internal_links cleaned = clean_divi_html(raw_html) cleaned = rewrite_internal_links(cleaned, staging_hosts=("vibrantyou.yoga",)) ``` ### Design System Extraction Convert Divi theme settings into AM CSS custom properties. Input: `design-system.json` produced by `analyze_db.py` with fields: - `primary_color`: main brand color - `body_font`: font family name - `header_font`: heading font name - `body_font_size`: base font size in px - `body_line_height`: line height ratio - `divi_version`: "4" or "5" - `wp_version`: WordPress version - `site_url`: domain - `site_name`: brand name Never lift the Divi palette 1:1. Use extracted colors as the base and build a full 5-step scale around the primary hue: ```css :root { --color-primary: {extracted-color}; --color-primary-dark: {darken-by-15%}; --color-primary-light: {lighten-by-40%}; --color-surface: #fafafa; --color-surface-alt: #f0f7f6; --color-text: #1a1a1a; --color-text-muted: #5a6e6b; --color-border: #c8dedd; --color-white: #ffffff; /* Fonts */ --font-body: '{body-font}', system-ui, sans-serif; --font-heading: '{header-font}', Georgia, serif; /* Modular scale (1.25 ratio) */ --text-xs: 0.75rem; --text-sm: 0.875rem; --text-base: 1rem; --text-lg: 1.125rem; --text-xl: 1.25rem; --text-2xl: 1.5rem; --text-3xl: 1.875rem; --text-4xl: 2.25rem; --text-5xl: 3rem; --text-6xl: 3.75rem; /* Spacing scale */ --space-1: 0.25rem; --space-2: 0.5rem; --space-3: 0.75rem; --space-4: 1rem; --space-5: 1.25rem; --space-6: 1.5rem; --space-8: 2rem; --space-10: 2.5rem; --space-12: 3rem; --space-16: 4rem; --space-20: 5rem; --space-24: 6rem; --space-32: 8rem; } ``` ### Content Migration Map extracted Divi content into AM HTML templates. Build order: 1. `src/assets/css/main.css` — design tokens, reset, typography, layout grid 2. `src/assets/css/components.css` — header, footer, hero, cards, forms, nav 3. `src/components/header.html` — navigation 4. `src/components/footer.html` — footer links, contact info 5. `src/assets/js/components.js` — fetch + inject header/footer 6. `src/assets/js/main.js` — scroll animations, intersection observer 7. `src/index.html` — home page (this IS the design system in working form) 8. Remaining pages: about, classes, contact, blog 9. `src/robots.txt`, `src/sitemap.xml`, `src/404.html`, `src/500.html` For 4+ similar pages (class types, locations), use JSON template build: ``` src/classes/ ├── _template.html ← class detail page template ├── hatha.html ← generated from classes.json ├── vinyasa.html └── yin.html .planning/data/ └── classes.json ← array of class objects ``` ### Media Assets Migrate WordPress uploads to AM `/assets/images/`, convert to WebP, and generate a media manifest for URL remapping. Steps: 1. Catalog all original media (skip WordPress-generated size variants like `-150x150`) 2. Copy originals to `src/assets/images/` 3. Convert to WebP using `cwebp` or Python Pillow 4. Generate media manifest with old → new URL mapping 5. Apply manifest during HTML build to rewrite all image paths ```bash # Catalog originals (skip WP size variants) find .planning/wpress-extract/uploads -type f \( -name "*.jpg" -o -name "*.png" \) | \ grep -v -E "\-[0-9]+x[0-9]+\.(jpg|png)$" > .planning/data/media-originals.txt # Copy and convert while IFS= read -r src; do cp "$src" "src/assets/images/$(basename $src)" done < .planning/data/media-originals.txt cd src/assets/images/ for img in *.jpg *.png; do [ -f "$img" ] || continue cwebp -q 82 "$img" -o "${img%.*}.webp" && rm "$img" done ``` Remap URLs during HTML build: ```python import json, re manifest = json.loads(open('.planning/data/media-manifest.json').read()) url_map = {m['wp_url']: m['am_url'] for m in manifest} def rewrite_media_urls(html: str) -> str: for wp_url, am_url in url_map.items(): html = html.replace(wp_url, am_url) return html ``` ### SEO Preservation Before building HTML, map every WordPress page URL to its new AM URL and ensure title, description, canonical, and schema.org are preserved or improved. Rank Math SEO extraction (already in `pages.json` as `seo_title` and `seo_description`). Priority order for SEO fields: 1. `seo_title` from Rank Math (if not empty and not a template) 2. `post_title` with AM format appended: `{Title} | {Brand Name}` 3. Never leave title as the raw WP default Rank Math title templates use `%` tokens — strip them and rebuild: ```python import re def clean_rm_title(rm_title: str, post_title: str, site_name: str) -> str: if not rm_title or "%" in rm_title: return f"{post_title} | {site_name}" return rm_title def clean_rm_desc(rm_desc: str) -> str: return re.sub(r"%[a-z_]+%", "", rm_desc).strip(" -|") ``` Schema.org by page type: | Page | Schema type | Required fields | |------|------------|----------------| | Home | `LocalBusiness` | name, url, telephone, address, areaServed, openingHours | | About | `AboutPage` + `Organization` | name, description, founders | | Contact | `ContactPage` | name, url, telephone, email, address | | Blog post | `Article` | headline, datePublished, author, image | Pre-launch SEO audit (all must return empty): ```bash SITE=src # Every page has title/description/canonical/JSON-LD find $SITE -name "*.html" | xargs grep -L '<title>' find $SITE -name "*.html" | xargs grep -L 'name="description"' find $SITE -name "*.html" | xargs grep -L 'rel="canonical"' find $SITE -name "*.html" | xargs grep -L 'application/ld+json' # No WP URLs leaked grep -r "wp-content\|wp-admin\|?p=\|?page_id=" $SITE --include="*.html" # No unreplaced placeholders grep -r "{{" $SITE --include="*.html" # No Divi residue grep -r "et_pb_\|wp:divi" $SITE --include="*.html" ``` ### Run Order (Complete Execution Sequence) ```bash export DOMAIN="vibrantyou.yoga" export PROJECT="/home/sirdrez/arisingmedia-websites/$DOMAIN" export SOPS="/home/sirdrez/arisingmedia-websites/.am-webdesign-sops" export WPRESS=$(ls $PROJECT/.planning/*.wpress | head -1) # Phase 0: Setup mkdir -p $PROJECT/{src/{about,services,contact,blog,classes,components,assets/{css,js,images,svg,fonts}},build,infra,api,.planning/{data/{content},scripts,wpress-extract}} # Phase 1: Extract archive python3 $SOPS/wp-divi-pipeline/scripts/extract_wpress.py "$WPRESS" "$PROJECT/.planning/wpress-extract/" # Phase 2: Database analysis python3 $SOPS/wp-divi-pipeline/scripts/analyze_db.py "$PROJECT/.planning/wpress-extract/" "$PROJECT/.planning/data/" # Phase 3: Content extraction (Divi 5 example) python3 $SOPS/wp-divi-pipeline/scripts/extract_divi5.py "$PROJECT/.planning/data/pages.json" "$PROJECT/.planning/data/content/" # Phase 4: Design system (manual — read design-system.json, write main.css) # Phase 5: Media migration find $PROJECT/.planning/wpress-extract/uploads -type f \( -name "*.jpg" -o -name "*.png" \) | \ grep -v -E "\-[0-9]+x[0-9]+\.(jpg|png)$" > $PROJECT/.planning/data/media-originals.txt while IFS= read -r src; do cp "$src" "$PROJECT/src/assets/images/$(basename $src)" done < $PROJECT/.planning/data/media-originals.txt cd $PROJECT/src/assets/images/ for img in *.jpg *.png; do [ -f "$img" ] || continue cwebp -q 82 "$img" -o "${img%.*}.webp" && rm "$img" done # Phase 6: Build HTML (manual — per 05-content-migration.md) # Phase 7: SEO audit cd $PROJECT/src find . -name "*.html" | grep -v "_template" | xargs grep -L '<title>' find . -name "*.html" | grep -v "_template" | xargs grep -L 'rel="canonical"' # Phase 8: Docker setup docker compose -f $PROJECT/docker-compose.yml build docker compose -f $PROJECT/docker-compose.yml up -d curl -I http://localhost:PORT/ # Phase 9: Protection check bash $SOPS/tools/verify-protection.sh https://$DOMAIN ``` --- ## Docker + Nginx Deployment Every project ships with ALL deployment configs so it can go to either a Docker VPS or a cPanel shared host without refactoring. ### docker-compose.yml ```yaml services: web: image: {domain}-static build: context: . dockerfile: Dockerfile ports: - "{port}:80" depends_on: api: condition: service_healthy restart: unless-stopped api: image: {domain}-api build: context: ./api dockerfile: Dockerfile env_file: ./api/.env expose: - "3001" healthcheck: test: ["CMD", "python3", "-c", "import urllib.request,sys; sys.exit(0 if urllib.request.urlopen('http://localhost:3001/health',timeout=3).status==200 else 1)"] interval: 10s timeout: 5s retries: 3 restart: unless-stopped ``` Port assignments are unique per project. Track in `/home/sirdrez/arisingmedia-websites/PORTS.md` so no two projects collide. ### Dockerfile (nginx web container) CRITICAL — the Dockerfile must explicitly list which folders to copy. Never use `COPY . /usr/share/nginx/html/` because that copies `.env`, `Dockerfile`, build scripts, etc. into the web root where they become URL-accessible. ```dockerfile FROM nginx:alpine # nginx config — server-only, never served as a static file COPY nginx.conf /etc/nginx/conf.d/default.conf # Public website only — explicit list, no wildcards COPY index.html /usr/share/nginx/html/ COPY assets /usr/share/nginx/html/assets/ COPY components /usr/share/nginx/html/components/ COPY about /usr/share/nginx/html/about/ COPY blog /usr/share/nginx/html/blog/ COPY contact /usr/share/nginx/html/contact/ COPY locations /usr/share/nginx/html/locations/ COPY reviews /usr/share/nginx/html/reviews/ COPY services /usr/share/nginx/html/services/ EXPOSE 80 ``` ### Dockerfile (api Python container) ```dockerfile FROM python:3.13-alpine WORKDIR /app COPY server.py . EXPOSE 3001 CMD ["python3", "-u", "server.py"] ``` No pip, no requirements.txt, no node_modules. Python stdlib only. ### nginx.conf ```nginx server { listen 80; server_name _; root /usr/share/nginx/html; index index.html; # Defense in depth — deny dotfiles, configs, scripts, source files location ~ /\. { deny all; return 404; } location ~* \.(env|env\.example|conf|yml|yaml|py|pyc|md|txt|sh|sql|log|bak|old|swp|dockerfile)$ { deny all; return 404; } location = /Dockerfile { deny all; return 404; } # API proxy — strip /api/ prefix, forward to Python service location /api/ { proxy_pass http://api:3001/; proxy_http_version 1.1; proxy_set_header Host $host; proxy_set_header X-Real-IP $remote_addr; proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for; proxy_read_timeout 10s; proxy_connect_timeout 5s; } # Flat HTML routing — /locations/buffalo serves /locations/buffalo.html location / { try_files $uri $uri/ $uri.html =404; } # Cache static assets aggressively location ~* \.(jpg|jpeg|png|webp|svg|ico|css|js|woff2?|mp4|webm)$ { expires 30d; add_header Cache-Control "public, immutable"; access_log off; } # Security headers add_header X-Frame-Options "SAMEORIGIN"; add_header X-Content-Type-Options "nosniff"; add_header X-XSS-Protection "1; mode=block"; add_header Referrer-Policy "strict-origin-when-cross-origin"; add_header Permissions-Policy "geolocation=(), microphone=(), camera=()"; add_header Content-Security-Policy "default-src 'self'; script-src 'self' 'unsafe-inline' https://www.google.com https://www.gstatic.com https://www.recaptcha.net; style-src 'self' 'unsafe-inline' https://fonts.googleapis.com; font-src https://fonts.gstatic.com; img-src 'self' data: https:; object-src 'none'; frame-ancestors 'self'; form-action 'self'; base-uri 'self';"; add_header Strict-Transport-Security "max-age=31536000; includeSubDomains; preload"; # Disable server tokens server_tokens off; client_max_body_size 16k; gzip on; gzip_types text/html text/css application/javascript image/svg+xml; gzip_min_length 1024; error_page 404 /404.html; error_page 500 /500.html; } ``` ### .dockerignore Keeps sensitive files out of the build context: ``` .git .gitignore .dockerignore api build_*.py __pycache__ *.pyc *.md *.txt review_*.png docker-compose.yml .DS_Store .planning ``` ### .gitignore ``` api/.env api/__pycache__/ __pycache__/ *.pyc *.log .DS_Store ``` The `api/.env` file is NEVER committed. ### Sync from Source to Deployment After every change to source HTML/CSS/JS/assets: ```bash SITE="/path/to/concept-agent/projects/{domain}/site" DEPLOY="/home/sirdrez/arisingmedia-websites/{domain}" rsync -a \ --exclude=.git --exclude=.planning --exclude=api \ --exclude=Dockerfile --exclude=nginx.conf --exclude=docker-compose.yml \ --exclude=.dockerignore --exclude=.gitignore \ --exclude='build_*.py' --exclude=__pycache__ --exclude=data \ --exclude='*.md' --exclude='*.txt' --exclude='review_*.png' \ "$SITE/" "$DEPLOY/" cd "$DEPLOY" docker compose up -d --build web ``` ### Verify After Deploy Every deploy MUST be audited with `tools/verify-protection.sh` before being considered live. The script probes a fixed list of sensitive paths (`Dockerfile`, `.env`, `nginx.conf`, `.planning/`, `__pycache__/`, build scripts, `.git/`, etc.) and fails if any returns 200. ```bash ~/arisingmedia-websites/.am-webdesign-sops/tools/verify-protection.sh \ http://localhost:{port} ``` Exit codes: - `0` PASS — every sensitive path 404, every required path reachable. - `0` PASS (with warnings) — protection clean but `/robots.txt` or `/sitemap.xml` missing (content gap, not a leak). - `1` FAIL — at least one sensitive path returned 200, or `/` is unreachable. Run it manually after every `docker compose up -d --build`. Wire it into CI once the site has a remote pipeline. Treat a FAIL as a deploy rollback. For ad-hoc spot checks: ```bash curl -s -o /dev/null -w "site: %{http_code}\n" http://localhost:{port}/ curl -s -o /dev/null -w "css: %{http_code}\n" http://localhost:{port}/assets/css/main.css curl -s -o /dev/null -w "api: %{http_code}\n" http://localhost:{port}/api/health ``` All public paths return 200. All sensitive paths return 404. ### Project Folder Rename Procedure WHY: Docker Compose derives its project name from the folder the `docker-compose.yml` lives in. Renaming the folder changes the compose project name, which orphans any running containers under the old name. The fix is to explicitly remove the old container before bringing up the new compose project: ```bash # Stop and remove the old container by its known name docker stop {container-name} docker rm {container-name} # Now bring up from the renamed folder — clean start docker compose -f /path/to/renamed-folder/docker-compose.yml up -d ``` Always confirm the env vars loaded correctly after restart: ```bash docker exec {container-name} env | grep RESEND ``` --- ## cPanel + Apache Deployment Use this deployment method when the client's host is cPanel-based (shared hosting, WHM, Bluehost, HostGator, SiteGround, etc.) instead of a VPS running Docker. ### Key Rule: Repo Path ≠ Webroot cPanel Git requires an EMPTY directory as the repository path. The webroot (`public_html/{domain}/`) is never the repo path — cPanel rejects it if it already contains files. ``` Repo path (empty dir): /home/{username}/repositories/{domain}/ Deploy target (webroot): /home/{username}/public_html/{domain}/ ``` ### Setting Up the Repo in cPanel 1. cPanel → Git Version Control → Create Repository 2. Repository Path: `/home/{username}/repositories/{domain}/` (must be empty) 3. Clone URL: your Git remote (GitHub, Bitbucket, etc.) 4. cPanel clones into the repo path — never into the webroot ### .cpanel.yml This file lives in the repo root and tells cPanel what to copy to the webroot on every push/deploy. All paths are relative to the repo root. ```yaml --- deployment: tasks: - export DEPLOYPATH=/home/{username}/public_html/{domain}/ - /bin/cp -r assets $DEPLOYPATH - /bin/cp -r about $DEPLOYPATH - /bin/cp -r commercial $DEPLOYPATH - /bin/cp -r contact $DEPLOYPATH - /bin/cp -r locations $DEPLOYPATH - /bin/cp -r reviews $DEPLOYPATH - /bin/cp -r service-area $DEPLOYPATH - /bin/cp -r services $DEPLOYPATH - /bin/cp index.html $DEPLOYPATH - /bin/cp 404.html $DEPLOYPATH - /bin/cp robots.txt $DEPLOYPATH - /bin/cp sitemap.xml $DEPLOYPATH ``` Add or remove folder cp lines to match the project's actual directory structure. Do NOT copy: `tools/`, `*.py`, `*.md`, `.git/`, `docker-compose.yml`, `Dockerfile`. ### Lahrcarpetcleaning.com Reference ```yaml --- deployment: tasks: - export DEPLOYPATH=/home/dev1communitypro/public_html/lahrcarpetcleaning.dev1.communityproud.com/ - /bin/cp -r assets $DEPLOYPATH - /bin/cp -r about $DEPLOYPATH - /bin/cp -r commercial $DEPLOYPATH - /bin/cp -r contact $DEPLOYPATH - /bin/cp -r locations $DEPLOYPATH - /bin/cp -r reviews $DEPLOYPATH - /bin/cp -r service-area $DEPLOYPATH - /bin/cp -r services $DEPLOYPATH - /bin/cp index.html $DEPLOYPATH - /bin/cp 404.html $DEPLOYPATH - /bin/cp robots.txt $DEPLOYPATH - /bin/cp sitemap.xml $DEPLOYPATH ``` ### Deploying After a Push 1. Push to the connected remote (GitHub) 2. cPanel → Git Version Control → Manage → Pull or Deploy 3. cPanel runs the `.cpanel.yml` tasks, copying files to webroot 4. Apache serves from webroot automatically — no nginx, no Docker ### Apache vs nginx cPanel hosts use Apache (not nginx). There is no nginx.conf to manage. URL routing is handled by `.htaccess`: ```apache Options -Indexes RewriteEngine On # Directory-style URLs: /services/carpet-cleaning/ → index.html inside that folder # Apache handles this automatically with DirectoryIndex — no extra rules needed # Deny sensitive files <FilesMatch "\.(py|yml|yaml|md|log|sh|env|conf|dockerfile)$"> Order allow,deny Deny from all </FilesMatch> # Security headers <IfModule mod_headers.c> Header set X-Frame-Options "SAMEORIGIN" Header set X-Content-Type-Options "nosniff" Header set X-XSS-Protection "1; mode=block" Header set Referrer-Policy "strict-origin-when-cross-origin" Header set Permissions-Policy "geolocation=(), microphone=(), camera=()" Header set Strict-Transport-Security "max-age=31536000; includeSubDomains" </IfModule> ErrorDocument 404 /404.html ErrorDocument 500 /500.html ``` ### Cache Busting on cPanel Apache does not auto-invalidate cached assets. Bump `?v=N` on CSS/JS in all HTML files after every asset change: ```html <link rel="stylesheet" href="/assets/css/styles.css?v=6"> <script src="/assets/js/main.js?v=3"></script> ``` Increment by 1 on every change. Apply across ALL HTML pages. ### Verify After cPanel Deploy ```bash curl -s -o /dev/null -w "home: %{http_code}\n" https://{domain}/ curl -s -o /dev/null -w "css: %{http_code}\n" https://{domain}/assets/css/styles.css curl -s -o /dev/null -w "404: %{http_code}\n" https://{domain}/page-that-does-not-exist ``` All public paths return 200. All non-existent paths return 404. ### Universal Project Checklist (Both Paths) Every project must include ALL of these before first deploy: ``` Dockerfile ✓ Docker/VPS docker-compose.yml ✓ Docker/VPS nginx.conf ✓ Docker/VPS .htaccess ✓ cPanel/Apache .cpanel.yml ✓ cPanel Git .dockerignore ✓ Docker build security .gitignore ✓ keeps .env and secrets out of git robots.txt ✓ both paths sitemap.xml ✓ both paths 404.html ✓ both paths 500.html ✓ both paths ``` Lahrcarpetcleaning.com is the reference implementation for both paths. --- ## Domain, Email, DNS, and Resend ### Resend Account Setup 1. Sign up at https://resend.com 2. Generate an API key (one per project): https://resend.com/api-keys 3. Save the key in the project's `api/.env` as `RESEND_API_KEY=re_xxxx` 4. NEVER commit `.env`. NEVER paste the key in Slack, GitHub, or chat logs. ### Add and Verify the Sending Domain 1. https://resend.com/domains → **Add Domain** 2. Enter the domain (the one you'll send FROM, not necessarily the website domain) 3. Resend gives 3-4 DNS records. Add them all in Cloudflare (or whatever DNS host) 4. Wait 5-15 minutes, click **Verify** in Resend — all records must show green ### Records Resend Provides | Type | Name | Value | Proxy | TTL | |------|------|-------|-------|-----| | TXT | `resend._domainkey` | `p=...long-rsa-key...` | DNS only | 1 hr | | TXT | `send` | `v=spf1 include:amazonses.com ~all` | DNS only | 1 hr | | MX | `send` | `feedback-smtp.{region}.amazonses.com` priority 10 | DNS only | 1 hr | (Resend uses Amazon SES under the hood, hence `amazonses.com` in the SPF.) ### DMARC — REQUIRED for Inbox Placement Without DMARC, Gmail flags otherwise-correctly-configured email as suspicious and routes it to spam. Resend doesn't auto-create this record. You must add it. | Type | Name | Value | Proxy | TTL | |------|------|-------|-------|-----| | TXT | `_dmarc` | `v=DMARC1; p=none; rua=mailto:dev@{domain}` | DNS only | Auto | Components: - `v=DMARC1` — declares a DMARC policy exists - `p=none` — monitor mode, doesn't reject anything yet (safe to start) - `rua=mailto:...` — DMARC failure reports go to this inbox (review weekly) After 30 days of clean DMARC reports with no false positives, optionally upgrade to `p=quarantine` then `p=reject`. ### Verify DNS is Live ```bash dig +short TXT resend._domainkey.{domain} @8.8.8.8 dig +short TXT send.{domain} @8.8.8.8 dig +short TXT _dmarc.{domain} @8.8.8.8 dig +short MX send.{domain} @8.8.8.8 ``` All four should return their expected values. ### From-Name Format Always use a friendly From name, not bare email. Bare email looks robotic and triggers spam filters. ``` FROM_EMAIL=Brand Name <webleads@{domain}> ``` ### TO-Email Setup The `TO_EMAIL` is wherever the lead actually goes. Often a Gmail group address or the owner's personal inbox. - During Resend domain verification (BEFORE green): you can ONLY send TO the email tied to the Resend account - After verification: send to anyone For local testing without verification, use: ``` FROM_EMAIL=onboarding@resend.dev TO_EMAIL={your-resend-account-email} ``` ### When Emails Go to Spam Run this checklist: 1. **All 4 DNS records green at Resend**? If not, deliverability suffers. 2. **DMARC TXT record exists**? Most common cause of spam folder. 3. **Friendly From name**? `Brand Name <webleads@...>` not bare `webleads@...` 4. **Both `html` and `text` parts in the payload**? HTML-only is suspicious. 5. **Subject line clean**? No em-dashes, no "Estimate Request URGENT", no all-caps. 6. **Recipient marked first emails as Not Spam**? Train Gmail. ### Cloudflare-Specific Notes The user-agent quirk — Cloudflare in front of Resend's API blocks Python's default `User-Agent: Python-urllib/3.x`. Always set a custom `User-Agent` in the API request headers. If the DNS provider is Cloudflare, ensure all Resend records have **proxy status: DNS only** (the gray cloud icon, not orange). Proxying these breaks authentication. ### Annual Key Rotation Rotate Resend API keys annually: 1. Generate new key in Resend dashboard 2. Update `api/.env` on the server 3. `docker compose down && docker compose up -d` to reload env 4. Confirm a test submission still works 5. Revoke the old key in Resend dashboard ### Resend HTTP 403 — Domain Not Verified A 403 from the Resend API does NOT mean the API key is wrong. The specific error is: ```json {"statusCode":403,"message":"The {domain} domain is not verified. Please, add and verify your domain on https://resend.com/domains","name":"validation_error"} ``` This means the key is valid and authenticated, but the FROM domain has not been added or verified at resend.com/domains yet. Rule: **verify the domain BEFORE testing the form endpoint.** If you test before verification, `{"ok":false}` will be returned to the visitor even though the API key is correct and the code is correct. Sequence: 1. Set `RESEND_API_KEY` in `.env` 2. Add domain at resend.com/domains 3. Add DNS records in Cloudflare 4. Wait for green verification 5. Then test the form endpoint ### DKIM Key Rotation Resend periodically rotates DKIM keys. They send email when this happens. Add the new `resend2._domainkey` (or whichever selector they specify) TXT record in Cloudflare, then click verify. Old key remains active until they remove it. --- ## Form Handling — Resend Static sites can't send email by themselves. Every project that needs a contact form gets a small Python service running in its own Docker container, proxied by nginx. ### Architecture ``` Browser → POST /api/estimate (vanilla JS fetch in form.js) ↓ nginx → proxies /api/ to api:3001 (strips /api/ prefix) ↓ Python service (server.py, stdlib only) - Validates fields server-side - Verifies reCAPTCHA v3 with Google - Sends via Resend HTTPS API - Returns {ok: true} or {error: ...} ``` ### Front-End (Vanilla JS) `assets/js/form.js`: - Real-time validation (blur events) - Phone formatting `(###) ###-####` - Email regex check - Required-field check - Async submit to `/api/estimate` with JSON body - Disable submit button + show "Sending..." during request - Show success/error message in `.form-status` span - Reset form on success - reCAPTCHA v3 token fetched before submit and included in body ### Back-End (Python stdlib) `api/server.py` (skeleton): ```python #!/usr/bin/env python3 import hashlib, http.server, json, os, re, socketserver, time import urllib.parse, urllib.request PORT = int(os.environ.get("PORT", "3001")) RESEND_API_KEY = os.environ.get("RESEND_API_KEY", "") RECAPTCHA_SECRET = os.environ.get("RECAPTCHA_SECRET", "") TO_EMAIL = os.environ.get("TO_EMAIL", "") FROM_EMAIL = os.environ.get("FROM_EMAIL", "") RECAPTCHA_MIN = float(os.environ.get("RECAPTCHA_MIN", "0.5")) PHONE_RE = re.compile(r"^\(?\d{3}\)?[\s.\-]?\d{3}[\s.\-]?\d{4}$") EMAIL_RE = re.compile(r"^[^\s@]+@[^\s@]+\.[^\s@]+$") # Rate limit: 5 requests / IP / 15 minutes RATE_MAP = {} RATE_WINDOW = 15 * 60 RATE_MAX = 5 def sanitize(s): if not isinstance(s, str): return "" return s.replace("&","&").replace("<","<").replace(">",">").replace('"',""").strip()[:2000] def validate_fields(body): errors = [] if not body.get("name") or len((body["name"]).strip()) < 2: errors.append("name") if not EMAIL_RE.match((body.get("email") or "").strip()): errors.append("email") if not PHONE_RE.match((body.get("phone") or "").replace(" ", "")): errors.append("phone") return errors def verify_recaptcha(token): if not RECAPTCHA_SECRET or not token: return 0.0 data = urllib.parse.urlencode({"secret": RECAPTCHA_SECRET, "response": token}).encode() req = urllib.request.Request("https://www.google.com/recaptcha/api/siteverify", data=data) try: with urllib.request.urlopen(req, timeout=8) as resp: return float(json.loads(resp.read()).get("score", 0)) except Exception: return 0.0 def send_via_resend(fields): safe = {k: sanitize(fields.get(k,"")) for k in ["name","email","phone","address","city","zip","service","condition","message"]} html = f"""<!DOCTYPE html>...{safe['name']}...""" text = f"New estimate request\n\nName: {safe['name']}\n..." payload = json.dumps({ "from": FROM_EMAIL, "to": [TO_EMAIL], "reply_to": fields.get("email","").strip(), "subject": f"New estimate request: {safe['name']} ({safe['city']})", "html": html, "text": text, }).encode("utf-8") idem = hashlib.sha256(payload).hexdigest()[:64] req = urllib.request.Request("https://api.resend.com/emails", data=payload, headers={ "Authorization": f"Bearer {RESEND_API_KEY}", "Content-Type": "application/json", "Idempotency-Key": idem, "User-Agent": "{Brand}-Estimate-Form/1.0", }) try: with urllib.request.urlopen(req, timeout=10) as resp: if resp.status >= 300: raise RuntimeError(f"Resend {resp.status}: {resp.read().decode('utf-8','ignore')}") except urllib.error.HTTPError as e: raise RuntimeError(f"Resend {e.code}: {e.read().decode('utf-8','ignore')}") from None ``` Reference implementation: `floorithardwoodfloors.com/api/server.py`. ### Critical: User-Agent Header When calling the Resend API from Python, you MUST set a non-default User-Agent. Cloudflare (which fronts Resend) blocks Python's default `Python-urllib/3.x` with HTTP 403 / Cloudflare error code 1010. ```python "User-Agent": "{ProjectName}-Form/1.0" ``` ### Idempotency Every Resend request includes an `Idempotency-Key` header set to the SHA-256 of the payload (truncated to 64 chars). Identical payloads within 24 hours are deduplicated by Resend automatically. This prevents: - Double-clicks creating two leads - Browser retries after a network blip - Honest user submitting twice ### Security Checklist - API key in `.env` file, NOT in source control. `.gitignore` it. - API key NEVER reaches the browser bundle (only the server has it) - `.env` file lives in `api/`, NOT in the nginx web root - Server-side validation on EVERY field — never trust client - HTML-escape every field rendered into the email body to prevent injection - Rate limit per IP (5 / 15 min default) - 16 KB body cap — reject anything larger - 10-second upstream timeout — don't hold connections open - CORS locked to the production domain only (`Access-Control-Allow-Origin: https://{domain}`) - reCAPTCHA v3 with score threshold (default 0.5) once secret is configured ### Environment Variables `api/.env`: ``` RESEND_API_KEY=re_xxxxxxxxxxxx RECAPTCHA_SECRET=6Ldq... TO_EMAIL=leads@{domain} FROM_EMAIL=Brand Name <webleads@{domain}> RECAPTCHA_MIN=0.5 PORT=3001 ``` `api/.env.example` (committed) is the same file with placeholder values. ### reCAPTCHA Setup 1. Create site at https://www.google.com/recaptcha/admin 2. Type: **reCAPTCHA v3** (not v2) 3. Add your domain 4. Copy the **site key** into `assets/js/form.js`: ```js const RECAPTCHA_SITE_KEY = '6Ldq...'; ``` 5. Add the script tag to pages with the form: ```html <script src="https://www.google.com/recaptcha/api.js?render=6Ldq..."></script> ``` 6. Copy the **secret key** into `api/.env` as `RECAPTCHA_SECRET` ### Deliverability Checklist When emails are landing in spam: 1. Verify Resend domain is fully green (SPF + DKIM + DMARC) 2. From name set, not bare email: `Brand Name <webleads@{domain}>` 3. Both `html` and `text` parts in every Resend payload (no HTML-only) 4. Subject line is descriptive, no em-dash, no spam-trigger words 5. Recipient marks first 2-3 emails as "Not Spam" in Gmail to train the filter ### Testing ```bash # Validation rejection (expect 422) curl -X POST http://localhost:8096/api/estimate \ -H "Content-Type: application/json" \ -d '{"name":"","email":"bad"}' # Full valid submission (expect 200, real email sent) curl -X POST http://localhost:8096/api/estimate \ -H "Content-Type: application/json" \ -d '{"name":"Test","email":"test@example.com","phone":"(716) 555-1234","address":"100 Test St","city":"Buffalo","zip":"14201","service":"refinishing","message":"Test","token":""}' ``` The first real test email confirms end-to-end works. --- ## PHP App Stack (Server-Side Processing) Use this pattern when a project requires server-side processing that static HTML cannot handle: file conversion, at-rest encryption, payment processing, user authentication, or API-gated features. **Reference implementation:** `quickconvert.us` ### When to Use This Pattern - File uploads and processing (image conversion, PDF generation, etc.) - At-rest encryption of user data - Payment processing with Stripe subscriptions - User authentication with magic link or password-based login - Rate-limited APIs that must be server-enforced **Do not** introduce this pattern just to add a contact form. Use the Python stdlib form service instead. ### Stack - **PHP 8.3** (php:8.3-fpm-alpine base image) - **Nginx** (Alpine package, same container via supervisord) - **SQLite** (pdo_sqlite extension, no separate DB container needed) - **libsodium** (built into PHP 8.x — use for all encryption) - **ImageMagick** (pecl imagick for image processing) - **msmtp** (SMTP relay for outbound email) - **supervisord** (manages nginx + php-fpm + crond in one container) ### Project Structure ``` project/ ├── src/ ← nginx document root │ ├── index.php │ ├── api/ │ │ ├── convert.php ← POST endpoint (CSRF + reCAPTCHA protected) │ │ └── download.php ← GET endpoint (signed token) │ ├── assets/css/ │ ├── assets/js/ │ └── assets/images/ ├── includes/ ← PHP classes (above doc root, not web-accessible) │ ├── bootstrap.php ← constants, session, autoload │ ├── auth.php ← login, register, magic token │ ├── csrf.php │ ├── db.php ← SQLite PDO wrapper │ ├── encryption.php ← libsodium wrappers │ └── mailer.php ├── components/ │ ├── header.php │ └── footer.php ├── storage/ ← volume-mounted, NOT in docker image │ ├── uploads/ ← encrypted .enc files only │ ├── converted/ │ ├── temp/ │ ├── .htaccess ← deny all direct access │ └── {app}.db ├── infra/ │ ├── nginx.conf │ ├── php.ini │ ├── supervisord.conf │ └── docker-entrypoint.sh ├── tools/ │ └── cleanup.php ← cron: delete expired tokens + files ├── Dockerfile ├── docker-compose.yml └── .env ← gitignored, never committed ``` ### Security Requirements (Non-Negotiable) **CSRF** — every POST form and API endpoint must verify a CSRF token tied to the session. **Rate limiting** — two layers: 1. nginx: `limit_req_zone` on /api/ (10 req/s, burst 20) 2. PHP: per-IP daily counter in SQLite rate_limits table **reCAPTCHA v3** — on conversion/upload endpoints. Verify server-side via Google API. Cache result in session (verify once per session, not per request). **At-rest encryption** — any user-uploaded file must be encrypted before writing to disk. Use `sodium_crypto_secretstream_xchacha20poly1305_*` for files, `sodium_crypto_secretbox` for strings. Key stored in `.env` as `QC_ENCRYPTION_KEY` (32 bytes hex). **Signed download tokens** — never expose file paths. Issue a 64-char hex token stored in SQLite with expiry and single-use enforcement. **Magic link auth** — prefer magic link over password. On register: create account unverified, send verify email, block login until verified. Token: 64-char hex, 1-hour expiry, stored in `magic_tokens` table, consumed on use. ### Nginx Security Headers ```nginx add_header X-Frame-Options "SAMEORIGIN" always; add_header X-Content-Type-Options "nosniff" always; add_header Referrer-Policy "strict-origin-when-cross-origin" always; add_header Permissions-Policy "camera=(), microphone=(), geolocation=()" always; add_header Content-Security-Policy "default-src 'self'; script-src 'self' 'unsafe-inline' https://www.google.com https://www.gstatic.com; style-src 'self' 'unsafe-inline'; img-src 'self' data: https:; object-src 'none'; base-uri 'self'; form-action 'self' https://checkout.stripe.com;" always; # Stripe webhook — POST only location = /api/stripe-webhook.php { limit_except POST { deny all; } } # Block dotfiles location ~ /\. { deny all; return 403; } ``` ### Database Schema Pattern (SQLite, Idempotent) Use `CREATE TABLE IF NOT EXISTS` for all tables. Use `ALTER TABLE ... ADD COLUMN` wrapped in try/catch for schema migrations. ```php try { $pdo->exec("ALTER TABLE users ADD COLUMN verified_at INTEGER DEFAULT NULL"); } catch (Throwable $e) { /* column already exists */ } ``` ### Stripe Integration - Checkout: create session server-side, redirect to Stripe-hosted page - Webhook: verify `Stripe-Signature` header using HMAC-SHA256 (implement without Stripe SDK — use curl) - Webhook tolerance: 300 seconds (5 min) on timestamp - Register webhook endpoint at: `https://{domain}/api/stripe-webhook.php` - Events to subscribe: `checkout.session.completed`, `customer.subscription.created`, `customer.subscription.updated`, `customer.subscription.deleted`, `invoice.payment_succeeded`, `invoice.payment_failed` ### .env Required Vars ``` APP_ENV=production BASE_URL=https://{domain} QC_ENCRYPTION_KEY={32-bytes-hex} STRIPE_MODE=live STRIPE_LIVE_SECRET_KEY=sk_live_... STRIPE_LIVE_PUBLISHABLE_KEY=pk_live_... STRIPE_WEBHOOK_SECRET=whsec_... STRIPE_PRICE_ID=price_... RECAPTCHA_SITE_KEY=... RECAPTCHA_SECRET_KEY=... SMTP_HOST=... SMTP_PORT=587 SMTP_USER=... SMTP_PASS=... MAIL_FROM=noreply@{domain} MAIL_FROM_NAME={Brand} ``` Generate encryption key: `php -r "echo bin2hex(random_bytes(32));"`