5eb4426d30
- README: rewrite index to reflect actual files (STACK/CONTENT/OPTIMIZATION); remove 15 dead links to old numbered SOPs; add subdirectory table; update image gen to Google Imagen as default - STACK: fix wp-divi-pipeline script paths; genericize vibrantyou/domain examples; strip pre-existing em dashes throughout - CONTENT: update image generation default to Google Imagen API with allotted quota - image-gen-workflow: remove client-specific cobhamtech data; generalize brand palette step; update date - wp-divi-pipeline-to-am-stack: remove vibrantyou.yoga client data block; fix Related SOPs links to current files
1771 lines
61 KiB
Markdown
1771 lines
61 KiB
Markdown
# STACK: Architecture, Deployment, and Build Pipeline
|
|
Author: Andre Cobham / Arising Media
|
|
Updated: 2026-06-09
|
|
|
|
## Stack Philosophy
|
|
|
|
Two primary stacks. Pick based on page count and update frequency.
|
|
|
|
### Stack A: PHP Router + SQLite (50+ pages, standard as of 2026-05-21)
|
|
|
|
- **PHP Router**: `router.php` dispatches every content URL to the correct PHP template. Edit one template = entire page class updates on next request. No find-and-replace. No file edits.
|
|
- **SQLite**: single-file content DB. `pages.sqlite` holds all page content (title, meta, sections JSON, schema). 10,000 rows = 5MB. Sub-millisecond reads. No server process.
|
|
- **Vanilla JavaScript**: no frameworks. `fetch`, `IntersectionObserver`, `querySelector`
|
|
- **Plain CSS**: `tokens.css` (design tokens) + `main.css` (components). No Sass, no Tailwind
|
|
- **Docker + nginx**: nginx routes `/assets/*` directly; all content URLs → PHP-FPM → router.php
|
|
- **Resend**: transactional email via `/api/contact.php`
|
|
- **Reference:** `arisingmedia.us`: 10,000+ pages
|
|
|
|
### Stack B: Static HTML (fewer than 50 pages)
|
|
|
|
- **Static HTML**: every page is a `.html` file on disk
|
|
- Same JS, CSS, Docker, nginx, Resend as Stack A
|
|
- Python 3 stdlib for build scripts (no pip)
|
|
- **Reference:** `lahrcarpetcleaning.com`
|
|
|
|
### Never Use (Both Stacks)
|
|
|
|
- Node.js / npm packages on the website. Front-end JS uses ZERO packages
|
|
- WordPress for new builds (we migrate clients OUT of WordPress)
|
|
- CSS frameworks (Bootstrap, Tailwind, Bulma)
|
|
- JS frameworks (React, Vue, Angular, Svelte)
|
|
- jQuery, Lodash, Moment, axios, or any utility library
|
|
- CSS-in-JS, styled-components
|
|
- Build tools that require `node_modules` (webpack, vite, parcel, esbuild)
|
|
- Tracking pixels other than what the client explicitly requests
|
|
|
|
### Why This Stack
|
|
|
|
1. **Performance**: a static HTML page with vanilla JS loads in <100ms with no parse cost from frameworks
|
|
2. **Longevity**: no dependency rot. A site we build today still works in 10 years with no maintenance
|
|
3. **Security**: no `npm audit` warnings, no supply-chain attack vectors, no transitive deps to patch
|
|
4. **Auditability**: every line on the site is something we wrote and can read in plain text
|
|
5. **Hosting**: a static folder + tiny Python container fits in the smallest VM tier any provider sells
|
|
|
|
### When to Add a Server-Side Service
|
|
|
|
Static-only is the default. Add a small Python service ONLY when needed for:
|
|
- Form submission (handled via Resend in the stdlib HTTP server pattern)
|
|
- A specific dynamic feature the client paid for (e.g., booking widget, AI chat)
|
|
|
|
Each service is its own Docker container. Keep them small (single file when possible).
|
|
Use Python `http.server` + `urllib` from stdlib. Do not introduce Flask, FastAPI, Django, or any third-party HTTP framework.
|
|
|
|
---
|
|
|
|
## Project Structure
|
|
|
|
Two folders per project: source and deployment.
|
|
|
|
### Source Folder
|
|
|
|
Lives in the dev tree under `concept-agent/projects/{domain}/site/`.
|
|
Contains everything needed to maintain and rebuild the site.
|
|
|
|
```
|
|
{domain}/site/
|
|
├── index.html # home page
|
|
├── about/index.html # /about/
|
|
├── contact/index.html # /contact/
|
|
├── reviews/index.html # /reviews/
|
|
├── blog/index.html # /blog/
|
|
├── locations/ # location pages
|
|
│ ├── index.html # /locations/
|
|
│ ├── _template.html # template stamped with JSON
|
|
│ ├── buffalo.html # generated, flat URL
|
|
│ ├── amherst.html
|
|
│ └── ...
|
|
├── services/
|
|
│ ├── index.html
|
|
│ ├── _template.html
|
|
│ ├── floor-refinishing.html
|
|
│ └── ...
|
|
├── components/
|
|
│ ├── header.html # loaded via fetch() by components.js
|
|
│ └── footer.html
|
|
├── data/
|
|
│ ├── locations.json # source data for build_locations.py
|
|
│ └── services.json # source data for build_services.py
|
|
├── assets/
|
|
│ ├── css/
|
|
│ │ ├── main.css # variables, reset, layout
|
|
│ │ └── components.css # cards, hero, header, footer, nav, responsive
|
|
│ ├── js/
|
|
│ │ ├── main.js # scroll animations, count-up, etc.
|
|
│ │ ├── components.js # fetch + inject header/footer
|
|
│ │ └── form.js # form validation + submit
|
|
│ ├── images/
|
|
│ ├── videos/ # hero video files (.mp4 + .webm)
|
|
│ └── fonts/ # only if not using Google Fonts CDN
|
|
├── build_locations.py # JSON → flat .html stamping
|
|
├── build_services.py
|
|
└── README.md # project notes, content sources, status
|
|
```
|
|
|
|
### Deployment Folder
|
|
|
|
Lives at `/home/sirdrez/arisingmedia-websites/{domain}/`.
|
|
Contains ONLY what's needed to run `docker compose up`.
|
|
|
|
```
|
|
{domain}/
|
|
├── index.html # all public website folders
|
|
├── about/ # ↑
|
|
├── assets/ # ↑
|
|
├── blog/ # ↑
|
|
├── components/ # ↑
|
|
├── contact/ # ↑
|
|
├── locations/ # ↑
|
|
├── reviews/ # ↑
|
|
├── services/ # ↑
|
|
├── api/ # form-submit Python service (if used)
|
|
│ ├── server.py
|
|
│ ├── Dockerfile
|
|
│ ├── .env # gitignored: Resend key, etc.
|
|
│ └── .env.example
|
|
├── Dockerfile # nginx web container
|
|
├── nginx.conf
|
|
├── docker-compose.yml
|
|
├── .dockerignore
|
|
├── .gitignore
|
|
└── .planning/ # everything not needed at runtime
|
|
├── build_locations.py # build scripts moved here
|
|
├── data/ # JSON sources moved here
|
|
├── README.md
|
|
├── DNS_*.txt # DNS notes
|
|
└── review_*.png # design review screenshots
|
|
```
|
|
|
|
### What Goes Where
|
|
|
|
**Source folder gets** every working file (build scripts, data JSON, screenshots,
|
|
notes, raw assets). This is the dev/maintenance copy. NOT what gets deployed.
|
|
|
|
**Deployment folder gets** ONLY the rendered website + the small API service.
|
|
Build scripts, JSON data, and notes go into `.planning/` to keep root clean and
|
|
prevent accidental web exposure.
|
|
|
|
### URL Structure: Two Valid Patterns
|
|
|
|
#### Pattern A: Flat HTML (default for Docker/nginx projects)
|
|
|
|
nginx `try_files $uri $uri/ $uri.html =404` serves `/locations/buffalo` and
|
|
`/locations/buffalo.html`. Canonical form: `/locations/buffalo.html`.
|
|
|
|
Why flat:
|
|
- One file = one page, no `/index.html` confusion
|
|
- Easier sitemap generation
|
|
- `<a href>` links are unambiguous
|
|
- Crawl budget benefit: Google indexes one URL per page, not two
|
|
|
|
#### Pattern B: Directory-style (default for cPanel/Apache projects)
|
|
|
|
Each page lives at `{slug}/index.html`. Apache auto-serves `index.html` when
|
|
visiting `/{slug}/`. Use this when deploying to cPanel shared hosting.
|
|
|
|
```
|
|
services/
|
|
├── carpet-cleaning/index.html → /services/carpet-cleaning/
|
|
├── stairs/index.html → /services/stairs/
|
|
commercial/
|
|
├── offices/index.html → /commercial/offices/
|
|
└── vacation-rentals/index.html → /commercial/vacation-rentals/
|
|
```
|
|
|
|
### Lahrcarpetcleaning.com Reference (Directory-Style, cPanel)
|
|
|
|
```
|
|
lahrcarpetcleaning.com/
|
|
├── index.html
|
|
├── about/index.html
|
|
├── contact/index.html
|
|
├── reviews/index.html
|
|
├── service-area/index.html
|
|
├── locations/
|
|
│ ├── index.html
|
|
│ ├── waterloo-ny/index.html
|
|
│ ├── geneva-ny/index.html
|
|
│ └── ... (20 location pages)
|
|
├── services/
|
|
│ ├── carpet-cleaning/index.html
|
|
│ ├── stairs/index.html
|
|
│ ├── upholstery/index.html
|
|
│ ├── floors/index.html
|
|
│ ├── area-rugs/index.html
|
|
│ ├── add-ons/index.html
|
|
│ └── commercial/index.html
|
|
├── commercial/
|
|
│ ├── offices/index.html
|
|
│ ├── vacation-rentals/index.html
|
|
│ ├── hotels-inns/index.html
|
|
│ ├── retail-showrooms/index.html
|
|
│ └── property-management/index.html
|
|
├── assets/
|
|
│ ├── css/styles.css?v=N ← always cache-bust on change
|
|
│ ├── js/
|
|
│ │ ├── main.js
|
|
│ │ └── components.js ← injects nav+footer via innerHTML
|
|
│ ├── images/
|
|
│ │ ├── hero/ ← hero-{slug}.webp, one per page
|
|
│ │ └── services/ ← {service}.webp card images
|
|
│ └── videos/hero/hero-reel.mp4
|
|
├── tools/ ← NOT deployed to webroot
|
|
│ ├── convert-to-webp.py
|
|
│ ├── gen-images-flux.py
|
|
│ └── gen-hero-images.py
|
|
├── .cpanel.yml
|
|
├── robots.txt
|
|
├── sitemap.xml
|
|
├── 404.html
|
|
└── 500.html
|
|
```
|
|
|
|
All images are `.webp`. cPanel deployment via `.cpanel.yml`.
|
|
|
|
---
|
|
|
|
## Build Pipeline
|
|
|
|
When a site has many similar pages (location pages, service pages, blog posts,
|
|
team-member pages), use a JSON + template + Python build script.
|
|
|
|
### When to Use a Build Script
|
|
|
|
Use it when there are 4+ pages with identical structure differing only in
|
|
content. For example: 6 location pages where only the city name and
|
|
city-specific copy differs.
|
|
|
|
For one-off pages (home, about, contact, services index), hand-write the HTML
|
|
directly. Build scripts are for repetition, not for everything.
|
|
|
|
### Pattern
|
|
|
|
Three files per template family:
|
|
|
|
1. **`data/{thing}.json`**: array of objects, one per page
|
|
2. **`{thing}/_template.html`**: HTML with `{{placeholder}}` markers
|
|
3. **`build_{thing}.py`**: stdlib Python, stamps template with data
|
|
|
|
#### Example: locations.json
|
|
|
|
```json
|
|
[
|
|
{
|
|
"slug": "buffalo",
|
|
"city": "Buffalo",
|
|
"state": "NY",
|
|
"title": "Hardwood Floor Refinishing in Buffalo, NY | Floor It",
|
|
"meta_description": "Professional hardwood floor refinishing...",
|
|
"canonical": "https://floorithardwoodfloors.com/locations/buffalo.html",
|
|
"hero_h1": "Hardwood Floor Refinishing in Buffalo, NY",
|
|
"hero_lead": "Western New York's most experienced...",
|
|
"overview_h2": "Buffalo's Trusted Floor Refinishing Specialists",
|
|
"overview_body_1": "...",
|
|
"overview_body_2": "...",
|
|
"faqs": [
|
|
{ "q": "...", "a": "..." }
|
|
]
|
|
}
|
|
]
|
|
```
|
|
|
|
#### Example: _template.html
|
|
|
|
```html
|
|
<!DOCTYPE html>
|
|
<html lang="en">
|
|
<head>
|
|
<title>{{title}}</title>
|
|
<meta name="description" content="{{meta_description}}">
|
|
<link rel="canonical" href="{{canonical}}">
|
|
...
|
|
</head>
|
|
<body>
|
|
<h1>{{hero_h1}}</h1>
|
|
<p>{{hero_lead}}</p>
|
|
...
|
|
</body>
|
|
</html>
|
|
```
|
|
|
|
#### Example: build_locations.py (skeleton)
|
|
|
|
```python
|
|
"""Build flat .html location pages from data/locations.json + locations/_template.html."""
|
|
import json, sys
|
|
from pathlib import Path
|
|
|
|
SITE_ROOT = Path(__file__).parent
|
|
DATA_FILE = SITE_ROOT / "data" / "locations.json"
|
|
TEMPLATE_FILE = SITE_ROOT / "locations" / "_template.html"
|
|
OUT_DIR = SITE_ROOT / "locations"
|
|
|
|
def render(template: str, item: dict) -> str:
|
|
out = template
|
|
for key, value in item.items():
|
|
if isinstance(value, (str, int, float)):
|
|
out = out.replace("{{" + key + "}}", str(value))
|
|
# Custom rendering for nested arrays (e.g. faqs)
|
|
# ... handle item['faqs'] etc.
|
|
return out
|
|
|
|
def main():
|
|
data = json.loads(DATA_FILE.read_text(encoding="utf-8"))
|
|
template = TEMPLATE_FILE.read_text(encoding="utf-8")
|
|
print(f"Building {len(data)} location pages...")
|
|
for item in data:
|
|
rendered = render(template, item)
|
|
outfile = OUT_DIR / f"{item['slug']}.html"
|
|
outfile.write_text(rendered, encoding="utf-8")
|
|
print(f" Built: {outfile.relative_to(SITE_ROOT)}")
|
|
print(f"Done. {len(data)} pages written.")
|
|
|
|
if __name__ == "__main__":
|
|
main()
|
|
```
|
|
|
|
### Rules
|
|
|
|
1. **Source of truth is JSON, not HTML.** When content needs to change, edit the
|
|
JSON and re-run the build script. Never hand-edit a generated `.html` file :
|
|
the next build will overwrite your changes.
|
|
|
|
2. **Generated files land in the same folder as their template.** Do not nest
|
|
into a subfolder. The template file is always named `_template.html` (leading
|
|
underscore so it sorts above the generated pages).
|
|
|
|
3. **Build script lives in the SOURCE root**, not in deployment. After running
|
|
the build, sync the rendered `.html` files (not the script, not the JSON) to
|
|
deployment.
|
|
|
|
4. **Verify zero unreplaced placeholders** after every build:
|
|
```bash
|
|
grep -rn "{{" {thing}/*.html # should return nothing
|
|
```
|
|
|
|
5. **Build is idempotent.** Running it twice produces identical files.
|
|
|
|
### Stamping Rules: Escaping
|
|
|
|
When a JSON value gets stamped into an HTML attribute or `<title>`, special
|
|
characters can break the page. Use these rules:
|
|
|
|
- Plain text in `<p>` or `<h1>`: ampersand-encode (`&` → `&`)
|
|
- `<title>` content: ampersand-encode + strip line breaks
|
|
- `<meta>` content attribute: encode `&`, `"`, and remove line breaks
|
|
- `href` URL attribute: never put user input here, but if needed, urlencode
|
|
|
|
For our typical use case (controlled content authored by us), the simple
|
|
`str.replace("{{key}}", value)` is sufficient because we don't have hostile
|
|
input. Just don't put angle brackets or quotes in the JSON values.
|
|
|
|
### Re-Running the Build
|
|
|
|
```bash
|
|
cd {project}/site
|
|
python3 build_locations.py
|
|
python3 build_services.py
|
|
```
|
|
|
|
After build, sync the rendered files to deployment.
|
|
|
|
---
|
|
|
|
## WordPress to Static HTML Migration
|
|
|
|
The playbook for migrating a WordPress (Divi, Elementor, classic, whatever) site
|
|
to vanilla static HTML.
|
|
|
|
### Phase 1: Capture Source
|
|
|
|
Before touching anything, capture the current site so nothing is lost.
|
|
|
|
1. **Database dump**: `wp db export ${domain}.sql --add-drop-table`
|
|
2. **Wp-content snapshot**: tar the entire `wp-content/` (themes, plugins, uploads)
|
|
3. **Crawl the live site**: use `wget --mirror --convert-links --adjust-extension --page-requisites --no-parent https://{domain}` to capture rendered HTML + all assets
|
|
4. **Inventory pages**: list every URL returning 200 (use the sitemap if it has one)
|
|
5. **Inventory forms**: note every Gravity Form / Contact Form 7 / etc. field-by-field
|
|
6. **Inventory dynamic features**: search, comments, members, anything truly dynamic
|
|
|
|
Save all of this in the project's `.planning/` folder.
|
|
|
|
### Phase 2: Decide What to Keep
|
|
|
|
Re-design pass. Most WP sites have:
|
|
- Bloated copy → cut by 30-50%
|
|
- Outdated/inflated metrics → remove or replace with real, verifiable data
|
|
- Stock photos → replace with real client photos when available
|
|
- Cluttered layouts → strip back to one clear CTA per section
|
|
- Plugin features the client never uses → drop entirely
|
|
|
|
Show the client a wireframe of the simplified structure before building anything.
|
|
|
|
### Phase 3: Information Architecture
|
|
|
|
Standard structure for a small business:
|
|
|
|
```
|
|
/ home
|
|
/about/ about / story / team
|
|
/services/ services index
|
|
/services/{slug}.html one detail page per service
|
|
/locations/ locations index
|
|
/locations/{city}.html one detail page per service area (SEO gold)
|
|
/reviews/ customer reviews
|
|
/contact/ contact + form
|
|
/blog/ optional blog index
|
|
```
|
|
|
|
For each location and each service: one flat `.html` page generated from JSON +
|
|
template.
|
|
|
|
### Phase 4: Build
|
|
|
|
1. Set up source folder per the Project Structure section in STACK.md
|
|
2. Write `assets/css/main.css` (variables, reset, typography, layout)
|
|
3. Write `assets/css/components.css` (header, footer, hero, cards, forms)
|
|
4. Write `components/header.html` and `components/footer.html`
|
|
5. Write `assets/js/components.js` (fetch + inject header/footer)
|
|
6. Write `assets/js/main.js` (scroll animations, anything page-wide)
|
|
7. Build `index.html` first: this is the design system in working form
|
|
8. Generate location and service detail pages from JSON
|
|
9. Build remaining pages: about, contact, reviews, blog index
|
|
|
|
### Phase 5: Forms
|
|
|
|
If the WP site had Gravity Forms or similar, build a vanilla replacement:
|
|
- HTML form in `contact/index.html` (and inline on service/location pages if needed)
|
|
- Client-side validation in `assets/js/form.js`
|
|
- POST to `/api/estimate` (or similar) handled by Python stdlib service
|
|
- Server-side validation, reCAPTCHA verification, send via Resend
|
|
|
|
### Phase 6: SEO Parity
|
|
|
|
Before launch, every old URL must either:
|
|
- Have a matching new URL with the same or better content, OR
|
|
- 301-redirect to a relevant new URL
|
|
|
|
Build a redirect map from the old WP sitemap. Add to `nginx.conf`:
|
|
|
|
```nginx
|
|
location = /old-page-slug { return 301 /new-slug.html; }
|
|
location = /?p=123 { return 301 /about/; }
|
|
```
|
|
|
|
Per-page parity checklist:
|
|
- `<title>` matches or improves on the WP title
|
|
- `<meta name="description">` matches or improves
|
|
- `<link rel="canonical">` is set to the new URL
|
|
- Headings (h1, h2, h3) preserve the topical structure
|
|
- Internal links updated to new URLs
|
|
- Image alt text preserved or improved
|
|
- Schema.org JSON-LD added (`LocalBusiness`, `Service`, `BreadcrumbList`)
|
|
|
|
### Phase 7: Switch DNS / Cutover
|
|
|
|
1. Deploy the static site to a separate URL first (`new.{domain}`) for client review
|
|
2. Once approved, point production DNS to the new container
|
|
3. Keep the WP container running for 14 days as fallback
|
|
4. Submit new sitemap to Google Search Console
|
|
5. Use Search Console URL inspection on 5-10 key pages to confirm indexing
|
|
|
|
### Phase 8: Post-Launch
|
|
|
|
- Monitor Search Console for crawl errors / 404s, fix in nginx as redirects
|
|
- Monitor form submissions: first real lead through the new form is the
|
|
ultimate "it works" check
|
|
- Decommission WP only after 30 days of clean operation
|
|
|
|
### What NOT to Do
|
|
|
|
- Do not run a "headless WordPress" or "WordPress as API": that defeats the
|
|
whole point. Static means static.
|
|
- Do not use a static-site-generator tool (Hugo, 11ty, Jekyll, Astro, Next.js
|
|
static export). We hand-write HTML and use small Python build scripts only
|
|
where data is repeated.
|
|
- Do not migrate the database. Content gets re-written cleaner during migration.
|
|
|
|
---
|
|
|
|
## WP + Divi to AM HTML Pipeline Overview
|
|
|
|
End-to-end playbook for converting a WordPress / Divi site backup (.wpress)
|
|
into an Arising Media vanilla HTML + vanilla JS deployment.
|
|
|
|
### What This Pipeline Does
|
|
|
|
Takes a single `.wpress` archive (All-in-One WP Migration backup) and produces:
|
|
- A fully structured `src/` directory matching AM project layout
|
|
- A CSS design system derived from the original Divi theme settings
|
|
- All page content extracted, cleaned, and re-authored into AM HTML templates
|
|
- All media migrated to WebP and remapped to `/assets/images/`
|
|
- SEO metadata (titles, descriptions, canonicals, schema.org) preserved or improved
|
|
- Docker-ready deployment with nginx + PHP contact form
|
|
|
|
### Philosophy
|
|
|
|
The goal is NOT a 1:1 copy. The goal is:
|
|
1. Preserve all content, SEO equity, and brand identity
|
|
2. ENHANCE the design: cleaner, faster, more modern
|
|
3. Remove all WordPress / Divi bloat (plugin CSS, shortcode residue, 300KB JS bundles)
|
|
4. Produce a site that loads in <2s on mobile and scores 95+ on Lighthouse
|
|
|
|
Every migration is a design upgrade. The Divi site is the reference, not the target.
|
|
|
|
### Divi Version Matters
|
|
|
|
Two distinct extraction paths:
|
|
|
|
| Version | Content Storage | How to detect |
|
|
|---------|----------------|---------------|
|
|
| Divi 4 | `[et_pb_section]` shortcodes in `wp_posts.post_content` | `post_content` contains `[et_pb_` |
|
|
| Divi 5 | Gutenberg blocks (`<!-- wp:divi/section -->`) + JSON in `wp_postmeta` | `post_content` contains `<!-- wp:divi/` |
|
|
|
|
Run Phase 2 (database analysis) first to determine which version before choosing the extraction path.
|
|
|
|
### Pipeline Phases
|
|
|
|
```
|
|
Phase 0 Setup Verify .wpress location, create extraction directory
|
|
Phase 1 Extract Unpack .wpress binary archive to wpress-extract/
|
|
Phase 2 DB Analysis Inspect WordPress database dump, detect Divi version, inventory pages
|
|
Phase 3 Content Extract page content via Divi 4 or Divi 5 path
|
|
Phase 4 Design System Pull colors, fonts, spacing from wp_options → CSS custom properties
|
|
Phase 5 Media Catalog uploads/, convert to WebP, generate image manifest
|
|
Phase 6 Build HTML Map extracted content to AM templates, generate JSON data files
|
|
Phase 7 SEO Port titles, metas, canonicals, schema.org; build redirect map
|
|
Phase 8 Forms Replace Gravity Forms / CF7 with AM vanilla form + Python API
|
|
Phase 9 QA Lighthouse audit, grep for unreplaced placeholders, protection check
|
|
```
|
|
|
|
### Script Reference
|
|
|
|
All scripts live in `.am-webdesign-sops/wp-divi-pipeline/scripts/`.
|
|
|
|
| Script | Phase | Purpose |
|
|
|--------|-------|---------|
|
|
| `extract_wpress.py` | 1 | Unpack .wpress binary archive |
|
|
| `analyze_db.py` | 2 | Parse SQL dump, inventory pages + detect Divi version |
|
|
| `extract_divi4.py` | 3 | Parse et_pb_ shortcodes → structured content JSON |
|
|
| `extract_divi5.py` | 3 | Parse Gutenberg/Divi5 blocks → structured content JSON |
|
|
| `extract_design.py` | 4 | Pull Divi theme options → design-system.json |
|
|
| `extract_media.py` | 5 | Catalog uploads/, emit media-manifest.json |
|
|
| `convert_images.py` | 5 | Batch convert images → WebP |
|
|
| `run_pipeline.sh` | 0-7 | Master script: runs all phases in order |
|
|
|
|
### Per-Project Working Directory
|
|
|
|
```
|
|
{domain}/
|
|
└── .planning/
|
|
├── {domain}-YYYYMMDD-*.wpress ← source archive (never modify)
|
|
├── wpress-extract/ ← Phase 1 output (gitignored)
|
|
│ ├── package.json ← archive metadata
|
|
│ ├── database.sql ← MySQL dump
|
|
│ └── uploads/ ← all media (NOT in wp-content/)
|
|
├── data/
|
|
│ ├── pages.json ← Phase 2 output
|
|
│ ├── design-system.json ← Phase 3 output
|
|
│ └── media-manifest.json ← Phase 4 output
|
|
└── scripts/ ← project-specific overrides if needed
|
|
```
|
|
|
|
### .wpress Extraction Details
|
|
|
|
The `.wpress` binary format is NOT a standard zip or tar. Custom sequential binary format:
|
|
|
|
```
|
|
[HEADER 4377 bytes] [FILE DATA n bytes] [HEADER] [FILE DATA] ...
|
|
```
|
|
|
|
Header breakdown:
|
|
```
|
|
Offset Length Field
|
|
0 255 Filename (null-padded)
|
|
255 14 File size in bytes (ASCII decimal, null-padded)
|
|
269 12 mtime unix timestamp (ASCII decimal, null-padded)
|
|
281 4096 Relative path (null-padded)
|
|
4377 n Raw file bytes (size from header)
|
|
```
|
|
|
|
The archive ends when a header of all null bytes is encountered, or EOF.
|
|
|
|
Extraction script:
|
|
|
|
```bash
|
|
python3 /home/sirdrez/arisingmedia-websites/.am-webdesign-sops/wp-divi-pipeline-to-am-stack/scripts/extract_wpress.py \
|
|
/home/sirdrez/arisingmedia-websites/{domain}/.planning/{file}.wpress \
|
|
/home/sirdrez/arisingmedia-websites/{domain}/.planning/wpress-extract/
|
|
```
|
|
|
|
### Database Analysis
|
|
|
|
Parse the WordPress MySQL dump to inventory pages, detect Divi version,
|
|
extract design settings, and build the data JSON files.
|
|
|
|
```bash
|
|
python3 /home/sirdrez/arisingmedia-websites/.am-webdesign-sops/wp-divi-pipeline/scripts/analyze_db.py \
|
|
{domain}/.planning/wpress-extract/ \
|
|
{domain}/.planning/data/
|
|
```
|
|
|
|
Outputs three files into `.planning/data/`:
|
|
- `pages.json`: all published pages/posts with content and SEO meta
|
|
- `design-system.json`: colors, fonts, Divi settings
|
|
- `site-info.json`: domain, plugin list, WP version, Divi version
|
|
|
|
### Divi 5 Content Extraction
|
|
|
|
Parse raw Divi page content from `pages.json` into clean, structured HTML
|
|
sections ready to map into AM templates.
|
|
|
|
```bash
|
|
python3 /home/sirdrez/arisingmedia-websites/.am-webdesign-sops/wp-divi-pipeline/scripts/extract_divi5.py \
|
|
{domain}/.planning/data/pages.json \
|
|
{domain}/.planning/data/content/
|
|
```
|
|
|
|
Produces one JSON file per page: `content/{slug}.json`
|
|
|
|
Key fields in page JSON:
|
|
- `slug`: page URL slug
|
|
- `title`: page title
|
|
- `seo_title`: SEO title (from Rank Math if available)
|
|
- `seo_description`: SEO description (from Rank Math if available)
|
|
- `sections`: array of content sections with type, background_color, and modules
|
|
|
|
Map each Divi module type to AM component:
|
|
|
|
| Divi module | Extract | Map to AM element |
|
|
|-------------|---------|-------------------|
|
|
| `divi/text` | inner HTML | `<section>`, `<p>`, headings as-is |
|
|
| `divi/button` | `text`, `url` | `<a class="btn-primary">` |
|
|
| `divi/image` | `src`, `alt`, `title` | `<img>` → rewrite to WebP path |
|
|
| `divi/blurb` | icon, title, body | `.am-card` component |
|
|
| `divi/testimonial` | quote, author, company | `.am-testimonial` component |
|
|
| `divi/video` | `src`, poster | `<video>` or YouTube embed |
|
|
| `divi/contact_form` | field list | → replace with AM form |
|
|
| `divi/accordion` | Q+A pairs | `<details><summary>` |
|
|
| `divi/fullwidth_header` | title, subhead, CTA | hero section |
|
|
|
|
Strip Divi class/attribute noise using `clean_divi_html()` from `divi_to_html.py`:
|
|
|
|
```python
|
|
from divi_to_html import clean_divi_html, rewrite_internal_links
|
|
|
|
cleaned = clean_divi_html(raw_html)
|
|
cleaned = rewrite_internal_links(cleaned, staging_hosts=("{domain}",))
|
|
```
|
|
|
|
### Design System Extraction
|
|
|
|
Convert Divi theme settings into AM CSS custom properties.
|
|
|
|
Input: `design-system.json` produced by `analyze_db.py` with fields:
|
|
- `primary_color`: main brand color
|
|
- `body_font`: font family name
|
|
- `header_font`: heading font name
|
|
- `body_font_size`: base font size in px
|
|
- `body_line_height`: line height ratio
|
|
- `divi_version`: "4" or "5"
|
|
- `wp_version`: WordPress version
|
|
- `site_url`: domain
|
|
- `site_name`: brand name
|
|
|
|
Never lift the Divi palette 1:1. Use extracted colors as the base and build a
|
|
full 5-step scale around the primary hue:
|
|
|
|
```css
|
|
:root {
|
|
--color-primary: {extracted-color};
|
|
--color-primary-dark: {darken-by-15%};
|
|
--color-primary-light: {lighten-by-40%};
|
|
--color-surface: #fafafa;
|
|
--color-surface-alt: #f0f7f6;
|
|
--color-text: #1a1a1a;
|
|
--color-text-muted: #5a6e6b;
|
|
--color-border: #c8dedd;
|
|
--color-white: #ffffff;
|
|
|
|
/* Fonts */
|
|
--font-body: '{body-font}', system-ui, sans-serif;
|
|
--font-heading: '{header-font}', Georgia, serif;
|
|
|
|
/* Modular scale (1.25 ratio) */
|
|
--text-xs: 0.75rem; --text-sm: 0.875rem;
|
|
--text-base: 1rem; --text-lg: 1.125rem;
|
|
--text-xl: 1.25rem; --text-2xl: 1.5rem;
|
|
--text-3xl: 1.875rem; --text-4xl: 2.25rem;
|
|
--text-5xl: 3rem; --text-6xl: 3.75rem;
|
|
|
|
/* Spacing scale */
|
|
--space-1: 0.25rem; --space-2: 0.5rem; --space-3: 0.75rem;
|
|
--space-4: 1rem; --space-5: 1.25rem; --space-6: 1.5rem;
|
|
--space-8: 2rem; --space-10: 2.5rem; --space-12: 3rem;
|
|
--space-16: 4rem; --space-20: 5rem; --space-24: 6rem;
|
|
--space-32: 8rem;
|
|
}
|
|
```
|
|
|
|
### Content Migration
|
|
|
|
Map extracted Divi content into AM HTML templates.
|
|
|
|
Build order:
|
|
1. `src/assets/css/main.css`: design tokens, reset, typography, layout grid
|
|
2. `src/assets/css/components.css`: header, footer, hero, cards, forms, nav
|
|
3. `src/components/header.html`: navigation
|
|
4. `src/components/footer.html`: footer links, contact info
|
|
5. `src/assets/js/components.js`: fetch + inject header/footer
|
|
6. `src/assets/js/main.js`: scroll animations, intersection observer
|
|
7. `src/index.html`: home page (this IS the design system in working form)
|
|
8. Remaining pages: about, classes, contact, blog
|
|
9. `src/robots.txt`, `src/sitemap.xml`, `src/404.html`, `src/500.html`
|
|
|
|
For 4+ similar pages (class types, locations), use JSON template build:
|
|
|
|
```
|
|
src/classes/
|
|
├── _template.html ← class detail page template
|
|
├── hatha.html ← generated from classes.json
|
|
├── vinyasa.html
|
|
└── yin.html
|
|
|
|
.planning/data/
|
|
└── classes.json ← array of class objects
|
|
```
|
|
|
|
### Media Assets
|
|
|
|
Migrate WordPress uploads to AM `/assets/images/`, convert to WebP, and
|
|
generate a media manifest for URL remapping.
|
|
|
|
Steps:
|
|
1. Catalog all original media (skip WordPress-generated size variants like `-150x150`)
|
|
2. Copy originals to `src/assets/images/`
|
|
3. Convert to WebP using `cwebp` or Python Pillow
|
|
4. Generate media manifest with old → new URL mapping
|
|
5. Apply manifest during HTML build to rewrite all image paths
|
|
|
|
```bash
|
|
# Catalog originals (skip WP size variants)
|
|
find .planning/wpress-extract/uploads -type f \( -name "*.jpg" -o -name "*.png" \) | \
|
|
grep -v -E "\-[0-9]+x[0-9]+\.(jpg|png)$" > .planning/data/media-originals.txt
|
|
|
|
# Copy and convert
|
|
while IFS= read -r src; do
|
|
cp "$src" "src/assets/images/$(basename $src)"
|
|
done < .planning/data/media-originals.txt
|
|
|
|
cd src/assets/images/
|
|
for img in *.jpg *.png; do
|
|
[ -f "$img" ] || continue
|
|
cwebp -q 82 "$img" -o "${img%.*}.webp" && rm "$img"
|
|
done
|
|
```
|
|
|
|
Remap URLs during HTML build:
|
|
|
|
```python
|
|
import json, re
|
|
|
|
manifest = json.loads(open('.planning/data/media-manifest.json').read())
|
|
url_map = {m['wp_url']: m['am_url'] for m in manifest}
|
|
|
|
def rewrite_media_urls(html: str) -> str:
|
|
for wp_url, am_url in url_map.items():
|
|
html = html.replace(wp_url, am_url)
|
|
return html
|
|
```
|
|
|
|
### SEO Preservation
|
|
|
|
Before building HTML, map every WordPress page URL to its new AM URL and
|
|
ensure title, description, canonical, and schema.org are preserved or improved.
|
|
|
|
Rank Math SEO extraction (already in `pages.json` as `seo_title` and `seo_description`).
|
|
|
|
Priority order for SEO fields:
|
|
1. `seo_title` from Rank Math (if not empty and not a template)
|
|
2. `post_title` with AM format appended: `{Title} | {Brand Name}`
|
|
3. Never leave title as the raw WP default
|
|
|
|
Rank Math title templates use `%` tokens: strip them and rebuild:
|
|
|
|
```python
|
|
import re
|
|
|
|
def clean_rm_title(rm_title: str, post_title: str, site_name: str) -> str:
|
|
if not rm_title or "%" in rm_title:
|
|
return f"{post_title} | {site_name}"
|
|
return rm_title
|
|
|
|
def clean_rm_desc(rm_desc: str) -> str:
|
|
return re.sub(r"%[a-z_]+%", "", rm_desc).strip(" -|")
|
|
```
|
|
|
|
Schema.org by page type:
|
|
|
|
| Page | Schema type | Required fields |
|
|
|------|------------|----------------|
|
|
| Home | `LocalBusiness` | name, url, telephone, address, areaServed, openingHours |
|
|
| About | `AboutPage` + `Organization` | name, description, founders |
|
|
| Contact | `ContactPage` | name, url, telephone, email, address |
|
|
| Blog post | `Article` | headline, datePublished, author, image |
|
|
|
|
Pre-launch SEO audit (all must return empty):
|
|
|
|
```bash
|
|
SITE=src
|
|
|
|
# Every page has title/description/canonical/JSON-LD
|
|
find $SITE -name "*.html" | xargs grep -L '<title>'
|
|
find $SITE -name "*.html" | xargs grep -L 'name="description"'
|
|
find $SITE -name "*.html" | xargs grep -L 'rel="canonical"'
|
|
find $SITE -name "*.html" | xargs grep -L 'application/ld+json'
|
|
|
|
# No WP URLs leaked
|
|
grep -r "wp-content\|wp-admin\|?p=\|?page_id=" $SITE --include="*.html"
|
|
|
|
# No unreplaced placeholders
|
|
grep -r "{{" $SITE --include="*.html"
|
|
|
|
# No Divi residue
|
|
grep -r "et_pb_\|wp:divi" $SITE --include="*.html"
|
|
```
|
|
|
|
### Run Order (Complete Execution Sequence)
|
|
|
|
```bash
|
|
export DOMAIN="{domain}"
|
|
export PROJECT="/home/sirdrez/arisingmedia-websites/$DOMAIN"
|
|
export SOPS="/home/sirdrez/arisingmedia-websites/.am-webdesign-sops"
|
|
export WPRESS=$(ls $PROJECT/.planning/*.wpress | head -1)
|
|
|
|
# Phase 0: Setup
|
|
mkdir -p $PROJECT/{src/{about,services,contact,blog,classes,components,assets/{css,js,images,svg,fonts}},build,infra,api,.planning/{data/{content},scripts,wpress-extract}}
|
|
|
|
# Phase 1: Extract archive
|
|
python3 $SOPS/wp-divi-pipeline/scripts/extract_wpress.py "$WPRESS" "$PROJECT/.planning/wpress-extract/"
|
|
|
|
# Phase 2: Database analysis
|
|
python3 $SOPS/wp-divi-pipeline/scripts/analyze_db.py "$PROJECT/.planning/wpress-extract/" "$PROJECT/.planning/data/"
|
|
|
|
# Phase 3: Content extraction (Divi 5 example)
|
|
python3 $SOPS/wp-divi-pipeline/scripts/extract_divi5.py "$PROJECT/.planning/data/pages.json" "$PROJECT/.planning/data/content/"
|
|
|
|
# Phase 4: Design system (manual: read design-system.json, write main.css)
|
|
|
|
# Phase 5: Media migration
|
|
find $PROJECT/.planning/wpress-extract/uploads -type f \( -name "*.jpg" -o -name "*.png" \) | \
|
|
grep -v -E "\-[0-9]+x[0-9]+\.(jpg|png)$" > $PROJECT/.planning/data/media-originals.txt
|
|
|
|
while IFS= read -r src; do
|
|
cp "$src" "$PROJECT/src/assets/images/$(basename $src)"
|
|
done < $PROJECT/.planning/data/media-originals.txt
|
|
|
|
cd $PROJECT/src/assets/images/
|
|
for img in *.jpg *.png; do
|
|
[ -f "$img" ] || continue
|
|
cwebp -q 82 "$img" -o "${img%.*}.webp" && rm "$img"
|
|
done
|
|
|
|
# Phase 6: Build HTML (manual: per 05-content-migration.md)
|
|
|
|
# Phase 7: SEO audit
|
|
cd $PROJECT/src
|
|
find . -name "*.html" | grep -v "_template" | xargs grep -L '<title>'
|
|
find . -name "*.html" | grep -v "_template" | xargs grep -L 'rel="canonical"'
|
|
|
|
# Phase 8: Docker setup
|
|
docker compose -f $PROJECT/docker-compose.yml build
|
|
docker compose -f $PROJECT/docker-compose.yml up -d
|
|
curl -I http://localhost:PORT/
|
|
|
|
# Phase 9: Protection check
|
|
bash $SOPS/tools/verify-protection.sh https://$DOMAIN
|
|
```
|
|
|
|
---
|
|
|
|
## Docker + Nginx Deployment
|
|
|
|
Every project ships with ALL deployment configs so it can go to either a
|
|
Docker VPS or a cPanel shared host without refactoring.
|
|
|
|
### docker-compose.yml
|
|
|
|
```yaml
|
|
services:
|
|
web:
|
|
image: {domain}-static
|
|
build:
|
|
context: .
|
|
dockerfile: Dockerfile
|
|
ports:
|
|
- "{port}:80"
|
|
depends_on:
|
|
api:
|
|
condition: service_healthy
|
|
restart: unless-stopped
|
|
|
|
api:
|
|
image: {domain}-api
|
|
build:
|
|
context: ./api
|
|
dockerfile: Dockerfile
|
|
env_file: ./api/.env
|
|
expose:
|
|
- "3001"
|
|
healthcheck:
|
|
test: ["CMD", "python3", "-c", "import urllib.request,sys; sys.exit(0 if urllib.request.urlopen('http://localhost:3001/health',timeout=3).status==200 else 1)"]
|
|
interval: 10s
|
|
timeout: 5s
|
|
retries: 3
|
|
restart: unless-stopped
|
|
```
|
|
|
|
Port assignments are unique per project. Track in
|
|
`/home/sirdrez/arisingmedia-websites/PORTS.md` so no two projects collide.
|
|
|
|
### Dockerfile (nginx web container)
|
|
|
|
CRITICAL: the Dockerfile must explicitly list which folders to copy. Never use
|
|
`COPY . /usr/share/nginx/html/` because that copies `.env`, `Dockerfile`,
|
|
build scripts, etc. into the web root where they become URL-accessible.
|
|
|
|
```dockerfile
|
|
FROM nginx:alpine
|
|
|
|
# nginx config: server-only, never served as a static file
|
|
COPY nginx.conf /etc/nginx/conf.d/default.conf
|
|
|
|
# Public website only: explicit list, no wildcards
|
|
COPY index.html /usr/share/nginx/html/
|
|
COPY assets /usr/share/nginx/html/assets/
|
|
COPY components /usr/share/nginx/html/components/
|
|
COPY about /usr/share/nginx/html/about/
|
|
COPY blog /usr/share/nginx/html/blog/
|
|
COPY contact /usr/share/nginx/html/contact/
|
|
COPY locations /usr/share/nginx/html/locations/
|
|
COPY reviews /usr/share/nginx/html/reviews/
|
|
COPY services /usr/share/nginx/html/services/
|
|
|
|
EXPOSE 80
|
|
```
|
|
|
|
### Dockerfile (api Python container)
|
|
|
|
```dockerfile
|
|
FROM python:3.13-alpine
|
|
WORKDIR /app
|
|
COPY server.py .
|
|
EXPOSE 3001
|
|
CMD ["python3", "-u", "server.py"]
|
|
```
|
|
|
|
No pip, no requirements.txt, no node_modules. Python stdlib only.
|
|
|
|
### nginx.conf
|
|
|
|
```nginx
|
|
server {
|
|
listen 80;
|
|
server_name _;
|
|
root /usr/share/nginx/html;
|
|
index index.html;
|
|
|
|
# Defense in depth: deny dotfiles, configs, scripts, source files
|
|
location ~ /\. {
|
|
deny all;
|
|
return 404;
|
|
}
|
|
location ~* \.(env|env\.example|conf|yml|yaml|py|pyc|md|txt|sh|sql|log|bak|old|swp|dockerfile)$ {
|
|
deny all;
|
|
return 404;
|
|
}
|
|
location = /Dockerfile {
|
|
deny all;
|
|
return 404;
|
|
}
|
|
|
|
# API proxy: strip /api/ prefix, forward to Python service
|
|
location /api/ {
|
|
proxy_pass http://api:3001/;
|
|
proxy_http_version 1.1;
|
|
proxy_set_header Host $host;
|
|
proxy_set_header X-Real-IP $remote_addr;
|
|
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
|
|
proxy_read_timeout 10s;
|
|
proxy_connect_timeout 5s;
|
|
}
|
|
|
|
# Flat HTML routing: /locations/buffalo serves /locations/buffalo.html
|
|
location / {
|
|
try_files $uri $uri/ $uri.html =404;
|
|
}
|
|
|
|
# Cache static assets aggressively
|
|
location ~* \.(jpg|jpeg|png|webp|svg|ico|css|js|woff2?|mp4|webm)$ {
|
|
expires 30d;
|
|
add_header Cache-Control "public, immutable";
|
|
access_log off;
|
|
}
|
|
|
|
# Security headers
|
|
add_header X-Frame-Options "SAMEORIGIN";
|
|
add_header X-Content-Type-Options "nosniff";
|
|
add_header X-XSS-Protection "1; mode=block";
|
|
add_header Referrer-Policy "strict-origin-when-cross-origin";
|
|
add_header Permissions-Policy "geolocation=(), microphone=(), camera=()";
|
|
add_header Content-Security-Policy "default-src 'self'; script-src 'self' 'unsafe-inline' https://www.google.com https://www.gstatic.com https://www.recaptcha.net; style-src 'self' 'unsafe-inline' https://fonts.googleapis.com; font-src https://fonts.gstatic.com; img-src 'self' data: https:; object-src 'none'; frame-ancestors 'self'; form-action 'self'; base-uri 'self';";
|
|
add_header Strict-Transport-Security "max-age=31536000; includeSubDomains; preload";
|
|
|
|
# Disable server tokens
|
|
server_tokens off;
|
|
client_max_body_size 16k;
|
|
|
|
gzip on;
|
|
gzip_types text/html text/css application/javascript image/svg+xml;
|
|
gzip_min_length 1024;
|
|
|
|
error_page 404 /404.html;
|
|
error_page 500 /500.html;
|
|
}
|
|
```
|
|
|
|
### .dockerignore
|
|
|
|
Keeps sensitive files out of the build context:
|
|
|
|
```
|
|
.git
|
|
.gitignore
|
|
.dockerignore
|
|
api
|
|
build_*.py
|
|
__pycache__
|
|
*.pyc
|
|
*.md
|
|
*.txt
|
|
review_*.png
|
|
docker-compose.yml
|
|
.DS_Store
|
|
.planning
|
|
```
|
|
|
|
### .gitignore
|
|
|
|
```
|
|
api/.env
|
|
api/__pycache__/
|
|
__pycache__/
|
|
*.pyc
|
|
*.log
|
|
.DS_Store
|
|
```
|
|
|
|
The `api/.env` file is NEVER committed.
|
|
|
|
### Sync from Source to Deployment
|
|
|
|
After every change to source HTML/CSS/JS/assets:
|
|
|
|
```bash
|
|
SITE="/path/to/concept-agent/projects/{domain}/site"
|
|
DEPLOY="/home/sirdrez/arisingmedia-websites/{domain}"
|
|
|
|
rsync -a \
|
|
--exclude=.git --exclude=.planning --exclude=api \
|
|
--exclude=Dockerfile --exclude=nginx.conf --exclude=docker-compose.yml \
|
|
--exclude=.dockerignore --exclude=.gitignore \
|
|
--exclude='build_*.py' --exclude=__pycache__ --exclude=data \
|
|
--exclude='*.md' --exclude='*.txt' --exclude='review_*.png' \
|
|
"$SITE/" "$DEPLOY/"
|
|
|
|
cd "$DEPLOY"
|
|
docker compose up -d --build web
|
|
```
|
|
|
|
### Verify After Deploy
|
|
|
|
Every deploy MUST be audited with `tools/verify-protection.sh` before being
|
|
considered live. The script probes a fixed list of sensitive paths
|
|
(`Dockerfile`, `.env`, `nginx.conf`, `.planning/`, `__pycache__/`, build
|
|
scripts, `.git/`, etc.) and fails if any returns 200.
|
|
|
|
```bash
|
|
~/arisingmedia-websites/.am-webdesign-sops/tools/verify-protection.sh \
|
|
http://localhost:{port}
|
|
```
|
|
|
|
Exit codes:
|
|
- `0` PASS: every sensitive path 404, every required path reachable.
|
|
- `0` PASS (with warnings): protection clean but `/robots.txt` or
|
|
`/sitemap.xml` missing (content gap, not a leak).
|
|
- `1` FAIL: at least one sensitive path returned 200, or `/` is unreachable.
|
|
|
|
Run it manually after every `docker compose up -d --build`. Wire it into CI
|
|
once the site has a remote pipeline. Treat a FAIL as a deploy rollback.
|
|
|
|
For ad-hoc spot checks:
|
|
|
|
```bash
|
|
curl -s -o /dev/null -w "site: %{http_code}\n" http://localhost:{port}/
|
|
curl -s -o /dev/null -w "css: %{http_code}\n" http://localhost:{port}/assets/css/main.css
|
|
curl -s -o /dev/null -w "api: %{http_code}\n" http://localhost:{port}/api/health
|
|
```
|
|
|
|
All public paths return 200. All sensitive paths return 404.
|
|
|
|
### Project Folder Rename Procedure
|
|
|
|
WHY: Docker Compose derives its project name from the folder the
|
|
`docker-compose.yml` lives in. Renaming the folder changes the compose project
|
|
name, which orphans any running containers under the old name.
|
|
|
|
The fix is to explicitly remove the old container before bringing up the new
|
|
compose project:
|
|
|
|
```bash
|
|
# Stop and remove the old container by its known name
|
|
docker stop {container-name}
|
|
docker rm {container-name}
|
|
|
|
# Now bring up from the renamed folder: clean start
|
|
docker compose -f /path/to/renamed-folder/docker-compose.yml up -d
|
|
```
|
|
|
|
Always confirm the env vars loaded correctly after restart:
|
|
|
|
```bash
|
|
docker exec {container-name} env | grep RESEND
|
|
```
|
|
|
|
---
|
|
|
|
## cPanel + Apache Deployment
|
|
|
|
Use this deployment method when the client's host is cPanel-based (shared hosting,
|
|
WHM, Bluehost, HostGator, SiteGround, etc.) instead of a VPS running Docker.
|
|
|
|
### Key Rule: Repo Path ≠ Webroot
|
|
|
|
cPanel Git requires an EMPTY directory as the repository path. The webroot
|
|
(`public_html/{domain}/`) is never the repo path: cPanel rejects it if it
|
|
already contains files.
|
|
|
|
```
|
|
Repo path (empty dir): /home/{username}/repositories/{domain}/
|
|
Deploy target (webroot): /home/{username}/public_html/{domain}/
|
|
```
|
|
|
|
### Setting Up the Repo in cPanel
|
|
|
|
1. cPanel → Git Version Control → Create Repository
|
|
2. Repository Path: `/home/{username}/repositories/{domain}/` (must be empty)
|
|
3. Clone URL: your Git remote (GitHub, Bitbucket, etc.)
|
|
4. cPanel clones into the repo path: never into the webroot
|
|
|
|
### .cpanel.yml
|
|
|
|
This file lives in the repo root and tells cPanel what to copy to the webroot
|
|
on every push/deploy. All paths are relative to the repo root.
|
|
|
|
```yaml
|
|
---
|
|
deployment:
|
|
tasks:
|
|
- export DEPLOYPATH=/home/{username}/public_html/{domain}/
|
|
- /bin/cp -r assets $DEPLOYPATH
|
|
- /bin/cp -r about $DEPLOYPATH
|
|
- /bin/cp -r commercial $DEPLOYPATH
|
|
- /bin/cp -r contact $DEPLOYPATH
|
|
- /bin/cp -r locations $DEPLOYPATH
|
|
- /bin/cp -r reviews $DEPLOYPATH
|
|
- /bin/cp -r service-area $DEPLOYPATH
|
|
- /bin/cp -r services $DEPLOYPATH
|
|
- /bin/cp index.html $DEPLOYPATH
|
|
- /bin/cp 404.html $DEPLOYPATH
|
|
- /bin/cp robots.txt $DEPLOYPATH
|
|
- /bin/cp sitemap.xml $DEPLOYPATH
|
|
```
|
|
|
|
Add or remove folder cp lines to match the project's actual directory structure.
|
|
Do NOT copy: `tools/`, `*.py`, `*.md`, `.git/`, `docker-compose.yml`, `Dockerfile`.
|
|
|
|
### Lahrcarpetcleaning.com Reference
|
|
|
|
```yaml
|
|
---
|
|
deployment:
|
|
tasks:
|
|
- export DEPLOYPATH=/home/dev1communitypro/public_html/lahrcarpetcleaning.dev1.communityproud.com/
|
|
- /bin/cp -r assets $DEPLOYPATH
|
|
- /bin/cp -r about $DEPLOYPATH
|
|
- /bin/cp -r commercial $DEPLOYPATH
|
|
- /bin/cp -r contact $DEPLOYPATH
|
|
- /bin/cp -r locations $DEPLOYPATH
|
|
- /bin/cp -r reviews $DEPLOYPATH
|
|
- /bin/cp -r service-area $DEPLOYPATH
|
|
- /bin/cp -r services $DEPLOYPATH
|
|
- /bin/cp index.html $DEPLOYPATH
|
|
- /bin/cp 404.html $DEPLOYPATH
|
|
- /bin/cp robots.txt $DEPLOYPATH
|
|
- /bin/cp sitemap.xml $DEPLOYPATH
|
|
```
|
|
|
|
### Deploying After a Push
|
|
|
|
1. Push to the connected remote (GitHub)
|
|
2. cPanel → Git Version Control → Manage → Pull or Deploy
|
|
3. cPanel runs the `.cpanel.yml` tasks, copying files to webroot
|
|
4. Apache serves from webroot automatically: no nginx, no Docker
|
|
|
|
### Apache vs nginx
|
|
|
|
cPanel hosts use Apache (not nginx). There is no nginx.conf to manage.
|
|
URL routing is handled by `.htaccess`:
|
|
|
|
```apache
|
|
Options -Indexes
|
|
RewriteEngine On
|
|
|
|
# Directory-style URLs: /services/carpet-cleaning/ → index.html inside that folder
|
|
# Apache handles this automatically with DirectoryIndex: no extra rules needed
|
|
|
|
# Deny sensitive files
|
|
<FilesMatch "\.(py|yml|yaml|md|log|sh|env|conf|dockerfile)$">
|
|
Order allow,deny
|
|
Deny from all
|
|
</FilesMatch>
|
|
|
|
# Security headers
|
|
<IfModule mod_headers.c>
|
|
Header set X-Frame-Options "SAMEORIGIN"
|
|
Header set X-Content-Type-Options "nosniff"
|
|
Header set X-XSS-Protection "1; mode=block"
|
|
Header set Referrer-Policy "strict-origin-when-cross-origin"
|
|
Header set Permissions-Policy "geolocation=(), microphone=(), camera=()"
|
|
Header set Strict-Transport-Security "max-age=31536000; includeSubDomains"
|
|
</IfModule>
|
|
|
|
ErrorDocument 404 /404.html
|
|
ErrorDocument 500 /500.html
|
|
```
|
|
|
|
### Cache Busting on cPanel
|
|
|
|
Apache does not auto-invalidate cached assets. Bump `?v=N` on CSS/JS in
|
|
all HTML files after every asset change:
|
|
|
|
```html
|
|
<link rel="stylesheet" href="/assets/css/styles.css?v=6">
|
|
<script src="/assets/js/main.js?v=3"></script>
|
|
```
|
|
|
|
Increment by 1 on every change. Apply across ALL HTML pages.
|
|
|
|
### Verify After cPanel Deploy
|
|
|
|
```bash
|
|
curl -s -o /dev/null -w "home: %{http_code}\n" https://{domain}/
|
|
curl -s -o /dev/null -w "css: %{http_code}\n" https://{domain}/assets/css/styles.css
|
|
curl -s -o /dev/null -w "404: %{http_code}\n" https://{domain}/page-that-does-not-exist
|
|
```
|
|
|
|
All public paths return 200. All non-existent paths return 404.
|
|
|
|
### Universal Project Checklist (Both Paths)
|
|
|
|
Every project must include ALL of these before first deploy:
|
|
|
|
```
|
|
Dockerfile ✓ Docker/VPS
|
|
docker-compose.yml ✓ Docker/VPS
|
|
nginx.conf ✓ Docker/VPS
|
|
.htaccess ✓ cPanel/Apache
|
|
.cpanel.yml ✓ cPanel Git
|
|
.dockerignore ✓ Docker build security
|
|
.gitignore ✓ keeps .env and secrets out of git
|
|
robots.txt ✓ both paths
|
|
sitemap.xml ✓ both paths
|
|
404.html ✓ both paths
|
|
500.html ✓ both paths
|
|
```
|
|
|
|
Lahrcarpetcleaning.com is the reference implementation for both paths.
|
|
|
|
---
|
|
|
|
## Domain, Email, DNS, and Resend
|
|
|
|
### Resend Account Setup
|
|
|
|
1. Sign up at https://resend.com
|
|
2. Generate an API key (one per project): https://resend.com/api-keys
|
|
3. Save the key in the project's `api/.env` as `RESEND_API_KEY=re_xxxx`
|
|
4. NEVER commit `.env`. NEVER paste the key in Slack, GitHub, or chat logs.
|
|
|
|
### Add and Verify the Sending Domain
|
|
|
|
1. https://resend.com/domains → **Add Domain**
|
|
2. Enter the domain (the one you'll send FROM, not necessarily the website domain)
|
|
3. Resend gives 3-4 DNS records. Add them all in Cloudflare (or whatever DNS host)
|
|
4. Wait 5-15 minutes, click **Verify** in Resend: all records must show green
|
|
|
|
### Records Resend Provides
|
|
|
|
| Type | Name | Value | Proxy | TTL |
|
|
|------|------|-------|-------|-----|
|
|
| TXT | `resend._domainkey` | `p=...long-rsa-key...` | DNS only | 1 hr |
|
|
| TXT | `send` | `v=spf1 include:amazonses.com ~all` | DNS only | 1 hr |
|
|
| MX | `send` | `feedback-smtp.{region}.amazonses.com` priority 10 | DNS only | 1 hr |
|
|
|
|
(Resend uses Amazon SES under the hood, hence `amazonses.com` in the SPF.)
|
|
|
|
### DMARC: REQUIRED for Inbox Placement
|
|
|
|
Without DMARC, Gmail flags otherwise-correctly-configured email as suspicious
|
|
and routes it to spam. Resend doesn't auto-create this record. You must add it.
|
|
|
|
| Type | Name | Value | Proxy | TTL |
|
|
|------|------|-------|-------|-----|
|
|
| TXT | `_dmarc` | `v=DMARC1; p=none; rua=mailto:dev@{domain}` | DNS only | Auto |
|
|
|
|
Components:
|
|
- `v=DMARC1`: declares a DMARC policy exists
|
|
- `p=none`: monitor mode, doesn't reject anything yet (safe to start)
|
|
- `rua=mailto:...`: DMARC failure reports go to this inbox (review weekly)
|
|
|
|
After 30 days of clean DMARC reports with no false positives, optionally
|
|
upgrade to `p=quarantine` then `p=reject`.
|
|
|
|
### Verify DNS is Live
|
|
|
|
```bash
|
|
dig +short TXT resend._domainkey.{domain} @8.8.8.8
|
|
dig +short TXT send.{domain} @8.8.8.8
|
|
dig +short TXT _dmarc.{domain} @8.8.8.8
|
|
dig +short MX send.{domain} @8.8.8.8
|
|
```
|
|
|
|
All four should return their expected values.
|
|
|
|
### From-Name Format
|
|
|
|
Always use a friendly From name, not bare email. Bare email looks robotic
|
|
and triggers spam filters.
|
|
|
|
```
|
|
FROM_EMAIL=Brand Name <webleads@{domain}>
|
|
```
|
|
|
|
### TO-Email Setup
|
|
|
|
The `TO_EMAIL` is wherever the lead actually goes. Often a Gmail group address
|
|
or the owner's personal inbox.
|
|
|
|
- During Resend domain verification (BEFORE green): you can ONLY send TO the
|
|
email tied to the Resend account
|
|
- After verification: send to anyone
|
|
|
|
For local testing without verification, use:
|
|
```
|
|
FROM_EMAIL=onboarding@resend.dev
|
|
TO_EMAIL={your-resend-account-email}
|
|
```
|
|
|
|
### When Emails Go to Spam
|
|
|
|
Run this checklist:
|
|
|
|
1. **All 4 DNS records green at Resend**? If not, deliverability suffers.
|
|
2. **DMARC TXT record exists**? Most common cause of spam folder.
|
|
3. **Friendly From name**? `Brand Name <webleads@...>` not bare `webleads@...`
|
|
4. **Both `html` and `text` parts in the payload**? HTML-only is suspicious.
|
|
5. **Subject line clean**? No em-dashes, no "Estimate Request URGENT", no all-caps.
|
|
6. **Recipient marked first emails as Not Spam**? Train Gmail.
|
|
|
|
### Cloudflare-Specific Notes
|
|
|
|
The user-agent quirk: Cloudflare in front of Resend's API blocks Python's default
|
|
`User-Agent: Python-urllib/3.x`. Always set a custom `User-Agent` in the API request headers.
|
|
|
|
If the DNS provider is Cloudflare, ensure all Resend records have **proxy status: DNS only**
|
|
(the gray cloud icon, not orange). Proxying these breaks authentication.
|
|
|
|
### Annual Key Rotation
|
|
|
|
Rotate Resend API keys annually:
|
|
1. Generate new key in Resend dashboard
|
|
2. Update `api/.env` on the server
|
|
3. `docker compose down && docker compose up -d` to reload env
|
|
4. Confirm a test submission still works
|
|
5. Revoke the old key in Resend dashboard
|
|
|
|
### Resend HTTP 403: Domain Not Verified
|
|
|
|
A 403 from the Resend API does NOT mean the API key is wrong. The specific
|
|
error is:
|
|
|
|
```json
|
|
{"statusCode":403,"message":"The {domain} domain is not verified. Please, add and verify your domain on https://resend.com/domains","name":"validation_error"}
|
|
```
|
|
|
|
This means the key is valid and authenticated, but the FROM domain has not
|
|
been added or verified at resend.com/domains yet.
|
|
|
|
Rule: **verify the domain BEFORE testing the form endpoint.** If you test
|
|
before verification, `{"ok":false}` will be returned to the visitor even
|
|
though the API key is correct and the code is correct.
|
|
|
|
Sequence:
|
|
1. Set `RESEND_API_KEY` in `.env`
|
|
2. Add domain at resend.com/domains
|
|
3. Add DNS records in Cloudflare
|
|
4. Wait for green verification
|
|
5. Then test the form endpoint
|
|
|
|
### DKIM Key Rotation
|
|
|
|
Resend periodically rotates DKIM keys. They send email when this happens. Add
|
|
the new `resend2._domainkey` (or whichever selector they specify) TXT record
|
|
in Cloudflare, then click verify. Old key remains active until they remove it.
|
|
|
|
---
|
|
|
|
## Form Handling: Resend
|
|
|
|
Static sites can't send email by themselves. Every project that needs a
|
|
contact form gets a small Python service running in its own Docker container,
|
|
proxied by nginx.
|
|
|
|
### Architecture
|
|
|
|
```
|
|
Browser → POST /api/estimate (vanilla JS fetch in form.js)
|
|
↓
|
|
nginx → proxies /api/ to api:3001 (strips /api/ prefix)
|
|
↓
|
|
Python service (server.py, stdlib only)
|
|
- Validates fields server-side
|
|
- Verifies reCAPTCHA v3 with Google
|
|
- Sends via Resend HTTPS API
|
|
- Returns {ok: true} or {error: ...}
|
|
```
|
|
|
|
### Front-End (Vanilla JS)
|
|
|
|
`assets/js/form.js`:
|
|
|
|
- Real-time validation (blur events)
|
|
- Phone formatting `(###) ###-####`
|
|
- Email regex check
|
|
- Required-field check
|
|
- Async submit to `/api/estimate` with JSON body
|
|
- Disable submit button + show "Sending..." during request
|
|
- Show success/error message in `.form-status` span
|
|
- Reset form on success
|
|
- reCAPTCHA v3 token fetched before submit and included in body
|
|
|
|
### Back-End (Python stdlib)
|
|
|
|
`api/server.py` (skeleton):
|
|
|
|
```python
|
|
#!/usr/bin/env python3
|
|
import hashlib, http.server, json, os, re, socketserver, time
|
|
import urllib.parse, urllib.request
|
|
|
|
PORT = int(os.environ.get("PORT", "3001"))
|
|
RESEND_API_KEY = os.environ.get("RESEND_API_KEY", "")
|
|
RECAPTCHA_SECRET = os.environ.get("RECAPTCHA_SECRET", "")
|
|
TO_EMAIL = os.environ.get("TO_EMAIL", "")
|
|
FROM_EMAIL = os.environ.get("FROM_EMAIL", "")
|
|
RECAPTCHA_MIN = float(os.environ.get("RECAPTCHA_MIN", "0.5"))
|
|
|
|
PHONE_RE = re.compile(r"^\(?\d{3}\)?[\s.\-]?\d{3}[\s.\-]?\d{4}$")
|
|
EMAIL_RE = re.compile(r"^[^\s@]+@[^\s@]+\.[^\s@]+$")
|
|
|
|
# Rate limit: 5 requests / IP / 15 minutes
|
|
RATE_MAP = {}
|
|
RATE_WINDOW = 15 * 60
|
|
RATE_MAX = 5
|
|
|
|
def sanitize(s):
|
|
if not isinstance(s, str): return ""
|
|
return s.replace("&","&").replace("<","<").replace(">",">").replace('"',""").strip()[:2000]
|
|
|
|
def validate_fields(body):
|
|
errors = []
|
|
if not body.get("name") or len((body["name"]).strip()) < 2: errors.append("name")
|
|
if not EMAIL_RE.match((body.get("email") or "").strip()): errors.append("email")
|
|
if not PHONE_RE.match((body.get("phone") or "").replace(" ", "")): errors.append("phone")
|
|
return errors
|
|
|
|
def verify_recaptcha(token):
|
|
if not RECAPTCHA_SECRET or not token: return 0.0
|
|
data = urllib.parse.urlencode({"secret": RECAPTCHA_SECRET, "response": token}).encode()
|
|
req = urllib.request.Request("https://www.google.com/recaptcha/api/siteverify", data=data)
|
|
try:
|
|
with urllib.request.urlopen(req, timeout=8) as resp:
|
|
return float(json.loads(resp.read()).get("score", 0))
|
|
except Exception:
|
|
return 0.0
|
|
|
|
def send_via_resend(fields):
|
|
safe = {k: sanitize(fields.get(k,"")) for k in ["name","email","phone","address","city","zip","service","condition","message"]}
|
|
html = f"""<!DOCTYPE html>...{safe['name']}..."""
|
|
text = f"New estimate request\n\nName: {safe['name']}\n..."
|
|
payload = json.dumps({
|
|
"from": FROM_EMAIL,
|
|
"to": [TO_EMAIL],
|
|
"reply_to": fields.get("email","").strip(),
|
|
"subject": f"New estimate request: {safe['name']} ({safe['city']})",
|
|
"html": html, "text": text,
|
|
}).encode("utf-8")
|
|
idem = hashlib.sha256(payload).hexdigest()[:64]
|
|
req = urllib.request.Request("https://api.resend.com/emails", data=payload, headers={
|
|
"Authorization": f"Bearer {RESEND_API_KEY}",
|
|
"Content-Type": "application/json",
|
|
"Idempotency-Key": idem,
|
|
"User-Agent": "{Brand}-Estimate-Form/1.0",
|
|
})
|
|
try:
|
|
with urllib.request.urlopen(req, timeout=10) as resp:
|
|
if resp.status >= 300: raise RuntimeError(f"Resend {resp.status}: {resp.read().decode('utf-8','ignore')}")
|
|
except urllib.error.HTTPError as e:
|
|
raise RuntimeError(f"Resend {e.code}: {e.read().decode('utf-8','ignore')}") from None
|
|
```
|
|
|
|
Reference implementation: `floorithardwoodfloors.com/api/server.py`.
|
|
|
|
### Critical: User-Agent Header
|
|
|
|
When calling the Resend API from Python, you MUST set a non-default User-Agent.
|
|
Cloudflare (which fronts Resend) blocks Python's default `Python-urllib/3.x`
|
|
with HTTP 403 / Cloudflare error code 1010.
|
|
|
|
```python
|
|
"User-Agent": "{ProjectName}-Form/1.0"
|
|
```
|
|
|
|
### Idempotency
|
|
|
|
Every Resend request includes an `Idempotency-Key` header set to the SHA-256
|
|
of the payload (truncated to 64 chars). Identical payloads within 24 hours
|
|
are deduplicated by Resend automatically. This prevents:
|
|
- Double-clicks creating two leads
|
|
- Browser retries after a network blip
|
|
- Honest user submitting twice
|
|
|
|
### Security Checklist
|
|
|
|
- API key in `.env` file, NOT in source control. `.gitignore` it.
|
|
- API key NEVER reaches the browser bundle (only the server has it)
|
|
- `.env` file lives in `api/`, NOT in the nginx web root
|
|
- Server-side validation on EVERY field: never trust client
|
|
- HTML-escape every field rendered into the email body to prevent injection
|
|
- Rate limit per IP (5 / 15 min default)
|
|
- 16 KB body cap: reject anything larger
|
|
- 10-second upstream timeout: don't hold connections open
|
|
- CORS locked to the production domain only (`Access-Control-Allow-Origin: https://{domain}`)
|
|
- reCAPTCHA v3 with score threshold (default 0.5) once secret is configured
|
|
|
|
### Environment Variables
|
|
|
|
`api/.env`:
|
|
```
|
|
RESEND_API_KEY=re_xxxxxxxxxxxx
|
|
RECAPTCHA_SECRET=6Ldq...
|
|
TO_EMAIL=leads@{domain}
|
|
FROM_EMAIL=Brand Name <webleads@{domain}>
|
|
RECAPTCHA_MIN=0.5
|
|
PORT=3001
|
|
```
|
|
|
|
`api/.env.example` (committed) is the same file with placeholder values.
|
|
|
|
### reCAPTCHA Setup
|
|
|
|
1. Create site at https://www.google.com/recaptcha/admin
|
|
2. Type: **reCAPTCHA v3** (not v2)
|
|
3. Add your domain
|
|
4. Copy the **site key** into `assets/js/form.js`:
|
|
```js
|
|
const RECAPTCHA_SITE_KEY = '6Ldq...';
|
|
```
|
|
5. Add the script tag to pages with the form:
|
|
```html
|
|
<script src="https://www.google.com/recaptcha/api.js?render=6Ldq..."></script>
|
|
```
|
|
6. Copy the **secret key** into `api/.env` as `RECAPTCHA_SECRET`
|
|
|
|
### Deliverability Checklist
|
|
|
|
When emails are landing in spam:
|
|
1. Verify Resend domain is fully green (SPF + DKIM + DMARC)
|
|
2. From name set, not bare email: `Brand Name <webleads@{domain}>`
|
|
3. Both `html` and `text` parts in every Resend payload (no HTML-only)
|
|
4. Subject line is descriptive, no em-dash, no spam-trigger words
|
|
5. Recipient marks first 2-3 emails as "Not Spam" in Gmail to train the filter
|
|
|
|
### Testing
|
|
|
|
```bash
|
|
# Validation rejection (expect 422)
|
|
curl -X POST http://localhost:8096/api/estimate \
|
|
-H "Content-Type: application/json" \
|
|
-d '{"name":"","email":"bad"}'
|
|
|
|
# Full valid submission (expect 200, real email sent)
|
|
curl -X POST http://localhost:8096/api/estimate \
|
|
-H "Content-Type: application/json" \
|
|
-d '{"name":"Test","email":"test@example.com","phone":"(716) 555-1234","address":"100 Test St","city":"Buffalo","zip":"14201","service":"refinishing","message":"Test","token":""}'
|
|
```
|
|
|
|
The first real test email confirms end-to-end works.
|
|
|
|
---
|
|
|
|
## PHP App Stack (Server-Side Processing)
|
|
|
|
Use this pattern when a project requires server-side processing that static HTML cannot handle: file conversion, at-rest encryption, payment processing, user authentication, or API-gated features.
|
|
|
|
**Reference implementation:** `quickconvert.us`
|
|
|
|
### When to Use This Pattern
|
|
|
|
- File uploads and processing (image conversion, PDF generation, etc.)
|
|
- At-rest encryption of user data
|
|
- Payment processing with Stripe subscriptions
|
|
- User authentication with magic link or password-based login
|
|
- Rate-limited APIs that must be server-enforced
|
|
|
|
**Do not** introduce this pattern just to add a contact form. Use the Python stdlib form service instead.
|
|
|
|
### Stack
|
|
|
|
- **PHP 8.3** (php:8.3-fpm-alpine base image)
|
|
- **Nginx** (Alpine package, same container via supervisord)
|
|
- **SQLite** (pdo_sqlite extension, no separate DB container needed)
|
|
- **libsodium** (built into PHP 8.x: use for all encryption)
|
|
- **ImageMagick** (pecl imagick for image processing)
|
|
- **msmtp** (SMTP relay for outbound email)
|
|
- **supervisord** (manages nginx + php-fpm + crond in one container)
|
|
|
|
### Project Structure
|
|
|
|
```
|
|
project/
|
|
├── src/ ← nginx document root
|
|
│ ├── index.php
|
|
│ ├── api/
|
|
│ │ ├── convert.php ← POST endpoint (CSRF + reCAPTCHA protected)
|
|
│ │ └── download.php ← GET endpoint (signed token)
|
|
│ ├── assets/css/
|
|
│ ├── assets/js/
|
|
│ └── assets/images/
|
|
├── includes/ ← PHP classes (above doc root, not web-accessible)
|
|
│ ├── bootstrap.php ← constants, session, autoload
|
|
│ ├── auth.php ← login, register, magic token
|
|
│ ├── csrf.php
|
|
│ ├── db.php ← SQLite PDO wrapper
|
|
│ ├── encryption.php ← libsodium wrappers
|
|
│ └── mailer.php
|
|
├── components/
|
|
│ ├── header.php
|
|
│ └── footer.php
|
|
├── storage/ ← volume-mounted, NOT in docker image
|
|
│ ├── uploads/ ← encrypted .enc files only
|
|
│ ├── converted/
|
|
│ ├── temp/
|
|
│ ├── .htaccess ← deny all direct access
|
|
│ └── {app}.db
|
|
├── infra/
|
|
│ ├── nginx.conf
|
|
│ ├── php.ini
|
|
│ ├── supervisord.conf
|
|
│ └── docker-entrypoint.sh
|
|
├── tools/
|
|
│ └── cleanup.php ← cron: delete expired tokens + files
|
|
├── Dockerfile
|
|
├── docker-compose.yml
|
|
└── .env ← gitignored, never committed
|
|
```
|
|
|
|
### Security Requirements (Non-Negotiable)
|
|
|
|
**CSRF**: every POST form and API endpoint must verify a CSRF token tied to the session.
|
|
|
|
**Rate limiting**: two layers:
|
|
1. nginx: `limit_req_zone` on /api/ (10 req/s, burst 20)
|
|
2. PHP: per-IP daily counter in SQLite rate_limits table
|
|
|
|
**reCAPTCHA v3**: on conversion/upload endpoints. Verify server-side via Google API. Cache result in session (verify once per session, not per request).
|
|
|
|
**At-rest encryption**: any user-uploaded file must be encrypted before writing to disk. Use `sodium_crypto_secretstream_xchacha20poly1305_*` for files, `sodium_crypto_secretbox` for strings. Key stored in `.env` as `QC_ENCRYPTION_KEY` (32 bytes hex).
|
|
|
|
**Signed download tokens**: never expose file paths. Issue a 64-char hex token stored in SQLite with expiry and single-use enforcement.
|
|
|
|
**Magic link auth**: prefer magic link over password. On register: create account unverified, send verify email, block login until verified. Token: 64-char hex, 1-hour expiry, stored in `magic_tokens` table, consumed on use.
|
|
|
|
### Nginx Security Headers
|
|
|
|
```nginx
|
|
add_header X-Frame-Options "SAMEORIGIN" always;
|
|
add_header X-Content-Type-Options "nosniff" always;
|
|
add_header Referrer-Policy "strict-origin-when-cross-origin" always;
|
|
add_header Permissions-Policy "camera=(), microphone=(), geolocation=()" always;
|
|
add_header Content-Security-Policy "default-src 'self'; script-src 'self' 'unsafe-inline' https://www.google.com https://www.gstatic.com; style-src 'self' 'unsafe-inline'; img-src 'self' data: https:; object-src 'none'; base-uri 'self'; form-action 'self' https://checkout.stripe.com;" always;
|
|
|
|
# Stripe webhook: POST only
|
|
location = /api/stripe-webhook.php {
|
|
limit_except POST { deny all; }
|
|
}
|
|
|
|
# Block dotfiles
|
|
location ~ /\. { deny all; return 403; }
|
|
```
|
|
|
|
### Database Schema Pattern (SQLite, Idempotent)
|
|
|
|
Use `CREATE TABLE IF NOT EXISTS` for all tables. Use `ALTER TABLE ... ADD COLUMN` wrapped in try/catch for schema migrations.
|
|
|
|
```php
|
|
try { $pdo->exec("ALTER TABLE users ADD COLUMN verified_at INTEGER DEFAULT NULL"); }
|
|
catch (Throwable $e) { /* column already exists */ }
|
|
```
|
|
|
|
### Stripe Integration
|
|
|
|
- Checkout: create session server-side, redirect to Stripe-hosted page
|
|
- Webhook: verify `Stripe-Signature` header using HMAC-SHA256 (implement without Stripe SDK: use curl)
|
|
- Webhook tolerance: 300 seconds (5 min) on timestamp
|
|
- Register webhook endpoint at: `https://{domain}/api/stripe-webhook.php`
|
|
- Events to subscribe: `checkout.session.completed`, `customer.subscription.created`, `customer.subscription.updated`, `customer.subscription.deleted`, `invoice.payment_succeeded`, `invoice.payment_failed`
|
|
|
|
### .env Required Vars
|
|
|
|
```
|
|
APP_ENV=production
|
|
BASE_URL=https://{domain}
|
|
QC_ENCRYPTION_KEY={32-bytes-hex}
|
|
STRIPE_MODE=live
|
|
STRIPE_LIVE_SECRET_KEY=sk_live_...
|
|
STRIPE_LIVE_PUBLISHABLE_KEY=pk_live_...
|
|
STRIPE_WEBHOOK_SECRET=whsec_...
|
|
STRIPE_PRICE_ID=price_...
|
|
RECAPTCHA_SITE_KEY=...
|
|
RECAPTCHA_SECRET_KEY=...
|
|
SMTP_HOST=...
|
|
SMTP_PORT=587
|
|
SMTP_USER=...
|
|
SMTP_PASS=...
|
|
MAIL_FROM=noreply@{domain}
|
|
MAIL_FROM_NAME={Brand}
|
|
```
|
|
|
|
Generate encryption key: `php -r "echo bin2hex(random_bytes(32));"`
|