recent updates
This commit is contained in:
@@ -0,0 +1,94 @@
|
||||
# 00 — WP + Divi to AM Stack A Pipeline — Overview
|
||||
|
||||
Converts a .wpress archive (All-in-One WP Migration) into a Stack A deployment:
|
||||
PHP router + SQLite databases + vanilla JS/CSS. Never a 1:1 Divi copy.
|
||||
Every migration is a content extraction and redesign, not a port.
|
||||
|
||||
## Stack A output (what this pipeline produces)
|
||||
|
||||
```
|
||||
src/api/router.php URL dispatcher
|
||||
src/api/contact.php form handler (Resend via curl)
|
||||
src/api/templates/*.php home | static | classes | schedule | glossary | blog
|
||||
src/api/components/_header.php nav from nav.sqlite
|
||||
src/api/components/_footer.php
|
||||
src/api/data/*.sqlite one DB per content domain (see 09-stack-a-output.md)
|
||||
build/seed_databases.py creates + seeds all SQLite DBs — THE source of truth
|
||||
assets/ vanilla CSS/JS/images
|
||||
infra/nginx.conf, supervisord.conf, php-fpm-pool.conf
|
||||
Dockerfile (php:8.3-fpm-alpine)
|
||||
docker-compose.yml
|
||||
```
|
||||
|
||||
## Why NOT static HTML
|
||||
|
||||
Any site with a glossary, blog, schedule, or recurring content model gets Stack A.
|
||||
Editing content = edit seed_databases.py → reseed → rebuild. No PHP file edits.
|
||||
|
||||
## Divi is the data source, not the design target
|
||||
|
||||
Extract from Divi:
|
||||
- Page content (headings, body copy, CTAs)
|
||||
- Navigation menus (wp_terms + wp_termmeta)
|
||||
- Header logo + tagline (wp_options: blogname, blogdescription, et_divi)
|
||||
- Media (uploads/ → WebP → assets/images/)
|
||||
- Design tokens (colors, fonts → tokens.css)
|
||||
- SEO (Yoast wp_postmeta → pages.sqlite meta_description)
|
||||
- Blog posts (wp_posts where post_type=post)
|
||||
- Custom post types (testimonials, FAQs, glossary terms if present)
|
||||
|
||||
Do NOT replicate:
|
||||
- Divi section/row/column grid structure
|
||||
- Divi module types (blurbs, toggles, CTAs, pricing tables)
|
||||
- WordPress page slugs (map to clean slugs per nginx.conf pattern)
|
||||
- WordPress menu item IDs
|
||||
|
||||
## Pipeline phases
|
||||
|
||||
```
|
||||
Phase 0 Setup Point pipeline at .wpress file; create working dirs
|
||||
Phase 1 Extract Unpack .wpress → wpress-extract/
|
||||
Phase 2 DB Analysis Parse SQL dump; detect Divi version; inventory pages, posts, menus
|
||||
Phase 3 Content Extract page sections + nav menus + blog posts from Divi
|
||||
Phase 4 Design Pull colors + fonts → tokens.css draft
|
||||
Phase 5 Media Catalog uploads/; convert to WebP; build media-manifest.json
|
||||
Phase 6 Staging Map extracted JSON → seed_databases.py skeleton (content on standby)
|
||||
Phase 7 Fill Agent fills each SQLite table row by row from staged JSON
|
||||
Phase 8 Templates Scaffold PHP templates + components from AM reference
|
||||
Phase 9 SEO Port titles, metas, canonicals, schema.org, redirect map
|
||||
Phase 10 Build docker compose build && docker compose up -d
|
||||
Phase 11 QA Lighthouse, protection check, grep for Divi residue
|
||||
```
|
||||
|
||||
## CLI launcher
|
||||
|
||||
```
|
||||
python3 scripts/migrate.py --wpress /path/to/backup.wpress --domain example.com
|
||||
```
|
||||
|
||||
Runs phases 0-6 automatically, then prints agent breadcrumbs for phases 7-11.
|
||||
|
||||
## Key missed items from prior migrations (REQUIRED fixes)
|
||||
|
||||
1. **NAV MENUS**: Must extract wp_terms (taxonomy=nav_menu) + wp_termmeta for label/URL/order.
|
||||
Output: nav.json → seeded into nav.sqlite (label, href, display_order, is_cta).
|
||||
|
||||
2. **DIVI HEADER**: Must extract et_divi options from wp_options for logo, header layout, colors.
|
||||
The _header.php must be written from scratch using AM design tokens, not copied from Divi.
|
||||
|
||||
3. **MEDIA**: All uploads/ files must be: cataloged → copied to assets/images/ → converted to WebP.
|
||||
Every image reference in content JSON must be updated to /assets/images/{filename}.webp.
|
||||
|
||||
4. **SECTION REMAPPING**: Divi modules must be remapped to AM section types.
|
||||
- blurb_module → feature_cards item
|
||||
- toggle_module → accordion item
|
||||
- cta_module → cta_band section
|
||||
- pricing_module → booking_options section
|
||||
- testimonial_mod → testimonials.sqlite row
|
||||
- text_module → text_block section
|
||||
|
||||
## Related SOPs
|
||||
|
||||
- **09-stack-a-output.md** — SQLite schema + sections_json spec
|
||||
- **10-agent-breadcrumbs.md** — Step-by-step ordered checklist for agent execution
|
||||
- **00-stack-philosophy.md** — Stack A vs Stack B decision rationale
|
||||
@@ -0,0 +1,120 @@
|
||||
# 01 — .wpress Extraction
|
||||
|
||||
Unpack the All-in-One WP Migration `.wpress` archive into the project's
|
||||
`.planning/wpress-extract/` directory.
|
||||
|
||||
## .wpress binary format
|
||||
|
||||
NOT a standard zip or tar. Custom sequential binary format:
|
||||
|
||||
```
|
||||
[HEADER 4377 bytes] [FILE DATA n bytes] [HEADER] [FILE DATA] ...
|
||||
```
|
||||
|
||||
Header breakdown:
|
||||
```
|
||||
Offset Length Field
|
||||
0 255 Filename (null-padded)
|
||||
255 14 File size in bytes (ASCII decimal, null-padded)
|
||||
269 12 mtime unix timestamp (ASCII decimal, null-padded)
|
||||
281 4096 Relative path (null-padded)
|
||||
4377 n Raw file bytes (size from header)
|
||||
```
|
||||
|
||||
The archive ends when a header of all null bytes is encountered, or EOF.
|
||||
|
||||
## Extraction script
|
||||
|
||||
Script: `.am-webdesign-sops/wp-divi-pipeline/scripts/extract_wpress.py`
|
||||
|
||||
```bash
|
||||
python3 ~/.am-webdesign-sops-path/scripts/extract_wpress.py \
|
||||
.planning/vibrantyou-yoga-YYYYMMDD-*.wpress \
|
||||
.planning/wpress-extract/
|
||||
```
|
||||
|
||||
Or from the SOP scripts directory directly:
|
||||
|
||||
```bash
|
||||
python3 /home/sirdrez/arisingmedia-websites/.am-webdesign-sops/wp-divi-pipeline/scripts/extract_wpress.py \
|
||||
/home/sirdrez/arisingmedia-websites/{domain}/.planning/{file}.wpress \
|
||||
/home/sirdrez/arisingmedia-websites/{domain}/.planning/wpress-extract/
|
||||
```
|
||||
|
||||
Progress prints every 200 files. A 300-400MB archive typically extracts in
|
||||
2-5 minutes and produces 1,000-5,000 files.
|
||||
|
||||
## Expected archive contents
|
||||
|
||||
After extraction, `wpress-extract/` contains:
|
||||
|
||||
```
|
||||
wpress-extract/
|
||||
├── package.json ← archive metadata (domain, WP version, plugin list)
|
||||
├── database.sql ← full MySQL dump (the most important file)
|
||||
└── wp-content/
|
||||
├── uploads/ ← all media (images, PDFs, videos)
|
||||
│ └── YYYY/MM/ ← WordPress date-organized subdirs
|
||||
├── themes/
|
||||
│ ├── Divi/ ← Divi 4 theme files (if Divi 4)
|
||||
│ └── divi-5/ ← Divi 5 theme files (if Divi 5)
|
||||
└── plugins/ ← installed plugins (useful for form schema)
|
||||
├── gravityforms/
|
||||
└── contact-form-7/
|
||||
```
|
||||
|
||||
## Verify extraction
|
||||
|
||||
After the script completes, confirm the key files exist:
|
||||
|
||||
```bash
|
||||
# Database dump present?
|
||||
ls -lh .planning/wpress-extract/database.sql
|
||||
|
||||
# Uploads present?
|
||||
find .planning/wpress-extract/wp-content/uploads -name "*.jpg" | wc -l
|
||||
find .planning/wpress-extract/wp-content/uploads -name "*.png" | wc -l
|
||||
|
||||
# Archive metadata
|
||||
cat .planning/wpress-extract/package.json
|
||||
```
|
||||
|
||||
`package.json` contains the site URL, WordPress version, Divi version, and
|
||||
plugin list — read it before proceeding to Phase 2.
|
||||
|
||||
## Common issues
|
||||
|
||||
**"Not a zip file" error** — Expected. The .wpress format is not zip.
|
||||
The `extract_wpress.py` script handles it correctly.
|
||||
|
||||
**Missing database.sql** — The archive may name it differently. Check:
|
||||
```bash
|
||||
find .planning/wpress-extract -name "*.sql" 2>/dev/null
|
||||
```
|
||||
|
||||
**Partial extraction** — If the script stops early, check disk space:
|
||||
```bash
|
||||
df -h .planning/wpress-extract/
|
||||
```
|
||||
A 378MB .wpress typically expands to 1-3GB uncompressed.
|
||||
|
||||
**Path traversal in filenames** — The script strips leading `/` and `.` from
|
||||
paths. If files land in unexpected locations, check the raw path field with:
|
||||
```bash
|
||||
python3 -c "
|
||||
import sys
|
||||
HEADER_SIZE=4377; NAME_LEN=255; SIZE_LEN=14; MTIME_LEN=12; PATH_LEN=4096
|
||||
with open(sys.argv[1],'rb') as f:
|
||||
for i in range(5):
|
||||
h = f.read(HEADER_SIZE)
|
||||
name = h[:NAME_LEN].split(b'\x00',1)[0].decode(errors='replace')
|
||||
size = int(h[NAME_LEN:NAME_LEN+SIZE_LEN].split(b'\x00',1)[0] or 0)
|
||||
path = h[NAME_LEN+SIZE_LEN+MTIME_LEN:].split(b'\x00',1)[0].decode(errors='replace')
|
||||
print(f' [{i}] path={repr(path)} name={repr(name)} size={size}')
|
||||
f.seek(size, 1)
|
||||
" .planning/file.wpress
|
||||
```
|
||||
|
||||
## Next step
|
||||
|
||||
Proceed to `02-database-analysis.md` to inventory pages and detect Divi version.
|
||||
@@ -0,0 +1,151 @@
|
||||
# 02 — Database Analysis
|
||||
|
||||
Parse the WordPress MySQL dump to inventory pages, detect Divi version,
|
||||
extract design settings, and build the data JSON files that drive the AM build.
|
||||
|
||||
## Script
|
||||
|
||||
```bash
|
||||
python3 /home/sirdrez/arisingmedia-websites/.am-webdesign-sops/wp-divi-pipeline/scripts/analyze_db.py \
|
||||
{domain}/.planning/wpress-extract/ \
|
||||
{domain}/.planning/data/
|
||||
```
|
||||
|
||||
Outputs three files into `.planning/data/`:
|
||||
- `pages.json` — all published pages/posts with content and SEO meta
|
||||
- `design-system.json` — colors, fonts, Divi settings
|
||||
- `site-info.json` — domain, plugin list, WP version, Divi version
|
||||
|
||||
## Divi version detection
|
||||
|
||||
The script auto-detects Divi version by scanning `database.sql`:
|
||||
|
||||
| Signal in SQL | Divi version |
|
||||
|---------------|-------------|
|
||||
| `wp:divi/` in post_content | Divi 5 |
|
||||
| `[et_pb_section` in post_content | Divi 4 |
|
||||
|
||||
**This determines the content extraction path.** Divi 4 → use `extract_divi4.py`.
|
||||
Divi 5 → use `extract_divi5.py`. See `03-divi-content-extraction.md`.
|
||||
|
||||
## Key WordPress tables
|
||||
|
||||
| Table | Contents | Used for |
|
||||
|-------|----------|---------|
|
||||
| `wp_posts` | All pages, posts, attachments, layouts | Page inventory, content |
|
||||
| `wp_postmeta` | Per-post metadata | ACF fields, Rank Math SEO, Divi layout JSON |
|
||||
| `wp_options` | Site-wide settings | Divi theme settings, colors, fonts |
|
||||
| `wp_gf_forms` | Gravity Forms definitions | Form field schema |
|
||||
| `wp_gf_entries` | Gravity Form submissions | Not needed for migration |
|
||||
| `wp_rank_math_seo_meta` | Rank Math SEO per page | SEO titles, descriptions |
|
||||
|
||||
## Reading pages.json
|
||||
|
||||
Each entry in `pages.json`:
|
||||
|
||||
```json
|
||||
{
|
||||
"id": "42",
|
||||
"post_type": "page",
|
||||
"slug": "about",
|
||||
"title": "About VibrantYou Yoga",
|
||||
"status": "publish",
|
||||
"date": "2026-03-15",
|
||||
"modified": "2026-04-10",
|
||||
"content_raw": "<!-- wp:divi/section ... -->...",
|
||||
"excerpt": "",
|
||||
"parent_id": "0",
|
||||
"menu_order": "3",
|
||||
"seo_title": "About VibrantYou Yoga | Mindful Movement in [City]",
|
||||
"seo_description": "...",
|
||||
"seo_keywords": "yoga studio, mindful movement",
|
||||
"acf": {
|
||||
"vyy_hero_headline": "Move With Intention",
|
||||
"vyy_hero_subhead": "..."
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
`content_raw` holds the raw Divi block markup. Pass it to the extractor scripts.
|
||||
`acf` holds Advanced Custom Fields values — often cleaner than block content.
|
||||
|
||||
## Reading design-system.json
|
||||
|
||||
Contains extracted Divi theme settings. Key fields:
|
||||
|
||||
```json
|
||||
{
|
||||
"primary_color": "#1a8a7a",
|
||||
"body_font": "DM Sans",
|
||||
"header_font": "DM Serif Display",
|
||||
"body_font_size": "16",
|
||||
"body_line_height": "1.7",
|
||||
"divi_version": "5",
|
||||
"wp_version": "6.9.4",
|
||||
"site_url": "https://vibrantyou.yoga",
|
||||
"site_name": "VibrantYou Yoga"
|
||||
}
|
||||
```
|
||||
|
||||
Use these values to seed the AM `main.css` CSS custom properties block.
|
||||
|
||||
## Manual inspection (when script output is sparse)
|
||||
|
||||
Sometimes the Divi theme options are stored as PHP-serialized data.
|
||||
Use grep to find and eyeball the raw values:
|
||||
|
||||
```bash
|
||||
DB=.planning/wpress-extract/database.sql
|
||||
|
||||
# Divi global colors
|
||||
grep -o "'et_divi[^']*','[^']*'" $DB | head -30
|
||||
|
||||
# Site name + URL
|
||||
grep -E "'(siteurl|blogname|admin_email)','[^']*'" $DB
|
||||
|
||||
# Rank Math SEO meta for a specific post
|
||||
grep "rank_math_title\|rank_math_description" $DB | head -20
|
||||
|
||||
# All published page slugs
|
||||
grep -o "post_name','[^']*'" $DB | grep -v "revision\|auto-draft" | sort | uniq
|
||||
```
|
||||
|
||||
## Gravity Forms schema (for form replacement)
|
||||
|
||||
Find form field definitions:
|
||||
|
||||
```bash
|
||||
grep "INSERT INTO \`wp_gf_forms\`" .planning/wpress-extract/database.sql | \
|
||||
python3 -c "
|
||||
import sys, json, re
|
||||
for line in sys.stdin:
|
||||
m = re.search(r\"'([^']+)'\s*\)\s*;\", line)
|
||||
if m:
|
||||
try: print(json.dumps(json.loads(m.group(1).replace('\\\\\"','\"')), indent=2)[:2000])
|
||||
except: pass
|
||||
" 2>/dev/null | head -100
|
||||
```
|
||||
|
||||
Field types seen in Gravity Forms: text, email, phone, textarea, select, checkbox, radio, name, address, fileupload. Map each to a plain HTML input equivalent.
|
||||
|
||||
## Archive directory layout note
|
||||
|
||||
The AIOIM .wpress format extracts flat — no `wp-content/` wrapper:
|
||||
|
||||
```
|
||||
wpress-extract/
|
||||
├── database.sql ← NOT in wp-content/
|
||||
├── package.json
|
||||
├── uploads/ ← NOT wp-content/uploads/
|
||||
├── themes/ ← NOT wp-content/themes/
|
||||
├── plugins/ ← NOT wp-content/plugins/
|
||||
└── et-cache/
|
||||
```
|
||||
|
||||
Scripts must reference `uploads/`, `themes/`, `plugins/` directly under
|
||||
`wpress-extract/`, not `wpress-extract/wp-content/`.
|
||||
|
||||
## Next step
|
||||
|
||||
Once `pages.json` is written, proceed to `03-divi-content-extraction.md`
|
||||
to parse `content_raw` for each page into structured AM-ready HTML.
|
||||
@@ -0,0 +1,157 @@
|
||||
# 03 — Divi Content Extraction
|
||||
|
||||
Parse raw Divi page content from `pages.json` into clean, structured HTML
|
||||
sections ready to map into AM templates.
|
||||
|
||||
## Divi 4 vs Divi 5 — critical difference
|
||||
|
||||
### Divi 4 (shortcode-based)
|
||||
|
||||
Content is stored as shortcodes in `wp_posts.post_content`:
|
||||
|
||||
```
|
||||
[et_pb_section fb_built="1" admin_label="Hero" _builder_version="4.27.4"
|
||||
background_color="#0f5f53" ...]
|
||||
[et_pb_row ...]
|
||||
[et_pb_column type="4_4" ...]
|
||||
[et_pb_text ...]<h1>Move With Intention</h1>[/et_pb_text]
|
||||
[et_pb_button button_url="/contact" button_text="Book a Class" /]
|
||||
[/et_pb_column]
|
||||
[/et_pb_row]
|
||||
[/et_pb_section]
|
||||
```
|
||||
|
||||
Use `extract_divi4.py` → parses shortcode tree into section/row/module JSON.
|
||||
|
||||
### Divi 5 (block-based)
|
||||
|
||||
Content is stored as Gutenberg-style block comments:
|
||||
|
||||
```html
|
||||
<!-- wp:divi/section {"id":"section-abc123","attrs":{"backgroundColor":{"value":"#0f5f53"}}} -->
|
||||
<div class="et_pb_section ...">
|
||||
<!-- wp:divi/row ... -->
|
||||
<!-- wp:divi/column ... -->
|
||||
<!-- wp:divi/text ... -->
|
||||
<div class="et_pb_text_inner"><h1>Move With Intention</h1></div>
|
||||
<!-- /wp:divi/text -->
|
||||
<!-- /wp:divi/column -->
|
||||
<!-- /wp:divi/row -->
|
||||
</div>
|
||||
<!-- /wp:divi/section -->
|
||||
```
|
||||
|
||||
Use `extract_divi5.py` → strips block wrapper, extracts inner HTML per module.
|
||||
|
||||
## Divi 5 extraction script
|
||||
|
||||
```bash
|
||||
python3 /home/sirdrez/arisingmedia-websites/.am-webdesign-sops/wp-divi-pipeline/scripts/extract_divi5.py \
|
||||
{domain}/.planning/data/pages.json \
|
||||
{domain}/.planning/data/content/
|
||||
```
|
||||
|
||||
Produces one JSON file per page: `content/{slug}.json`
|
||||
|
||||
```json
|
||||
{
|
||||
"slug": "about",
|
||||
"title": "About VibrantYou Yoga",
|
||||
"seo_title": "About VibrantYou Yoga | ...",
|
||||
"seo_description": "...",
|
||||
"sections": [
|
||||
{
|
||||
"type": "hero",
|
||||
"background_color": "#0f5f53",
|
||||
"modules": [
|
||||
{ "module": "text", "html": "<h1>Move With Intention</h1>" },
|
||||
{ "module": "button", "text": "Book a Class", "url": "/contact/" }
|
||||
]
|
||||
},
|
||||
{
|
||||
"type": "standard",
|
||||
"modules": [
|
||||
{ "module": "text", "html": "<h2>Our Story</h2><p>...</p>" },
|
||||
{ "module": "image", "src": "/assets/images/studio.webp", "alt": "..." }
|
||||
]
|
||||
}
|
||||
]
|
||||
}
|
||||
```
|
||||
|
||||
## ACF fields take priority
|
||||
|
||||
If a page has ACF fields (in `pages.json[].acf`), use those over block content.
|
||||
ACF fields are typically cleaner, pre-authored copy without Divi wrapper noise.
|
||||
|
||||
Convention for VYY-specific ACF keys:
|
||||
- `vyy_hero_headline` → `<h1>` in hero section
|
||||
- `vyy_hero_subhead` → `<p class="hero-lead">` in hero
|
||||
- `vyy_hero_cta_text` → primary CTA button label
|
||||
- `vyy_hero_cta_url` → primary CTA button href
|
||||
|
||||
Always check `acf` keys before parsing `content_raw`.
|
||||
|
||||
## Stripping Divi class/attribute noise
|
||||
|
||||
After extraction, run every HTML snippet through the `clean_divi_html()`
|
||||
function from `divi_to_html.py`:
|
||||
|
||||
```python
|
||||
from divi_to_html import clean_divi_html, rewrite_internal_links
|
||||
|
||||
cleaned = clean_divi_html(raw_html)
|
||||
cleaned = rewrite_internal_links(cleaned, staging_hosts=("vibrantyou.yoga",))
|
||||
```
|
||||
|
||||
This removes:
|
||||
- `<!-- wp:divi/... -->` block comments
|
||||
- `data-et-*`, `data-builder-*` attributes
|
||||
- `et_pb_*`, `divi-builder-*`, `d5_*` class tokens
|
||||
- Empty `class=""` attributes
|
||||
|
||||
## What to extract per section type
|
||||
|
||||
| Divi module | Extract | Map to AM element |
|
||||
|-------------|---------|-------------------|
|
||||
| `divi/text` | inner HTML | `<section>`, `<p>`, headings as-is |
|
||||
| `divi/button` | `text`, `url` | `<a class="btn-primary">` |
|
||||
| `divi/image` | `src`, `alt`, `title` | `<img>` → rewrite to WebP path |
|
||||
| `divi/blurb` | icon, title, body | `.am-card` component |
|
||||
| `divi/testimonial` | quote, author, company | `.am-testimonial` component |
|
||||
| `divi/video` | `src`, poster | `<video>` or YouTube embed |
|
||||
| `divi/contact_form` | field list | → replace with AM form, see `08` |
|
||||
| `divi/accordion` | Q+A pairs | `<details><summary>` |
|
||||
| `divi/fullwidth_header` | title, subhead, CTA | hero section |
|
||||
|
||||
## Section background colors → AM section modifiers
|
||||
|
||||
Divi 5 stores `backgroundColor` in the block `attrs` JSON.
|
||||
Map to AM CSS modifier classes:
|
||||
|
||||
| Divi background | AM class modifier |
|
||||
|----------------|------------------|
|
||||
| `#0f5f53` (dark teal) | `.section--dark` |
|
||||
| `#1a8a7a` (mid teal) | `.section--brand` |
|
||||
| `#f5f5f5` / `#fafafa` | `.section--light` |
|
||||
| `#ffffff` / none | `.section--white` |
|
||||
|
||||
## Content quality pass (required before HTML build)
|
||||
|
||||
After extraction, review every page's content for:
|
||||
|
||||
1. **Cut bloated copy** — WordPress sites often have 3x more text than needed.
|
||||
Target 30-50% reduction. One clear idea per paragraph.
|
||||
2. **Remove stale metrics** — "Over 500 students" only stays if it's verifiable.
|
||||
Otherwise remove or mark `DRAFT NEEDED`.
|
||||
3. **Remove plugin artifacts** — Gravity Forms shortcodes `[gravityforms id="1"]`,
|
||||
Events Manager tags, Divi shortcode residue that survived extraction.
|
||||
4. **Improve CTAs** — Replace generic "Learn More" with action-specific text:
|
||||
"Book a Free Class", "View the Schedule", "Start Your Practice".
|
||||
5. **Flag images** — Note every `<img>` that needs a real photo vs stock.
|
||||
|
||||
## Next step
|
||||
|
||||
Proceed to `04-design-system-extraction.md` to convert Divi theme settings
|
||||
into AM CSS custom properties, then `05-content-migration.md` to build the
|
||||
HTML templates.
|
||||
@@ -0,0 +1,172 @@
|
||||
# 04 — Design System Extraction
|
||||
|
||||
Convert Divi theme settings into AM CSS custom properties.
|
||||
The goal is to ENHANCE the design — cleaner, more modern — not replicate it.
|
||||
|
||||
## Input
|
||||
|
||||
`design-system.json` produced by `analyze_db.py`. Key fields:
|
||||
|
||||
```json
|
||||
{
|
||||
"primary_color": "#1a8a7a",
|
||||
"body_font": "DM Sans",
|
||||
"header_font": "DM Serif Display",
|
||||
"body_font_size": "16",
|
||||
"body_line_height": "1.7",
|
||||
"site_name": "VibrantYou Yoga"
|
||||
}
|
||||
```
|
||||
|
||||
## Color palette strategy
|
||||
|
||||
Never lift the Divi palette 1:1. Use extracted colors as the base and build a
|
||||
full 5-step scale around the primary hue:
|
||||
|
||||
| Token | Derived from | Role |
|
||||
|-------|-------------|------|
|
||||
| `--color-primary` | Divi accent_color | Buttons, links, active states |
|
||||
| `--color-primary-dark` | Darken primary 15% | Hover states, section backgrounds |
|
||||
| `--color-primary-light` | Lighten primary 40% | Subtle tints, borders |
|
||||
| `--color-surface` | Always `#fafafa` | Page background |
|
||||
| `--color-surface-alt` | `#f3f3f3` | Alternating sections |
|
||||
| `--color-text` | Always `#1a1a1a` | Body copy |
|
||||
| `--color-text-muted` | `#666` | Subheadings, captions |
|
||||
| `--color-border` | 10% primary or `#e0e0e0` | Dividers, inputs |
|
||||
| `--color-white` | `#ffffff` | Card backgrounds, hero text |
|
||||
|
||||
For VibrantYou Yoga (primary `#1a8a7a`, dark `#0f5f53`):
|
||||
|
||||
```css
|
||||
:root {
|
||||
--color-primary: #1a8a7a;
|
||||
--color-primary-dark: #0f5f53;
|
||||
--color-primary-light: #d4f0eb;
|
||||
--color-surface: #fafafa;
|
||||
--color-surface-alt: #f0f7f6;
|
||||
--color-text: #1a1a1a;
|
||||
--color-text-muted: #5a6e6b;
|
||||
--color-border: #c8dedd;
|
||||
--color-white: #ffffff;
|
||||
}
|
||||
```
|
||||
|
||||
## Typography strategy
|
||||
|
||||
Use the extracted fonts but upgrade the type scale.
|
||||
Divi's default type scale is too small and too flat. Aim for 1.25–1.333 modular ratio.
|
||||
|
||||
```css
|
||||
:root {
|
||||
/* Fonts from design-system.json */
|
||||
--font-body: 'DM Sans', system-ui, sans-serif;
|
||||
--font-heading: 'DM Serif Display', Georgia, serif;
|
||||
|
||||
/* Modular scale (1.25 ratio from 16px base) */
|
||||
--text-xs: 0.75rem; /* 12px */
|
||||
--text-sm: 0.875rem; /* 14px */
|
||||
--text-base: 1rem; /* 16px */
|
||||
--text-lg: 1.125rem; /* 18px */
|
||||
--text-xl: 1.25rem; /* 20px */
|
||||
--text-2xl: 1.5rem; /* 24px */
|
||||
--text-3xl: 1.875rem; /* 30px */
|
||||
--text-4xl: 2.25rem; /* 36px */
|
||||
--text-5xl: 3rem; /* 48px */
|
||||
--text-6xl: 3.75rem; /* 60px */
|
||||
|
||||
/* Line heights */
|
||||
--leading-tight: 1.2;
|
||||
--leading-normal: 1.6;
|
||||
--leading-loose: 1.8;
|
||||
|
||||
/* Font weights */
|
||||
--weight-normal: 400;
|
||||
--weight-medium: 500;
|
||||
--weight-semibold: 600;
|
||||
--weight-bold: 700;
|
||||
}
|
||||
```
|
||||
|
||||
## Spacing and layout
|
||||
|
||||
Divi uses pixel-based margins/paddings that must be converted to a consistent
|
||||
rem-based spacing scale:
|
||||
|
||||
```css
|
||||
:root {
|
||||
--space-1: 0.25rem; /* 4px */
|
||||
--space-2: 0.5rem; /* 8px */
|
||||
--space-3: 0.75rem; /* 12px */
|
||||
--space-4: 1rem; /* 16px */
|
||||
--space-5: 1.25rem; /* 20px */
|
||||
--space-6: 1.5rem; /* 24px */
|
||||
--space-8: 2rem; /* 32px */
|
||||
--space-10: 2.5rem; /* 40px */
|
||||
--space-12: 3rem; /* 48px */
|
||||
--space-16: 4rem; /* 64px */
|
||||
--space-20: 5rem; /* 80px */
|
||||
--space-24: 6rem; /* 96px */
|
||||
--space-32: 8rem; /* 128px */
|
||||
|
||||
/* Section vertical padding */
|
||||
--section-py: var(--space-20); /* 80px default */
|
||||
--section-py-sm: var(--space-12); /* 48px mobile */
|
||||
|
||||
/* Container */
|
||||
--container-max: 1200px;
|
||||
--container-px: var(--space-6);
|
||||
|
||||
/* Border radius */
|
||||
--radius-sm: 4px;
|
||||
--radius-md: 8px;
|
||||
--radius-lg: 12px;
|
||||
--radius-xl: 20px;
|
||||
--radius-full: 9999px;
|
||||
|
||||
/* Shadows */
|
||||
--shadow-sm: 0 1px 3px rgba(0,0,0,.08);
|
||||
--shadow-md: 0 4px 16px rgba(0,0,0,.1);
|
||||
--shadow-lg: 0 12px 40px rgba(0,0,0,.12);
|
||||
}
|
||||
```
|
||||
|
||||
## Google Fonts import
|
||||
|
||||
For DM Sans + DM Serif Display:
|
||||
|
||||
```html
|
||||
<link rel="preconnect" href="https://fonts.googleapis.com">
|
||||
<link rel="preconnect" href="https://fonts.gstatic.com" crossorigin>
|
||||
<link href="https://fonts.googleapis.com/css2?family=DM+Sans:ital,opsz,wght@0,9..40,300;0,9..40,400;0,9..40,500;0,9..40,600;0,9..40,700;1,9..40,400&family=DM+Serif+Display:ital@0;1&display=swap" rel="stylesheet">
|
||||
```
|
||||
|
||||
## Enhancement rules (required)
|
||||
|
||||
These upgrades apply to every AM migration regardless of source:
|
||||
|
||||
1. **Increase contrast** — body text must be #1a1a1a on white (WCAG AA minimum).
|
||||
Never use the grey-on-grey color schemes that Divi themes commonly use.
|
||||
|
||||
2. **Whitespace is content** — section padding must be at minimum 80px vertical
|
||||
on desktop. Divi often uses 40-60px which feels cramped.
|
||||
|
||||
3. **One weight per heading level** — h1 at 700, h2 at 600, h3 at 500.
|
||||
Divi often leaves all headings at the same weight.
|
||||
|
||||
4. **Max-width prose** — body copy containers max 680px wide. Divi stretches
|
||||
copy to full column width on 1200px screens, which is unreadable.
|
||||
|
||||
5. **Brand color is a highlight, not a wallpaper** — primary color should
|
||||
appear on buttons, links, and 1-2 hero sections only. Divi sites often
|
||||
paint every other section in the primary color.
|
||||
|
||||
## Output: main.css variables block
|
||||
|
||||
Write the complete `:root {}` block into `src/assets/css/main.css` as the
|
||||
first section. All other CSS rules reference only `var(--token-name)`.
|
||||
Never hard-code a color, font, or spacing value outside of `:root`.
|
||||
|
||||
## Next step
|
||||
|
||||
Proceed to `05-content-migration.md` to map extracted content into AM HTML
|
||||
templates using this design system.
|
||||
@@ -0,0 +1,246 @@
|
||||
# 05 — Content Migration
|
||||
|
||||
Map extracted Divi content into AM HTML templates. This is the build phase.
|
||||
Follow `01-project-structure.md` for directory layout and `03-build-pipeline.md`
|
||||
for JSON + template stamping.
|
||||
|
||||
## Source files
|
||||
|
||||
After running Phase 2-4 scripts, `.planning/data/` contains:
|
||||
|
||||
```
|
||||
.planning/data/
|
||||
├── pages.json ← all published pages (from analyze_db.py)
|
||||
├── site-info.json ← domain, plugin list, Divi version
|
||||
├── design-system.json ← colors, fonts, spacing tokens
|
||||
└── content/
|
||||
├── home.json ← parsed sections for home page
|
||||
├── about.json ← parsed sections for about page
|
||||
├── services.json
|
||||
└── ... ← one file per published page
|
||||
```
|
||||
|
||||
## Information architecture for yoga sites
|
||||
|
||||
Standard AM structure for a yoga studio / wellness site:
|
||||
|
||||
```
|
||||
/ home (hero, classes preview, testimonials, CTA)
|
||||
/about/ about / story / instructors
|
||||
/classes/ class schedule index
|
||||
/classes/{slug}.html one page per class type (hatha, vinyasa, yin, etc.)
|
||||
/private-sessions/ 1:1 session offerings
|
||||
/workshops/ workshops + retreats index
|
||||
/contact/ contact + booking form
|
||||
/blog/ optional blog index
|
||||
/blog/{slug}.html individual blog posts
|
||||
/404.html
|
||||
/500.html
|
||||
/robots.txt
|
||||
/sitemap.xml
|
||||
```
|
||||
|
||||
Map every WP page slug to this structure first. Some WP slugs may need to be
|
||||
consolidated, renamed, or dropped. Document the redirect map in
|
||||
`.planning/redirect-map.txt` (old slug → new path).
|
||||
|
||||
## Build order
|
||||
|
||||
Build in this sequence. Each page uses the previous as a reference:
|
||||
|
||||
1. `src/assets/css/main.css` — design tokens, reset, typography, layout grid
|
||||
2. `src/assets/css/components.css` — header, footer, hero, cards, forms, nav
|
||||
3. `src/components/header.html` — navigation
|
||||
4. `src/components/footer.html` — footer links, contact info
|
||||
5. `src/assets/js/components.js` — fetch + inject header/footer
|
||||
6. `src/assets/js/main.js` — scroll animations, intersection observer
|
||||
7. `src/index.html` — home page (this IS the design system in working form)
|
||||
8. `src/about/index.html`
|
||||
9. `src/classes/index.html` + individual class pages (from JSON template if 4+)
|
||||
10. `src/contact/index.html` + AM form
|
||||
11. `src/blog/index.html` + individual posts
|
||||
12. `src/robots.txt`, `src/sitemap.xml`, `src/404.html`, `src/500.html`
|
||||
|
||||
## HTML page skeleton
|
||||
|
||||
Every page uses the same skeleton. Copy from 06-seo-meta.md for the full
|
||||
`<head>` requirements. Shell:
|
||||
|
||||
```html
|
||||
<!DOCTYPE html>
|
||||
<html lang="en">
|
||||
<head>
|
||||
<meta charset="UTF-8">
|
||||
<meta name="viewport" content="width=device-width, initial-scale=1.0">
|
||||
<meta name="site-root" content="/">
|
||||
<title>{{seo_title}}</title>
|
||||
<meta name="description" content="{{seo_description}}">
|
||||
<link rel="canonical" href="{{canonical}}">
|
||||
<!-- og, twitter, schema — see 06-seo-meta.md -->
|
||||
<link rel="stylesheet" href="/assets/css/main.css">
|
||||
<link rel="stylesheet" href="/assets/css/components.css">
|
||||
</head>
|
||||
<body>
|
||||
<div id="header-placeholder"></div>
|
||||
|
||||
<main>
|
||||
<!-- page sections go here -->
|
||||
</main>
|
||||
|
||||
<div id="footer-placeholder"></div>
|
||||
<script src="/assets/js/components.js"></script>
|
||||
<script src="/assets/js/main.js"></script>
|
||||
</body>
|
||||
</html>
|
||||
```
|
||||
|
||||
## Section HTML patterns
|
||||
|
||||
Map each `content/{slug}.json` section to one of these AM patterns:
|
||||
|
||||
### Hero (role: "hero")
|
||||
|
||||
```html
|
||||
<section class="hero hero--dark">
|
||||
<div class="container">
|
||||
<div class="hero__content">
|
||||
<h1 class="hero__title">Move With Intention</h1>
|
||||
<p class="hero__lead">Discover yoga classes for all levels in [City].</p>
|
||||
<div class="hero__actions">
|
||||
<a href="/classes/" class="btn btn--primary">Explore Classes</a>
|
||||
<a href="/contact/" class="btn btn--outline">Book a Session</a>
|
||||
</div>
|
||||
</div>
|
||||
</div>
|
||||
</section>
|
||||
```
|
||||
|
||||
### Feature grid (4-col blurb modules)
|
||||
|
||||
```html
|
||||
<section class="section section--light">
|
||||
<div class="container">
|
||||
<h2 class="section__title text-center">Why VibrantYou Yoga</h2>
|
||||
<div class="grid grid--4">
|
||||
<div class="feature-card">
|
||||
<div class="feature-card__icon"><!-- SVG icon --></div>
|
||||
<h3 class="feature-card__title">All Levels Welcome</h3>
|
||||
<p class="feature-card__body">From first-timers to advanced practitioners.</p>
|
||||
</div>
|
||||
<!-- repeat -->
|
||||
</div>
|
||||
</div>
|
||||
</section>
|
||||
```
|
||||
|
||||
### Testimonials (3-col)
|
||||
|
||||
```html
|
||||
<section class="section section--white">
|
||||
<div class="container">
|
||||
<h2 class="section__title text-center">What Students Say</h2>
|
||||
<div class="grid grid--3">
|
||||
<blockquote class="testimonial">
|
||||
<p class="testimonial__quote">"..."</p>
|
||||
<footer class="testimonial__author">
|
||||
<strong>Jane D.</strong>
|
||||
<span>Student since 2024</span>
|
||||
</footer>
|
||||
</blockquote>
|
||||
</div>
|
||||
</div>
|
||||
</section>
|
||||
```
|
||||
|
||||
### CTA section
|
||||
|
||||
```html
|
||||
<section class="section section--brand">
|
||||
<div class="container text-center">
|
||||
<h2 class="section__title">Ready to Begin?</h2>
|
||||
<p class="section__lead">Your first class is on us.</p>
|
||||
<a href="/contact/" class="btn btn--white btn--lg">Book a Free Class</a>
|
||||
</div>
|
||||
</section>
|
||||
```
|
||||
|
||||
## Class pages — JSON template build
|
||||
|
||||
If there are 4+ class types (Hatha, Vinyasa, Yin, Meditation, etc.), use the
|
||||
build pipeline:
|
||||
|
||||
```
|
||||
src/classes/
|
||||
├── _template.html ← class detail page template
|
||||
├── hatha.html ← generated from classes.json
|
||||
├── vinyasa.html
|
||||
├── yin.html
|
||||
└── meditation.html
|
||||
|
||||
.planning/data/
|
||||
└── classes.json ← array of class objects
|
||||
```
|
||||
|
||||
`classes.json` schema:
|
||||
```json
|
||||
[
|
||||
{
|
||||
"slug": "hatha",
|
||||
"name": "Hatha Yoga",
|
||||
"title": "Hatha Yoga Classes | VibrantYou Yoga",
|
||||
"meta_description": "...",
|
||||
"canonical": "https://vibrantyou.yoga/classes/hatha.html",
|
||||
"hero_h1": "Hatha Yoga",
|
||||
"hero_lead": "A grounding practice for all experience levels.",
|
||||
"description": "<p>...</p>",
|
||||
"duration": "60 min",
|
||||
"level": "All levels",
|
||||
"schedule": "Mon, Wed, Fri — 9:00 AM",
|
||||
"instructor": "Sarah M.",
|
||||
"faqs": [
|
||||
{ "q": "Do I need prior experience?", "a": "No." }
|
||||
]
|
||||
}
|
||||
]
|
||||
```
|
||||
|
||||
## Events Manager → static schedule
|
||||
|
||||
The site uses Events Manager plugin. For static migration:
|
||||
- Extract recurring class schedule from the database (`wp_em_events` table)
|
||||
- Convert to a static schedule table / cards in `src/classes/index.html`
|
||||
- Do NOT recreate a dynamic booking system unless explicitly requested
|
||||
- Link the "Book" button to the contact form or an external booking URL
|
||||
|
||||
## Image remapping
|
||||
|
||||
Every `<img src="...">` extracted from Divi content will have a WordPress
|
||||
upload URL like `/wp-content/uploads/2026/03/image.jpg`.
|
||||
|
||||
Remap to AM path:
|
||||
- Source: `wpress-extract/uploads/2026/03/image.jpg`
|
||||
- AM dest: `src/assets/images/image.webp` (after WebP conversion)
|
||||
- HTML: `<img src="/assets/images/image.webp" alt="..." loading="lazy" width="800" height="600">`
|
||||
|
||||
Always include `width`, `height`, `loading="lazy"`, and `alt` on every `<img>`.
|
||||
|
||||
## After build — verify
|
||||
|
||||
```bash
|
||||
# Zero unreplaced template placeholders
|
||||
grep -rn "{{" src/**/*.html
|
||||
|
||||
# All pages have canonical
|
||||
grep -rL 'rel="canonical"' src/**/*.html
|
||||
|
||||
# All images have alt text
|
||||
grep -rn '<img' src/**/*.html | grep -v 'alt="[^"]'
|
||||
|
||||
# Protection check (after deploy)
|
||||
bash /home/sirdrez/arisingmedia-websites/.am-webdesign-sops/tools/verify-protection.sh https://{domain}
|
||||
```
|
||||
|
||||
## Next step
|
||||
|
||||
Proceed to `06-media-assets.md` for image migration and WebP conversion,
|
||||
then `07-seo-preservation.md` for redirect map and meta tag audit.
|
||||
@@ -0,0 +1,177 @@
|
||||
# 06 — Media Assets
|
||||
|
||||
Migrate WordPress uploads to AM `/assets/images/`, convert to WebP, and
|
||||
generate a media manifest for URL remapping during HTML build.
|
||||
|
||||
## Source location in wpress-extract
|
||||
|
||||
AIOIM extracts flat — uploads are at:
|
||||
```
|
||||
wpress-extract/uploads/ NOT wpress-extract/wp-content/uploads/
|
||||
```
|
||||
|
||||
Organized by WordPress date-upload subdirs:
|
||||
```
|
||||
uploads/
|
||||
├── 2026/
|
||||
│ ├── 03/
|
||||
│ │ ├── VibrantYouYogaLogo.png
|
||||
│ │ ├── hero-studio.jpg
|
||||
│ │ └── ...
|
||||
│ └── 04/
|
||||
│ └── ...
|
||||
└── woocommerce-placeholder.png ← skip
|
||||
```
|
||||
|
||||
## Step 1 — Catalog all media
|
||||
|
||||
```bash
|
||||
find .planning/wpress-extract/uploads -type f \
|
||||
\( -name "*.jpg" -o -name "*.jpeg" -o -name "*.png" -o -name "*.gif" -o -name "*.webp" -o -name "*.svg" \) \
|
||||
| sort > .planning/data/media-raw-list.txt
|
||||
|
||||
wc -l .planning/data/media-raw-list.txt
|
||||
```
|
||||
|
||||
## Step 2 — Skip WordPress-generated size variants
|
||||
|
||||
WordPress auto-generates resized variants: `-150x150`, `-300x200`, `-768x512`, etc.
|
||||
Skip these — they are redundant once we have the originals.
|
||||
|
||||
```bash
|
||||
grep -v -E "\-[0-9]+x[0-9]+\.(jpg|jpeg|png|webp)$" \
|
||||
.planning/data/media-raw-list.txt > .planning/data/media-originals.txt
|
||||
|
||||
echo "Originals: $(wc -l < .planning/data/media-originals.txt)"
|
||||
```
|
||||
|
||||
## Step 3 — Copy originals to src/assets/images/
|
||||
|
||||
Flatten the date-organized subdirs into a single flat directory.
|
||||
Preserve filenames exactly (except extension will change to .webp).
|
||||
|
||||
```bash
|
||||
mkdir -p src/assets/images/
|
||||
|
||||
while IFS= read -r src_path; do
|
||||
filename=$(basename "$src_path")
|
||||
cp "$src_path" "src/assets/images/$filename"
|
||||
done < .planning/data/media-originals.txt
|
||||
|
||||
echo "Copied: $(ls src/assets/images/ | wc -l) files"
|
||||
```
|
||||
|
||||
## Step 4 — Convert to WebP
|
||||
|
||||
Use the project's standard WebP conversion script (see `12-image-assets.md`).
|
||||
If cwebp is available:
|
||||
|
||||
```bash
|
||||
cd src/assets/images/
|
||||
for img in *.jpg *.jpeg *.png; do
|
||||
[ -f "$img" ] || continue
|
||||
base="${img%.*}"
|
||||
cwebp -q 82 "$img" -o "${base}.webp" 2>/dev/null && rm "$img"
|
||||
done
|
||||
echo "WebP conversion done. Count: $(ls *.webp | wc -l)"
|
||||
```
|
||||
|
||||
Or use the Python Pillow batch script if cwebp is not installed:
|
||||
|
||||
```bash
|
||||
python3 /home/sirdrez/arisingmedia-websites/.am-webdesign-sops/wp-divi-pipeline/scripts/convert_images.py \
|
||||
src/assets/images/
|
||||
```
|
||||
|
||||
## Step 5 — Generate media manifest
|
||||
|
||||
After conversion, build the URL remap table used during HTML build:
|
||||
|
||||
```bash
|
||||
python3 -c "
|
||||
import os, json
|
||||
from pathlib import Path
|
||||
|
||||
uploads_dir = Path('.planning/wpress-extract/uploads')
|
||||
site_url = 'https://vibrantyou.yoga'
|
||||
am_path = '/assets/images'
|
||||
|
||||
manifest = []
|
||||
for root, dirs, files in os.walk(uploads_dir):
|
||||
for f in files:
|
||||
full = Path(root) / f
|
||||
rel = full.relative_to(uploads_dir)
|
||||
# WordPress URL for this file
|
||||
wp_url = f'{site_url}/wp-content/uploads/{rel}'
|
||||
# Strip size variants from slug
|
||||
stem = Path(f).stem
|
||||
import re
|
||||
stem_clean = re.sub(r'-\d+x\d+$', '', stem)
|
||||
am_url = f'{am_path}/{stem_clean}.webp'
|
||||
manifest.append({'wp_url': wp_url, 'am_url': am_url, 'original': f})
|
||||
|
||||
Path('.planning/data/media-manifest.json').write_text(
|
||||
json.dumps(manifest, indent=2))
|
||||
print(f'Manifest: {len(manifest)} entries')
|
||||
"
|
||||
```
|
||||
|
||||
## Step 6 — Apply manifest during HTML build
|
||||
|
||||
When writing HTML from extracted content, use the manifest to rewrite
|
||||
every WordPress upload URL:
|
||||
|
||||
```python
|
||||
import json, re
|
||||
|
||||
manifest = json.loads(open('.planning/data/media-manifest.json').read())
|
||||
url_map = {m['wp_url']: m['am_url'] for m in manifest}
|
||||
|
||||
def rewrite_media_urls(html: str) -> str:
|
||||
for wp_url, am_url in url_map.items():
|
||||
html = html.replace(wp_url, am_url)
|
||||
# Also rewrite relative /wp-content/uploads/ paths
|
||||
html = re.sub(
|
||||
r'/wp-content/uploads/\d{4}/\d{2}/([^"\'>\s]+)',
|
||||
lambda m: f"/assets/images/{m.group(1).split('/')[-1].rsplit('.',1)[0]}.webp",
|
||||
html
|
||||
)
|
||||
return html
|
||||
```
|
||||
|
||||
## Files to skip
|
||||
|
||||
Do not migrate these WordPress system images to `src/assets/images/`:
|
||||
- `woocommerce-placeholder.png` and variants
|
||||
- `wp-includes/` images (WordPress core UI)
|
||||
- Plugin admin icons (anything from `plugins/` in uploads)
|
||||
- Files in `wc-logs/`, `ithemes-security/`, `amcu-chunks/` subdirs
|
||||
|
||||
## Logo handling
|
||||
|
||||
The logo is typically at:
|
||||
```
|
||||
uploads/YYYY/MM/VibrantYouYogaLogo.png
|
||||
```
|
||||
|
||||
Place the logo at:
|
||||
- `src/assets/images/logo.webp` — standard WebP version
|
||||
- `src/assets/svg/logo.svg` — if an SVG version exists (preferred)
|
||||
- `src/assets/images/logo.png` — keep PNG fallback for email/OG use
|
||||
|
||||
Reference in header.html:
|
||||
```html
|
||||
<a href="/" class="nav__logo">
|
||||
<img src="/assets/images/logo.webp" alt="VibrantYou Yoga" width="160" height="48">
|
||||
</a>
|
||||
```
|
||||
|
||||
## OG image
|
||||
|
||||
Generate one 1200×630px OG image per `06-seo-meta.md` requirements.
|
||||
Place at: `src/assets/images/og-default.jpg`
|
||||
|
||||
## Next step
|
||||
|
||||
Proceed to `07-seo-preservation.md` to build the redirect map and audit
|
||||
every page's title, description, and canonical before the HTML build.
|
||||
@@ -0,0 +1,182 @@
|
||||
# 07 — SEO Preservation
|
||||
|
||||
Before building HTML, map every WordPress page URL to its new AM URL and
|
||||
ensure title, description, canonical, and schema.org are preserved or improved.
|
||||
|
||||
## Step 1 — Inventory all WP URLs
|
||||
|
||||
Extract every published page slug from `pages.json`:
|
||||
|
||||
```bash
|
||||
python3 -c "
|
||||
import json
|
||||
pages = json.load(open('.planning/data/pages.json'))
|
||||
for p in pages:
|
||||
slug = p['slug']
|
||||
ptype = p['post_type']
|
||||
print(f'/{slug}/ ({ptype}) title={p[\"title\"]!r}')
|
||||
" | tee .planning/data/wp-url-inventory.txt
|
||||
```
|
||||
|
||||
## Step 2 — Build redirect map
|
||||
|
||||
Map each WP URL to the new AM URL. Write to `.planning/data/redirect-map.txt`:
|
||||
|
||||
Format: `OLD_PATH -> NEW_PATH`
|
||||
|
||||
Common mapping patterns for yoga sites:
|
||||
|
||||
| Old WP URL | New AM URL | Action |
|
||||
|-----------|-----------|--------|
|
||||
| `/` | `/` | Same |
|
||||
| `/about/` | `/about/` | Same |
|
||||
| `/classes/` | `/classes/` | Same |
|
||||
| `/yoga-class-name/` | `/classes/yoga-class-name.html` | Restructure |
|
||||
| `/private-yoga-sessions/` | `/private-sessions/` | Rename |
|
||||
| `/contact-us/` | `/contact/` | Simplify |
|
||||
| `/?page_id=42` | `/about/` | WP ID → slug |
|
||||
| `/blog/post-title/` | `/blog/post-title.html` | Flatten |
|
||||
| `/events/event-name/` | `/classes/` | Consolidate into schedule |
|
||||
|
||||
Redirects go into `infra/nginx.conf`:
|
||||
|
||||
```nginx
|
||||
# Exact-match redirects
|
||||
location = /contact-us/ { return 301 /contact/; }
|
||||
location = /private-yoga-sessions/ { return 301 /private-sessions/; }
|
||||
|
||||
# WP page ID redirects
|
||||
location = / {
|
||||
if ($arg_page_id = "42") { return 301 /about/; }
|
||||
if ($arg_p) { return 301 /blog/; }
|
||||
}
|
||||
|
||||
# WP upload URLs → AM asset paths (catch-all)
|
||||
location ^~ /wp-content/uploads/ {
|
||||
return 301 /assets/images/$uri;
|
||||
}
|
||||
|
||||
# Block all WP URLs
|
||||
location ~ ^/wp-(admin|login|json|cron|includes|content/plugins|content/themes) {
|
||||
return 410;
|
||||
}
|
||||
```
|
||||
|
||||
## Step 3 — Rank Math SEO extraction
|
||||
|
||||
Rank Math stores titles and descriptions in `wp_postmeta`.
|
||||
`analyze_db.py` already extracts these into `pages.json` as `seo_title` and `seo_description`.
|
||||
|
||||
For each page, the priority order for SEO fields:
|
||||
1. `seo_title` from Rank Math (if not empty and not a template like `%title% - %sitename%`)
|
||||
2. `post_title` with AM format appended: `{Title} | VibrantYou Yoga`
|
||||
3. Never leave title as the raw WP default
|
||||
|
||||
Rank Math title templates use `%` tokens — strip them and rebuild:
|
||||
```python
|
||||
import re
|
||||
|
||||
def clean_rm_title(rm_title: str, post_title: str, site_name: str) -> str:
|
||||
if not rm_title or "%" in rm_title:
|
||||
return f"{post_title} | {site_name}"
|
||||
return rm_title
|
||||
|
||||
def clean_rm_desc(rm_desc: str) -> str:
|
||||
# Strip %token% placeholders
|
||||
return re.sub(r"%[a-z_]+%", "", rm_desc).strip(" -|")
|
||||
```
|
||||
|
||||
## Step 4 — Per-page SEO checklist
|
||||
|
||||
For every page in `pages.json`, fill in this record before writing HTML:
|
||||
|
||||
```json
|
||||
{
|
||||
"slug": "about",
|
||||
"new_path": "/about/",
|
||||
"canonical": "https://vibrantyou.yoga/about/",
|
||||
"title": "About VibrantYou Yoga | Mindful Movement in [City], [State]",
|
||||
"description": "Meet the instructors and story behind VibrantYou Yoga. [150-160 chars, include city]",
|
||||
"keywords": "yoga studio [city], yoga instructor, mindful movement",
|
||||
"og_image": "/assets/images/about-studio.webp",
|
||||
"schema_type": "AboutPage",
|
||||
"h1": "Our Story"
|
||||
}
|
||||
```
|
||||
|
||||
Write to `.planning/data/seo-map.json`. The HTML build reads this file to
|
||||
stamp `<head>` tags.
|
||||
|
||||
## Step 5 — Schema.org per page type
|
||||
|
||||
| Page | Schema type | Required fields |
|
||||
|------|------------|----------------|
|
||||
| Home | `LocalBusiness` | name, url, telephone, address, areaServed, openingHours |
|
||||
| About | `AboutPage` + `Organization` | name, description, founders |
|
||||
| Classes index | `ItemList` of `Course` | name, url, description per class |
|
||||
| Class detail | `Course` | name, description, provider, educationalLevel |
|
||||
| Contact | `ContactPage` | name, url, telephone, email, address |
|
||||
| Blog post | `Article` | headline, datePublished, author, image |
|
||||
| 404 | none | — |
|
||||
|
||||
LocalBusiness schema for vibrantyou.yoga (seed from `site-info.json`):
|
||||
```json
|
||||
{
|
||||
"@context": "https://schema.org",
|
||||
"@type": ["LocalBusiness", "HealthAndBeautyBusiness"],
|
||||
"@id": "https://vibrantyou.yoga/#business",
|
||||
"name": "VibrantYou Yoga",
|
||||
"url": "https://vibrantyou.yoga",
|
||||
"telephone": "",
|
||||
"priceRange": "$$",
|
||||
"servesCuisine": null,
|
||||
"currenciesAccepted": "USD",
|
||||
"paymentAccepted": "Cash, Credit Card",
|
||||
"address": {
|
||||
"@type": "PostalAddress",
|
||||
"streetAddress": "",
|
||||
"addressLocality": "",
|
||||
"addressRegion": "",
|
||||
"postalCode": "",
|
||||
"addressCountry": "US"
|
||||
}
|
||||
}
|
||||
```
|
||||
Mark address fields `DRAFT NEEDED` — do not fabricate. Pull from `wp_options`
|
||||
(`admin_email`, Events Manager location settings) or ask client.
|
||||
|
||||
## Step 6 — Pre-launch SEO audit commands
|
||||
|
||||
Run these before declaring the build complete:
|
||||
|
||||
```bash
|
||||
SITE=src
|
||||
|
||||
# Every page has a <title>
|
||||
find $SITE -name "*.html" | xargs grep -L '<title>' | grep -v "_template"
|
||||
|
||||
# Every page has meta description
|
||||
find $SITE -name "*.html" | xargs grep -L 'name="description"' | grep -v "_template"
|
||||
|
||||
# Every page has canonical
|
||||
find $SITE -name "*.html" | xargs grep -L 'rel="canonical"' | grep -v "_template"
|
||||
|
||||
# Every page has JSON-LD
|
||||
find $SITE -name "*.html" | xargs grep -L 'application/ld+json' | grep -v "_template"
|
||||
|
||||
# No WP URLs leaked into HTML
|
||||
grep -r "wp-content\|wp-admin\|wordpress\|?p=\|?page_id=" $SITE --include="*.html"
|
||||
|
||||
# No unreplaced template placeholders
|
||||
grep -r "{{" $SITE --include="*.html"
|
||||
|
||||
# No Divi class residue
|
||||
grep -r "et_pb_\|divi-builder" $SITE --include="*.html"
|
||||
```
|
||||
|
||||
All six commands must return zero results before launch.
|
||||
|
||||
## Next step
|
||||
|
||||
Proceed to `08-run-order.md` for the complete execution sequence,
|
||||
then `02-wordpress-to-html-migration.md` Phase 7 for DNS cutover.
|
||||
@@ -0,0 +1,230 @@
|
||||
# 08 — Run Order (DEPRECATED)
|
||||
|
||||
> **Superseded by `10-agent-breadcrumbs.md`.**
|
||||
> This file described the WP → static HTML (Stack B) run order.
|
||||
> The pipeline now targets Stack A (PHP router + SQLite).
|
||||
> Use `10-agent-breadcrumbs.md` for the current ordered execution checklist.
|
||||
|
||||
---
|
||||
|
||||
Step-by-step execution sequence for a complete .wpress → AM HTML migration.
|
||||
Run each command, verify the output, then proceed to the next.
|
||||
|
||||
## Prerequisites
|
||||
|
||||
```bash
|
||||
# Python 3.8+ required
|
||||
python3 --version
|
||||
|
||||
# cwebp for image conversion (optional — Python fallback available)
|
||||
which cwebp || echo "cwebp not installed — will use Python Pillow fallback"
|
||||
|
||||
# Set project domain variable (use throughout)
|
||||
export DOMAIN="vibrantyou.yoga"
|
||||
export PROJECT="/home/sirdrez/arisingmedia-websites/$DOMAIN"
|
||||
export SOPS="/home/sirdrez/arisingmedia-websites/.am-webdesign-sops"
|
||||
export WPRESS=$(ls $PROJECT/.planning/*.wpress | head -1)
|
||||
|
||||
echo "Domain: $DOMAIN"
|
||||
echo "Project: $PROJECT"
|
||||
echo "Archive: $WPRESS"
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Phase 0 — Setup
|
||||
|
||||
```bash
|
||||
# Create directory structure
|
||||
mkdir -p $PROJECT/{src/{about,services,contact,blog,classes,components,assets/{css,js,images,svg,fonts}},build,infra,api,.planning/{data/{content},scripts,wpress-extract}}
|
||||
|
||||
# Verify archive
|
||||
ls -lh $WPRESS
|
||||
file $WPRESS
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Phase 1 — Extract archive
|
||||
|
||||
```bash
|
||||
python3 $SOPS/wp-divi-pipeline/scripts/extract_wpress.py \
|
||||
"$WPRESS" \
|
||||
"$PROJECT/.planning/wpress-extract/"
|
||||
|
||||
# Verify
|
||||
ls $PROJECT/.planning/wpress-extract/
|
||||
cat $PROJECT/.planning/wpress-extract/package.json | python3 -m json.tool | head -20
|
||||
ls -lh $PROJECT/.planning/wpress-extract/database.sql
|
||||
```
|
||||
|
||||
Expected output: `DONE: N files | X MB`
|
||||
|
||||
---
|
||||
|
||||
## Phase 2 — Database analysis
|
||||
|
||||
```bash
|
||||
python3 $SOPS/wp-divi-pipeline/scripts/analyze_db.py \
|
||||
"$PROJECT/.planning/wpress-extract/" \
|
||||
"$PROJECT/.planning/data/"
|
||||
|
||||
# Verify
|
||||
cat $PROJECT/.planning/data/site-info.json
|
||||
echo "Pages: $(python3 -c "import json; print(len(json.load(open('$PROJECT/.planning/data/pages.json'))))")"
|
||||
cat $PROJECT/.planning/data/design-system.json
|
||||
```
|
||||
|
||||
Expected output: `pages.json (N pages/posts)`
|
||||
If pages = 0, check the SQL prefix detection in the script output.
|
||||
|
||||
---
|
||||
|
||||
## Phase 3 — Content extraction
|
||||
|
||||
### Divi 5 (most common — check design-system.json divi_version first)
|
||||
|
||||
```bash
|
||||
python3 $SOPS/wp-divi-pipeline/scripts/extract_divi5.py \
|
||||
"$PROJECT/.planning/data/pages.json" \
|
||||
"$PROJECT/.planning/data/content/"
|
||||
|
||||
# Verify
|
||||
ls $PROJECT/.planning/data/content/
|
||||
cat $PROJECT/.planning/data/content/home.json | python3 -m json.tool | head -40
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Phase 4 — Design system
|
||||
|
||||
Read `$PROJECT/.planning/data/design-system.json` and seed `main.css`:
|
||||
|
||||
```bash
|
||||
cat $PROJECT/.planning/data/design-system.json
|
||||
```
|
||||
|
||||
Manually translate to CSS custom properties per `04-design-system-extraction.md`.
|
||||
Write to: `$PROJECT/src/assets/css/main.css`
|
||||
|
||||
Key values for vibrantyou.yoga:
|
||||
- Primary: #1a8a7a Dark: #0f5f53
|
||||
- Body font: DM Sans Heading font: DM Serif Display
|
||||
|
||||
---
|
||||
|
||||
## Phase 5 — Media migration
|
||||
|
||||
```bash
|
||||
# Catalog originals (skip WP-generated size variants)
|
||||
find $PROJECT/.planning/wpress-extract/uploads -type f \
|
||||
\( -name "*.jpg" -o -name "*.jpeg" -o -name "*.png" -o -name "*.webp" \) | \
|
||||
grep -v -E "\-[0-9]+x[0-9]+\.(jpg|jpeg|png|webp)$" | \
|
||||
sort > $PROJECT/.planning/data/media-originals.txt
|
||||
|
||||
echo "Original images: $(wc -l < $PROJECT/.planning/data/media-originals.txt)"
|
||||
|
||||
# Copy to src/assets/images/
|
||||
while IFS= read -r src; do
|
||||
cp "$src" "$PROJECT/src/assets/images/$(basename $src)"
|
||||
done < $PROJECT/.planning/data/media-originals.txt
|
||||
|
||||
# Convert to WebP (cwebp path)
|
||||
cd $PROJECT/src/assets/images/
|
||||
for img in *.jpg *.jpeg *.png; do
|
||||
[ -f "$img" ] || continue
|
||||
base="${img%.*}"
|
||||
cwebp -q 82 "$img" -o "${base}.webp" 2>/dev/null && rm "$img"
|
||||
done
|
||||
echo "WebP count: $(ls *.webp 2>/dev/null | wc -l)"
|
||||
cd $PROJECT
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Phase 6 — Build HTML
|
||||
|
||||
Per `05-content-migration.md`, build pages in this order:
|
||||
|
||||
```bash
|
||||
# 1. Write src/assets/css/main.css (design tokens — manual)
|
||||
# 2. Write src/assets/css/components.css (manual)
|
||||
# 3. Write src/components/header.html (manual)
|
||||
# 4. Write src/components/footer.html (manual)
|
||||
# 5. Write src/assets/js/components.js (fetch + inject)
|
||||
# 6. Write src/assets/js/main.js (scroll, animations)
|
||||
# 7. Write src/index.html (home page — first, establishes design)
|
||||
# 8. Write remaining pages
|
||||
|
||||
# After build, verify zero unreplaced placeholders
|
||||
grep -r "{{" $PROJECT/src --include="*.html" && echo "FAIL: placeholders found" || echo "OK"
|
||||
|
||||
# Verify no Divi residue
|
||||
grep -rn "et_pb_\|wp:divi\|\[et_pb" $PROJECT/src --include="*.html" && echo "FAIL: Divi residue" || echo "OK"
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Phase 7 — SEO audit
|
||||
|
||||
```bash
|
||||
cd $PROJECT/src
|
||||
|
||||
# All pages have title
|
||||
find . -name "*.html" | grep -v "_template" | xargs grep -L '<title>' | head
|
||||
|
||||
# All pages have canonical
|
||||
find . -name "*.html" | grep -v "_template" | xargs grep -L 'rel="canonical"' | head
|
||||
|
||||
# All pages have JSON-LD
|
||||
find . -name "*.html" | grep -v "_template" | xargs grep -L 'ld+json' | head
|
||||
|
||||
cd $PROJECT
|
||||
```
|
||||
|
||||
All commands must return empty output.
|
||||
|
||||
---
|
||||
|
||||
## Phase 8 — Infra (Docker)
|
||||
|
||||
```bash
|
||||
# Copy infra from reference project
|
||||
cp /home/sirdrez/arisingmedia-websites/vibrantyoucoaching.com/Dockerfile $PROJECT/
|
||||
cp /home/sirdrez/arisingmedia-websites/vibrantyoucoaching.com/docker-compose.yml $PROJECT/
|
||||
cp -r /home/sirdrez/arisingmedia-websites/vibrantyoucoaching.com/infra/ $PROJECT/infra/
|
||||
|
||||
# Update nginx.conf: set server_name to $DOMAIN, add redirects from 07-seo-preservation.md
|
||||
# Update docker-compose.yml: set container_name and port
|
||||
|
||||
# Test build
|
||||
docker compose -f $PROJECT/docker-compose.yml build 2>&1 | tail -5
|
||||
docker compose -f $PROJECT/docker-compose.yml up -d
|
||||
curl -I http://localhost:PORT/ 2>&1 | head -5
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Phase 9 — Protection check
|
||||
|
||||
```bash
|
||||
# Run after deploy
|
||||
bash $SOPS/tools/verify-protection.sh https://$DOMAIN
|
||||
|
||||
# Must return exit 0 with no FAIL lines
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Checklist summary
|
||||
|
||||
- [ ] Phase 0: Directories created
|
||||
- [ ] Phase 1: .wpress extracted, database.sql present
|
||||
- [ ] Phase 2: pages.json > 0 entries, design-system.json has colors + fonts
|
||||
- [ ] Phase 3: content/ dir has one JSON per page
|
||||
- [ ] Phase 4: main.css written with full :root{} token block
|
||||
- [ ] Phase 5: WebP images in src/assets/images/
|
||||
- [ ] Phase 6: All HTML pages built, zero {{ placeholders, zero Divi residue
|
||||
- [ ] Phase 7: All SEO audit commands return empty
|
||||
- [ ] Phase 8: Docker container up, curl returns 200
|
||||
- [ ] Phase 9: verify-protection.sh exits 0
|
||||
@@ -0,0 +1,370 @@
|
||||
# 09 — Stack A Output Spec (SQLite Schema + sections_json)
|
||||
|
||||
## SQLite databases produced by seed_databases.py
|
||||
|
||||
### pages.sqlite
|
||||
|
||||
```sql
|
||||
CREATE TABLE pages (
|
||||
id INTEGER PRIMARY KEY,
|
||||
slug TEXT UNIQUE NOT NULL,
|
||||
template TEXT NOT NULL, -- home | static | classes | schedule | glossary | blog
|
||||
title TEXT NOT NULL,
|
||||
meta_description TEXT,
|
||||
canonical_url TEXT,
|
||||
og_image TEXT,
|
||||
schema_json TEXT,
|
||||
hero_eyebrow TEXT,
|
||||
hero_h1 TEXT,
|
||||
hero_lead TEXT,
|
||||
sections_json TEXT, -- JSON array of section objects
|
||||
updated_at TEXT
|
||||
);
|
||||
```
|
||||
|
||||
### nav.sqlite
|
||||
|
||||
```sql
|
||||
CREATE TABLE nav_items (
|
||||
id INTEGER PRIMARY KEY,
|
||||
label TEXT NOT NULL,
|
||||
href TEXT NOT NULL,
|
||||
display_order INTEGER DEFAULT 0,
|
||||
is_cta INTEGER DEFAULT 0 -- 1 = render as button
|
||||
);
|
||||
```
|
||||
|
||||
### blog.sqlite
|
||||
|
||||
```sql
|
||||
CREATE TABLE posts (
|
||||
id INTEGER PRIMARY KEY,
|
||||
slug TEXT UNIQUE NOT NULL,
|
||||
title TEXT NOT NULL,
|
||||
excerpt TEXT,
|
||||
body_html TEXT,
|
||||
author TEXT DEFAULT 'Admin',
|
||||
published_at TEXT,
|
||||
og_image TEXT,
|
||||
tags TEXT
|
||||
);
|
||||
```
|
||||
|
||||
### testimonials.sqlite
|
||||
|
||||
```sql
|
||||
CREATE TABLE testimonials (
|
||||
id INTEGER PRIMARY KEY,
|
||||
quote TEXT NOT NULL,
|
||||
author_name TEXT NOT NULL,
|
||||
author_role TEXT,
|
||||
is_featured INTEGER DEFAULT 0,
|
||||
display_order INTEGER DEFAULT 0
|
||||
);
|
||||
```
|
||||
|
||||
### glossary.sqlite (if site has a glossary)
|
||||
|
||||
```sql
|
||||
CREATE TABLE terms (
|
||||
id INTEGER PRIMARY KEY,
|
||||
slug TEXT UNIQUE NOT NULL,
|
||||
term TEXT NOT NULL,
|
||||
pronunciation TEXT,
|
||||
definition TEXT NOT NULL,
|
||||
category TEXT NOT NULL,
|
||||
level TEXT NOT NULL,
|
||||
display_order INTEGER DEFAULT 0
|
||||
);
|
||||
```
|
||||
|
||||
### faq.sqlite (if site has FAQs)
|
||||
|
||||
```sql
|
||||
CREATE TABLE faqs (
|
||||
id INTEGER PRIMARY KEY,
|
||||
question TEXT NOT NULL,
|
||||
answer TEXT NOT NULL,
|
||||
category TEXT NOT NULL,
|
||||
display_order INTEGER DEFAULT 0
|
||||
);
|
||||
```
|
||||
|
||||
## sections_json section types
|
||||
|
||||
Each page row's sections_json is a JSON array. Each element is a typed object:
|
||||
|
||||
### text_split
|
||||
|
||||
Two-column: text on one side, image on the other. CTAs optional.
|
||||
|
||||
```json
|
||||
{
|
||||
"type": "text_split",
|
||||
"eyebrow": "",
|
||||
"h2": "",
|
||||
"body": "",
|
||||
"img": "/assets/images/x.webp",
|
||||
"img_alt": "",
|
||||
"cta_label": "",
|
||||
"cta_href": "",
|
||||
"reverse": false
|
||||
}
|
||||
```
|
||||
|
||||
### feature_cards
|
||||
|
||||
Grid of 3-4 cards, each with icon + title + body.
|
||||
|
||||
```json
|
||||
{
|
||||
"type": "feature_cards",
|
||||
"eyebrow": "",
|
||||
"h2": "",
|
||||
"lead": "",
|
||||
"cards": [
|
||||
{"icon": "", "title": "", "body": ""}
|
||||
]
|
||||
}
|
||||
```
|
||||
|
||||
### accordion
|
||||
|
||||
Collapsible question/answer pairs.
|
||||
|
||||
```json
|
||||
{
|
||||
"type": "accordion",
|
||||
"eyebrow": "",
|
||||
"h2": "",
|
||||
"items": [
|
||||
{"q": "", "a": ""}
|
||||
]
|
||||
}
|
||||
```
|
||||
|
||||
### cta_band
|
||||
|
||||
Full-width call-to-action with headline + button.
|
||||
|
||||
```json
|
||||
{
|
||||
"type": "cta_band",
|
||||
"eyebrow": "",
|
||||
"h2": "",
|
||||
"lead": "",
|
||||
"btn_label": "",
|
||||
"btn_href": "",
|
||||
"variant": "forest"
|
||||
}
|
||||
```
|
||||
|
||||
### text_block
|
||||
|
||||
Simple text heading + body.
|
||||
|
||||
```json
|
||||
{
|
||||
"type": "text_block",
|
||||
"eyebrow": "",
|
||||
"h2": "",
|
||||
"body": ""
|
||||
}
|
||||
```
|
||||
|
||||
### stats_strip
|
||||
|
||||
Grid of stat + label pairs.
|
||||
|
||||
```json
|
||||
{
|
||||
"type": "stats_strip",
|
||||
"stats": [
|
||||
{"value": "", "label": ""}
|
||||
]
|
||||
}
|
||||
```
|
||||
|
||||
### topic_pills
|
||||
|
||||
Row of clickable topic/tag items.
|
||||
|
||||
```json
|
||||
{
|
||||
"type": "topic_pills",
|
||||
"eyebrow": "",
|
||||
"h2": "",
|
||||
"items": [
|
||||
{"label": "", "href": ""}
|
||||
]
|
||||
}
|
||||
```
|
||||
|
||||
### form_contact
|
||||
|
||||
Embedded contact form.
|
||||
|
||||
```json
|
||||
{
|
||||
"type": "form_contact",
|
||||
"h2": "",
|
||||
"lead": ""
|
||||
}
|
||||
```
|
||||
|
||||
### booking_options
|
||||
|
||||
Pricing table or service options grid.
|
||||
|
||||
```json
|
||||
{
|
||||
"type": "booking_options",
|
||||
"eyebrow": "",
|
||||
"h2": "",
|
||||
"options": [
|
||||
{"name": "", "price": "", "features": [], "cta_label": "", "cta_href": ""}
|
||||
]
|
||||
}
|
||||
```
|
||||
|
||||
## Divi module → section type mapping
|
||||
|
||||
| Divi Module | AM Section Type | Notes |
|
||||
|---|---|---|
|
||||
| et_pb_blurb | feature_cards item | Extract icon, title, body |
|
||||
| et_pb_toggle | accordion item | Extract q/a pairs |
|
||||
| et_pb_cta | cta_band | Extract headline, button text, href |
|
||||
| et_pb_pricing_table | booking_options | Extract plan names, prices, features |
|
||||
| et_pb_testimonial | testimonials.sqlite row | Extract quote, author, role |
|
||||
| et_pb_text | text_block | Extract body copy |
|
||||
| et_pb_code | text_block (sanitized) | Extract HTML, remove script tags |
|
||||
| et_pb_number_counter | stats_strip item | Extract number, label |
|
||||
| et_pb_button | cta_band (minimal) | Extract button text, href |
|
||||
| et_pb_menu / header | nav.sqlite rows | Extract label, URL, menu order |
|
||||
|
||||
## seed_databases.py structure
|
||||
|
||||
Every migration generates a seed_databases.py at `build/seed_databases.py`.
|
||||
|
||||
Template structure:
|
||||
|
||||
```python
|
||||
import sqlite3
|
||||
import json
|
||||
from pathlib import Path
|
||||
|
||||
pages_path = Path('src/api/data/pages.sqlite')
|
||||
nav_path = Path('src/api/data/nav.sqlite')
|
||||
blog_path = Path('src/api/data/blog.sqlite')
|
||||
testimonials_path = Path('src/api/data/testimonials.sqlite')
|
||||
|
||||
def seed_pages(conn):
|
||||
"""INSERT all pages with sections_json and hero data."""
|
||||
cursor = conn.cursor()
|
||||
cursor.execute('''
|
||||
CREATE TABLE IF NOT EXISTS pages (
|
||||
id INTEGER PRIMARY KEY,
|
||||
slug TEXT UNIQUE NOT NULL,
|
||||
template TEXT NOT NULL,
|
||||
title TEXT NOT NULL,
|
||||
meta_description TEXT,
|
||||
canonical_url TEXT,
|
||||
og_image TEXT,
|
||||
schema_json TEXT,
|
||||
hero_eyebrow TEXT,
|
||||
hero_h1 TEXT,
|
||||
hero_lead TEXT,
|
||||
sections_json TEXT,
|
||||
updated_at TEXT
|
||||
)
|
||||
''')
|
||||
|
||||
pages = [
|
||||
('home', 'home', 'Home', 'Home meta', '/home', '', '{}',
|
||||
'', 'Welcome', 'Lead text', json.dumps([...])),
|
||||
# ... more rows
|
||||
]
|
||||
for page in pages:
|
||||
cursor.execute(
|
||||
'INSERT INTO pages (slug, template, title, meta_description, canonical_url, og_image, schema_json, hero_eyebrow, hero_h1, hero_lead, sections_json, updated_at) VALUES (?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, datetime("now"))',
|
||||
page
|
||||
)
|
||||
|
||||
def seed_nav(conn):
|
||||
"""INSERT navigation items from nav.json."""
|
||||
cursor = conn.cursor()
|
||||
cursor.execute('''
|
||||
CREATE TABLE IF NOT EXISTS nav_items (
|
||||
id INTEGER PRIMARY KEY,
|
||||
label TEXT NOT NULL,
|
||||
href TEXT NOT NULL,
|
||||
display_order INTEGER DEFAULT 0,
|
||||
is_cta INTEGER DEFAULT 0
|
||||
)
|
||||
''')
|
||||
|
||||
items = [
|
||||
('Home', '/', 0, 0),
|
||||
('About', '/about', 1, 0),
|
||||
('Contact', '/contact', 2, 1),
|
||||
# ... more rows
|
||||
]
|
||||
for item in items:
|
||||
cursor.execute(
|
||||
'INSERT INTO nav_items (label, href, display_order, is_cta) VALUES (?, ?, ?, ?)',
|
||||
item
|
||||
)
|
||||
|
||||
def seed_blog(conn):
|
||||
"""INSERT blog posts if site has a blog."""
|
||||
cursor = conn.cursor()
|
||||
cursor.execute('''
|
||||
CREATE TABLE IF NOT EXISTS posts (
|
||||
id INTEGER PRIMARY KEY,
|
||||
slug TEXT UNIQUE NOT NULL,
|
||||
title TEXT NOT NULL,
|
||||
excerpt TEXT,
|
||||
body_html TEXT,
|
||||
author TEXT DEFAULT 'Admin',
|
||||
published_at TEXT,
|
||||
og_image TEXT,
|
||||
tags TEXT
|
||||
)
|
||||
''')
|
||||
# ... INSERT rows
|
||||
|
||||
def seed_testimonials(conn):
|
||||
"""INSERT testimonials if present."""
|
||||
# ... CREATE TABLE + INSERT rows
|
||||
|
||||
if __name__ == '__main__':
|
||||
for db_path, seeder_fn in [
|
||||
(pages_path, seed_pages),
|
||||
(nav_path, seed_nav),
|
||||
(blog_path, seed_blog),
|
||||
(testimonials_path, seed_testimonials),
|
||||
]:
|
||||
if db_path.exists():
|
||||
db_path.unlink() # clear if re-running
|
||||
conn = sqlite3.connect(db_path)
|
||||
seeder_fn(conn)
|
||||
conn.commit()
|
||||
conn.close()
|
||||
print(f"seeded: {db_path.name}")
|
||||
|
||||
print("All databases seeded successfully.")
|
||||
```
|
||||
|
||||
## Content validation checklist
|
||||
|
||||
After staging seed_databases.py and before running it:
|
||||
|
||||
- [ ] No raw Divi shortcode residue: `[et_pb_`, `[vc_`, etc.
|
||||
- [ ] No em-dashes (—): replace with commas, periods, or spaces
|
||||
- [ ] No "Netherlands" or other location-specific copy (unless intentional)
|
||||
- [ ] hero_h1 is 5-10 words (brand voice, not generic)
|
||||
- [ ] Each section type matches the spec above (no custom types)
|
||||
- [ ] All images are `/assets/images/{name}.webp` (not absolute URLs)
|
||||
- [ ] All CTAs point to correct slugs (`/about`, `/contact`, etc.)
|
||||
- [ ] Nav items include at least 3 menu links
|
||||
- [ ] At least one nav item has `is_cta=1` (usually Contact or Book)
|
||||
@@ -0,0 +1,249 @@
|
||||
# 10 — Agent Execution Breadcrumbs
|
||||
|
||||
Step-by-step ordered checklist for an agent executing a .wpress migration to Stack A.
|
||||
Each step has: input, command, expected output, verification. Complete each before next.
|
||||
|
||||
## Pre-flight
|
||||
|
||||
- [ ] .wpress file confirmed at `$PROJECT/.planning/*.wpress`
|
||||
- [ ] python3 --version >= 3.8
|
||||
- [ ] docker compose version confirmed
|
||||
- [ ] DOMAIN and PROJECT env vars set
|
||||
|
||||
## Step 1 — Extract archive
|
||||
|
||||
**INPUT:** `$WPRESS` (path to .wpress file)
|
||||
|
||||
**CMD:**
|
||||
```bash
|
||||
python3 $SOPS/wp-divi-pipeline-to-am-stack/scripts/extract_wpress.py "$WPRESS" "$PROJECT/.planning/wpress-extract/"
|
||||
```
|
||||
|
||||
**VERIFY:**
|
||||
```bash
|
||||
ls $PROJECT/.planning/wpress-extract/
|
||||
```
|
||||
|
||||
Expected: `database.sql` and `wp-content/` present
|
||||
|
||||
**BLOCK:** If database.sql missing, .wpress format differs — check extract_wpress.py logs.
|
||||
|
||||
---
|
||||
|
||||
## Step 2 — Analyze database
|
||||
|
||||
**INPUT:** `$PROJECT/.planning/wpress-extract/database.sql`
|
||||
|
||||
**CMD:**
|
||||
```bash
|
||||
python3 $SOPS/wp-divi-pipeline-to-am-stack/scripts/analyze_db.py "$PROJECT/.planning/wpress-extract/" "$PROJECT/.planning/data/"
|
||||
```
|
||||
|
||||
**VERIFY:**
|
||||
```bash
|
||||
cat $PROJECT/.planning/data/pages.json | python3 -m json.tool | head -20
|
||||
cat $PROJECT/.planning/data/site-info.json
|
||||
```
|
||||
|
||||
Expected: page objects with slug + title visible; divi_version: 4 or 5
|
||||
|
||||
**BLOCK:** If pages.json empty, check table prefix detection in analyze_db.py output.
|
||||
|
||||
---
|
||||
|
||||
## Step 3 — Extract nav menus
|
||||
|
||||
**INPUT:** `$PROJECT/.planning/wpress-extract/database.sql`
|
||||
|
||||
**CMD:**
|
||||
```bash
|
||||
python3 $SOPS/wp-divi-pipeline-to-am-stack/scripts/extract_nav.py "$PROJECT/.planning/wpress-extract/" "$PROJECT/.planning/data/"
|
||||
```
|
||||
|
||||
**VERIFY:**
|
||||
```bash
|
||||
cat $PROJECT/.planning/data/nav.json | python3 -m json.tool
|
||||
```
|
||||
|
||||
Expected: array of `{label, href, display_order, is_cta}` objects. At least 3 items.
|
||||
|
||||
**NOTE:** `is_cta=1` for "Book", "Get Started", "Contact", "Sign Up" type items.
|
||||
|
||||
---
|
||||
|
||||
## Step 4 — Extract page content
|
||||
|
||||
**INPUT:** `$PROJECT/.planning/data/pages.json` + `wpress-extract/`
|
||||
|
||||
**CMD:** (choose based on Divi version from Step 2)
|
||||
|
||||
Divi 5:
|
||||
```bash
|
||||
python3 $SOPS/wp-divi-pipeline-to-am-stack/scripts/extract_divi5.py "$PROJECT/.planning/data/pages.json" "$PROJECT/.planning/data/content/"
|
||||
```
|
||||
|
||||
Divi 4:
|
||||
```bash
|
||||
python3 $SOPS/wp-divi-pipeline-to-am-stack/scripts/extract_divi4.py "$PROJECT/.planning/data/pages.json" "$PROJECT/.planning/data/content/"
|
||||
```
|
||||
|
||||
**VERIFY:**
|
||||
```bash
|
||||
ls $PROJECT/.planning/data/content/
|
||||
cat $PROJECT/.planning/data/content/home.json | python3 -m json.tool | head -40
|
||||
```
|
||||
|
||||
Expected: one .json file per page (home.json, about.json, etc.); sections array with type fields visible.
|
||||
|
||||
---
|
||||
|
||||
## Step 5 — Extract media
|
||||
|
||||
**INPUT:** `$PROJECT/.planning/wpress-extract/wp-content/uploads/`
|
||||
|
||||
**CMD:**
|
||||
```bash
|
||||
python3 $SOPS/wp-divi-pipeline-to-am-stack/scripts/extract_media.py "$PROJECT/.planning/wpress-extract/" "$PROJECT/.planning/data/" "$PROJECT/assets/images/"
|
||||
```
|
||||
|
||||
**VERIFY:**
|
||||
```bash
|
||||
ls $PROJECT/assets/images/ | head -10
|
||||
cat $PROJECT/.planning/data/media-manifest.json | python3 -m json.tool | head -20
|
||||
```
|
||||
|
||||
Expected: .webp files present; media-manifest.json shows `original_url → /assets/images/x.webp` mapping.
|
||||
|
||||
---
|
||||
|
||||
## Step 6 — Stage seed_databases.py skeleton
|
||||
|
||||
**INPUT:** All .json files in `$PROJECT/.planning/data/content/` + `nav.json` + `media-manifest.json`
|
||||
|
||||
**CMD:**
|
||||
```bash
|
||||
python3 $SOPS/wp-divi-pipeline-to-am-stack/scripts/stage_seed.py "$PROJECT/.planning/data/" "$PROJECT/build/seed_databases.py" --domain "$DOMAIN"
|
||||
```
|
||||
|
||||
**VERIFY:**
|
||||
```bash
|
||||
python3 -c "import ast; ast.parse(open('$PROJECT/build/seed_databases.py').read()); print('syntax OK')"
|
||||
grep "def seed_pages" $PROJECT/build/seed_databases.py
|
||||
```
|
||||
|
||||
Expected: seed_databases.py is valid Python; contains seed_pages, seed_nav functions.
|
||||
|
||||
**NOTE:** Content stubs are in place. Human/agent reviews + fills in prose before running.
|
||||
|
||||
---
|
||||
|
||||
## Step 7 — Review and fill content
|
||||
|
||||
**MANUAL:** Open `$PROJECT/build/seed_databases.py`
|
||||
|
||||
For each page's `sections_json`:
|
||||
- [ ] Confirm `hero_h1` and `hero_lead` match the brand (not raw Divi copy-paste)
|
||||
- [ ] Confirm each section has correct type (see 09-stack-a-output.md mapping)
|
||||
- [ ] Replace any em-dashes (—) with commas or periods
|
||||
- [ ] Replace any Divi shortcode residue (`[et_pb_`, `vc_`, etc.)
|
||||
- [ ] Ensure no "Netherlands" or location-specific copy if site is global
|
||||
- [ ] Confirm nav items in `seed_nav()` match final site IA
|
||||
- [ ] Verify all image paths are `/assets/images/{name}.webp`
|
||||
- [ ] Verify all CTAs point to correct slugs (`/about`, `/contact`, etc.)
|
||||
|
||||
---
|
||||
|
||||
## Step 8 — Run seed_databases.py
|
||||
|
||||
**CMD:**
|
||||
```bash
|
||||
cd $PROJECT && python3 build/seed_databases.py
|
||||
```
|
||||
|
||||
**VERIFY:**
|
||||
```bash
|
||||
ls -lh src/api/data/
|
||||
```
|
||||
|
||||
Expected: Output line shows counts > 0: `seeded: pages=N nav=N blog=N ...`. Database files exist.
|
||||
|
||||
**BLOCK:** Any count=0 means that seeder function has an error — fix before continuing.
|
||||
|
||||
---
|
||||
|
||||
## Step 9 — Scaffold PHP templates
|
||||
|
||||
**CMD:** Copy reference templates from vibrantyou.yoga as starting point:
|
||||
|
||||
```bash
|
||||
VYOGA="/home/sirdrez/arisingmedia-websites/vibrantyou.yoga"
|
||||
cp $VYOGA/src/api/router.php $PROJECT/src/api/router.php
|
||||
cp $VYOGA/src/api/contact.php $PROJECT/src/api/contact.php
|
||||
cp $VYOGA/src/api/templates/static.php $PROJECT/src/api/templates/static.php
|
||||
cp $VYOGA/src/api/templates/home.php $PROJECT/src/api/templates/home.php
|
||||
cp $VYOGA/src/api/components/_header.php $PROJECT/src/api/components/_header.php
|
||||
cp $VYOGA/src/api/components/_footer.php $PROJECT/src/api/components/_footer.php
|
||||
cp -r $VYOGA/assets/css $PROJECT/assets/
|
||||
cp -r $VYOGA/assets/js $PROJECT/assets/
|
||||
cp $VYOGA/Dockerfile $PROJECT/
|
||||
cp $VYOGA/docker-compose.yml $PROJECT/
|
||||
cp -r $VYOGA/infra $PROJECT/
|
||||
```
|
||||
|
||||
**VERIFY:**
|
||||
```bash
|
||||
php -l $PROJECT/src/api/router.php
|
||||
```
|
||||
|
||||
Expected: `No syntax errors detected`
|
||||
|
||||
**NOTE:** Update brand name, colors, and any site-specific logic in templates.
|
||||
|
||||
**NOTE:** `_header.php` reads from nav.sqlite — no hardcoded nav needed.
|
||||
|
||||
---
|
||||
|
||||
## Step 10 — Build and test
|
||||
|
||||
**CMD:**
|
||||
```bash
|
||||
cd $PROJECT && docker compose build --no-cache && docker compose up -d
|
||||
```
|
||||
|
||||
**VERIFY:**
|
||||
```bash
|
||||
sleep 5
|
||||
curl -I http://localhost:8000/
|
||||
curl -s http://localhost:8000/ | grep -i "title\|h1" | head -3
|
||||
```
|
||||
|
||||
Expected: HTTP 200; site name visible in page.
|
||||
|
||||
---
|
||||
|
||||
## Step 11 — Protection + SEO check
|
||||
|
||||
**CMD:**
|
||||
```bash
|
||||
bash /home/sirdrez/arisingmedia-websites/.am-webdesign-sops/tools/verify-protection.sh http://localhost:8000
|
||||
```
|
||||
|
||||
**VERIFY:** Exit 0, no FAIL lines
|
||||
|
||||
---
|
||||
|
||||
## Step 12 — Lighthouse + cleanup
|
||||
|
||||
**MANUAL:**
|
||||
- Open Firefox: `firefox http://localhost:8000/`
|
||||
- Run Lighthouse (DevTools > Lighthouse)
|
||||
|
||||
**TARGET:**
|
||||
- Performance >= 90
|
||||
- SEO >= 95
|
||||
- Accessibility >= 90
|
||||
|
||||
**CLEANUP:**
|
||||
```bash
|
||||
cd $PROJECT && docker compose down
|
||||
```
|
||||
@@ -0,0 +1,81 @@
|
||||
# WP + Divi to AM Stack A Pipeline — SOP Index
|
||||
|
||||
End-to-end playbook for converting any WordPress / Divi site backup (.wpress)
|
||||
into an Arising Media Stack A deployment: PHP router + SQLite + vanilla JS/CSS.
|
||||
|
||||
## Quick start (CLI launcher)
|
||||
|
||||
```bash
|
||||
python3 scripts/migrate.py --wpress /path/to/backup.wpress --domain example.com
|
||||
```
|
||||
|
||||
Runs phases 0-6 automatically (extract, analyze, nav, content, media, stage seed).
|
||||
Prints agent breadcrumbs for phases 7-11. See `10-agent-breadcrumbs.md` for the
|
||||
complete ordered execution checklist.
|
||||
|
||||
## SOPs in this folder
|
||||
|
||||
| File | Phase | Description |
|
||||
|------|-------|-------------|
|
||||
| `00-overview.md` | — | Pipeline overview, philosophy, what to extract vs not replicate |
|
||||
| `01-wpress-extraction.md` | 1 | .wpress binary format, extraction script, verification |
|
||||
| `02-database-analysis.md` | 2 | MySQL dump parsing, page inventory, Divi version detection |
|
||||
| `03-divi-content-extraction.md` | 3 | Divi 4 shortcodes vs Divi 5 blocks, extraction scripts |
|
||||
| `04-design-system-extraction.md` | 4 | Colors, fonts, spacing → tokens.css |
|
||||
| `05-content-migration.md` | 5-6 | Section remapping, content staging, seed_databases.py |
|
||||
| `06-media-assets.md` | 5 | Upload migration, WebP conversion, media manifest |
|
||||
| `07-seo-preservation.md` | 7 | Redirect map, Rank Math extraction, schema.org |
|
||||
| `08-run-order.md` | — | DEPRECATED — superseded by `10-agent-breadcrumbs.md` |
|
||||
| `09-stack-a-output.md` | — | SQLite schemas, sections_json spec, Divi→AM module mapping |
|
||||
| `10-agent-breadcrumbs.md` | 0-11 | Ordered agent execution checklist (.wpress → live Docker) |
|
||||
|
||||
## Scripts in scripts/
|
||||
|
||||
| Script | Purpose |
|
||||
|--------|---------|
|
||||
| `migrate.py` | CLI launcher — runs phases 0-6, prints breadcrumbs for 7-11 |
|
||||
| `run_pipeline.sh` | Legacy shell wrapper (pre-migrate.py) |
|
||||
| `extract_wpress.py` | Unpack .wpress binary archive |
|
||||
| `analyze_db.py` | Parse SQL dump → pages.json + design-system.json |
|
||||
| `extract_divi5.py` | Parse Divi 5 blocks → per-page content JSON |
|
||||
| `extract_nav.py` | Extract WordPress nav menus → nav.json |
|
||||
| `stage_seed.py` | Map extracted JSON → seed_databases.py skeleton (Phase 6) |
|
||||
|
||||
## Key facts about .wpress archives
|
||||
|
||||
- Format: Custom sequential binary (NOT zip/tar) — 4377-byte headers
|
||||
- Table prefix in SQL dump: `SERVMASK_PREFIX_` (placeholder, NOT `wp_`)
|
||||
- Directory layout: flat — `uploads/`, `themes/`, `plugins/` at archive root (no `wp-content/` wrapper)
|
||||
- Divi 5 stores theme settings in `et_divi` option as PHP-serialized array
|
||||
|
||||
## vibrantyou.yoga — extracted data reference
|
||||
|
||||
Site: Vibrant You Yoga (instructor: Meghan)
|
||||
Domain: https://vibrantyou.yoga
|
||||
Divi version: 5.0.3
|
||||
WP version: 6.9.4
|
||||
|
||||
Design system:
|
||||
- Primary: #1a8a7a Dark: #0f5f53 Secondary: #2ea3f2
|
||||
- Body: #5a6b68 Headings: #2d2d2d
|
||||
- Body font: DM Sans 17px / 1.6 lh
|
||||
- Heading font: DM Serif Display 600 / 36px / 1.2 lh
|
||||
|
||||
Pages to migrate (22 published):
|
||||
- home, about, classes, schedule, instructors, contact, blog, faq
|
||||
- book (private sessions), online-yoga, donate
|
||||
- Drop: video-category, video-tag, search-videos, user-videos, player-embed,
|
||||
categories, tags, my-bookings (all plugin-generated archive pages)
|
||||
|
||||
Plugins requiring AM replacements:
|
||||
- Gravity Forms + Stripe → AM HTML form + Python API + Resend
|
||||
- Events Manager → static schedule table in /schedule/
|
||||
- All-in-One Video Gallery → embed YouTube/Vimeo directly or drop
|
||||
|
||||
## Related SOPs
|
||||
|
||||
- `../01-project-structure.md` — AM deployment directory layout
|
||||
- `../02-wordpress-to-html-migration.md` — Original 8-phase WP migration playbook
|
||||
- `../03-build-pipeline.md` — JSON + template stamping for repeated pages
|
||||
- `../06-seo-meta.md` — Full `<head>` requirements, schema.org per page type
|
||||
- `../tools/verify-protection.sh` — Post-deploy security audit
|
||||
Binary file not shown.
@@ -0,0 +1,368 @@
|
||||
#!/usr/bin/env python3
|
||||
"""Analyze WordPress MySQL dump from a .wpress extract.
|
||||
|
||||
Parses database.sql and outputs:
|
||||
- pages.json : all published pages with title, slug, content, SEO meta
|
||||
- design-system.json : colors, fonts from wp_options (Divi theme settings)
|
||||
- site-info.json : domain, WP version, detected Divi version, plugin list
|
||||
|
||||
Usage:
|
||||
python3 analyze_db.py <extract_dir> <output_data_dir>
|
||||
|
||||
extract_dir : path to wpress-extract/ (contains database.sql)
|
||||
output_data_dir : where to write JSON output files (e.g. .planning/data/)
|
||||
"""
|
||||
from __future__ import annotations
|
||||
|
||||
import json
|
||||
import os
|
||||
import re
|
||||
import sys
|
||||
from pathlib import Path
|
||||
from typing import Any
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# SQL parsing helpers
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
def _unescape_sql(s: str) -> str:
|
||||
"""Undo MySQL string escaping."""
|
||||
return (s
|
||||
.replace("\\'", "'")
|
||||
.replace('\\"', '"')
|
||||
.replace("\\\\", "\\")
|
||||
.replace("\\n", "\n")
|
||||
.replace("\\r", "\r")
|
||||
.replace("\\t", "\t")
|
||||
.replace("\\0", "\0"))
|
||||
|
||||
|
||||
def _parse_values_block(sql_block: str) -> list[list[str]]:
|
||||
"""Extract rows from a multi-row INSERT VALUES block.
|
||||
|
||||
Handles commas inside quoted strings via a simple state machine.
|
||||
Returns list of rows; each row is a list of raw string values.
|
||||
"""
|
||||
rows: list[list[str]] = []
|
||||
# Find VALUES section
|
||||
m = re.search(r"VALUES\s*", sql_block, re.IGNORECASE)
|
||||
if not m:
|
||||
return rows
|
||||
rest = sql_block[m.end():]
|
||||
|
||||
i = 0
|
||||
n = len(rest)
|
||||
while i < n:
|
||||
# Skip to '('
|
||||
while i < n and rest[i] != '(':
|
||||
i += 1
|
||||
if i >= n:
|
||||
break
|
||||
i += 1 # skip '('
|
||||
|
||||
row: list[str] = []
|
||||
field = []
|
||||
in_quote = False
|
||||
quote_char = ''
|
||||
|
||||
while i < n:
|
||||
c = rest[i]
|
||||
if not in_quote:
|
||||
if c in ("'", '"'):
|
||||
in_quote = True
|
||||
quote_char = c
|
||||
i += 1
|
||||
continue
|
||||
elif c == ',' :
|
||||
row.append("".join(field))
|
||||
field = []
|
||||
i += 1
|
||||
continue
|
||||
elif c == ')':
|
||||
row.append("".join(field))
|
||||
field = []
|
||||
rows.append(row)
|
||||
i += 1
|
||||
break
|
||||
elif c == 'N' and rest[i:i+4] == 'NULL':
|
||||
field.append('\x00NULL\x00')
|
||||
i += 4
|
||||
continue
|
||||
else:
|
||||
field.append(c)
|
||||
i += 1
|
||||
else:
|
||||
if c == '\\' and i + 1 < n:
|
||||
field.append(c)
|
||||
field.append(rest[i + 1])
|
||||
i += 2
|
||||
continue
|
||||
elif c == quote_char:
|
||||
in_quote = False
|
||||
i += 1
|
||||
continue
|
||||
else:
|
||||
field.append(c)
|
||||
i += 1
|
||||
|
||||
return rows
|
||||
|
||||
|
||||
def load_table(sql_text: str, table_name: str) -> list[dict]:
|
||||
"""Return all rows for table_name as list of dicts."""
|
||||
# Find column definition
|
||||
col_re = re.compile(
|
||||
rf"CREATE TABLE `{re.escape(table_name)}`\s*\((.*?)\)\s*ENGINE",
|
||||
re.DOTALL | re.IGNORECASE,
|
||||
)
|
||||
m = col_re.search(sql_text)
|
||||
if not m:
|
||||
return []
|
||||
col_block = m.group(1)
|
||||
cols = re.findall(r"`([^`]+)`\s+(?:bigint|int|mediumint|smallint|tinyint|varchar|text|mediumtext|longtext|char|datetime|date|float|double|decimal|enum|set|blob|mediumblob|longblob)", col_block, re.IGNORECASE)
|
||||
|
||||
# Find INSERT blocks for this table
|
||||
insert_re = re.compile(
|
||||
rf"INSERT INTO `{re.escape(table_name)}`\s+VALUES\s*\(.+?\);",
|
||||
re.DOTALL | re.IGNORECASE,
|
||||
)
|
||||
rows_out: list[dict] = []
|
||||
for block in insert_re.finditer(sql_text):
|
||||
parsed = _parse_values_block(block.group(0))
|
||||
for row in parsed:
|
||||
d: dict[str, Any] = {}
|
||||
for idx, col in enumerate(cols):
|
||||
val = row[idx] if idx < len(row) else ""
|
||||
if val == "\x00NULL\x00":
|
||||
d[col] = None
|
||||
else:
|
||||
d[col] = _unescape_sql(val)
|
||||
rows_out.append(d)
|
||||
return rows_out
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# Divi version detection
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
def detect_divi_version(sql_text: str) -> str:
|
||||
if "wp:divi/" in sql_text:
|
||||
return "5"
|
||||
if "[et_pb_section" in sql_text:
|
||||
return "4"
|
||||
# Check et_theme_builder version in options
|
||||
m = re.search(r"'et_theme_builder_api_version','([^']+)'", sql_text)
|
||||
if m:
|
||||
return "5"
|
||||
return "unknown"
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# Options extraction
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
def load_options(sql_text: str, prefix: str = "wp_") -> dict[str, str]:
|
||||
table = f"{prefix}options"
|
||||
rows = load_table(sql_text, table)
|
||||
return {r["option_name"]: r["option_value"] for r in rows if r.get("option_name")}
|
||||
|
||||
|
||||
def _parse_php_serialized_pairs(raw: str) -> dict[str, str]:
|
||||
"""Extract key/value string pairs from a PHP-serialized array.
|
||||
|
||||
Handles both escaped (SQL-dump) and unescaped forms.
|
||||
Only returns s->s pairs (string key, string value).
|
||||
"""
|
||||
result: dict[str, str] = {}
|
||||
# SQL dumps escape double-quotes as \\", giving patterns like:
|
||||
# s:9:\\"body_font\\";s:7:\\"DM Sans\\";
|
||||
# Also handle unescaped form: s:9:"body_font";s:7:"DM Sans";
|
||||
pat = re.compile(
|
||||
r's:\d+:\\"([^"\\]+)\\";s:\d+:\\"([^"\\]*)\\"' # SQL-escaped
|
||||
r'|s:\d+:"([^"]+)";s:\d+:"([^"]*)"', # plain
|
||||
)
|
||||
for m in pat.finditer(raw):
|
||||
if m.group(1) is not None:
|
||||
k, v = m.group(1), m.group(2)
|
||||
else:
|
||||
k, v = m.group(3), m.group(4)
|
||||
result[k] = v
|
||||
return result
|
||||
|
||||
|
||||
def extract_design_system(options: dict[str, str]) -> dict:
|
||||
"""Pull Divi theme colors, fonts, and spacing from wp_options."""
|
||||
raw = options.get("et_divi", "") or options.get("et_divi_options", "")
|
||||
|
||||
design: dict[str, Any] = {}
|
||||
|
||||
# Parse PHP-serialized et_divi option (Divi 4 + 5 store settings here)
|
||||
if raw:
|
||||
pairs = _parse_php_serialized_pairs(raw)
|
||||
# Map Divi option keys to design-system keys
|
||||
key_map = {
|
||||
"accent_color": "primary_color_dark",
|
||||
"link_color": "primary_color",
|
||||
"body_font": "body_font",
|
||||
"heading_font": "heading_font",
|
||||
"header_font": "heading_font", # Divi 4 alias
|
||||
"body_font_size": "body_font_size",
|
||||
"body_line_height": "body_line_height",
|
||||
"heading_font_weight": "heading_font_weight",
|
||||
"header_text_size": "heading_font_size",
|
||||
"header_line_height": "heading_line_height",
|
||||
"header_color": "heading_color",
|
||||
"font_color": "body_color",
|
||||
"secondary_accent_color": "secondary_color",
|
||||
}
|
||||
for divi_key, design_key in key_map.items():
|
||||
if divi_key in pairs:
|
||||
design.setdefault(design_key, pairs[divi_key])
|
||||
|
||||
# Site info
|
||||
design["site_url"] = options.get("siteurl", "")
|
||||
design["site_name"] = options.get("blogname", "")
|
||||
|
||||
return design
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# Page extraction
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
def extract_pages(sql_text: str, prefix: str = "wp_") -> list[dict]:
|
||||
"""Return all published pages and posts with SEO meta."""
|
||||
posts = load_table(sql_text, f"{prefix}posts")
|
||||
postmeta = load_table(sql_text, f"{prefix}postmeta")
|
||||
|
||||
# Build postmeta lookup: post_id -> {meta_key: meta_value}
|
||||
meta_map: dict[str, dict[str, str]] = {}
|
||||
for row in postmeta:
|
||||
pid = str(row.get("post_id", ""))
|
||||
meta_map.setdefault(pid, {})[row.get("meta_key", "")] = row.get("meta_value", "")
|
||||
|
||||
pages = []
|
||||
for p in posts:
|
||||
if p.get("post_status") not in ("publish",):
|
||||
continue
|
||||
post_type = p.get("post_type", "")
|
||||
if post_type not in ("page", "post", "event"):
|
||||
continue
|
||||
|
||||
pid = str(p.get("ID", ""))
|
||||
meta = meta_map.get(pid, {})
|
||||
|
||||
# Rank Math SEO fields
|
||||
rm_title = meta.get("rank_math_title", "")
|
||||
rm_desc = meta.get("rank_math_description", "")
|
||||
rm_focus = meta.get("rank_math_focus_keyword", "")
|
||||
|
||||
entry = {
|
||||
"id": pid,
|
||||
"post_type": post_type,
|
||||
"slug": p.get("post_name", ""),
|
||||
"title": p.get("post_title", ""),
|
||||
"status": p.get("post_status", ""),
|
||||
"date": p.get("post_date", "")[:10],
|
||||
"modified": p.get("post_modified", "")[:10],
|
||||
"content_raw": p.get("post_content", ""),
|
||||
"excerpt": p.get("post_excerpt", ""),
|
||||
"parent_id": p.get("post_parent", "0"),
|
||||
"menu_order": p.get("menu_order", "0"),
|
||||
"seo_title": rm_title,
|
||||
"seo_description": rm_desc,
|
||||
"seo_keywords": rm_focus,
|
||||
"acf": {k: v for k, v in meta.items() if not k.startswith("_") and not k.startswith("rank_math") and not k.startswith("et_")},
|
||||
}
|
||||
pages.append(entry)
|
||||
|
||||
pages.sort(key=lambda x: int(x["menu_order"] or 0))
|
||||
return pages
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# Main
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
def main():
|
||||
if len(sys.argv) < 3:
|
||||
print(f"Usage: {sys.argv[0]} <extract_dir> <output_data_dir>")
|
||||
sys.exit(1)
|
||||
|
||||
extract_dir = Path(sys.argv[1])
|
||||
out_dir = Path(sys.argv[2])
|
||||
out_dir.mkdir(parents=True, exist_ok=True)
|
||||
|
||||
sql_file = extract_dir / "database.sql"
|
||||
if not sql_file.exists():
|
||||
# Search for it
|
||||
found = list(extract_dir.rglob("*.sql"))
|
||||
if not found:
|
||||
print(f"ERROR: No .sql file found under {extract_dir}")
|
||||
sys.exit(1)
|
||||
sql_file = found[0]
|
||||
print(f"Found SQL at: {sql_file}")
|
||||
|
||||
print(f"Loading {sql_file} ({sql_file.stat().st_size / 1024 / 1024:.1f} MB)...")
|
||||
sql_text = sql_file.read_text(encoding="utf-8", errors="replace")
|
||||
|
||||
# Detect Divi version
|
||||
divi_version = detect_divi_version(sql_text)
|
||||
print(f"Divi version detected: {divi_version}")
|
||||
|
||||
# Load wp_options
|
||||
pkg = {}
|
||||
pkg_file = extract_dir / "package.json"
|
||||
if pkg_file.exists():
|
||||
pkg = json.loads(pkg_file.read_text())
|
||||
|
||||
# AIOIM dumps use SERVMASK_PREFIX_ as a placeholder in the SQL file.
|
||||
# Detect which prefix the dump actually uses.
|
||||
if "SERVMASK_PREFIX_" in sql_text:
|
||||
sql_prefix = "SERVMASK_PREFIX_"
|
||||
else:
|
||||
sql_prefix = pkg.get("Database", {}).get("Prefix", "wp_")
|
||||
runtime_prefix = pkg.get("Database", {}).get("Prefix", "wp_")
|
||||
print(f"SQL prefix: {sql_prefix!r} (runtime prefix: {runtime_prefix!r})")
|
||||
|
||||
options = load_options(sql_text, sql_prefix)
|
||||
print(f"Loaded {len(options)} options")
|
||||
|
||||
# Design system
|
||||
design = extract_design_system(options)
|
||||
design["divi_version"] = divi_version
|
||||
design["wp_version"] = pkg.get("WordPress", {}).get("Version", "")
|
||||
design["plugins"] = pkg.get("Plugins", [])
|
||||
(out_dir / "design-system.json").write_text(json.dumps(design, indent=2, ensure_ascii=False))
|
||||
print(f"Wrote design-system.json ({len(design)} keys)")
|
||||
|
||||
# Pages
|
||||
pages = extract_pages(sql_text, sql_prefix)
|
||||
(out_dir / "pages.json").write_text(json.dumps(pages, indent=2, ensure_ascii=False))
|
||||
print(f"Wrote pages.json ({len(pages)} pages/posts)")
|
||||
|
||||
# Site info summary
|
||||
site_info = {
|
||||
"domain": pkg.get("SiteURL", options.get("siteurl", "")),
|
||||
"name": options.get("blogname", ""),
|
||||
"tagline": options.get("blogdescription", ""),
|
||||
"admin_email": options.get("admin_email", ""),
|
||||
"wp_version": pkg.get("WordPress", {}).get("Version", ""),
|
||||
"divi_version": divi_version,
|
||||
"plugins": pkg.get("Plugins", []),
|
||||
"prefix": runtime_prefix,
|
||||
"total_pages": len([p for p in pages if p["post_type"] == "page"]),
|
||||
"total_posts": len([p for p in pages if p["post_type"] == "post"]),
|
||||
}
|
||||
(out_dir / "site-info.json").write_text(json.dumps(site_info, indent=2, ensure_ascii=False))
|
||||
print(f"Wrote site-info.json")
|
||||
|
||||
print(f"\nDone. Output in: {out_dir}")
|
||||
print(f" pages.json : {len(pages)} entries")
|
||||
print(f" design-system.json: {len(design)} keys")
|
||||
print(f" site-info.json : done")
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
main()
|
||||
@@ -0,0 +1,271 @@
|
||||
#!/usr/bin/env python3
|
||||
"""Extract content from Divi 5 block markup in pages.json.
|
||||
|
||||
Reads .planning/data/pages.json (produced by analyze_db.py) and for each page
|
||||
parses the `content_raw` Divi 5 block structure into a clean per-page JSON
|
||||
under .planning/data/content/{slug}.json.
|
||||
|
||||
Usage:
|
||||
python3 extract_divi5.py <pages_json> <output_dir>
|
||||
|
||||
pages_json : path to .planning/data/pages.json
|
||||
output_dir : directory to write {slug}.json files (created if missing)
|
||||
"""
|
||||
from __future__ import annotations
|
||||
|
||||
import json
|
||||
import re
|
||||
import sys
|
||||
from pathlib import Path
|
||||
from html.parser import HTMLParser
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# HTML inner-text extractor
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
class _TextExtractor(HTMLParser):
|
||||
def __init__(self):
|
||||
super().__init__()
|
||||
self.parts: list[str] = []
|
||||
|
||||
def handle_data(self, data: str):
|
||||
self.parts.append(data)
|
||||
|
||||
def get_text(self) -> str:
|
||||
return " ".join(self.parts).strip()
|
||||
|
||||
|
||||
def _text(html: str) -> str:
|
||||
p = _TextExtractor()
|
||||
p.feed(html)
|
||||
return p.get_text()
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# Divi block parsing
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
# Matches opening block comment: <!-- wp:divi/MODULE {JSON} -->
|
||||
_BLOCK_OPEN = re.compile(r"<!--\s*wp:(divi/[a-z0-9_-]+)\s*(.*?)--?>", re.DOTALL)
|
||||
# Matches closing block comment: <!-- /wp:divi/MODULE -->
|
||||
_BLOCK_CLOSE = re.compile(r"<!--\s*/wp:(divi/[a-z0-9_-]+)\s*-->")
|
||||
|
||||
# Strip et_pb_* class tokens and data-et-* attributes
|
||||
_ET_CLASS = re.compile(r"\b(et_pb_[a-z0-9_-]+|divi-[a-z0-9_-]+-[a-z0-9_-]+|d5_[a-z0-9_-]+)\b", re.IGNORECASE)
|
||||
_ET_ATTR = re.compile(r'\s+data-(?:et|builder|module-id|module-class|d5)-[a-z0-9_-]+\s*=\s*"[^"]*"', re.IGNORECASE)
|
||||
_EMPTY_CL = re.compile(r'\s+class="\s*"')
|
||||
|
||||
|
||||
def _clean(html: str) -> str:
|
||||
"""Strip Divi noise from an HTML fragment."""
|
||||
out = _BLOCK_OPEN.sub("", html)
|
||||
out = _BLOCK_CLOSE.sub("", out)
|
||||
out = _ET_ATTR.sub("", out)
|
||||
out = _ET_CLASS.sub("", out)
|
||||
out = _EMPTY_CL.sub("", out)
|
||||
out = re.sub(r"\n{3,}", "\n\n", out)
|
||||
return out.strip()
|
||||
|
||||
|
||||
def _parse_attrs(raw_json: str) -> dict:
|
||||
"""Parse the JSON attrs blob from a block comment (may be empty)."""
|
||||
raw_json = raw_json.strip()
|
||||
if not raw_json:
|
||||
return {}
|
||||
try:
|
||||
return json.loads(raw_json)
|
||||
except Exception:
|
||||
return {}
|
||||
|
||||
|
||||
def _extract_inner(content: str, block_type: str) -> str:
|
||||
"""Return the raw inner HTML of the first matching block."""
|
||||
open_pat = re.compile(rf"<!--\s*wp:{re.escape(block_type)}[^>]*-->", re.DOTALL)
|
||||
close_pat = re.compile(rf"<!--\s*/wp:{re.escape(block_type)}\s*-->")
|
||||
m = open_pat.search(content)
|
||||
if not m:
|
||||
return ""
|
||||
start = m.end()
|
||||
m2 = close_pat.search(content, start)
|
||||
end = m2.start() if m2 else len(content)
|
||||
return content[start:end]
|
||||
|
||||
|
||||
def _bg_color(attrs: dict) -> str:
|
||||
"""Extract background colour from Divi 5 attrs dict."""
|
||||
bg = attrs.get("backgroundColor", {})
|
||||
if isinstance(bg, dict):
|
||||
return bg.get("value", bg.get("color", ""))
|
||||
return str(bg) if bg else ""
|
||||
|
||||
|
||||
def _section_type(bg: str) -> str:
|
||||
"""Classify section by background colour."""
|
||||
dark_colors = {"#0f5f53", "#1a3a34", "#0d4d42"}
|
||||
brand_colors = {"#1a8a7a", "#20a090"}
|
||||
light_colors = {"#f5f5f5", "#fafafa", "#f0f0f0", "#efefef"}
|
||||
bg_lower = bg.lower().strip()
|
||||
if bg_lower in dark_colors:
|
||||
return "dark"
|
||||
if bg_lower in brand_colors:
|
||||
return "brand"
|
||||
if bg_lower in light_colors:
|
||||
return "light"
|
||||
if bg_lower in ("#ffffff", "#fff", ""):
|
||||
return "white"
|
||||
return "custom"
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# Section/module extraction
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
def _extract_modules(section_html: str) -> list[dict]:
|
||||
"""Walk block comments inside a section and extract module data."""
|
||||
modules: list[dict] = []
|
||||
pos = 0
|
||||
content = section_html
|
||||
|
||||
for m in _BLOCK_OPEN.finditer(content):
|
||||
block_type = m.group(1) # e.g. "divi/text"
|
||||
attrs = _parse_attrs(m.group(2))
|
||||
inner_start = m.end()
|
||||
|
||||
# Find matching close tag
|
||||
close_pat = re.compile(rf"<!--\s*/wp:{re.escape(block_type)}\s*-->")
|
||||
close_m = close_pat.search(content, inner_start)
|
||||
inner_html = content[inner_start : close_m.start() if close_m else len(content)]
|
||||
clean_inner = _clean(inner_html)
|
||||
|
||||
module_type = block_type.split("/")[-1] # "text", "button", "image", etc.
|
||||
|
||||
mod: dict = {"module": module_type}
|
||||
|
||||
if module_type == "text":
|
||||
mod["html"] = clean_inner
|
||||
mod["text"] = _text(clean_inner)
|
||||
|
||||
elif module_type in ("button", "cta"):
|
||||
mod["text"] = attrs.get("buttonText", _text(clean_inner))
|
||||
mod["url"] = attrs.get("buttonUrl", attrs.get("url", "#"))
|
||||
|
||||
elif module_type == "image":
|
||||
src = attrs.get("src", attrs.get("url", ""))
|
||||
mod["src"] = src
|
||||
mod["alt"] = attrs.get("altText", attrs.get("alt", ""))
|
||||
mod["caption"] = attrs.get("caption", "")
|
||||
|
||||
elif module_type == "blurb":
|
||||
mod["title"] = attrs.get("title", "")
|
||||
mod["icon"] = attrs.get("iconName", "")
|
||||
mod["html"] = clean_inner
|
||||
mod["text"] = _text(clean_inner)
|
||||
|
||||
elif module_type == "testimonial":
|
||||
mod["quote"] = attrs.get("content", _text(clean_inner))
|
||||
mod["author"] = attrs.get("authorName", "")
|
||||
mod["company"] = attrs.get("authorJobTitle", "")
|
||||
|
||||
elif module_type == "video":
|
||||
mod["src"] = attrs.get("src", "")
|
||||
mod["poster"] = attrs.get("poster", attrs.get("image", ""))
|
||||
|
||||
elif module_type in ("accordion", "toggle"):
|
||||
items = re.findall(r"<dt[^>]*>(.*?)</dt>\s*<dd[^>]*>(.*?)</dd>", clean_inner, re.DOTALL)
|
||||
mod["items"] = [{"q": q.strip(), "a": a.strip()} for q, a in items]
|
||||
|
||||
elif module_type == "contact_form":
|
||||
mod["form_id"] = attrs.get("formId", "")
|
||||
mod["note"] = "REPLACE with AM vanilla form — see 08-forms.md"
|
||||
|
||||
else:
|
||||
mod["html"] = clean_inner
|
||||
mod["attrs"] = attrs
|
||||
|
||||
modules.append(mod)
|
||||
|
||||
return modules
|
||||
|
||||
|
||||
def parse_page_content(content_raw: str) -> list[dict]:
|
||||
"""Parse Divi 5 block content into a list of section dicts."""
|
||||
sections: list[dict] = []
|
||||
|
||||
section_pat = re.compile(r"<!--\s*wp:divi/section(.*?)-->", re.DOTALL)
|
||||
section_close = re.compile(r"<!--\s*/wp:divi/section\s*-->")
|
||||
|
||||
for sm in section_pat.finditer(content_raw):
|
||||
attrs = _parse_attrs(sm.group(1).strip())
|
||||
start = sm.end()
|
||||
close_m = section_close.search(content_raw, start)
|
||||
sec_html = content_raw[start : close_m.start() if close_m else len(content_raw)]
|
||||
|
||||
bg = _bg_color(attrs)
|
||||
sec_type = _section_type(bg)
|
||||
modules = _extract_modules(sec_html)
|
||||
|
||||
# Determine semantic role from first module
|
||||
role = "content"
|
||||
if modules and modules[0]["module"] in ("fullwidth_header", "text"):
|
||||
first_html = modules[0].get("html", "")
|
||||
if "<h1" in first_html:
|
||||
role = "hero"
|
||||
|
||||
sections.append({
|
||||
"role": role,
|
||||
"section_type": sec_type,
|
||||
"background_color": bg,
|
||||
"attrs": attrs,
|
||||
"modules": modules,
|
||||
})
|
||||
|
||||
return sections
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# Main
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
def main():
|
||||
if len(sys.argv) < 3:
|
||||
print(f"Usage: {sys.argv[0]} <pages_json> <output_dir>")
|
||||
sys.exit(1)
|
||||
|
||||
pages_path = Path(sys.argv[1])
|
||||
out_dir = Path(sys.argv[2])
|
||||
out_dir.mkdir(parents=True, exist_ok=True)
|
||||
|
||||
pages = json.loads(pages_path.read_text(encoding="utf-8"))
|
||||
print(f"Processing {len(pages)} pages...")
|
||||
|
||||
for page in pages:
|
||||
slug = page.get("slug") or f"page-{page['id']}"
|
||||
content = page.get("content_raw", "")
|
||||
|
||||
sections = parse_page_content(content) if content.strip() else []
|
||||
|
||||
output = {
|
||||
"id": page["id"],
|
||||
"slug": slug,
|
||||
"title": page["title"],
|
||||
"post_type": page["post_type"],
|
||||
"seo_title": page.get("seo_title", ""),
|
||||
"seo_description": page.get("seo_description", ""),
|
||||
"seo_keywords": page.get("seo_keywords", ""),
|
||||
"acf": page.get("acf", {}),
|
||||
"date": page.get("date", ""),
|
||||
"modified": page.get("modified", ""),
|
||||
"sections": sections,
|
||||
"section_count": len(sections),
|
||||
}
|
||||
|
||||
out_file = out_dir / f"{slug}.json"
|
||||
out_file.write_text(json.dumps(output, indent=2, ensure_ascii=False))
|
||||
print(f" {slug}.json ({len(sections)} sections)")
|
||||
|
||||
print(f"\nDone. {len(pages)} content files in {out_dir}")
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
main()
|
||||
@@ -0,0 +1,99 @@
|
||||
#!/usr/bin/env python3
|
||||
"""
|
||||
extract_nav.py — Extract WordPress navigation menus from database.sql dump.
|
||||
Outputs nav.json: [{label, href, display_order, is_cta}]
|
||||
|
||||
Usage: python3 extract_nav.py <wpress-extract-dir> <output-data-dir>
|
||||
"""
|
||||
import sys, re, json, os
|
||||
|
||||
CTA_KEYWORDS = {'book', 'get started', 'contact', 'sign up', 'register', 'join', 'buy', 'shop'}
|
||||
|
||||
def extract_nav(extract_dir: str, data_dir: str):
|
||||
sql_path = os.path.join(extract_dir, 'database.sql')
|
||||
if not os.path.exists(sql_path):
|
||||
print(f"ERROR: {sql_path} not found", file=sys.stderr)
|
||||
sys.exit(1)
|
||||
|
||||
with open(sql_path, encoding='utf-8', errors='replace') as f:
|
||||
sql = f.read()
|
||||
|
||||
# Detect table prefix
|
||||
prefix_match = re.search(r"INSERT INTO `(\w+)options`", sql)
|
||||
prefix = prefix_match.group(1) if prefix_match else 'wp_'
|
||||
|
||||
# Find nav menu items: post_type = 'nav_menu_item'
|
||||
# Extract INSERT rows from wp_posts
|
||||
posts_pattern = re.compile(
|
||||
r"INSERT INTO `%sposts`[^;]+?;" % re.escape(prefix),
|
||||
re.DOTALL | re.IGNORECASE
|
||||
)
|
||||
postmeta_pattern = re.compile(
|
||||
r"INSERT INTO `%spostmeta`[^;]+?;" % re.escape(prefix),
|
||||
re.DOTALL | re.IGNORECASE
|
||||
)
|
||||
|
||||
nav_posts = {}
|
||||
for m in posts_pattern.finditer(sql):
|
||||
rows = re.findall(r"\((\d+),[^,]*,'[^']*','[^']*','([^']*)'[^,]*,[^,]*,[^,]*,[^,]*,[^,]*,[^,]*,[^,]*,'([^']*)'[^,]*,[^,]*,\d+,'nav_menu_item'", m.group())
|
||||
for post_id, post_title, post_status in rows:
|
||||
if post_status == 'publish':
|
||||
nav_posts[post_id] = {'label': post_title, 'href': '/', 'menu_order': 0}
|
||||
|
||||
if not nav_posts:
|
||||
# Fallback: simpler pattern
|
||||
for m in posts_pattern.finditer(sql):
|
||||
block = m.group()
|
||||
ids = re.findall(r"\((\d+),", block)
|
||||
titles = re.findall(r"'([^']{1,60})'", block)
|
||||
for i, post_id in enumerate(ids):
|
||||
if i < len(titles) and titles[i]:
|
||||
nav_posts[post_id] = {'label': titles[i], 'href': '/', 'menu_order': i}
|
||||
|
||||
# Extract menu item URLs from postmeta (_menu_item_url or _menu_item_object_id)
|
||||
for m in postmeta_pattern.finditer(sql):
|
||||
block = m.group()
|
||||
# _menu_item_url
|
||||
url_matches = re.findall(r"\((\d+),\s*\d+,\s*'_menu_item_url',\s*'([^']*)'\)", block)
|
||||
for post_id, url in url_matches:
|
||||
if post_id in nav_posts and url:
|
||||
nav_posts[post_id]['href'] = url
|
||||
# _menu_item_menu_order
|
||||
order_matches = re.findall(r"\((\d+),\s*\d+,\s*'_menu_item_menu_order',\s*'(\d+)'\)", block)
|
||||
for post_id, order in order_matches:
|
||||
if post_id in nav_posts:
|
||||
nav_posts[post_id]['menu_order'] = int(order)
|
||||
|
||||
# Clean up hrefs: make relative if same domain
|
||||
items = []
|
||||
for idx, (post_id, item) in enumerate(sorted(nav_posts.items(), key=lambda x: x[1].get('menu_order', 0))):
|
||||
label = item['label'].strip()
|
||||
href = item['href'].strip()
|
||||
if not label:
|
||||
continue
|
||||
# Make relative
|
||||
href = re.sub(r'https?://[^/]+', '', href) or '/'
|
||||
if not href.startswith('/'):
|
||||
href = '/' + href
|
||||
is_cta = 1 if any(kw in label.lower() for kw in CTA_KEYWORDS) else 0
|
||||
items.append({
|
||||
'label': label,
|
||||
'href': href,
|
||||
'display_order': idx + 1,
|
||||
'is_cta': is_cta
|
||||
})
|
||||
|
||||
os.makedirs(data_dir, exist_ok=True)
|
||||
out_path = os.path.join(data_dir, 'nav.json')
|
||||
with open(out_path, 'w', encoding='utf-8') as f:
|
||||
json.dump(items, f, indent=2, ensure_ascii=False)
|
||||
|
||||
print(f"nav.json: {len(items)} items → {out_path}")
|
||||
for item in items:
|
||||
print(f" {'[CTA]' if item['is_cta'] else ' '} {item['label']} → {item['href']}")
|
||||
|
||||
if __name__ == '__main__':
|
||||
if len(sys.argv) != 3:
|
||||
print("Usage: python3 extract_nav.py <wpress-extract-dir> <output-data-dir>")
|
||||
sys.exit(1)
|
||||
extract_nav(sys.argv[1], sys.argv[2])
|
||||
@@ -0,0 +1,110 @@
|
||||
#!/usr/bin/env python3
|
||||
"""Extract All-in-One WP Migration .wpress archive.
|
||||
|
||||
Usage:
|
||||
python3 extract_wpress.py <path/to/file.wpress> <output/directory>
|
||||
|
||||
The .wpress format is a sequential binary archive with 4377-byte headers:
|
||||
255 bytes filename (null-padded)
|
||||
14 bytes file size in bytes (ASCII digits, null-padded)
|
||||
12 bytes mtime unix timestamp (ASCII digits, null-padded)
|
||||
4096 bytes relative path (null-padded)
|
||||
Followed immediately by the raw file bytes, then the next header.
|
||||
"""
|
||||
import os
|
||||
import sys
|
||||
import argparse
|
||||
from pathlib import Path
|
||||
|
||||
HEADER_SIZE = 4377
|
||||
NAME_LEN = 255
|
||||
SIZE_LEN = 14
|
||||
MTIME_LEN = 12
|
||||
PATH_LEN = 4096
|
||||
|
||||
|
||||
def _parse_int(b: bytes) -> int:
|
||||
s = b.split(b"\x00", 1)[0].decode(errors="replace").strip()
|
||||
return int(s) if s else 0
|
||||
|
||||
|
||||
def _parse_str(b: bytes) -> str:
|
||||
return b.split(b"\x00", 1)[0].decode(errors="replace")
|
||||
|
||||
|
||||
def extract(wpress_path: str, out_dir: str, verbose: bool = True) -> dict:
|
||||
out = Path(out_dir)
|
||||
out.mkdir(parents=True, exist_ok=True)
|
||||
count = 0
|
||||
total_bytes = 0
|
||||
skipped = 0
|
||||
|
||||
with open(wpress_path, "rb") as f:
|
||||
while True:
|
||||
header = f.read(HEADER_SIZE)
|
||||
if not header or len(header) < HEADER_SIZE:
|
||||
break
|
||||
if header == b"\x00" * HEADER_SIZE:
|
||||
break
|
||||
|
||||
name = _parse_str(header[0:NAME_LEN])
|
||||
size = _parse_int(header[NAME_LEN : NAME_LEN + SIZE_LEN])
|
||||
mtime = _parse_int(header[NAME_LEN + SIZE_LEN : NAME_LEN + SIZE_LEN + MTIME_LEN])
|
||||
path = _parse_str(header[NAME_LEN + SIZE_LEN + MTIME_LEN : NAME_LEN + SIZE_LEN + MTIME_LEN + PATH_LEN])
|
||||
|
||||
# Sanitise path traversal
|
||||
path = path.lstrip("/").lstrip("\\").lstrip(".")
|
||||
path = path.lstrip("/")
|
||||
|
||||
dest_dir = out / path if path else out
|
||||
dest_dir.mkdir(parents=True, exist_ok=True)
|
||||
dest_file = dest_dir / name
|
||||
|
||||
if not name:
|
||||
skipped += 1
|
||||
f.seek(size, 1)
|
||||
continue
|
||||
|
||||
with open(dest_file, "wb") as o:
|
||||
remaining = size
|
||||
while remaining > 0:
|
||||
chunk = f.read(min(65536, remaining))
|
||||
if not chunk:
|
||||
break
|
||||
o.write(chunk)
|
||||
remaining -= len(chunk)
|
||||
|
||||
try:
|
||||
if mtime > 0:
|
||||
os.utime(dest_file, (mtime, mtime))
|
||||
except Exception:
|
||||
pass
|
||||
|
||||
count += 1
|
||||
total_bytes += size
|
||||
|
||||
if verbose and count % 200 == 0:
|
||||
print(f" [{count} files | {total_bytes / 1024 / 1024:.1f} MB extracted]", flush=True)
|
||||
|
||||
result = {
|
||||
"files": count,
|
||||
"bytes": total_bytes,
|
||||
"mb": round(total_bytes / 1024 / 1024, 1),
|
||||
"skipped": skipped,
|
||||
"out_dir": str(out),
|
||||
}
|
||||
print(f"DONE: {count} files | {result['mb']} MB -> {out_dir} (skipped {skipped})")
|
||||
return result
|
||||
|
||||
|
||||
def main():
|
||||
p = argparse.ArgumentParser(description="Extract .wpress archive")
|
||||
p.add_argument("wpress", help="Path to .wpress file")
|
||||
p.add_argument("outdir", help="Destination directory")
|
||||
p.add_argument("-q", "--quiet", action="store_true", help="Suppress progress output")
|
||||
args = p.parse_args()
|
||||
extract(args.wpress, args.outdir, verbose=not args.quiet)
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
main()
|
||||
@@ -0,0 +1,149 @@
|
||||
#!/usr/bin/env python3
|
||||
"""
|
||||
migrate.py — AM Stack A migration launcher.
|
||||
Points at a .wpress file and runs all extraction phases automatically.
|
||||
Phases 7+ require human/agent review of staged seed_databases.py.
|
||||
|
||||
Usage:
|
||||
python3 migrate.py --wpress /path/to/backup.wpress --domain example.com [--project /path/to/project]
|
||||
|
||||
Output:
|
||||
Runs phases 0-6, then prints agent breadcrumbs for phases 7-11.
|
||||
"""
|
||||
import argparse, os, sys, subprocess, json
|
||||
|
||||
SOPS = os.path.dirname(os.path.dirname(os.path.abspath(__file__)))
|
||||
SCRIPTS = os.path.join(SOPS, 'scripts')
|
||||
|
||||
def run(cmd: list, label: str) -> bool:
|
||||
print(f"\n[{label}] Running: {' '.join(cmd)}")
|
||||
result = subprocess.run(cmd, capture_output=False)
|
||||
if result.returncode != 0:
|
||||
print(f"[{label}] FAILED (exit {result.returncode})")
|
||||
return False
|
||||
print(f"[{label}] OK")
|
||||
return True
|
||||
|
||||
def phase_header(n: int, title: str):
|
||||
print(f"\n{'='*60}")
|
||||
print(f" Phase {n} — {title}")
|
||||
print(f"{'='*60}")
|
||||
|
||||
def main():
|
||||
parser = argparse.ArgumentParser(description='AM Stack A migration launcher')
|
||||
parser.add_argument('--wpress', required=True, help='Path to .wpress backup file')
|
||||
parser.add_argument('--domain', required=True, help='Target domain (e.g. example.com)')
|
||||
parser.add_argument('--project', help='Project directory (default: ~/arisingmedia-websites/{domain})')
|
||||
args = parser.parse_args()
|
||||
|
||||
wpress = os.path.abspath(args.wpress)
|
||||
domain = args.domain
|
||||
project = args.project or os.path.expanduser(f'~/arisingmedia-websites/{domain}')
|
||||
extract_dir = os.path.join(project, '.planning', 'wpress-extract')
|
||||
data_dir = os.path.join(project, '.planning', 'data')
|
||||
content_dir = os.path.join(data_dir, 'content')
|
||||
|
||||
if not os.path.exists(wpress):
|
||||
print(f"ERROR: .wpress file not found: {wpress}")
|
||||
sys.exit(1)
|
||||
|
||||
print(f"\nAM Stack A Migration Pipeline")
|
||||
print(f" Domain: {domain}")
|
||||
print(f" Project: {project}")
|
||||
print(f" Archive: {wpress}")
|
||||
|
||||
# Phase 0 — Setup
|
||||
phase_header(0, 'Setup')
|
||||
for d in [extract_dir, data_dir, content_dir,
|
||||
os.path.join(project, 'assets', 'images'),
|
||||
os.path.join(project, 'build'),
|
||||
os.path.join(project, 'src', 'api', 'data'),
|
||||
os.path.join(project, 'src', 'api', 'templates'),
|
||||
os.path.join(project, 'src', 'api', 'components')]:
|
||||
os.makedirs(d, exist_ok=True)
|
||||
print(f" mkdir {d}")
|
||||
|
||||
# Phase 1 — Extract
|
||||
phase_header(1, 'Extract .wpress archive')
|
||||
if not run(['python3', os.path.join(SCRIPTS, 'extract_wpress.py'), wpress, extract_dir], 'Phase 1'):
|
||||
sys.exit(1)
|
||||
|
||||
# Phase 2 — DB Analysis
|
||||
phase_header(2, 'Database analysis')
|
||||
if not run(['python3', os.path.join(SCRIPTS, 'analyze_db.py'), extract_dir, data_dir], 'Phase 2'):
|
||||
sys.exit(1)
|
||||
|
||||
# Detect Divi version
|
||||
site_info_path = os.path.join(data_dir, 'site-info.json')
|
||||
divi_version = 5
|
||||
if os.path.exists(site_info_path):
|
||||
with open(site_info_path) as f:
|
||||
info = json.load(f)
|
||||
divi_version = info.get('divi_version', 5)
|
||||
print(f" Divi version detected: {divi_version}")
|
||||
|
||||
# Phase 3 — Nav extraction
|
||||
phase_header(3, 'Extract navigation menus')
|
||||
run(['python3', os.path.join(SCRIPTS, 'extract_nav.py'), extract_dir, data_dir], 'Phase 3 (nav)')
|
||||
|
||||
# Phase 3 — Content extraction
|
||||
extract_script = f'extract_divi{divi_version}.py'
|
||||
pages_json = os.path.join(data_dir, 'pages.json')
|
||||
if not run(['python3', os.path.join(SCRIPTS, extract_script), pages_json, content_dir], f'Phase 3 (divi{divi_version})'):
|
||||
print(f" WARNING: content extraction had errors — review {content_dir}")
|
||||
|
||||
# Phase 5 — Media
|
||||
phase_header(5, 'Extract and convert media')
|
||||
run(['python3', os.path.join(SCRIPTS, 'extract_media.py'), extract_dir, data_dir,
|
||||
os.path.join(project, 'assets', 'images')], 'Phase 5')
|
||||
|
||||
# Phase 6 — Stage seed_databases.py
|
||||
phase_header(6, 'Stage seed_databases.py skeleton')
|
||||
seed_path = os.path.join(project, 'build', 'seed_databases.py')
|
||||
# Check if stage_seed.py exists
|
||||
stage_script = os.path.join(SCRIPTS, 'stage_seed.py')
|
||||
if os.path.exists(stage_script):
|
||||
run(['python3', stage_script, data_dir, seed_path, '--domain', domain], 'Phase 6')
|
||||
else:
|
||||
print(f" WARNING: stage_seed.py not found — seed_databases.py must be written manually")
|
||||
print(f" Reference: /home/sirdrez/arisingmedia-websites/vibrantyou.yoga/build/seed_databases.py")
|
||||
|
||||
# Print agent breadcrumbs for remaining phases
|
||||
print(f"\n{'='*60}")
|
||||
print(" EXTRACTION COMPLETE — Manual/Agent phases follow")
|
||||
print(f"{'='*60}")
|
||||
print(f"""
|
||||
Phases 0-6 complete. Staged content is at:
|
||||
{data_dir}/content/ ← extracted page sections (JSON)
|
||||
{data_dir}/nav.json ← navigation items
|
||||
{data_dir}/media-manifest.json ← image URL mappings
|
||||
{seed_path} ← seed_databases.py skeleton
|
||||
|
||||
Next steps (see 10-agent-breadcrumbs.md for full detail):
|
||||
|
||||
Phase 7 — REVIEW seed_databases.py
|
||||
Open: {seed_path}
|
||||
For each page: verify sections_json has correct section types
|
||||
Replace em-dashes. Remove Divi shortcode residue. Review nav items.
|
||||
|
||||
Phase 8 — RUN seed_databases.py
|
||||
cd {project} && python3 build/seed_databases.py
|
||||
Verify: output shows all counts > 0
|
||||
|
||||
Phase 9 — SCAFFOLD PHP templates
|
||||
Copy from reference: vibrantyou.yoga/src/api/
|
||||
Update brand name and colors in _header.php + _footer.php
|
||||
|
||||
Phase 10 — BUILD
|
||||
cd {project} && docker compose build --no-cache && docker compose up -d
|
||||
Verify: curl -I http://localhost:PORT/
|
||||
|
||||
Phase 11 — QA
|
||||
bash {SOPS}/../tools/verify-protection.sh http://localhost:PORT
|
||||
Lighthouse in Firefox
|
||||
|
||||
Reference: {SOPS}/wp-divi-pipeline-to-am-stack/10-agent-breadcrumbs.md
|
||||
""")
|
||||
|
||||
if __name__ == '__main__':
|
||||
main()
|
||||
@@ -0,0 +1,175 @@
|
||||
#!/usr/bin/env bash
|
||||
# run_pipeline.sh — AM WP+Divi to HTML pipeline master script
|
||||
# Usage: bash run_pipeline.sh <domain>
|
||||
# Example: bash run_pipeline.sh vibrantyou.yoga
|
||||
set -euo pipefail
|
||||
|
||||
DOMAIN="${1:-}"
|
||||
if [ -z "$DOMAIN" ]; then
|
||||
echo "Usage: $0 <domain>"
|
||||
echo " Example: $0 vibrantyou.yoga"
|
||||
exit 1
|
||||
fi
|
||||
|
||||
PROJECT="/home/sirdrez/arisingmedia-websites/$DOMAIN"
|
||||
SOPS="/home/sirdrez/arisingmedia-websites/.am-webdesign-sops"
|
||||
SCRIPTS="$SOPS/wp-divi-pipeline/scripts"
|
||||
WPRESS=$(ls "$PROJECT/.planning/"*.wpress 2>/dev/null | head -1)
|
||||
|
||||
if [ -z "$WPRESS" ]; then
|
||||
echo "ERROR: No .wpress file found in $PROJECT/.planning/"
|
||||
exit 1
|
||||
fi
|
||||
|
||||
echo "================================================"
|
||||
echo " AM WP+Divi Pipeline"
|
||||
echo " Domain: $DOMAIN"
|
||||
echo " Archive: $(basename $WPRESS)"
|
||||
echo "================================================"
|
||||
echo ""
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# Phase 0 — Directory structure
|
||||
# ---------------------------------------------------------------------------
|
||||
echo "[Phase 0] Creating directory structure..."
|
||||
mkdir -p "$PROJECT"/{src/{about,services,contact,blog,classes,components,assets/{css,js,images,svg,fonts}},build,infra,api}
|
||||
mkdir -p "$PROJECT/.planning"/{data/{content},scripts,wpress-extract}
|
||||
echo " OK: directories created"
|
||||
echo ""
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# Phase 1 — Extract .wpress archive
|
||||
# ---------------------------------------------------------------------------
|
||||
EXTRACT_DIR="$PROJECT/.planning/wpress-extract"
|
||||
|
||||
if [ -f "$EXTRACT_DIR/database.sql" ]; then
|
||||
echo "[Phase 1] Archive already extracted — skipping"
|
||||
echo " Found: $EXTRACT_DIR/database.sql"
|
||||
else
|
||||
echo "[Phase 1] Extracting archive (this may take a few minutes)..."
|
||||
python3 "$SCRIPTS/extract_wpress.py" "$WPRESS" "$EXTRACT_DIR"
|
||||
echo " OK: extraction complete"
|
||||
fi
|
||||
echo ""
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# Phase 2 — Database analysis
|
||||
# ---------------------------------------------------------------------------
|
||||
DATA_DIR="$PROJECT/.planning/data"
|
||||
echo "[Phase 2] Analyzing database..."
|
||||
python3 "$SCRIPTS/analyze_db.py" "$EXTRACT_DIR" "$DATA_DIR"
|
||||
|
||||
PAGE_COUNT=$(python3 -c "import json; print(len(json.load(open('$DATA_DIR/pages.json'))))" 2>/dev/null || echo 0)
|
||||
echo " OK: $PAGE_COUNT pages extracted"
|
||||
echo ""
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# Phase 3 — Content extraction (Divi 5)
|
||||
# ---------------------------------------------------------------------------
|
||||
echo "[Phase 3] Extracting Divi 5 content..."
|
||||
python3 "$SCRIPTS/extract_divi5.py" \
|
||||
"$DATA_DIR/pages.json" \
|
||||
"$DATA_DIR/content/"
|
||||
echo " OK: content JSON files written"
|
||||
echo ""
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# Phase 4 — Design system (manual step)
|
||||
# ---------------------------------------------------------------------------
|
||||
echo "[Phase 4] Design system (MANUAL STEP REQUIRED)"
|
||||
echo " Read: $DATA_DIR/design-system.json"
|
||||
echo " Write: $PROJECT/src/assets/css/main.css"
|
||||
echo " Ref: $SOPS/wp-divi-pipeline/04-design-system-extraction.md"
|
||||
echo ""
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# Phase 5 — Media migration
|
||||
# ---------------------------------------------------------------------------
|
||||
UPLOADS_DIR="$EXTRACT_DIR/uploads"
|
||||
IMAGES_DIR="$PROJECT/src/assets/images"
|
||||
|
||||
if [ -d "$UPLOADS_DIR" ]; then
|
||||
echo "[Phase 5] Migrating media..."
|
||||
# Catalog originals (skip WP-generated size variants)
|
||||
find "$UPLOADS_DIR" -type f \( -name "*.jpg" -o -name "*.jpeg" -o -name "*.png" -o -name "*.gif" -o -name "*.webp" \) \
|
||||
| grep -v -E "\-[0-9]+x[0-9]+\.(jpg|jpeg|png|webp|gif)$" \
|
||||
| sort > "$DATA_DIR/media-originals.txt"
|
||||
|
||||
MEDIA_COUNT=$(wc -l < "$DATA_DIR/media-originals.txt")
|
||||
echo " Found: $MEDIA_COUNT original images"
|
||||
|
||||
# Copy to src/assets/images/
|
||||
while IFS= read -r src_img; do
|
||||
fname=$(basename "$src_img")
|
||||
cp "$src_img" "$IMAGES_DIR/$fname"
|
||||
done < "$DATA_DIR/media-originals.txt"
|
||||
|
||||
# Convert to WebP if cwebp available
|
||||
if command -v cwebp &>/dev/null; then
|
||||
echo " Converting to WebP..."
|
||||
cd "$IMAGES_DIR"
|
||||
for img in *.jpg *.jpeg *.png; do
|
||||
[ -f "$img" ] || continue
|
||||
base="${img%.*}"
|
||||
cwebp -q 82 "$img" -o "${base}.webp" 2>/dev/null && rm "$img"
|
||||
done
|
||||
WEBP_COUNT=$(ls *.webp 2>/dev/null | wc -l)
|
||||
echo " WebP files: $WEBP_COUNT"
|
||||
cd "$PROJECT"
|
||||
else
|
||||
echo " WARN: cwebp not found — images copied as-is (convert manually)"
|
||||
fi
|
||||
echo " OK: media migrated to $IMAGES_DIR"
|
||||
else
|
||||
echo "[Phase 5] No uploads/ directory found — skipping media migration"
|
||||
fi
|
||||
echo ""
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# Phase 6 — HTML build (manual step)
|
||||
# ---------------------------------------------------------------------------
|
||||
echo "[Phase 6] HTML Build (MANUAL STEP REQUIRED)"
|
||||
echo " Ref: $SOPS/wp-divi-pipeline/05-content-migration.md"
|
||||
echo " Build order:"
|
||||
echo " 1. src/assets/css/main.css"
|
||||
echo " 2. src/assets/css/components.css"
|
||||
echo " 3. src/components/header.html"
|
||||
echo " 4. src/components/footer.html"
|
||||
echo " 5. src/assets/js/components.js"
|
||||
echo " 6. src/assets/js/main.js"
|
||||
echo " 7. src/index.html (home — design system anchor)"
|
||||
echo " 8. Remaining pages"
|
||||
echo ""
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# Phase 7 — SEO audit
|
||||
# ---------------------------------------------------------------------------
|
||||
echo "[Phase 7] SEO audit (run after HTML build):"
|
||||
echo " grep -rL '<title>' $PROJECT/src --include='*.html' | grep -v _template"
|
||||
echo " grep -rL 'canonical' $PROJECT/src --include='*.html' | grep -v _template"
|
||||
echo " grep -rL 'ld+json' $PROJECT/src --include='*.html' | grep -v _template"
|
||||
echo " grep -r '{{' $PROJECT/src --include='*.html'"
|
||||
echo ""
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# Phase 8 — Infra
|
||||
# ---------------------------------------------------------------------------
|
||||
echo "[Phase 8] Infra setup:"
|
||||
echo " Copy Dockerfile + docker-compose.yml from vibrantyoucoaching.com"
|
||||
echo " Update server_name in infra/nginx.conf to: $DOMAIN"
|
||||
echo " Run: docker compose up -d --build"
|
||||
echo ""
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# Phase 9 — Protection check
|
||||
# ---------------------------------------------------------------------------
|
||||
echo "[Phase 9] After deploy, run:"
|
||||
echo " bash $SOPS/tools/verify-protection.sh https://$DOMAIN"
|
||||
echo ""
|
||||
|
||||
echo "================================================"
|
||||
echo " Pipeline setup complete."
|
||||
echo " Phases 0-3 + 5 executed automatically."
|
||||
echo " Phases 4, 6, 7, 8, 9 require manual steps."
|
||||
echo " See $SOPS/wp-divi-pipeline/ for all SOPs."
|
||||
echo "================================================"
|
||||
@@ -0,0 +1,574 @@
|
||||
#!/usr/bin/env python3
|
||||
"""
|
||||
stage_seed.py — Phase 6 of WP/Divi → Stack A migration pipeline.
|
||||
|
||||
Reads extracted JSON from prior pipeline run and generates a seed_databases.py
|
||||
skeleton for the target project. Human/agent reviews [FILL] markers and fills
|
||||
gaps before running the seeder.
|
||||
|
||||
Usage:
|
||||
python3 stage_seed.py <data_dir> <seed_path> --domain <domain> [--force]
|
||||
|
||||
Example:
|
||||
python3 stage_seed.py /path/to/.planning/data build/seed_databases.py --domain example.com
|
||||
"""
|
||||
|
||||
import argparse
|
||||
import json
|
||||
import os
|
||||
import re
|
||||
from datetime import datetime
|
||||
|
||||
|
||||
def slugify(text):
|
||||
"""Convert text to URL-safe slug."""
|
||||
return re.sub(r'[^a-z0-9]+', '-', text.lower()).strip('-')
|
||||
|
||||
|
||||
def infer_template(slug):
|
||||
"""Infer template type from page slug."""
|
||||
slug_lower = slug.lower()
|
||||
if slug_lower == 'home':
|
||||
return 'home'
|
||||
elif slug_lower in ('classes', 'class'):
|
||||
return 'classes'
|
||||
elif slug_lower == 'schedule':
|
||||
return 'schedule'
|
||||
elif slug_lower == 'glossary':
|
||||
return 'glossary'
|
||||
elif slug_lower in ('blog', 'posts', 'articles'):
|
||||
return 'blog'
|
||||
else:
|
||||
return 'static'
|
||||
|
||||
|
||||
def load_json_file(path):
|
||||
"""Load JSON file, return empty dict/list if not found."""
|
||||
if not os.path.exists(path):
|
||||
return None
|
||||
try:
|
||||
with open(path, 'r') as f:
|
||||
return json.load(f)
|
||||
except Exception as e:
|
||||
print(f"Warning: Failed to load {path}: {e}")
|
||||
return None
|
||||
|
||||
|
||||
def generate_seed_script(data_dir, domain, design_system, pages, glossary, nav):
|
||||
"""Generate the seed_databases.py script content."""
|
||||
now = datetime.now().isoformat()
|
||||
|
||||
# Build pages_data list in outer scope
|
||||
pages_list = []
|
||||
for page in pages:
|
||||
if page.get('status') != 'publish' or page.get('post_type') != 'page':
|
||||
continue
|
||||
|
||||
slug = page.get('slug', '')
|
||||
title = page.get('title', '[FILL] Title needed')
|
||||
meta_desc = page.get('seo_description', '')
|
||||
if not meta_desc:
|
||||
meta_desc = f"[FILL] Meta description for {slug}"
|
||||
|
||||
canonical = f"https://{domain}/{slug}/" if slug != 'home' else f"https://{domain}/"
|
||||
date_str = page.get('date', datetime.now().isoformat())
|
||||
|
||||
# Infer template
|
||||
template_map = {
|
||||
'home': 'home',
|
||||
'classes': 'classes',
|
||||
'schedule': 'schedule',
|
||||
'glossary': 'glossary',
|
||||
'blog': 'blog',
|
||||
}
|
||||
template = template_map.get(slug, 'static')
|
||||
|
||||
pages_list.append({
|
||||
'slug': slug,
|
||||
'template': template,
|
||||
'title': title,
|
||||
'meta_description': meta_desc,
|
||||
'canonical_url': canonical,
|
||||
'hero_h1': f"[FILL] {title}",
|
||||
'sections_json': '[]',
|
||||
'updated_at': date_str
|
||||
})
|
||||
|
||||
# Build pages_data JSON string
|
||||
pages_json_str = json.dumps(pages_list, indent=8)
|
||||
|
||||
script = f'''#!/usr/bin/env python3
|
||||
"""
|
||||
seed_databases.py — generated by stage_seed.py on {now}
|
||||
Source: {data_dir}
|
||||
Domain: {domain}
|
||||
|
||||
EDIT THIS FILE then run: python3 build/seed_databases.py
|
||||
Content marked [FILL] needs human/agent review before seeding.
|
||||
"""
|
||||
import sqlite3
|
||||
import json
|
||||
import os
|
||||
from datetime import datetime
|
||||
|
||||
DB_DIR = os.path.join(os.path.dirname(__file__), '..', 'src', 'api', 'data')
|
||||
os.makedirs(DB_DIR, exist_ok=True)
|
||||
|
||||
|
||||
def slugify(text):
|
||||
"""Convert text to URL-safe slug."""
|
||||
import re
|
||||
return re.sub(r'[^a-z0-9]+', '-', text.lower()).strip('-')
|
||||
|
||||
|
||||
def seed_pages():
|
||||
"""Create pages.sqlite and populate with published pages."""
|
||||
db_path = os.path.join(DB_DIR, 'pages.sqlite')
|
||||
conn = sqlite3.connect(db_path)
|
||||
c = conn.cursor()
|
||||
|
||||
c.execute("""
|
||||
CREATE TABLE IF NOT EXISTS pages (
|
||||
id INTEGER PRIMARY KEY,
|
||||
slug TEXT UNIQUE NOT NULL,
|
||||
template TEXT NOT NULL,
|
||||
title TEXT NOT NULL,
|
||||
meta_description TEXT,
|
||||
canonical_url TEXT,
|
||||
og_image TEXT,
|
||||
schema_json TEXT,
|
||||
hero_eyebrow TEXT,
|
||||
hero_h1 TEXT,
|
||||
hero_lead TEXT,
|
||||
sections_json TEXT,
|
||||
updated_at TEXT
|
||||
)
|
||||
""")
|
||||
|
||||
pages_data = {pages_json_str}
|
||||
|
||||
for page in pages_data:
|
||||
c.execute("""
|
||||
INSERT OR REPLACE INTO pages
|
||||
(slug, template, title, meta_description, canonical_url, hero_h1, sections_json, updated_at)
|
||||
VALUES (?, ?, ?, ?, ?, ?, ?, ?)
|
||||
""", (
|
||||
page['slug'],
|
||||
page['template'],
|
||||
page['title'],
|
||||
page['meta_description'],
|
||||
page['canonical_url'],
|
||||
page['hero_h1'],
|
||||
page['sections_json'],
|
||||
page['updated_at']
|
||||
))
|
||||
|
||||
conn.commit()
|
||||
conn.close()
|
||||
print(f"✓ pages.sqlite created with {{len(pages_data)}} pages")
|
||||
|
||||
|
||||
def seed_nav():
|
||||
"""Create nav.sqlite and populate navigation items."""
|
||||
db_path = os.path.join(DB_DIR, 'nav.sqlite')
|
||||
conn = sqlite3.connect(db_path)
|
||||
c = conn.cursor()
|
||||
|
||||
c.execute("""
|
||||
CREATE TABLE IF NOT EXISTS nav_items (
|
||||
id INTEGER PRIMARY KEY,
|
||||
label TEXT NOT NULL,
|
||||
href TEXT NOT NULL,
|
||||
display_order INTEGER DEFAULT 0,
|
||||
is_cta INTEGER DEFAULT 0
|
||||
)
|
||||
""")
|
||||
'''
|
||||
|
||||
if nav:
|
||||
script += f'''
|
||||
nav_items = {json.dumps(nav, indent=8)}
|
||||
|
||||
for item in nav_items:
|
||||
c.execute("""
|
||||
INSERT INTO nav_items (label, href, display_order, is_cta)
|
||||
VALUES (?, ?, ?, ?)
|
||||
""", (item['label'], item['href'], item.get('display_order', 0), item.get('is_cta', 0)))
|
||||
|
||||
conn.commit()
|
||||
conn.close()
|
||||
print(f"✓ nav.sqlite created with {{len(nav_items)}} nav items")
|
||||
'''
|
||||
else:
|
||||
script += '''
|
||||
# [FILL] nav.json not found — add navigation items manually
|
||||
# Example:
|
||||
# nav_items = [
|
||||
# {"label": "Home", "href": "/", "display_order": 1, "is_cta": 0},
|
||||
# {"label": "Classes", "href": "/classes", "display_order": 2, "is_cta": 0},
|
||||
# {"label": "Schedule", "href": "/schedule", "display_order": 3, "is_cta": 0},
|
||||
# {"label": "Get Started", "href": "/contact", "display_order": 4, "is_cta": 1},
|
||||
# ]
|
||||
# Then uncomment and insert rows
|
||||
|
||||
conn.commit()
|
||||
conn.close()
|
||||
print("✓ nav.sqlite created (empty — [FILL] navigation items)")
|
||||
'''
|
||||
|
||||
# Seed glossary
|
||||
if glossary:
|
||||
script += f'''
|
||||
|
||||
|
||||
def seed_glossary():
|
||||
"""Create glossary.sqlite and populate terms."""
|
||||
db_path = os.path.join(DB_DIR, 'glossary.sqlite')
|
||||
conn = sqlite3.connect(db_path)
|
||||
c = conn.cursor()
|
||||
|
||||
c.execute("""
|
||||
CREATE TABLE IF NOT EXISTS terms (
|
||||
id INTEGER PRIMARY KEY,
|
||||
slug TEXT UNIQUE NOT NULL,
|
||||
term TEXT NOT NULL,
|
||||
pronunciation TEXT,
|
||||
definition TEXT NOT NULL,
|
||||
category TEXT NOT NULL,
|
||||
level TEXT NOT NULL,
|
||||
display_order INTEGER DEFAULT 0
|
||||
)
|
||||
""")
|
||||
|
||||
glossary_items = {json.dumps(glossary, indent=8)}
|
||||
|
||||
for idx, item in enumerate(glossary_items):
|
||||
fields = item.get('fields', {{}})
|
||||
term = fields.get('sanskrit_name', '[FILL] Term needed')
|
||||
slug = slugify(term)
|
||||
pronunciation = fields.get('pronunciation', '')
|
||||
definition = fields.get('definition', '[FILL] Definition needed')
|
||||
category = fields.get('category', 'yoga')
|
||||
level = fields.get('level', 'beginner')
|
||||
|
||||
c.execute("""
|
||||
INSERT OR REPLACE INTO terms
|
||||
(slug, term, pronunciation, definition, category, level, display_order)
|
||||
VALUES (?, ?, ?, ?, ?, ?, ?)
|
||||
""", (slug, term, pronunciation, definition, category, level, idx))
|
||||
|
||||
conn.commit()
|
||||
conn.close()
|
||||
print(f"✓ glossary.sqlite created with {{len(glossary_items)}} terms")
|
||||
'''
|
||||
else:
|
||||
script += '''
|
||||
|
||||
|
||||
def seed_glossary():
|
||||
"""Create glossary.sqlite (empty — no glossary.json found)."""
|
||||
db_path = os.path.join(DB_DIR, 'glossary.sqlite')
|
||||
conn = sqlite3.connect(db_path)
|
||||
c = conn.cursor()
|
||||
|
||||
c.execute("""
|
||||
CREATE TABLE IF NOT EXISTS terms (
|
||||
id INTEGER PRIMARY KEY,
|
||||
slug TEXT UNIQUE NOT NULL,
|
||||
term TEXT NOT NULL,
|
||||
pronunciation TEXT,
|
||||
definition TEXT NOT NULL,
|
||||
category TEXT NOT NULL,
|
||||
level TEXT NOT NULL,
|
||||
display_order INTEGER DEFAULT 0
|
||||
)
|
||||
""")
|
||||
|
||||
conn.commit()
|
||||
conn.close()
|
||||
print("✓ glossary.sqlite created (empty)")
|
||||
'''
|
||||
|
||||
script += '''
|
||||
|
||||
|
||||
def seed_testimonials():
|
||||
"""Create testimonials.sqlite (empty stub)."""
|
||||
db_path = os.path.join(DB_DIR, 'testimonials.sqlite')
|
||||
conn = sqlite3.connect(db_path)
|
||||
c = conn.cursor()
|
||||
|
||||
c.execute("""
|
||||
CREATE TABLE IF NOT EXISTS testimonials (
|
||||
id INTEGER PRIMARY KEY,
|
||||
quote TEXT NOT NULL,
|
||||
author_name TEXT NOT NULL,
|
||||
author_role TEXT,
|
||||
is_featured INTEGER DEFAULT 0
|
||||
)
|
||||
""")
|
||||
|
||||
# [FILL] Add testimonials extracted from Divi testimonial modules or client-provided
|
||||
# rows = [
|
||||
# {"quote": "...", "author_name": "...", "author_role": "...", "is_featured": 0},
|
||||
# ]
|
||||
|
||||
conn.commit()
|
||||
conn.close()
|
||||
print("✓ testimonials.sqlite created (empty — [FILL] add testimonials)")
|
||||
|
||||
|
||||
def seed_blog():
|
||||
"""Create blog.sqlite (empty stub)."""
|
||||
db_path = os.path.join(DB_DIR, 'blog.sqlite')
|
||||
conn = sqlite3.connect(db_path)
|
||||
c = conn.cursor()
|
||||
|
||||
c.execute("""
|
||||
CREATE TABLE IF NOT EXISTS posts (
|
||||
id INTEGER PRIMARY KEY,
|
||||
slug TEXT UNIQUE NOT NULL,
|
||||
title TEXT NOT NULL,
|
||||
excerpt TEXT,
|
||||
content TEXT,
|
||||
author TEXT,
|
||||
published_at TEXT,
|
||||
is_featured INTEGER DEFAULT 0
|
||||
)
|
||||
""")
|
||||
|
||||
# [FILL] Add blog posts extracted from WP posts table
|
||||
# rows = [
|
||||
# {"slug": "...", "title": "...", "excerpt": "...", "content": "...", "author": "...", "published_at": "..."},
|
||||
# ]
|
||||
|
||||
conn.commit()
|
||||
conn.close()
|
||||
print("✓ blog.sqlite created (empty — [FILL] add blog posts)")
|
||||
|
||||
|
||||
def seed_videos():
|
||||
"""Create videos.sqlite (empty stub)."""
|
||||
db_path = os.path.join(DB_DIR, 'videos.sqlite')
|
||||
conn = sqlite3.connect(db_path)
|
||||
c = conn.cursor()
|
||||
|
||||
c.execute("""
|
||||
CREATE TABLE IF NOT EXISTS videos (
|
||||
id INTEGER PRIMARY KEY,
|
||||
slug TEXT UNIQUE NOT NULL,
|
||||
title TEXT NOT NULL,
|
||||
duration TEXT,
|
||||
embed_url TEXT,
|
||||
thumbnail TEXT,
|
||||
category TEXT,
|
||||
level TEXT,
|
||||
is_free INTEGER DEFAULT 1
|
||||
)
|
||||
""")
|
||||
|
||||
# [FILL] Add on-demand video entries if site has video content
|
||||
# rows = [
|
||||
# {"slug": "...", "title": "...", "duration": "12:34", "embed_url": "...", "category": "...", "level": "..."},
|
||||
# ]
|
||||
|
||||
conn.commit()
|
||||
conn.close()
|
||||
print("✓ videos.sqlite created (empty — [FILL] add videos)")
|
||||
|
||||
|
||||
def seed_events():
|
||||
"""Create events.sqlite (empty stub)."""
|
||||
db_path = os.path.join(DB_DIR, 'events.sqlite')
|
||||
conn = sqlite3.connect(db_path)
|
||||
c = conn.cursor()
|
||||
|
||||
c.execute("""
|
||||
CREATE TABLE IF NOT EXISTS events (
|
||||
id INTEGER PRIMARY KEY,
|
||||
slug TEXT UNIQUE NOT NULL,
|
||||
title TEXT NOT NULL,
|
||||
event_date TEXT,
|
||||
time_cet TEXT,
|
||||
format TEXT,
|
||||
capacity INTEGER,
|
||||
price_eur REAL,
|
||||
status TEXT DEFAULT 'open'
|
||||
)
|
||||
""")
|
||||
|
||||
# [FILL] Add workshop/event entries
|
||||
# rows = [
|
||||
# {"slug": "...", "title": "...", "event_date": "2026-06-15", "time_cet": "10:00", "format": "online", "capacity": 20, "price_eur": 29.99},
|
||||
# ]
|
||||
|
||||
conn.commit()
|
||||
conn.close()
|
||||
print("✓ events.sqlite created (empty — [FILL] add events)")
|
||||
|
||||
|
||||
def seed_schedule():
|
||||
"""Create schedule.sqlite (empty stub)."""
|
||||
db_path = os.path.join(DB_DIR, 'schedule.sqlite')
|
||||
conn = sqlite3.connect(db_path)
|
||||
c = conn.cursor()
|
||||
|
||||
c.execute("""
|
||||
CREATE TABLE IF NOT EXISTS classes (
|
||||
id INTEGER PRIMARY KEY,
|
||||
day_of_week TEXT NOT NULL,
|
||||
day_order INTEGER NOT NULL,
|
||||
time_cet TEXT NOT NULL,
|
||||
class_name TEXT NOT NULL,
|
||||
level TEXT NOT NULL,
|
||||
format TEXT NOT NULL,
|
||||
duration_min INTEGER NOT NULL,
|
||||
badge_variant TEXT DEFAULT ''
|
||||
)
|
||||
""")
|
||||
|
||||
# [FILL] Add recurring class schedule rows
|
||||
# rows = [
|
||||
# {"day_of_week": "Monday", "day_order": 1, "time_cet": "10:00", "class_name": "Hatha Yoga", "level": "beginner", "format": "online", "duration_min": 60, "badge_variant": "featured"},
|
||||
# ]
|
||||
|
||||
conn.commit()
|
||||
conn.close()
|
||||
print("✓ schedule.sqlite created (empty — [FILL] add class schedule)")
|
||||
|
||||
|
||||
def seed_instructors():
|
||||
"""Create instructors.sqlite (empty stub)."""
|
||||
db_path = os.path.join(DB_DIR, 'instructors.sqlite')
|
||||
conn = sqlite3.connect(db_path)
|
||||
c = conn.cursor()
|
||||
|
||||
c.execute("""
|
||||
CREATE TABLE IF NOT EXISTS instructors (
|
||||
id INTEGER PRIMARY KEY,
|
||||
slug TEXT UNIQUE NOT NULL,
|
||||
name TEXT NOT NULL,
|
||||
title TEXT,
|
||||
bio TEXT,
|
||||
certifications TEXT,
|
||||
image TEXT,
|
||||
is_primary INTEGER DEFAULT 0
|
||||
)
|
||||
""")
|
||||
|
||||
# [FILL] Add instructor rows
|
||||
# rows = [
|
||||
# {"slug": "alice-johnson", "name": "Alice Johnson", "title": "Lead Instructor", "bio": "...", "certifications": "...", "is_primary": 1},
|
||||
# ]
|
||||
|
||||
conn.commit()
|
||||
conn.close()
|
||||
print("✓ instructors.sqlite created (empty — [FILL] add instructors)")
|
||||
|
||||
|
||||
def seed_packages():
|
||||
"""Create packages.sqlite (empty stub)."""
|
||||
db_path = os.path.join(DB_DIR, 'packages.sqlite')
|
||||
conn = sqlite3.connect(db_path)
|
||||
c = conn.cursor()
|
||||
|
||||
c.execute("""
|
||||
CREATE TABLE IF NOT EXISTS packages (
|
||||
id INTEGER PRIMARY KEY,
|
||||
slug TEXT UNIQUE NOT NULL,
|
||||
name TEXT NOT NULL,
|
||||
price_eur REAL,
|
||||
sessions_count INTEGER,
|
||||
validity_days INTEGER,
|
||||
is_featured INTEGER DEFAULT 0
|
||||
)
|
||||
""")
|
||||
|
||||
# [FILL] Add class pack/package options
|
||||
# rows = [
|
||||
# {"slug": "starter", "name": "Starter Pack", "price_eur": 49.99, "sessions_count": 5, "validity_days": 30, "is_featured": 0},
|
||||
# {"slug": "unlimited", "name": "Unlimited Monthly", "price_eur": 99.99, "sessions_count": None, "validity_days": 30, "is_featured": 1},
|
||||
# ]
|
||||
|
||||
conn.commit()
|
||||
conn.close()
|
||||
print("✓ packages.sqlite created (empty — [FILL] add packages)")
|
||||
|
||||
|
||||
if __name__ == '__main__':
|
||||
seed_pages()
|
||||
seed_nav()
|
||||
seed_glossary()
|
||||
seed_testimonials()
|
||||
seed_blog()
|
||||
seed_videos()
|
||||
seed_events()
|
||||
seed_schedule()
|
||||
seed_instructors()
|
||||
seed_packages()
|
||||
print("\\nSeeding complete. Review [FILL] markers before running in production.")
|
||||
'''
|
||||
|
||||
return script
|
||||
|
||||
|
||||
def main():
|
||||
parser = argparse.ArgumentParser(
|
||||
description='Generate seed_databases.py from extracted WP/Divi JSON data'
|
||||
)
|
||||
parser.add_argument('data_dir', help='Path to extracted data directory (.planning/data/)')
|
||||
parser.add_argument('seed_path', help='Output path for seed_databases.py')
|
||||
parser.add_argument('--domain', required=True, help='Domain name (e.g., example.com)')
|
||||
parser.add_argument('--force', action='store_true', help='Overwrite existing seed_databases.py')
|
||||
|
||||
args = parser.parse_args()
|
||||
|
||||
# Validate inputs
|
||||
if not os.path.isdir(args.data_dir):
|
||||
print(f"Error: data_dir not found: {args.data_dir}")
|
||||
return 1
|
||||
|
||||
if os.path.exists(args.seed_path) and not args.force:
|
||||
print(f"Error: seed_databases.py already exists at {args.seed_path}")
|
||||
print("Use --force to overwrite")
|
||||
return 1
|
||||
|
||||
# Load required data files
|
||||
pages = load_json_file(os.path.join(args.data_dir, 'pages.json'))
|
||||
if not pages:
|
||||
print("Error: pages.json not found or invalid")
|
||||
return 1
|
||||
|
||||
design_system = load_json_file(os.path.join(args.data_dir, 'design-system.json'))
|
||||
glossary = load_json_file(os.path.join(args.data_dir, 'glossary.json'))
|
||||
nav = load_json_file(os.path.join(args.data_dir, 'nav.json'))
|
||||
|
||||
# Generate script
|
||||
script_content = generate_seed_script(
|
||||
args.data_dir,
|
||||
args.domain,
|
||||
design_system,
|
||||
pages,
|
||||
glossary,
|
||||
nav
|
||||
)
|
||||
|
||||
# Write output
|
||||
os.makedirs(os.path.dirname(args.seed_path), exist_ok=True)
|
||||
with open(args.seed_path, 'w') as f:
|
||||
f.write(script_content)
|
||||
|
||||
# Make executable
|
||||
os.chmod(args.seed_path, 0o755)
|
||||
|
||||
print(f"✓ Generated: {args.seed_path}")
|
||||
print(f" Pages: {len([p for p in pages if p.get('status') == 'publish' and p.get('post_type') == 'page'])}")
|
||||
print(f" Glossary terms: {len(glossary) if glossary else 0}")
|
||||
print(f" Nav items: {len(nav) if nav else 0}")
|
||||
print("\nNext: Review [FILL] markers, then run: python3 " + args.seed_path)
|
||||
|
||||
return 0
|
||||
|
||||
|
||||
if __name__ == '__main__':
|
||||
exit(main())
|
||||
Reference in New Issue
Block a user