# 02 — Database Analysis Parse the WordPress MySQL dump to inventory pages, detect Divi version, extract design settings, and build the data JSON files that drive the AM build. ## Script ```bash python3 /home/sirdrez/arisingmedia-websites/.am-webdesign-sops/wp-divi-pipeline/scripts/analyze_db.py \ {domain}/.planning/wpress-extract/ \ {domain}/.planning/data/ ``` Outputs three files into `.planning/data/`: - `pages.json` — all published pages/posts with content and SEO meta - `design-system.json` — colors, fonts, Divi settings - `site-info.json` — domain, plugin list, WP version, Divi version ## Divi version detection The script auto-detects Divi version by scanning `database.sql`: | Signal in SQL | Divi version | |---------------|-------------| | `wp:divi/` in post_content | Divi 5 | | `[et_pb_section` in post_content | Divi 4 | **This determines the content extraction path.** Divi 4 → use `extract_divi4.py`. Divi 5 → use `extract_divi5.py`. See `03-divi-content-extraction.md`. ## Key WordPress tables | Table | Contents | Used for | |-------|----------|---------| | `wp_posts` | All pages, posts, attachments, layouts | Page inventory, content | | `wp_postmeta` | Per-post metadata | ACF fields, Rank Math SEO, Divi layout JSON | | `wp_options` | Site-wide settings | Divi theme settings, colors, fonts | | `wp_gf_forms` | Gravity Forms definitions | Form field schema | | `wp_gf_entries` | Gravity Form submissions | Not needed for migration | | `wp_rank_math_seo_meta` | Rank Math SEO per page | SEO titles, descriptions | ## Reading pages.json Each entry in `pages.json`: ```json { "id": "42", "post_type": "page", "slug": "about", "title": "About VibrantYou Yoga", "status": "publish", "date": "2026-03-15", "modified": "2026-04-10", "content_raw": "...", "excerpt": "", "parent_id": "0", "menu_order": "3", "seo_title": "About VibrantYou Yoga | Mindful Movement in [City]", "seo_description": "...", "seo_keywords": "yoga studio, mindful movement", "acf": { "vyy_hero_headline": "Move With Intention", "vyy_hero_subhead": "..." } } ``` `content_raw` holds the raw Divi block markup. Pass it to the extractor scripts. `acf` holds Advanced Custom Fields values — often cleaner than block content. ## Reading design-system.json Contains extracted Divi theme settings. Key fields: ```json { "primary_color": "#1a8a7a", "body_font": "DM Sans", "header_font": "DM Serif Display", "body_font_size": "16", "body_line_height": "1.7", "divi_version": "5", "wp_version": "6.9.4", "site_url": "https://vibrantyou.yoga", "site_name": "VibrantYou Yoga" } ``` Use these values to seed the AM `main.css` CSS custom properties block. ## Manual inspection (when script output is sparse) Sometimes the Divi theme options are stored as PHP-serialized data. Use grep to find and eyeball the raw values: ```bash DB=.planning/wpress-extract/database.sql # Divi global colors grep -o "'et_divi[^']*','[^']*'" $DB | head -30 # Site name + URL grep -E "'(siteurl|blogname|admin_email)','[^']*'" $DB # Rank Math SEO meta for a specific post grep "rank_math_title\|rank_math_description" $DB | head -20 # All published page slugs grep -o "post_name','[^']*'" $DB | grep -v "revision\|auto-draft" | sort | uniq ``` ## Gravity Forms schema (for form replacement) Find form field definitions: ```bash grep "INSERT INTO \`wp_gf_forms\`" .planning/wpress-extract/database.sql | \ python3 -c " import sys, json, re for line in sys.stdin: m = re.search(r\"'([^']+)'\s*\)\s*;\", line) if m: try: print(json.dumps(json.loads(m.group(1).replace('\\\\\"','\"')), indent=2)[:2000]) except: pass " 2>/dev/null | head -100 ``` Field types seen in Gravity Forms: text, email, phone, textarea, select, checkbox, radio, name, address, fileupload. Map each to a plain HTML input equivalent. ## Archive directory layout note The AIOIM .wpress format extracts flat — no `wp-content/` wrapper: ``` wpress-extract/ ├── database.sql ← NOT in wp-content/ ├── package.json ├── uploads/ ← NOT wp-content/uploads/ ├── themes/ ← NOT wp-content/themes/ ├── plugins/ ← NOT wp-content/plugins/ └── et-cache/ ``` Scripts must reference `uploads/`, `themes/`, `plugins/` directly under `wpress-extract/`, not `wpress-extract/wp-content/`. ## Next step Once `pages.json` is written, proceed to `03-divi-content-extraction.md` to parse `content_raw` for each page into structured AM-ready HTML.