07 — SEO Preservation

Before building HTML, map every WordPress page URL to its new AM URL and ensure title, description, canonical, and schema.org are preserved or improved.

Step 1 — Inventory all WP URLs

Extract every published page slug from pages.json:

python3 -c "
import json
pages = json.load(open('.planning/data/pages.json'))
for p in pages:
    slug = p['slug']
    ptype = p['post_type']
    print(f'/{slug}/  ({ptype})  title={p[\"title\"]!r}')
" | tee .planning/data/wp-url-inventory.txt

Step 2 — Build redirect map

Map each WP URL to the new AM URL. Write to .planning/data/redirect-map.txt:

Format: OLD_PATH -> NEW_PATH

Common mapping patterns for yoga sites:

Old WP URL	New AM URL	Action
`/`	`/`	Same
`/about/`	`/about/`	Same
`/classes/`	`/classes/`	Same
`/yoga-class-name/`	`/classes/yoga-class-name.html`	Restructure
`/private-yoga-sessions/`	`/private-sessions/`	Rename
`/contact-us/`	`/contact/`	Simplify
`/?page_id=42`	`/about/`	WP ID → slug
`/blog/post-title/`	`/blog/post-title.html`	Flatten
`/events/event-name/`	`/classes/`	Consolidate into schedule

Redirects go into infra/nginx.conf:

# Exact-match redirects
location = /contact-us/ { return 301 /contact/; }
location = /private-yoga-sessions/ { return 301 /private-sessions/; }

# WP page ID redirects
location = / {
  if ($arg_page_id = "42") { return 301 /about/; }
  if ($arg_p) { return 301 /blog/; }
}

# WP upload URLs → AM asset paths (catch-all)
location ^~ /wp-content/uploads/ {
  return 301 /assets/images/$uri;
}

# Block all WP URLs
location ~ ^/wp-(admin|login|json|cron|includes|content/plugins|content/themes) {
  return 410;
}

Step 3 — Rank Math SEO extraction

Rank Math stores titles and descriptions in wp_postmeta. analyze_db.py already extracts these into pages.json as seo_title and seo_description.

For each page, the priority order for SEO fields:

seo_title from Rank Math (if not empty and not a template like %title% - %sitename%)
post_title with AM format appended: {Title} | VibrantYou Yoga
Never leave title as the raw WP default

Rank Math title templates use % tokens — strip them and rebuild:

import re

def clean_rm_title(rm_title: str, post_title: str, site_name: str) -> str:
    if not rm_title or "%" in rm_title:
        return f"{post_title} | {site_name}"
    return rm_title

def clean_rm_desc(rm_desc: str) -> str:
    # Strip %token% placeholders
    return re.sub(r"%[a-z_]+%", "", rm_desc).strip(" -|")

Step 4 — Per-page SEO checklist

For every page in pages.json, fill in this record before writing HTML:

{
  "slug":         "about",
  "new_path":     "/about/",
  "canonical":    "https://vibrantyou.yoga/about/",
  "title":        "About VibrantYou Yoga | Mindful Movement in [City], [State]",
  "description":  "Meet the instructors and story behind VibrantYou Yoga. [150-160 chars, include city]",
  "keywords":     "yoga studio [city], yoga instructor, mindful movement",
  "og_image":     "/assets/images/about-studio.webp",
  "schema_type":  "AboutPage",
  "h1":           "Our Story"
}

Write to .planning/data/seo-map.json. The HTML build reads this file to stamp <head> tags.

Step 5 — Schema.org per page type

Page	Schema type	Required fields
Home	`LocalBusiness`	name, url, telephone, address, areaServed, openingHours
About	`AboutPage` + `Organization`	name, description, founders
Classes index	`ItemList` of `Course`	name, url, description per class
Class detail	`Course`	name, description, provider, educationalLevel
Contact	`ContactPage`	name, url, telephone, email, address
Blog post	`Article`	headline, datePublished, author, image
404	none	—

LocalBusiness schema for vibrantyou.yoga (seed from site-info.json):

{
  "@context": "https://schema.org",
  "@type": ["LocalBusiness", "HealthAndBeautyBusiness"],
  "@id": "https://vibrantyou.yoga/#business",
  "name": "VibrantYou Yoga",
  "url": "https://vibrantyou.yoga",
  "telephone": "",
  "priceRange": "$$",
  "servesCuisine": null,
  "currenciesAccepted": "USD",
  "paymentAccepted": "Cash, Credit Card",
  "address": {
    "@type": "PostalAddress",
    "streetAddress": "",
    "addressLocality": "",
    "addressRegion": "",
    "postalCode": "",
    "addressCountry": "US"
  }
}

Mark address fields DRAFT NEEDED — do not fabricate. Pull from wp_options (admin_email, Events Manager location settings) or ask client.

Step 6 — Pre-launch SEO audit commands

Run these before declaring the build complete:

SITE=src

# Every page has a <title>
find $SITE -name "*.html" | xargs grep -L '<title>' | grep -v "_template"

# Every page has meta description
find $SITE -name "*.html" | xargs grep -L 'name="description"' | grep -v "_template"

# Every page has canonical
find $SITE -name "*.html" | xargs grep -L 'rel="canonical"' | grep -v "_template"

# Every page has JSON-LD
find $SITE -name "*.html" | xargs grep -L 'application/ld+json' | grep -v "_template"

# No WP URLs leaked into HTML
grep -r "wp-content\|wp-admin\|wordpress\|?p=\|?page_id=" $SITE --include="*.html"

# No unreplaced template placeholders
grep -r "{{" $SITE --include="*.html"

# No Divi class residue
grep -r "et_pb_\|divi-builder" $SITE --include="*.html"

All six commands must return zero results before launch.

Next step

Proceed to 08-run-order.md for the complete execution sequence, then 02-wordpress-to-html-migration.md Phase 7 for DNS cutover.

5.6 KiB Raw Blame History