Files
2026-06-09 18:31:59 +02:00

183 lines
5.6 KiB
Markdown

# 07 — SEO Preservation
Before building HTML, map every WordPress page URL to its new AM URL and
ensure title, description, canonical, and schema.org are preserved or improved.
## Step 1 — Inventory all WP URLs
Extract every published page slug from `pages.json`:
```bash
python3 -c "
import json
pages = json.load(open('.planning/data/pages.json'))
for p in pages:
slug = p['slug']
ptype = p['post_type']
print(f'/{slug}/ ({ptype}) title={p[\"title\"]!r}')
" | tee .planning/data/wp-url-inventory.txt
```
## Step 2 — Build redirect map
Map each WP URL to the new AM URL. Write to `.planning/data/redirect-map.txt`:
Format: `OLD_PATH -> NEW_PATH`
Common mapping patterns for yoga sites:
| Old WP URL | New AM URL | Action |
|-----------|-----------|--------|
| `/` | `/` | Same |
| `/about/` | `/about/` | Same |
| `/classes/` | `/classes/` | Same |
| `/yoga-class-name/` | `/classes/yoga-class-name.html` | Restructure |
| `/private-yoga-sessions/` | `/private-sessions/` | Rename |
| `/contact-us/` | `/contact/` | Simplify |
| `/?page_id=42` | `/about/` | WP ID → slug |
| `/blog/post-title/` | `/blog/post-title.html` | Flatten |
| `/events/event-name/` | `/classes/` | Consolidate into schedule |
Redirects go into `infra/nginx.conf`:
```nginx
# Exact-match redirects
location = /contact-us/ { return 301 /contact/; }
location = /private-yoga-sessions/ { return 301 /private-sessions/; }
# WP page ID redirects
location = / {
if ($arg_page_id = "42") { return 301 /about/; }
if ($arg_p) { return 301 /blog/; }
}
# WP upload URLs → AM asset paths (catch-all)
location ^~ /wp-content/uploads/ {
return 301 /assets/images/$uri;
}
# Block all WP URLs
location ~ ^/wp-(admin|login|json|cron|includes|content/plugins|content/themes) {
return 410;
}
```
## Step 3 — Rank Math SEO extraction
Rank Math stores titles and descriptions in `wp_postmeta`.
`analyze_db.py` already extracts these into `pages.json` as `seo_title` and `seo_description`.
For each page, the priority order for SEO fields:
1. `seo_title` from Rank Math (if not empty and not a template like `%title% - %sitename%`)
2. `post_title` with AM format appended: `{Title} | VibrantYou Yoga`
3. Never leave title as the raw WP default
Rank Math title templates use `%` tokens — strip them and rebuild:
```python
import re
def clean_rm_title(rm_title: str, post_title: str, site_name: str) -> str:
if not rm_title or "%" in rm_title:
return f"{post_title} | {site_name}"
return rm_title
def clean_rm_desc(rm_desc: str) -> str:
# Strip %token% placeholders
return re.sub(r"%[a-z_]+%", "", rm_desc).strip(" -|")
```
## Step 4 — Per-page SEO checklist
For every page in `pages.json`, fill in this record before writing HTML:
```json
{
"slug": "about",
"new_path": "/about/",
"canonical": "https://vibrantyou.yoga/about/",
"title": "About VibrantYou Yoga | Mindful Movement in [City], [State]",
"description": "Meet the instructors and story behind VibrantYou Yoga. [150-160 chars, include city]",
"keywords": "yoga studio [city], yoga instructor, mindful movement",
"og_image": "/assets/images/about-studio.webp",
"schema_type": "AboutPage",
"h1": "Our Story"
}
```
Write to `.planning/data/seo-map.json`. The HTML build reads this file to
stamp `<head>` tags.
## Step 5 — Schema.org per page type
| Page | Schema type | Required fields |
|------|------------|----------------|
| Home | `LocalBusiness` | name, url, telephone, address, areaServed, openingHours |
| About | `AboutPage` + `Organization` | name, description, founders |
| Classes index | `ItemList` of `Course` | name, url, description per class |
| Class detail | `Course` | name, description, provider, educationalLevel |
| Contact | `ContactPage` | name, url, telephone, email, address |
| Blog post | `Article` | headline, datePublished, author, image |
| 404 | none | — |
LocalBusiness schema for vibrantyou.yoga (seed from `site-info.json`):
```json
{
"@context": "https://schema.org",
"@type": ["LocalBusiness", "HealthAndBeautyBusiness"],
"@id": "https://vibrantyou.yoga/#business",
"name": "VibrantYou Yoga",
"url": "https://vibrantyou.yoga",
"telephone": "",
"priceRange": "$$",
"servesCuisine": null,
"currenciesAccepted": "USD",
"paymentAccepted": "Cash, Credit Card",
"address": {
"@type": "PostalAddress",
"streetAddress": "",
"addressLocality": "",
"addressRegion": "",
"postalCode": "",
"addressCountry": "US"
}
}
```
Mark address fields `DRAFT NEEDED` — do not fabricate. Pull from `wp_options`
(`admin_email`, Events Manager location settings) or ask client.
## Step 6 — Pre-launch SEO audit commands
Run these before declaring the build complete:
```bash
SITE=src
# Every page has a <title>
find $SITE -name "*.html" | xargs grep -L '<title>' | grep -v "_template"
# Every page has meta description
find $SITE -name "*.html" | xargs grep -L 'name="description"' | grep -v "_template"
# Every page has canonical
find $SITE -name "*.html" | xargs grep -L 'rel="canonical"' | grep -v "_template"
# Every page has JSON-LD
find $SITE -name "*.html" | xargs grep -L 'application/ld+json' | grep -v "_template"
# No WP URLs leaked into HTML
grep -r "wp-content\|wp-admin\|wordpress\|?p=\|?page_id=" $SITE --include="*.html"
# No unreplaced template placeholders
grep -r "{{" $SITE --include="*.html"
# No Divi class residue
grep -r "et_pb_\|divi-builder" $SITE --include="*.html"
```
All six commands must return zero results before launch.
## Next step
Proceed to `08-run-order.md` for the complete execution sequence,
then `02-wordpress-to-html-migration.md` Phase 7 for DNS cutover.