recent updates
This commit is contained in:
@@ -0,0 +1,182 @@
|
||||
# 07 — SEO Preservation
|
||||
|
||||
Before building HTML, map every WordPress page URL to its new AM URL and
|
||||
ensure title, description, canonical, and schema.org are preserved or improved.
|
||||
|
||||
## Step 1 — Inventory all WP URLs
|
||||
|
||||
Extract every published page slug from `pages.json`:
|
||||
|
||||
```bash
|
||||
python3 -c "
|
||||
import json
|
||||
pages = json.load(open('.planning/data/pages.json'))
|
||||
for p in pages:
|
||||
slug = p['slug']
|
||||
ptype = p['post_type']
|
||||
print(f'/{slug}/ ({ptype}) title={p[\"title\"]!r}')
|
||||
" | tee .planning/data/wp-url-inventory.txt
|
||||
```
|
||||
|
||||
## Step 2 — Build redirect map
|
||||
|
||||
Map each WP URL to the new AM URL. Write to `.planning/data/redirect-map.txt`:
|
||||
|
||||
Format: `OLD_PATH -> NEW_PATH`
|
||||
|
||||
Common mapping patterns for yoga sites:
|
||||
|
||||
| Old WP URL | New AM URL | Action |
|
||||
|-----------|-----------|--------|
|
||||
| `/` | `/` | Same |
|
||||
| `/about/` | `/about/` | Same |
|
||||
| `/classes/` | `/classes/` | Same |
|
||||
| `/yoga-class-name/` | `/classes/yoga-class-name.html` | Restructure |
|
||||
| `/private-yoga-sessions/` | `/private-sessions/` | Rename |
|
||||
| `/contact-us/` | `/contact/` | Simplify |
|
||||
| `/?page_id=42` | `/about/` | WP ID → slug |
|
||||
| `/blog/post-title/` | `/blog/post-title.html` | Flatten |
|
||||
| `/events/event-name/` | `/classes/` | Consolidate into schedule |
|
||||
|
||||
Redirects go into `infra/nginx.conf`:
|
||||
|
||||
```nginx
|
||||
# Exact-match redirects
|
||||
location = /contact-us/ { return 301 /contact/; }
|
||||
location = /private-yoga-sessions/ { return 301 /private-sessions/; }
|
||||
|
||||
# WP page ID redirects
|
||||
location = / {
|
||||
if ($arg_page_id = "42") { return 301 /about/; }
|
||||
if ($arg_p) { return 301 /blog/; }
|
||||
}
|
||||
|
||||
# WP upload URLs → AM asset paths (catch-all)
|
||||
location ^~ /wp-content/uploads/ {
|
||||
return 301 /assets/images/$uri;
|
||||
}
|
||||
|
||||
# Block all WP URLs
|
||||
location ~ ^/wp-(admin|login|json|cron|includes|content/plugins|content/themes) {
|
||||
return 410;
|
||||
}
|
||||
```
|
||||
|
||||
## Step 3 — Rank Math SEO extraction
|
||||
|
||||
Rank Math stores titles and descriptions in `wp_postmeta`.
|
||||
`analyze_db.py` already extracts these into `pages.json` as `seo_title` and `seo_description`.
|
||||
|
||||
For each page, the priority order for SEO fields:
|
||||
1. `seo_title` from Rank Math (if not empty and not a template like `%title% - %sitename%`)
|
||||
2. `post_title` with AM format appended: `{Title} | VibrantYou Yoga`
|
||||
3. Never leave title as the raw WP default
|
||||
|
||||
Rank Math title templates use `%` tokens — strip them and rebuild:
|
||||
```python
|
||||
import re
|
||||
|
||||
def clean_rm_title(rm_title: str, post_title: str, site_name: str) -> str:
|
||||
if not rm_title or "%" in rm_title:
|
||||
return f"{post_title} | {site_name}"
|
||||
return rm_title
|
||||
|
||||
def clean_rm_desc(rm_desc: str) -> str:
|
||||
# Strip %token% placeholders
|
||||
return re.sub(r"%[a-z_]+%", "", rm_desc).strip(" -|")
|
||||
```
|
||||
|
||||
## Step 4 — Per-page SEO checklist
|
||||
|
||||
For every page in `pages.json`, fill in this record before writing HTML:
|
||||
|
||||
```json
|
||||
{
|
||||
"slug": "about",
|
||||
"new_path": "/about/",
|
||||
"canonical": "https://vibrantyou.yoga/about/",
|
||||
"title": "About VibrantYou Yoga | Mindful Movement in [City], [State]",
|
||||
"description": "Meet the instructors and story behind VibrantYou Yoga. [150-160 chars, include city]",
|
||||
"keywords": "yoga studio [city], yoga instructor, mindful movement",
|
||||
"og_image": "/assets/images/about-studio.webp",
|
||||
"schema_type": "AboutPage",
|
||||
"h1": "Our Story"
|
||||
}
|
||||
```
|
||||
|
||||
Write to `.planning/data/seo-map.json`. The HTML build reads this file to
|
||||
stamp `<head>` tags.
|
||||
|
||||
## Step 5 — Schema.org per page type
|
||||
|
||||
| Page | Schema type | Required fields |
|
||||
|------|------------|----------------|
|
||||
| Home | `LocalBusiness` | name, url, telephone, address, areaServed, openingHours |
|
||||
| About | `AboutPage` + `Organization` | name, description, founders |
|
||||
| Classes index | `ItemList` of `Course` | name, url, description per class |
|
||||
| Class detail | `Course` | name, description, provider, educationalLevel |
|
||||
| Contact | `ContactPage` | name, url, telephone, email, address |
|
||||
| Blog post | `Article` | headline, datePublished, author, image |
|
||||
| 404 | none | — |
|
||||
|
||||
LocalBusiness schema for vibrantyou.yoga (seed from `site-info.json`):
|
||||
```json
|
||||
{
|
||||
"@context": "https://schema.org",
|
||||
"@type": ["LocalBusiness", "HealthAndBeautyBusiness"],
|
||||
"@id": "https://vibrantyou.yoga/#business",
|
||||
"name": "VibrantYou Yoga",
|
||||
"url": "https://vibrantyou.yoga",
|
||||
"telephone": "",
|
||||
"priceRange": "$$",
|
||||
"servesCuisine": null,
|
||||
"currenciesAccepted": "USD",
|
||||
"paymentAccepted": "Cash, Credit Card",
|
||||
"address": {
|
||||
"@type": "PostalAddress",
|
||||
"streetAddress": "",
|
||||
"addressLocality": "",
|
||||
"addressRegion": "",
|
||||
"postalCode": "",
|
||||
"addressCountry": "US"
|
||||
}
|
||||
}
|
||||
```
|
||||
Mark address fields `DRAFT NEEDED` — do not fabricate. Pull from `wp_options`
|
||||
(`admin_email`, Events Manager location settings) or ask client.
|
||||
|
||||
## Step 6 — Pre-launch SEO audit commands
|
||||
|
||||
Run these before declaring the build complete:
|
||||
|
||||
```bash
|
||||
SITE=src
|
||||
|
||||
# Every page has a <title>
|
||||
find $SITE -name "*.html" | xargs grep -L '<title>' | grep -v "_template"
|
||||
|
||||
# Every page has meta description
|
||||
find $SITE -name "*.html" | xargs grep -L 'name="description"' | grep -v "_template"
|
||||
|
||||
# Every page has canonical
|
||||
find $SITE -name "*.html" | xargs grep -L 'rel="canonical"' | grep -v "_template"
|
||||
|
||||
# Every page has JSON-LD
|
||||
find $SITE -name "*.html" | xargs grep -L 'application/ld+json' | grep -v "_template"
|
||||
|
||||
# No WP URLs leaked into HTML
|
||||
grep -r "wp-content\|wp-admin\|wordpress\|?p=\|?page_id=" $SITE --include="*.html"
|
||||
|
||||
# No unreplaced template placeholders
|
||||
grep -r "{{" $SITE --include="*.html"
|
||||
|
||||
# No Divi class residue
|
||||
grep -r "et_pb_\|divi-builder" $SITE --include="*.html"
|
||||
```
|
||||
|
||||
All six commands must return zero results before launch.
|
||||
|
||||
## Next step
|
||||
|
||||
Proceed to `08-run-order.md` for the complete execution sequence,
|
||||
then `02-wordpress-to-html-migration.md` Phase 7 for DNS cutover.
|
||||
Reference in New Issue
Block a user