recent updates

2026-06-09 18:31:59 +02:00
parent 398b94965c
commit 94f7a1f72a
42 changed files with 8686 additions and 0 deletions
@@ -0,0 +1,182 @@
+# 07 — SEO Preservation
+
+Before building HTML, map every WordPress page URL to its new AM URL and
+ensure title, description, canonical, and schema.org are preserved or improved.
+
+## Step 1 — Inventory all WP URLs
+
+Extract every published page slug from `pages.json`:
+
+```bash
+python3 -c "
+import json
+pages = json.load(open('.planning/data/pages.json'))
+for p in pages:
+    slug = p['slug']
+    ptype = p['post_type']
+    print(f'/{slug}/  ({ptype})  title={p[\"title\"]!r}')
+" | tee .planning/data/wp-url-inventory.txt
+```
+
+## Step 2 — Build redirect map
+
+Map each WP URL to the new AM URL. Write to `.planning/data/redirect-map.txt`:
+
+Format: `OLD_PATH -> NEW_PATH`
+
+Common mapping patterns for yoga sites:
+
+| Old WP URL | New AM URL | Action |
+|-----------|-----------|--------|
+| `/` | `/` | Same |
+| `/about/` | `/about/` | Same |
+| `/classes/` | `/classes/` | Same |
+| `/yoga-class-name/` | `/classes/yoga-class-name.html` | Restructure |
+| `/private-yoga-sessions/` | `/private-sessions/` | Rename |
+| `/contact-us/` | `/contact/` | Simplify |
+| `/?page_id=42` | `/about/` | WP ID → slug |
+| `/blog/post-title/` | `/blog/post-title.html` | Flatten |
+| `/events/event-name/` | `/classes/` | Consolidate into schedule |
+
+Redirects go into `infra/nginx.conf`:
+
+```nginx
+# Exact-match redirects
+location = /contact-us/ { return 301 /contact/; }
+location = /private-yoga-sessions/ { return 301 /private-sessions/; }
+
+# WP page ID redirects
+location = / {
+  if ($arg_page_id = "42") { return 301 /about/; }
+  if ($arg_p) { return 301 /blog/; }
+}
+
+# WP upload URLs → AM asset paths (catch-all)
+location ^~ /wp-content/uploads/ {
+  return 301 /assets/images/$uri;
+}
+
+# Block all WP URLs
+location ~ ^/wp-(admin|login|json|cron|includes|content/plugins|content/themes) {
+  return 410;
+}
+```
+
+## Step 3 — Rank Math SEO extraction
+
+Rank Math stores titles and descriptions in `wp_postmeta`.
+`analyze_db.py` already extracts these into `pages.json` as `seo_title` and `seo_description`.
+
+For each page, the priority order for SEO fields:
+1. `seo_title` from Rank Math (if not empty and not a template like `%title% - %sitename%`)
+2. `post_title` with AM format appended: `{Title} | VibrantYou Yoga`
+3. Never leave title as the raw WP default
+
+Rank Math title templates use `%` tokens — strip them and rebuild:
+```python
+import re
+
+def clean_rm_title(rm_title: str, post_title: str, site_name: str) -> str:
+    if not rm_title or "%" in rm_title:
+        return f"{post_title} | {site_name}"
+    return rm_title
+
+def clean_rm_desc(rm_desc: str) -> str:
+    # Strip %token% placeholders
+    return re.sub(r"%[a-z_]+%", "", rm_desc).strip(" -|")
+```
+
+## Step 4 — Per-page SEO checklist
+
+For every page in `pages.json`, fill in this record before writing HTML:
+
+```json
+{
+  "slug":         "about",
+  "new_path":     "/about/",
+  "canonical":    "https://vibrantyou.yoga/about/",
+  "title":        "About VibrantYou Yoga | Mindful Movement in [City], [State]",
+  "description":  "Meet the instructors and story behind VibrantYou Yoga. [150-160 chars, include city]",
+  "keywords":     "yoga studio [city], yoga instructor, mindful movement",
+  "og_image":     "/assets/images/about-studio.webp",
+  "schema_type":  "AboutPage",
+  "h1":           "Our Story"
+}
+```
+
+Write to `.planning/data/seo-map.json`. The HTML build reads this file to
+stamp `<head>` tags.
+
+## Step 5 — Schema.org per page type
+
+| Page | Schema type | Required fields |
+|------|------------|----------------|
+| Home | `LocalBusiness` | name, url, telephone, address, areaServed, openingHours |
+| About | `AboutPage` + `Organization` | name, description, founders |
+| Classes index | `ItemList` of `Course` | name, url, description per class |
+| Class detail | `Course` | name, description, provider, educationalLevel |
+| Contact | `ContactPage` | name, url, telephone, email, address |
+| Blog post | `Article` | headline, datePublished, author, image |
+| 404 | none | — |
+
+LocalBusiness schema for vibrantyou.yoga (seed from `site-info.json`):
+```json
+{
+  "@context": "https://schema.org",
+  "@type": ["LocalBusiness", "HealthAndBeautyBusiness"],
+  "@id": "https://vibrantyou.yoga/#business",
+  "name": "VibrantYou Yoga",
+  "url": "https://vibrantyou.yoga",
+  "telephone": "",
+  "priceRange": "$$",
+  "servesCuisine": null,
+  "currenciesAccepted": "USD",
+  "paymentAccepted": "Cash, Credit Card",
+  "address": {
+    "@type": "PostalAddress",
+    "streetAddress": "",
+    "addressLocality": "",
+    "addressRegion": "",
+    "postalCode": "",
+    "addressCountry": "US"
+  }
+}
+```
+Mark address fields `DRAFT NEEDED` — do not fabricate. Pull from `wp_options`
+(`admin_email`, Events Manager location settings) or ask client.
+
+## Step 6 — Pre-launch SEO audit commands
+
+Run these before declaring the build complete:
+
+```bash
+SITE=src
+
+# Every page has a <title>
+find $SITE -name "*.html" | xargs grep -L '<title>' | grep -v "_template"
+
+# Every page has meta description
+find $SITE -name "*.html" | xargs grep -L 'name="description"' | grep -v "_template"
+
+# Every page has canonical
+find $SITE -name "*.html" | xargs grep -L 'rel="canonical"' | grep -v "_template"
+
+# Every page has JSON-LD
+find $SITE -name "*.html" | xargs grep -L 'application/ld+json' | grep -v "_template"
+
+# No WP URLs leaked into HTML
+grep -r "wp-content\|wp-admin\|wordpress\|?p=\|?page_id=" $SITE --include="*.html"
+
+# No unreplaced template placeholders
+grep -r "{{" $SITE --include="*.html"
+
+# No Divi class residue
+grep -r "et_pb_\|divi-builder" $SITE --include="*.html"
+```
+
+All six commands must return zero results before launch.
+
+## Next step
+
+Proceed to `08-run-order.md` for the complete execution sequence,
+then `02-wordpress-to-html-migration.md` Phase 7 for DNS cutover.