Every page declares its canonical. Every surface indexed under the domain it serves. Every sitemap references governed URLs only. 255 or reject.
Constraints
MUST: Emit on every page (absolute URL, lowercase, custom domain)
MUST: Generate sitemap.xml per fleet site via jekyll-sitemap plugin
MUST: Generate robots.txt per fleet site via build-surfaces (emit_robots_txt)
MUST: Exclude governance internals from sitemap via jekyll-exclude.py (DIMS + PRIVATE_DIRS)
MUST: Declare public/private boundary in magic.py PRIVATE_DIRS (single source of truth)
MUST: Reference custom domains in all public URLs — never origin domains
MUST: Emit JSON-LD structured data per page type
MUST: Verify all fleet domains in Google Search Console
MUST: Wire analytics (GA4, Google Ads, Meta Pixel, LinkedIn, Twitter/X) via HTTP.md contract — not hardcoded
MUST: Compose INTEL primitive — every claim evidence-linked
MUST: Maintain LEARNING.md — every indexing gap is a pattern
MUST NOT: Index origin URLs as canonical
MUST NOT: Reference origin domains in any sitemap, canonical tag, or OG tag
MUST NOT: Hardcode analytics IDs — they flow from HTTP.md → _config.yml → HEAD.html
MUST NOT: Hand-code robots.txt or _config.yml exclude blocks — compile from magic.py boundary
MUST NOT: Include governance files (CANON.md, COVERAGE.md, LEARNING.md, etc.) in sitemap
Every page emits <link rel="canonical"> via DESIGN theme HEAD.html. The canonical URL is:
Absolute — includes https:// + domain
Lowercase — matches Cloudflare case normalization (HTTP.md Routing)
Custom domain — uses site.url from _config.yml, never origin domain
For proxy domains (mammochat.com, mammo.chat), Cloudflare Workers must rewrite the canonical tag to use the proxy domain, not the origin domain.
Sitemaps
jekyll-sitemap in _config.yml plugins array. Generated by build-surfaces from HTTP.md contract. URLs use site.url (custom domain). robots.txt references sitemap.xml at site root.
Structured Data
Per-page Schema.org JSON-LD via DESIGN SEO.html include:
Page Type
Schema.org Type
Source Fields
Service (CHAT/*)
SoftwareApplication
label, tagline, description
Paper
ScholarlyArticle
title, authors, date, description
Blog post
BlogPosting
title, date, description
Organization
Organization
site.title, site.url, site.description
Open Graph
Tag
Source
Status
og:type
page.layout
LIVE
og:title
page.title
LIVE
og:description
page.description
LIVE
og:url
page.url (lowercase)
LIVE
og:image
site.image
LIVE
og:site_name
site.title
LIVE
og:locale
en_US
LIVE
article:published_time
page.date
LIVE
twitter:card
summary_large_image
LIVE
twitter:title
page.title
LIVE
twitter:description
page.description
LIVE
twitter:image
site.image
LIVE
Scope
SEO is the search engine optimization and indexing governance service for:
Canonical tags — <link rel="canonical"> per page, lowercase, custom domain
Sitemaps — sitemap.xml generated per fleet site via jekyll-sitemap
Structured data — JSON-LD per page type (Schema.org)
Open Graph — og: meta tags for social link previews
Search Console — Google Search Console verification and monitoring
Analytics — GA4 + Google Ads + Meta Pixel + LinkedIn + Twitter/X wired to fleet domains
Cross-References
Asset
Location
HTTP Contract
canonic-canonic/MAGIC/TOOLCHAIN/HTTP.md
DESIGN Theme
canonic-canonic/DESIGN
INTEL Surface
hadleylab-canonic/SERVICES/SEO/INTEL.md
LEARNING Patterns
hadleylab-canonic/SERVICES/SEO/LEARNING.md
DOMAINS Service
hadleylab-canonic/SERVICES/DOMAINS
HEAD.html
design/_includes/HEAD.html
OG.html
design/_includes/OG.html
*SEO
SERVICE
hadleylab-canonic*
INTEL
Search Visibility Architecture
Layer
Technology
Governance
Domain identity
Cloudflare DNS + Workers
HTTP.md § Zones, Domains
Canonical resolution
<link rel="canonical">
HEAD.html (DESIGN theme)
Crawl directives
robots.txt + sitemap.xml
jekyll-sitemap plugin
Structured data
JSON-LD (Schema.org)
SEO.html (DESIGN theme)
Social projection
Open Graph + Twitter Cards
OG.html (DESIGN theme)
Analytics collection
GA4, Meta, LinkedIn, X, Google Ads
TRACKING.html (DESIGN theme)
Search Console
Google verification + index monitoring
DNS TXT records (Cloudflare)
Data Sources
Source
Technology
Access
Domain fleet
HTTP.md § Zones + Domains tables
9 domains: 2 fleet, 2 proxy, 5 brand
Canonical tags
HEAD.html <link rel="canonical">
Per-page, absolute, lowercase
Sitemaps
jekyll-sitemap plugin
Per-fleet-site sitemap.xml
Structured data
SEO.html JSON-LD
Per-page Schema.org type mapping
OG metadata
OG.html meta tags
12 tags per page
Analytics IDs
HTTP.md § Credential Registry
5 providers × 2 fleet domains
Search Console
Google Search Console API
DNS TXT verification
CSP
HEAD.html Content-Security-Policy
Governs allowed script/connect origins
Analytics Pipeline
Stage
Input
Output
REGISTER
HTTP.md Credential Registry
Provider IDs (PENDING → provisioned)
COMPILE
build-surfaces reads HTTP.md
_config.yml with ga4_id, meta_pixel_id, etc.
INJECT
TRACKING.html conditional load
Provider scripts (only when ID present)
COLLECT
User pageviews + events
Platform dashboards (GA4, Meta, LinkedIn, X, Google Ads)
VERIFY
Search Console DNS TXT
Domain ownership + index status
Cross-Scope Connections
Service
Role
HTTP
Source of truth — domain fleet, routing, credential registry
Missing og:site_name, og:locale, article:published_time — social link previews degraded
OG.html (before fix)
2026-03-26
SITEMAP_JUNK
sitemap.xml contained 1,011 URLs but 607 were governance internals (SERVICES/, CHARTER/, CANON.html, COVERAGE.html, etc.) blocked by robots.txt. Root cause: jekyll-exclude.py bug — DIMS descriptions like “Declarative CANON.md exists” don’t end with .md, so the filename extraction caught zero files. Only 4 hardcoded excludes were present. Fix: magic.py now declares PRIVATE_DIRS + gov_filenames() as single source of truth, jekyll-exclude.py imports both.
Chrome DevTools sitemap audit
2026-03-26
ROBOTS_HAND_CODED
robots.txt was hand-coded in GOV repo. HTTP.md contract said build-surfaces should generate it, but emit_robots_txt didn’t exist. Fix: build_surfaces_fleet.py now has emit_robots_txt() that generates robots.txt from magic.py PRIVATE_DIRS. Compiled, not hand-coded.
Compiler audit
2026-03-26
CLOUDFLARE_PREPEND
Cloudflare prepends its own content-signal block to robots.txt at the edge (User-agent: *, Content-Signal: search=yes,ai-train=no, plus blocks for GPTBot, ClaudeBot, etc.). The GOV-compiled robots.txt is the base; Cloudflare adds the AI policy layer. Both are valid.
curl hadleylab.org/robots.txt
2026-03-27
SITEMAP_FRONTMATTER
_config.yml exclude: only prevents Jekyll from processing source .md files. Pre-committed HTML (CANON.html, COVERAGE.html, etc.) bypasses exclude. Fix: governance emitters now inject sitemap: false in Jekyll frontmatter. jekyll-sitemap respects this field and omits the page. Three-layer defense: (1) _config.yml exclude: for source, (2) sitemap: false in frontmatter for compiled HTML, (3) robots.txt Disallow for crawlers.
Compiler code review
2026-03-27
GC_CONTENT_LANES
build –gc wiped 1523 content files (BLOGS, PAPERS, BOOKS, etc.) because gc_orphans treated catalog-managed directories as orphans. Their CANON.json had _generated but was not in the compiled set. Fix: magic.py CONTENT_LANES constant, gc_orphans skips paths whose top-level component is a content lane.
Content wipe incident 2026-03-27
ROADMAP
Now
[ ] Verify hadleylab.org in Google Search Console
[ ] Verify canonic.org in Google Search Console
[ ] Verify mammochat.com in Google Search Console
[ ] Run build to deploy fixed sitemap + robots.txt
[ ] Worker canonical rewrite for proxy domains (mammochat.com, mammo.chat)
[ ] Google Search Console: check indexing status, crawl errors after fix deploys
[ ] Enhance robots.txt with Cloudflare content-signal alignment check
VOCAB
Term
Definition
INHERITANCE CHAIN
SERVICES
SERVICES compose primitives — INTEL + CHAT + COIN. Every service governed. Every scope discovered.
MUST: Maintain TRIAD integrity (CANON.md + VOCAB.md + README.md)
MUST: Treat SPEC as scope identity (`{SCOPE}` directory), not as a file
MUST: Every SERVICE scope include ROADMAP.md, COVERAGE.md, LEARNING.md, and `{SCOPE}.md` as governed content surfaces
MUST: Discover SERVICE scopes from filesystem only (no manual catalog)
MUST: Keep http:// and magic:// on the same namespace (transport differs, scope path matches)
MUST: CANON.md = axiom + universal constraints (no service names, no paths, no implementation)
MUST: README.md = how to run the CANON (nothing else)
MUST: {SCOPE}.md = SPEC — the interface (purpose, routes, projections, ecosystem)
MUST: SHOP.md = public projection file (per scope, filesystem-discoverable)
MUST: VAULT.md = private projection file (per scope, filesystem-discoverable)
MUST: Runtime implementation remains under ~/.canonic; this workspace is governance-first
MUST NOT: Hardcode service names in CANON constraints (law speaks universals)
MUST NOT: Define ungoverned terms outside VOCAB.md
MUST NOT: Treat `{SCOPE}.md` as SPEC identity
MUST NOT: Move architecture/lifecycle into README
MUST NOT: Leak private projections to public surfaces
MUST NOT: Maintain duplicate mapping tables outside generated manifest outputs
MUST NOT: Add runtime jargon to governance contracts
MUST: Ledger-consuming services declare source ledgers, scope filters, and closure gates
MUST: Learning governance remains live — closure claims require fresh DISCOVER → GENERATE → RELINK evidence
hadleylab-canonic
HADLEYLAB ships software. Every app, book, paper, deal, and patent is PROOF that MAGIC works. COIN = WORK. LEARNING = COMPUTE.
MUST: Every app, book, paper, deal, or patent is evidence of MAGIC
MUST: All scopes inherit canonic-canonic/CANONIC.md governance
MUST: All users governed under USERS/ via SERVICES/USER
MUST: Cross-index INTEL across users (INTEL.md)
MUST: Shared events propagate to ALL affected user dashboards
MUST: Maintain governance workspace purity (.md files only)
MUST: Ledger all COIN (validated work) through MAGIC 255
MUST: Compile all INTEL from governed sources
MUST: Keep frontend/runtime implementation under ~/.canonic (hidden runtime)
MUST: Surface governed TALK, Library, and SERVICES scopes (no orphan content)
MUST: Derive nav labels from governed scope names (no hardcoded strings)
MUST NOT: Publish without governance (CANON.md required)
MUST NOT: Duplicate primitives — compose from INTEL, CHAT, COIN
MUST NOT: Silo intelligence inside a single user when multiple are affected
MUST NOT: Expose VAULT contents outside NDA scope
MUST NOT: Store runtime artifacts in governance workspace
canonic-canonic
SPEC is governance. `canonic-canonic/` is the spec root.
MUST: Keep this repo governance-only (.md/.pdf)
MUST: Publish workspace mapping in CANONIC.git (no hardcoded repo lists)
MUST: Preserve three primary lanes: FOUNDATION, INDUSTRIES, MAGIC
MUST NOT: Commit runtime artifacts here (runtime belongs in ~/.canonic/)
MUST: Sell MAGIC tiers — the product, not the proof (proof is hadleylab-canonic)
MUST NOT: Embed beta-test app URLs in platform page content
SEO · SERVICE CONTRACT · CANONIC ∩
SERVICES
SEO
Every page declares its canonical. Every surface indexed under the domain it serves. Every sitemap references governed URLs only. 255 or reject.