How our SEO audit
actually works — under the hood.
A complete walk-through of the v3.14.0 ortho-seo plugin: 17 parallel sub-audits, a 1000-point composite rubric, what data each sub-audit pulls, how median-of-N scoring works, and real findings from Adirondack Orthodontics + Green Orthodontics audits.
How the audit actually runs
17 sub-audits do not run in sequence — they run in three coordinated waves. Wave 1 is massively parallel; Waves 2 and 3 are sequential because they depend on Wave 1's output.
Wave 1 — Parallel batch
In a single message, the orchestrator dispatches 16 sub-audits concurrently as independent agents. Each writes its findings to disk as audit_<name>.json.
Includes: backlinks, GBP, real GSC rankings, citations, reviews, technical, schema, page deep-dive (SCHOLAR), content quality, LLM visibility, competitor, content gap, map-pack heatmaps, link-graph, tracking, plus OTTO Phase A (raw pull only).
Wave 2 — OTTO Phase B
OTTO triage runs after Wave 1 because it scores each OTTO recommendation against the other 14 sub-audits' findings. It literally cannot score corroboration until those envelopes exist on disk.
The triage produces three CSVs: otto_keep.csv (worth deploying), otto_review.csv (human judgment needed), otto_noise.csv (suppressed with reasons logged). See §5.
Wave 3 — Content roadmap
The content-roadmap synthesizer reads every Wave 1 + Wave 2 output and produces the 90/180/365-day publication plan. New pages, upgrades, consolidations — each with target keyword, parent pillar, estimated traffic, and dependencies.
Pure synthesis. Zero new measurements. Just intelligent reading of what the audits already found.
1–2 sub-audits fail: continue. Affected sub-criteria get marked "Estimated — confirm via [tool]". Strategy presentation hides those sections via template guards.
4+ sub-audits fail: abort. That's an MCP/auth problem, not a content problem — the operator gets routed to run the test harness rather than waste time troubleshooting individual sub-audits.
Brand routing happens once, upfront (Step 2.4) — a single prompt to confirm HIP or NEON Canvas, written to brand.yaml. Prevents 8 sub-audits from each independently blocking on the same question.
The V5 rubric — 1000 points, unpacked
Eight categories, weighted by what actually moves the needle for orthodontic local SEO in 2025–2026. Each category breaks into sub-criteria; each sub-criterion has a max-point value and a deterministic scoring rule.
Sub-criteria — how each category's points get distributed
Every category subdivides into 3–7 sub-criteria. Each sub-criterion has a max-point value and a precise scoring rule (e.g., "50 pts if 5+ pillar pages with 2,000+ words and 10+ internal links each"). The rubric document is 502 lines and lives at skills/seo-audit-and-plan/references/v5-rubric.md.
| Category | Sub-criteria | Max pts | Primary sub-audit |
|---|---|---|---|
| Cat 1 Content · 300 | 1.1 Topical Map Completeness (75) · 1.2 Pillar Pages (50) · 1.3 Cluster Content (50) · 1.4 Service Pages (40) · 1.5 Location Pages (40) · 1.6 Content Velocity (25) · 1.7 E-E-A-T (20) | 300 | content-gap · content-quality · page-deep-dive · schema-audit |
| Cat 2 GBP · 200 | 2.1 GBP Completeness (50) · 2.2 Heatmap Performance (75) · 2.3 GBP Activity (40) · 2.4 GBP-Website Alignment (35) | 200 | map-pack · gbp-reviews · schema-audit |
| Cat 3 On-Page · 150 | 3.1 Title Tags (25) · 3.2 Meta Descriptions (20) · 3.3 Heading Structure (25) · 3.4 Internal Linking (50) · 3.5 Image Optimization (20) · 3.6 URL Structure (10) | 150 | technical-audit · link-graph-audit · page-deep-dive |
| Cat 4 Authority · 150 | 4.1 Domain Metrics (40) · 4.2 Backlink Quality (40) · 4.3 Backlink Gap (30) · 4.4 Off-Page Building (40) | 150 | backlink-audit · backlink-gap |
| Cat 5 Technical · 75 | 5.1 Schema Markup (25) · 5.2 Core Web Vitals (25) · 5.3 Crawlability (15) · 5.4 Site Security (10) | 75 | technical-audit · schema-audit |
| Cat 6 GEO · 50 | 6.1 LLM Visibility (25) · 6.2 Content Extractability (15) · 6.3 Cross-Domain Consensus (10) | 50 | llm-visibility |
| Cat 7 Reviews · 50 | 7.1 Review Quantity (20) · 7.2 Review Quality (20) · 7.3 Review Management (10) | 50 | gbp-reviews |
| Cat 8 Citations · 25 | 8.1 Citation Presence (15) · 8.2 NAP Consistency (10) | 25 | citation-audit |
How the final score maps to a strategy phase
Score band → phase mapping is deterministic. The grade drives the entire roadmap:
| Score | Grade | Status | Phase & Priority |
|---|---|---|---|
| 900–1000 | A+ | Market Dominator | Phase 6 — Maintain & Defend |
| 800–899 | A | Strong Performer | Phase 5 — Market Domination (GEO, trophy content) |
| 700–799 | B | Competitive | Phase 4 — Authority Building |
| 600–699 | C | Average | Phase 3 — Competitive Positioning |
| 500–599 | D | Below Average | Phase 2 — Core Optimization |
| 0–499 | F | Needs Major Work | Phase 1 — Foundation |
What the rubric actually looks like in practice — Adirondack Orthodontics
The headline
Strong on-page mechanics (Cat 3: 131/150) and a flagship review reputation (Cat 7: 40/50), but content is undeveloped (Cat 1: 106/300), authority is thin (Cat 4: 26/150), and the GEO trajectory is negative (LLM mentions down 41 in 30 days). Despite excellent map-pack rankings in established cities, Albany "braces" is declining 1.2 → 2.8 and rival Albany Braces now outranks them in their headline market.
| Category | Score | Max | % | Why |
|---|---|---|---|---|
| Cat 1 — Content | 106 | 300 | 35% | No topical map, no pillar pages, vendor templates, cannibalization. SCHOLAR avg 46.8 across priority pages. |
| Cat 2 — GBP | 66 | 200 | 33% | Strong existing heatmap rankings dragged down by missing GBP Galactic data + declining Albany "braces". |
| Cat 3 — On-Page | 131 | 150 | 87% | Highest % earned. SA Holistic Technical pillar 81/100 + real GSC CTR 3.52%. |
| Cat 4 — Authority | 26 | 150 | 17% | DR 15, spam score 21, 275 broken backlinks, no off-page work. |
| Cat 5 — Technical | 51 | 75 | 68% | CWV failing on audited pages, but HTTPS + crawlability solid. |
| Cat 6 — GEO | 28 | 50 | 56% | LLM visibility 50/100 but trajectory negative (−11.7 pts, −41 mentions in 30d). |
| Cat 7 — Reviews | 40 | 50 | 80% | 1,268 reviews, 4.94 avg, 89% reply rate. Best category by % earned. |
| Cat 8 — Citations | 13 | 25 | 52% | No SA citation submissions on record; multiple duplicate CIDs flagged. |
| Total | 461 | 1000 | 46% | Grade F · Phase 1 Foundation |
Composite scoring — how 17 envelopes become 1 score
Multiple sub-audits often measure the same category. The composite scorer's job is to merge them honestly without averaging away signal.
The composite scorer runs in two layers. A baseline scorer (score_rubric.py) computes a single-source score from the crawl alone — that's the v3.4 fallback if you run with zero sub-audits. The composite scorer (run_composite_score.py) is a non-breaking overlay on top that pulls in all 17 sub-audit JSON envelopes and merges them.
The merging rule: median-of-N
When two or more sub-audits score the same V5 category at the full point-range (e.g., both report a 0–150 score for Cat 3), the composite takes the median — not the average, not the sum. This is the system's defense against any single source dragging the result wrong. One extreme reading gets absorbed.
real-rankings contributed 131/150 (driven by SA Holistic Technical pillar 81 + real CTR 3.52%).
technical-audit contributed 53/150 (driven by detailed Lighthouse + meta + alt-text measurement).
page-deep-dive contributed 14/150 (SCHOLAR + internal linking on priority pages).
link-graph contributed 22/150 (PageRank + click-depth analysis).
Composite reported: 131 — the median of the full-range contributions, dominated here by real-rankings.
Three aggregation modes — selected automatically per category
median_of_full_scores
When multiple sub-audits each score the full category range. Used for Cat 3, 5, 6, 7, 8 on Adirondack. Median is robust to outliers.
sum_of_partial_ranges
When sub-audits cover complementary sub-ranges. E.g., backlink-audit owns sub 4.1–4.5 (90 pts), backlink-gap owns sub 4.6 (60 pts) — these sum cleanly. Used for Cat 1, 2, 4.
baseline_only
When no sub-audit covered the category, fall back to the crawl baseline. The audit envelope notes "Estimated — confirm via [tool]" so the gap is transparent.
Transparency fields — every score shows its work
score_by_source— how many points each data source contributed per category (site crawl, Search Atlas, Ahrefs, DataForSEO, etc.).category_contributions— for each category: the baseline crawl score, every sub-audit's named contribution, the aggregation method used, plus cross-category evidence rows from supporting sub-audits.sub_audits_succeeded/sub_audits_failed— exact list so the operator can see which measurements are missing if any sub-audit errored.evidencearrays — each sub-audit's evidence strings flow through to the audit report and the strategy presentation. Example: "avg LCP 3550ms, CLS 0.165, 0% pages pass CWV, avg Lighthouse Performance 62/100" — that's the actual evidence string from Adirondack's Cat 5 score.
The 17 sub-audits — what each one actually does
Click any audit to expand. Each one shows its purpose, the data sources it pulls, the actual algorithm, what it contributes to the V5 rubric, and a real finding from the Adirondack or Green Orthodontics audit.
Site crawl + ground truth + pre-flight QA
Finds every content, NAP, schema, and structural error on a live site before any other audit runs — produces the canonical page list every downstream sub-audit reads.
Finds every content, NAP, schema, and structural error on a live site before any other audit runs — produces the canonical page list every downstream sub-audit reads.
Data sources
- Firecrawl — full markdown + rawHtml for every page (headers/footers included so NAP and JSON-LD survive).
- DataForSEO OnPage — bulk task crawl: pages, duplicate tags, internal links, redirect chains, non-indexable.
- Parallel content-analyzer subagents — fanned out for content QA in batches of ~30 pages.
Methodology
- Bulk-only on the happy path — two async jobs do the heavy lifting; no per-page calls until exceptions surface.
- A Python "ground-truth" step picks the homepage, contact, and location pages, extracts canonical practice name/phone/address/doctors, and asks the operator to confirm before flagging anything. Wrong ground truth = false positives everywhere.
- Pre-flight pattern checks against rawHtml: heading hierarchy, Google Maps URL resolution (must land on a real GBP CID, not an address pin), iframe form privacy/terms link inspection, deep JSON-LD validation, hostname canonicalization (catches Kinsta/WP Engine leaks).
- Cross-page Python pass detects cannibalization (exact title/H1 duplicates, neighborhood-doorway templating) and internal-linking issues (orphans, missing footer links, broken links).
Rubric contribution
Not directly scored. Produces crawl_pages.json + ground_truth.json that every other sub-audit consumes. Findings feed Cat 3 (On-Page) and Cat 1 (Content) indirectly.
Crawl seeded ground truth and surfaced 1 orphan page, 2 broken internal links that flowed into Cat 3 scoring. Confirmed 7 GBP locations + 4 doctors (Berenshteyn, Boudreaux, Pacheco, Pellettieri).
Keyword gap vs competitors
Which keywords are our rivals ranking for that we aren't, ranked by how much traffic we'd capture if we won them?
Which keywords are our rivals ranking for that we aren't, ranked by how much traffic we'd capture if we won them?
Data sources
- Ahrefs —
site-explorer-organic-keywordstop 1,000 by volume, per domain. - DataForSEO Labs —
google_domain_intersectionwithintersections: falseflag = "rival ranks, we don't". - Search Atlas —
se_analyze_keyword_gap+se_get_keyword_gap_results(async submit-and-poll, cross-validation).
Methodology
- Pulls our top 1,000 organic keywords and each rival's top 1,000 from all three providers; subtracts our set from each rival's set — what remains is the "gap".
- Merges results across providers and dedupes. A keyword present in 2–3 sources is high-confidence; single-source gets a caveat.
- Scores each gap keyword as
highest_volume × rival_position_inverse(position 1 → ×100, position 50 → ×51, position 100+ → ×0). Rewards both volume AND rival proximity to page 1. - Buckets the top 100 by topical category (braces, invisalign, ortho_general, info) via deterministic keyword-token rules — no LLM guessing.
- Cross-references against existing internal link graph to tag each gap as
coverage: partial(rewrite existing page) vscoverage: none(build new).
Rubric contribution
Cat 1.3 Cluster Content (max 50 pts). Computes a coverage_pct and maps to a delta: ≥90% coverage = 0; <30% = −20 pts off baseline.
Scored 25/50 on Cat 1.3, applied a −10 delta. Single rival used: albanybraces.com. Evidence: "DFS domain_intersection adirondackorthodontics.com vs albanybraces.com". Ahrefs and Search Atlas legs didn't surface keywords on this run, so the audit ran in DFS-only mode with the lower confidence noted.
Cannibalization, vendor content, freshness, pillar architecture
Where is content actively hurting us — duplicate pages competing with each other, boilerplate from old vendors, stale content, or a missing pillar structure?
Where is content actively hurting us — duplicate pages competing with each other, boilerplate from old vendors, stale content, or a missing pillar structure?
Data sources
crawl_pages.jsonfrom prior site-review (required, ≥5 pages).- DataForSEO Labs —
google_ranked_keywordsfor cannibalization confirmation (Phase 2). - Search Atlas GSC —
gsc_get_keyword_performancefor canonical Google truth (Phase 3).
Methodology
- Cannibalization detection: normalize each page's title + H1, strip brand and geo tokens, cluster pages with identical phrases. Confirm via DFS (≥2 URLs ranking top 50 for same keyword) AND GSC (≥2 URLs accumulating impressions for the same query). GSC is the strongest signal — it's Google literally telling us two pages compete.
- Vendor boilerplate: hash every sentence ≥50 chars across the site. Sentences appearing on ≥2 pages indicate template reuse. Flag pages with <40% unique sentences AND >50% boilerplate score.
- Freshness: extract dates from JSON-LD first, then OpenGraph, then visible text. Bucket: fresh (≤6mo), current, stale (>12mo), untraceable.
- Pillar architecture: build URL tree, identify Mother pages matching
ground_truth.services, count Sons and Grandsons. Compare to ideal (each Mother → 2–5 Sons → 1–3 Grandsons each).
Rubric contribution
Cat 1.4 Service Pages (40) + 1.5 Location Pages (40) + 1.6 Content Velocity (25) + 1.7 E-E-A-T (20) = 125 pts. Also emits cross-category cannibalization evidence to Cat 3.
Scored 61/125. 1.6 Velocity = 17/25 with evidence "17% pages fresh, 17% stale, avg page age 611 days, 8 pages have no detectable date". 1.5 Location Pages flagged "8 location pages for 2 cities (400% coverage); 4 location pages involved in cannibalization" — the practice over-built city pages that now compete with each other.
90/180/365-day content plan synthesizer
Given everything every other audit found — what specific pages should we build, upgrade, or kill, and in what sequence?
Given everything every other audit found — what specific pages should we build, upgrade, or kill, and in what sequence?
Data sources
- Pure synthesizer — zero new measurements. Reads all the other audit envelopes plus
ground_truth.jsonandcrawl_pages.json.
Methodology
- Generates three candidate types: NEW_PAGE (from content-gap keywords, missing pillar Mothers/Sons, competitor top pages), UPGRADE (page-deep-dive SCHOLAR <60, cannibalization winners, buried pages with topical value), CONSOLIDATE (cannibalization losers, stale vestigial orphans).
- Each candidate gets a type-specific impact score. NEW_PAGE blends keyword volume (log-scaled), winnability (inverse of rival position), pillar gap severity, and parent Mother's PageRank percentile.
- Sequences into phases with dependencies: Phase 1 (90d) = all consolidations + top 3 upgrades + top 3 new pages. Phase 2 (180d) = remaining upgrades + Son/Grandson buildout. Phase 3 (365d) = depth content. A Son can't ship before its Mother.
- Generates a deployable brief for each item: title, slug, primary/secondary keywords, target word count, required schema types, H1/H2 skeleton, FAQ prompts, internal-link plan, E-E-A-T requirements.
Rubric contribution
Evidence only — informational under Cat 1. The output is the deliverable; it doesn't change the score.
Roadmap proposes 2 new pages + 3 upgrades + 2 consolidations. Phase 1 (90d): 7 actions. Phases 2 + 3 empty because content-gap surfaced few keywords on this run (DFS-only mode) — synthesizer compressed everything into Phase 1 rather than padding without evidence. Estimated additional keyword capture: 2 keywords.
Crawlability, Core Web Vitals, indexability, meta validation
Is the site technically crawlable, indexable, fast, and clean — with one cloud crawl instead of running Screaming Frog?
Is the site technically crawlable, indexable, fast, and clean — with one cloud crawl instead of running Screaming Frog?
Data sources
- DataForSEO OnPage task crawl — task post, summary, pages, duplicate tags, redirect chains, non-indexable, links.
- DataForSEO Lighthouse on top 10 pages — Performance, SEO, Accessibility, Best Practices + Core Web Vitals.
- Ahrefs Site Audit — independent error/warning verification.
- Firecrawl JS-render diff on 3 pages — detects hydration issues where content only appears post-JS.
Methodology
- One task crawl, then ~5 GET calls retrieve the whole site (versus 150 sequential page hits).
- Per-page
checks.*boolean flags are the authoritative issue source — no inference, no LLM guessing. - Severity is rule-based: Critical = 404/5xx, missing title, missing H1, HTTP on HTTPS site. Warning = duplicate tags, thin content (<300 words), missing alt, canonical mismatch, orphan. Info = LCP/INP, render-blocking, no schema.
- v3.13.0+ meta-conventions pass validates every title (50–60 char band, no YMYL superlatives, no "near me", no year), every meta description (120–160 chars, CTA before mobile cutoff, no phone numbers), and Title↔H1 token similarity ≥80%. State-board awareness layer flags FTC/state-dental-regulator risk for YMYL violations.
Rubric contribution
Full Cat 5 Technical (75 pts): 5.1 Schema (25), 5.2 CWV (25), 5.3 Crawlability (15), 5.4 Security/HTTPS (10). Also feeds Cat 3 sub-scores 3.1 (25) + 3.2 (20) + 3.4 (25) = 70 pts via title/meta/alt-text data.
Cat 5 earned 41/75. Evidence: "avg LCP 3550ms, CLS 0.165, INP 270ms, 0% of audited pages pass all 3 CWV, avg Lighthouse Performance 62/100, 3 redirect chains, 2 non-indexable pages, 8 Ahrefs errors."
Per-page SCHOLAR scoring on priority pages
Score the 5–10 most important pages page-by-page so the audit can say exactly which URLs need a rewrite vs. a minor tweak.
Score the 5–10 most important pages page-by-page so the audit can say exactly which URLs need a rewrite vs. a minor tweak.
Data sources
crawl_pages.jsonfrom site-review (no re-crawl).- DataForSEO —
on_page_instant_pagesin parallel batches of 5;on_page_lighthouseon top 5. - Firecrawl as JSON-LD fallback when rawHtml is missing.
SCHOLAR framework — 7 letters × 0–10 each, scaled to 0–100
- S — Specific: title / H1 / URL contain the primary service from ground truth + a tracked city.
- C — Conversational: ≥3 question patterns, FAQ schema present, 2nd-person voice ≥5 occurrences, dl/details blocks.
- H — Helpful: ≥2 CTAs ("book", "schedule", "consult", "call"), price or before/after content, descriptive alt text, clear next-step section.
- O — Original: practice-specific signals (doctor names, local landmarks), low vendor-boilerplate ratio (<0.6 of phrases like "state-of-the-art technology"), internal link to a related Mother page.
- L — Long: word-count tiers (≥400 / ≥800 / ≥1500) plus well-formed H2/H3 hierarchy.
- A — Authoritative: doctor credentials (DDS, DMD, AAO), Author/Person schema, outbound link to .gov/.edu, Reviewed/Updated date <12 months.
- R — Relevant: title keyword matches a tracked rank-tracker keyword, no sibling cannibalization, semantically-aligned inbound anchors, schema type matches content.
Interpretation: 80–100 pillar-ready · 60–79 needs work · 40–59 rewrite candidate · <40 failing.
Rubric contribution
Cat 1.3 Content Quality (30) + 1.4 Content Depth (30) · Cat 3.5 Internal Linking (25) + 3.6 Schema Coverage (25) = ~110 pts.
4 priority pages audited, avg SCHOLAR 46.8/100. Drove 1.3 Content Quality to 8/30 and 1.4 Content Depth to 2/30 ("avg word count 63; 50% of pages have ≥3 H2s"). Internal linking on priority pages: avg inbound 1.2, anchor diversity 1.0, 1 orphan → 4/25 on 3.5.
Structured data validation + NAP cross-reference
Does every page have the right JSON-LD, does it match visible content + GBP, and is it valid?
Does every page have the right JSON-LD, does it match visible content + GBP, and is it valid?
Data sources
crawl_pages.jsonrawHtml — regex-extract every<script type="application/ld+json">block.- DataForSEO —
onpage_pages.checks.has_json_ld/has_microdatafor the "which pages have schema" baseline. - Firecrawl fallback when rawHtml is missing.
- Ground truth NAP + GBP data from
audit_map-pack.jsonfor cross-validation.
Methodology
- Parse every JSON-LD block; validate JSON syntax, required fields per
@type, and correct subtype (Dentist/Orthodontistarray form, not bareLocalBusiness). - Determine expected schema by page_type: homepage → LocalBusiness + Organization; service → MedicalProcedure; provider → Person/Physician; location → LocalBusiness per office; blog → Article + Person author; FAQ section detected → expect FAQPage.
- Cross-reference NAP three ways: schema-vs-visible content, schema-vs-GBP, visible-vs-GBP. Mismatches go to the action list.
- Stale/vendor schema detection: out-of-date phone, wrong address, generic vendor strings injected by old plugins.
@idcross-reference validation — every{"@id": "..."}must resolve to an entity in the@graph; broken refs are removed, never emitted.
Rubric contribution
Cat 1.5 Schema Completeness (20) + Cat 2.4 GBP-Website Alignment / Schema NAP (25) + Cat 5.1 Schema Markup (25) = ~70 pts.
50% of pages have type-appropriate schema; 50% have any schema; 75% have no validation errors; NAP in schema matches visible content on 0/1 pages (0%). Scored 1.5 = 10/20 and 3.6 = 10/25. Recommended: deploy MedicalProcedure schema on all service pages, fix NAP-in-schema on the homepage.
Map-pack heatmap grid rankings
Where in the city does the practice actually show up in the Google Map Pack, and where does it disappear?
Where in the city does the practice actually show up in the Google Map Pack, and where does it disappear?
Data sources
- Search Atlas Local SEO Heatmaps (primary) —
list_businesses,list_businesses_heatmaps,get_heatmap_details,single_competitor_versus_report. - DataForSEO —
serp_organic_live_advancedwithlocation_coordinatefor cross-validation at the address center.
Methodology
- Match each practice location to a SA heatmap business by address/lat-lng (within ~100m).
- For each (location × tracked keyword), pull the existing grid — every cell has a GPS coordinate and the practice's rank at that point. Typically 39 cells per grid.
- Compute four KPIs per grid: average rank, top-3 share %, the largest concentric radius where median rank ≤3 (proximity radius), and dead-zone clusters (≥3 contiguous cells where rank >20).
- Cross-validate with DataForSEO at the address center — if the practice doesn't appear at its own front door, SA data is suspect.
- Optional rival comparison runs
single_competitor_versus_reportagainst the primary rival domain.
Rubric contribution
Cat 2.2 GBP Heatmap Performance (75 pts): 2.2.1 avg rank (25), 2.2.2 top-3 share (25), 2.2.3 proximity radius (15), 2.2.4 dead-zone coverage (10). Returns null if no heatmap project exists — "data unavailable" is not "failed."
6 of 7 GBP locations average 1.2–2.8 across braces / orthodontist / invisalign. Latham "orthodontics" = 1.2 across all 39 grid pins. The killer find: Albany "braces" declining from avg 1.2 (Jan 2026) to 2.8 — and rival Albany Braces now ranks 2.6, outranking the practice in its own headline market. Score: 64/75 with a −10 negative-trend penalty.
NAP consistency across directories
Is this practice's Name/Address/Phone consistent across the directory ecosystem, and is it listed where it matters?
Is this practice's Name/Address/Phone consistent across the directory ecosystem, and is it listed where it matters?
Data sources
- Search Atlas —
gbp_list_citation_submissions+gbp_get_aggregator_details(covers ~30–50 directories via 5 aggregators: Data Axle, Localeze, Foursquare, Factual, Acxiom). - DataForSEO —
business_data_business_listings_search,google_my_business_info,google_reviews.
Methodology
- Path A (preferred): read prior SA citation submissions if they exist.
- Path B (fallback): initialize a citation draft populated from
ground_truth.jsonand submit a fresh scan; poll up to 5 minutes. - Normalize every found listing: phone to E.164, address with abbreviation collapse, name with legal-suffix stripping, website with host-only.
- Compute Levenshtein distance per field vs canonical NAP — matches must be exact or ≤2 character edits (single-typo tolerance for name/address; exact match for phone/website).
- Bucket each mismatching listing into name / phone / address / website / hours discrepancies.
Rubric contribution
Cat 8 (25 pts total): 8.1 aggregator presence (10), 8.2 NAP consistency % (10), 8.3 niche directories — AAO / ADA / Healthgrades / Zocdoc (5).
Score 13/25 — partial because no SA citation scan history existed for any of the 6 locations. Baseline ran from GBP location data alone: 5/6 locations verified, 85.7% profile completeness, missing Healthgrades / Zocdoc / WebMD / RealSelf / AAO / Yelp. Map-pack data revealed duplicate CIDs — strong evidence a real scan would surface NAP fragmentation. Next action: gbp_init_citation_draft per location.
Real GSC rankings — clicks, impressions, CTR, position
What does Google itself say about this practice's actual organic performance — replacing third-party rank estimates with first-party GSC truth?
What does Google itself say about this practice's actual organic performance — replacing third-party rank estimates with first-party GSC truth?
Data sources
- Search Atlas GSC integration — 9 tools including
gsc_get_site_property_performance,gsc_get_keyword_performance,gsc_get_page_performance,gsc_compare_performance. - Search Atlas KRT (Keyword Rank Tracker) — daily SERP-position history beyond GSC's 16-month window.
- DataForSEO Labs —
google_ranked_keywordsfor cross-validation of estimate accuracy vs GSC truth.
Methodology
- Hard gate:
gsc_get_sitesmust return the practice domain. If GSC isn't connected, the audit returnsstatus: error— it deliberately does NOT fall back to estimates. - Pulls two 30-day windows in parallel (current and prior) for period-over-period delta.
- Computes branded vs non-branded split by matching keywords against practice name + doctor surnames from ground truth.
- Identifies "winnable" keywords: position 4–10 with ≥50 impressions — small CTR/meta tuning lifts them to page 1.
- Computes mean absolute position delta between DFS estimates and GSC reality for top 50 keywords — feeds confidence intervals on the other audits.
Rubric contribution
Does not own a category. Emits a cross_category block that the composite scorer overlays onto Cat 1.7, Cat 3.6, and Cat 6.3 — plus the SA Holistic Technical pillar feeds Cat 3 and Cat 5.
8,289 ranking keywords, 1,330 clicks / 86,043 impressions / 3.52% CTR / avg pos 10.3 over 40 days. KRT shows 35/42 tracked keywords in top 3 (83%) across 5 location projects. The killer find: /how-much-do-braces-cost/ has 9,671 impressions at 0.07% CTR (position 16) — the single largest CTR opportunity site-wide. Homepage ranks for 189 keywords at avg position 26 — page-3 cliff, massive upside.
Google review reputation across all locations
Is this practice's Google review reputation a competitive asset or a liability, and are they actively managing it?
Is this practice's Google review reputation a competitive asset or a liability, and are they actively managing it?
Data sources
- Search Atlas —
gbp_list_locations(enumerate verified GBP profiles) +gbp_get_review_statsper location (total reviews, weighted rating, reply count, rating distribution). - Optional:
gbp_list_reviewsfor individual review pulls when negative-handling drill-down is needed.
Methodology
- List all GBP locations under the practice's SA account; flag any unverified locations as warnings.
- Per location, pull review stats and compute reply rate (replied / total).
- Weights aggregate rating by review volume across locations — a 5.00 location with 118 reviews shouldn't dominate a 4.90 location with 405.
- Flag any location with reply rate <90% as a "reply gap"; flag any location with ≥5 one-star reviews as needing sentiment investigation.
Rubric contribution
Full Cat 7 Reviews (50 pts): 7.1 review volume (20), 7.2 avg rating (20), 7.3 review management/response rate (10). Cross-category contribution to Cat 2.5 GBP Review Velocity.
Score 40/50. 1,268 total reviews, 4.94 weighted avg, 89.4% reply rate across 6 GBP locations. Clifton Park has the biggest gap — 70 unreplied of 429 reviews. Albany has 6 one-star reviews (1.5% of 405) needing sentiment investigation. Glens Falls is the flagship: perfect 5.00 across 118 reviews. One stale unverified location (Moe Rd Clifton Park, ID 54025) needs claim/redirect.
Practice's own backlink profile + velocity
How many real websites link to the practice, how trustworthy are those links, and is the link program growing, flat, or in decline?
How many real websites link to the practice, how trustworthy are those links, and is the link program growing, flat, or in decline?
Data sources
- Ahrefs Site Explorer (primary) — referring-domains history, all-backlinks top-500, anchors, DR history, URL Rating history, broken backlinks.
- DataForSEO Backlinks API (cross-validator) —
backlinks_summary,backlinks_anchors,backlinks_timeseries_new_lost_summary,backlinks_bulk_spam_score.
Methodology
- Pulls 12 months of referring-domain history from both sources; computes new / lost / net velocity at 30 / 90 / 365 day windows.
- Classifies every anchor (weighted by referring domain, not raw link count) into five buckets: brand, exact-match, partial-match, naked URL, generic.
- Flags a referring domain "toxic" if DFS spam score ≥30, OR if DR ≤5 with exact-match anchor (anchor-bait pattern), OR if it's on a low-trust TLD with DR ≤10.
- Reconciles Ahrefs vs DFS — if they disagree by >50% on net velocity or >15pp on any anchor bucket, logs a warning and surfaces both numbers.
Rubric contribution
Owns sub-criteria 4.1–4.5 in Cat 4 (90 of 150 pts): Domain Power proxy (20), ref-domain count (20), velocity (20), anchor diversity (15), link quality/toxicity (15).
Adirondack: Ahrefs DR 15, 115 referring domains, 341 total backlinks but 275 broken (80% link rot), DFS spam_score 21 (high — toxic), Wildfire 2:1 ratio violated (822 external vs 9,130 internal links). Cat 4 contribution: 26/90.
Green Ortho: DR 16, 179 ref-domains, only 1 broken backlink, spam_score 5 → 47/90. Same emit logic, much healthier profile.
Link prospects — domains that link to rivals but not us
Which referring domains link to the practice's rivals but not the practice — the acquisition pipeline of plausible link prospects already proven willing to link to a similar business?
Which referring domains link to the practice's rivals but not the practice — the acquisition pipeline of plausible link prospects already proven willing to link to a similar business?
Data sources
- Ahrefs
batch-analysis(primary) — one call returns ref-domain lists for practice + up to 3 rivals, aligned for diffing. - DataForSEO —
backlinks_bulk_referring_domainsfor cross-validation.
Methodology
- Resolves rivals from
ground_truth.jsonor, failing that, from DFSgoogle_competitors_domain. Zero rivals → fail fast. - Normalizes every ref-domain, builds sets per target, computes
gap = union(rivals) – practice. - Scores each gap domain:
acquisition_value = DR × rival_count × topical_relevance. Topical relevance: 2.0 for medical/dental, 1.7 for local-news (*-times.com,*.patch.com), 1.5 forchamber/rotary/community/.org, 1.0 otherwise. - Sorts, keeps top 30 for the deliverable. Buckets targets into medical / local-news / community / general.
Rubric contribution
Cat 4.6 Authority gap closure pipeline (60 pts): pipeline depth (30), topical relevance ratio (15), tier mix of DR50+ targets (15).
Sub-audit fully implemented but the e2e smoke harness hasn't driven it yet for either Adirondack or Green Ortho — envelope is wired and ingestion-verified; live findings pending the next full audit pass. The methodology and emit script are complete.
Full-site profile of the primary rival
Build a full-site profile of one named rival — topical authority, brand-signal score, top traffic pages, blog cadence, schema — so the strategy can ground its competitor narrative in evidence.
Build a full-site profile of one named rival — topical authority, brand-signal score, top traffic pages, blog cadence, schema — so the strategy can ground its competitor narrative in evidence.
Data sources
- Search Atlas Site Explorer v2 (primary) — holistic SEO scores, brand signals, organic keywords, organic pages, position distribution, indexed pages.
- Ahrefs — top-pages, DR history, traffic history.
- DataForSEO Labs —
google_competitors_domainfor rival selection cross-validation. - FireCrawl — light crawl of rival homepage + top-5 pages for architecture / schema / blog signals.
Methodology
- Auto-detects the rival if not specified: top DFS competitor with >50 keyword intersections and avg position <30 (skipping aggregators like Healthgrades, WebMD).
- Pulls rival's SA holistic scores; pulls the same for the practice; computes deltas across topical authority, brand signal, top-10 page traffic, indexed-page count, blog cadence.
- Extracts cities-served from each domain's top-200 organic keywords — surfaces "cities the rival ranks for that we don't" for market-expansion strategy.
Rubric contribution
Diagnostic only — does not own a sub-criterion. Emits rubric_contributions.evidence_only.rival_diagnostics which the composite scorer overlays onto Cat 1 and Cat 4 evidence. The skill explicitly will not invent a category score; it informs, doesn't grade.
Adirondack: Top rival auto-detected as diamondbraces.com (61 keyword intersections, $261 ETV — strongest geographic match). Secondary rivals smileworksnyc.com (NYC) and coastlineorthodontics.com (FL) flagged as not true local rivals.
Green Ortho: Top rivals all surfaced as Atlanta-area practices — the audit raised a geography_note warning the intake address may be wrong (claimed NJ/Westwood but ranking like an Atlanta practice).
Graph-theoretic internal linking (PageRank, click-depth, cohesion)
Treat the website as a directed graph of pages and links — then run the same algorithms Google uses to decide which pages are important.
Treat the website as a directed graph of pages and links — then run the same algorithms Google uses to decide which pages are important.
Data sources
- None external. All math runs locally in Python on
crawl_pages.json. Sub-30 seconds on small sites, 1–3 minutes on a 300-page site.
The 5 algorithms — in plain English
- Internal PageRank. Every page starts equal. On each iteration, a page hands a share of its authority to every page it links to. After ~100 passes the numbers stabilize — you have a ranking of which internal pages are most "trusted" by the site's own link structure. Top of the list should be homepage + Mother pillars. If not, the link graph is misallocating equity.
- Click-depth. Walk outward from the homepage in breadth-first order: depth 1 = one click away, depth 2 = two clicks away. Anything at depth ≥4 is buried — Google re-crawls those less often and weights them lower.
- Anchor distribution. For every target page, classify each incoming anchor (brand / exact-match / partial / generic / naked URL). Targets where exact-match >50% are over-optimization risks (Penguin territory). Where generic ("click here") >60% the anchor carries no semantic signal to Google.
- Cluster cohesion. For each Mother service, find the Mother page and its Son pages. Ask: of all possible Son-to-Son pairs, what fraction actually cross-link? Score 0.0 (isolated silos) to 1.0 (every sister links to every sister). Healthy ortho clusters: 0.4–0.7.
- PR concentration. What % of total PageRank do the top 5 pages hold? 40–60% is healthy. >70% means the site is so homepage-heavy nothing else can rank. <30% means equity is so diffuse no page is strong enough to win competitive keywords.
The audit then proposes the top 15 high-leverage link moves — specific source page → target page → suggested anchor — ranked by source_PR × (1/target_PR) × topical_relevance × priority_bonus. "This page has lots of authority to spare, this other page is starved despite being important, and they're topically related — add the link with this anchor."
Rubric contribution
Cat 3.5 Internal Linking (50 pts): orphan rate (12), click-depth (12), anchor distribution (8), cluster cohesion (10), PR concentration (8). Replaces the simpler "inbound link count" metric the older page-deep-dive skill used.
Sub-audit fully specified; emit script complete. Hasn't been driven end-to-end on either example practice yet. Expected envelope shape on a typical 124-page site: "542 edges, 2 orphans, avg depth 2.3, top 5 hold 52% of PR (healthy)."
Brand mentions + share-of-voice across LLM responses
When real people (and AI engines on their behalf) ask LLMs questions in this practice's space, how often does the brand show up — and is it gaining or losing ground?
When real people (and AI engines on their behalf) ask LLMs questions in this practice's space, how often does the brand show up — and is it gaining or losing ground?
Data sources (triangulated)
- Search Atlas Brand Radar (primary, 8 endpoints) — mentions overview + history, share-of-voice overview + history, cited-pages, cited-domains, impressions overview + history.
- Ahrefs Brand Radar (cross-validation, 7 endpoints) — AI-response samples, mentions, SoV, cited-pages/domains.
- DataForSEO LLM Mentions (third triangulation) — aggregate metrics, searchable mention list, top cited pages/domains.
- DataForSEO SERP Live — AI Overview parsing on 5–10 focal queries (e.g., "best orthodontist in Albany", "Invisalign Albany").
Methodology
- Pulls a 30-day window from all three providers in parallel (90 days if a recent rebrand).
- Triangulates mention counts; if sources disagree by >50%, takes the median (no single-source scores).
- Detects share-of-voice by comparing the practice to a resolved rival domain (from ground_truth.json or auto-detected via SA's organic competitors, filtering out aggregators).
- Parses each focal SERP's
ai_overview.references[]to count how many AI Overviews actually cite the practice's domain. - Classifies every cited URL by path (homepage vs service vs location vs about) so the rubric can reward depth, not just homepage hits.
- Computes 30-day mention slope (last-7-day avg vs first-7-day avg) for trajectory.
Key concepts in plain English
- Mention share = what % of LLM responses for queries in your space include your brand name at all. 50% means an LLM mentions you in half the answers it generates for in-category prompts.
- Share of voice = of all brand mentions in those responses, what % are yours vs competitors'.
- AI Overview citations = Google's AI Overview shows ~3–8 source links per answer; this counts how many of your focal queries cite your domain as a source. This is the AI-era equivalent of position 1.
Rubric contribution
Drives Cat 6 GEO in full — 50 pts: 6.1 Mention Presence (15), 6.2 SoV vs Rival (10), 6.3 AI Overview Citations (10), 6.4 Cited-Pages Quality (10), 6.5 Sentiment & Trajectory (5).
Scored 28/50. 208 total mentions across ChatGPT / Gemini / Perplexity / Copilot / Google AI Mode / Grok over 30 days, sentiment 71.6 (positive). But visibility score down 11.7 points from 61.7 and mention volume down 41 (249 → 208) — trajectory negative. Per-platform visibility ranges from 40% (ChatGPT) to 80% (Copilot, Google AI Mode, Grok).
STANCE compliance for SA tracking setup
Is the agency's Search Atlas tracking setup for this practice actually disciplined, or is it quietly burning credits on garbage data?
Is the agency's Search Atlas tracking setup for this practice actually disciplined, or is it quietly burning credits on garbage data?
Data sources
- Search Atlas only (brand-routed) — KRT projects + keyword details, Local SEO heatmap businesses + heatmaps, LLM Visibility projects + queries/topics, GBP citation submission history, OTTO engagement, GBP location detail.
Methodology
- Detects the practice's vertical (orthodontics, dermatology, periodontics, chiropractic, etc.) and generates a candidate keyword set from the per-vertical YAML template — what should be tracked.
- Pulls actual tracked state from Search Atlas across KRT, heatmaps, LLMV, citations, OTTO, GBP.
- Diffs candidate vs actual: flags gaps (recommend add), excess (consider remove), matches (good).
- Runs 10 anti-pattern checks against the STANCE rules (word-order over-tracking, near-me in KRT, mobile-desktop doubling, cross-city tracking, missing brand, high NR rate, radius-market mismatch, multi-specialty in one project, stale heatmaps, no citation history).
- Builds a credit-costed action list (rough $/month per add or remove) so operators can weigh tradeoffs.
What "STANCE" is
Not an acronym — shorthand for the agency's source-of-truth doc "Agency Tracking-Setup Standards — Canonical Stance" (setup/research/STANCE.md), which codifies 10 numbered rules compiled from Whitespark, Sterling Sky, BrightLocal, Local Falcon, SearchAtlas docs, and a 41-project NEON/HIP audit. Headline rules: canonical word-order {service} in {city} {state}, near-me → heatmaps not KRT, one KRT project per location, mobile is source of truth, heatmap radius bounded by ~20-min drive-time.
Rubric contribution
Evidence-only — does NOT move the V5 score. STANCE compliance is methodology hygiene. Writes evidence rows into Cat 2 / Cat 6 / Cat 8 for internal review only. Never appears in client-facing strategy presentations.
Vertical = orthodontics, 16 tracked kws vs a 36-kw candidate set (35 gap, 12 extras). 7 anti-patterns detected including 2 critical: word-order over-tracking ("orthodontist albany" × 3 variants — wastes $2.50/mo) and cross-city tracking (5 of 16 kws / 31% target Rochester or other non-served cities). Brand name "Adirondack Orthodontics" is completely untracked — no anchor for NR debugging. 2 of 4 expected heatmap businesses set up. Zero citation scans run.
OTTO will make suggestions. We validate every single one.
Search Atlas OTTO is a recommendation engine — useful but noisy. The plugin scores every OTTO rec 0–100 against the other 14 sub-audits' findings and bucketizes it before anything enters the action plan.
OTTO surfaces hundreds of "fix this" items per site. Without triage, we'd drown in 184 recommendations with no way to distinguish a true critical fix from "add a meta description to your privacy page." Worse: we'd deploy noise to clients. OTTO-overlay validates against everything else we know about the site before recommending action.
The 5 scoring dimensions (100 pts before penalties)
| Dimension | Max | What it measures (and why this weight) |
|---|---|---|
| Corroboration | 40 | Most important signal. Does another independent sub-audit recommend the same fix on the same URL? +40 = technical-audit AND page-deep-dive both flagged this exact page for this exact issue. +20 = another audit flagged the same category of issue on a different URL. 0 = OTTO is alone. Three independent measurement systems agreeing → the fix is real. |
| Page priority | 20 | Homepage = +20. Pillar Mother page = +16. Top-10 inbound-link page = +12. Generic valid page = +4. A fix on a money page is worth more than the same fix on a buried page. |
| Specificity | 15 | Three +5 increments: has a real URL, has an actionable verb ("add" / "fix" / "remove"), has a measurable target ("compress to <100KB" beats "improve speed"). Vague recs get filtered. |
| V5 weakness | 15 | Pulls from v5_score.json — if the rec addresses a category currently scoring <50% of max, +15. 50–75%, +8. >75%, +2. Prioritize fixes in the categories where the practice is actually weak. |
| OTTO confidence | 10 | OTTO's own severity flag — critical +10, warning +6, info +2. Weighted lightest because it's vendor self-rating. |
The 4 noise penalties (subtractive)
| Penalty | Pts | Trigger |
|---|---|---|
| Low-value page target | −30 | URL matches /privacy/, /terms/, /sitemap/, /thank-you/, /404/. OTTO will happily tell you to optimize the meta description of your privacy policy. No. |
| Conflict with another audit | −20 | Another sub-audit explicitly disagrees. Example: OTTO says "expand content" but page-deep-dive SCHOLAR-L scored 9/10 — page is fine; OTTO is wrong. |
| Already fixed | −30 | SCHOLAR criterion this rec addresses already scores ≥8/10 in the crawl. The fix was done; OTTO hasn't recrawled. |
| Stale OTTO data | −10 | OTTO's last_recrawl_at is older than 30 days. Confidence decays with age. |
Three output buckets
Worth deploying
Strong corroboration + good page + specific. Flows into the action list and Gold Modules.
Human judgment
Maybe valid, maybe not. Strategist decides case-by-case. Goes into the review CSV.
Drop, with reasons logged
Explicit reasons listed (e.g., "conflicts with content-quality finding X"). Suppressed but auditable.
What this looks like in practice — same issue, different bucket
KEEP · score 84
"Add meta description to /braces/"
- Corroboration: +40 (technical-audit AND page-deep-dive both flagged this URL)
- Page priority: +18 (pillar page)
- Specificity: +11
- V5 weakness: +8
- OTTO confidence: +6
- Penalties: 0
NOISE · score 28
"Add meta description to /privacy/"
- Corroboration: 0 (no other audit flagged)
- Page priority: +4
- Specificity: +11
- V5 weakness: +8
- OTTO confidence: +6
- Penalty: −30 low-value page target
Suppressed. Reason: low_value_page_target.
OTTO is a recommendation source, not a measurement. Scoring V5 with OTTO data would be circular — you'd be scoring the practice on the volume of OTTO's suggestions rather than the underlying signal. The Keep bucket flows into the Gold Modules priority list and the internal audit report's Section 10, but never into the score itself, and never into the client-facing strategy presentation.
Page titles and meta descriptions
The receipts on why the audit's recommendations land where they do.
This section exists because you sent a proposed OTTO prompt asking for 40–50 character titles with "Best [Service]" required and "Fast Braces" allowed. The audit's recommendations land differently. Here's why.
The TL;DR rules the audit enforces
- Title: 50–58 characters, primary keyword forward, no superlatives on YMYL pages, no year, no "near me", state as 2-letter abbreviation, ≥80% token similarity to H1.
- Description: 150–160 characters, primary keyword in first 90 chars, CTA in first 120 chars (mobile-safe), no phone numbers, no hype.
Why 50–58 chars — Tier 1 empirical
Zyppy ran a 2025 study of 80,959 titles across 2,370 sites measuring how often Google rewrites a title by length:
| Title length | Google rewrite rate | Implication |
|---|---|---|
| < 50 chars | 50–96% | Google replaces our title — usually with H1 + brand |
| 51–55 chars | ~40% | Lowest rewrite rate of any bucket |
| 56–60 chars | ~45% | Still in the low-rewrite zone |
| 61–70 chars | ~70% | Often truncated on desktop |
| > 70 chars | ~100% | Always rewritten |
Mobile SERP truncation cuts at ~500 px ≈ 50 chars. So 50–58 is the sweet spot — in Zyppy's lowest-rewrite bucket AND inside mobile display. The 40–50 cap in the original prompt would push most pages into the 50–96% rewrite bucket. Empirical loss of brand control, not a judgment call.
Why no superlatives — what Google actually says
Orthodontics is YMYL (Your Money or Your Life) by definition. YMYL pages get extra Trust weighting in raters' assessments.
What we infer (our professional judgment, named honestly):
- Unsubstantiated superlatives ("Best", "#1", "Top-Rated") create Trust-debit signals — even though the QRG doesn't specifically call out superlatives. That's our read.
- FTC §5 truth-in-advertising applies broadly. "Best orthodontist" is a comparative claim needing substantiation — no industry-standard ranking methodology supplies it.
- State dental boards typically prohibit "false, misleading, or deceptive" advertising and require substantiation of comparative claims (see §7).
Honest caveats — what we cannot cite
- The 2025 FTC max civil penalty under §5 is $53,088 per violation — but only after specific procedural triggers. NOT an automatic per-superlative fine.
- There is no FTC enforcement action against an orthodontic practice for "Best Orthodontist" titles we can cite. OrthoAccel v. Propel (2016) was a private civil suit between manufacturers.
- The Google QRG does NOT specifically call out superlatives. That linkage is our inference.
The risk is real but indirect. We present it as such — which is why the audit's recommendations are tiered, not absolute.
State dental board awareness
When a YMYL violation fires, the audit names the relevant state regulator and the governing statute citation.
The audit knows the practice's state from ground_truth.address.state. For superlative or accelerated-treatment violations, the envelope now surfaces the relevant state dental board, the statute citation pattern, and a verification URL.
We name the regulator + the statute + the verification URL. We don't interpret whether specific phrasing violates the current rule. That's between the client + their attorney + the dental board.
What's seeded
Full citations on file
California (BPC §651), Texas (22 TAC §108), Florida (FAC 64B5-13.0046 + FS 466), New York (NY Educ. Law §6509 + 8 NYCRR §29).
Regulator + URL seeded
PA, IL, OH, GA, NC, MI, VA, AZ, MA, WA, NJ — flagged needs_verification until we confirm rule text.
Default fallback
Names "the state dental board for [state]" and points to a directory. No specific claims made about rule text.
SERP context — when competitors normalize "best"
If the live SERP for the practice's primary keyword rewards superlatives, our default may be counter-productive. The audit pulls competitor titles and surfaces a recommendation bucket.
Hold the line
SERP isn't rewarding the pattern. Keep no-superlatives default. Differentiate with a substantiable claim.
Differentiate
Mixed signal. Differentiate with a substantiable claim (years in practice, specific tech, board cert, patient volume).
Match or differentiate
SERP normalizes superlatives. Surface both paths: (a) match with signed client ack, or (b) differentiate with a substantiable claim competitors aren't making.
The bucket gets surfaced; the strategist makes the call with the client. If we pick "match with client ack," client_overrides.yaml records the exception so the audit history tracks it cleanly. The trail is what protects the agency if a regulator ever asks.
Coming next — Apify NAP, SEMrush, Moz Pro
Adding three more data sources so the audit accounts for whatever tool a client points at our work.
Clients run their own audits with Claude using Ahrefs, SEMrush, or Moz Pro. When they come back with issues we didn't surface, that's a credibility problem. The fix is to use all of them — so our plan already accounts for whatever any tool they have could possibly raise.
Apify NAP citation audit
Today's citation-audit uses Search Atlas's citation scan. Apify gets us coverage on directories SA doesn't index, plus niche orthodontic-specific directories. Ground truth NAP becomes the truth set; Apify-scraped citations get diffed against it.
SEMrush API integration
Unique strengths in keyword-difficulty modeling and competitor traffic estimation. SEMrush feeds the median-of-N composite scorer for content-gap, real-rankings, and backlink-gap.
Moz Pro API integration
Domain Authority + Page Authority + Spam Score — metrics clients reference. Lets us run their report against ours and resolve discrepancies before they surface them.
The strategic point
By v3.15 the audit uses 7 data sources in parallel: Search Atlas (×2 brands), Ahrefs, DataForSEO, Firecrawl, Apify, SEMrush, Moz Pro. Whatever tool a client points at our work, our plan already accounts for what that tool would surface. No more reactive audits where a client's SEMrush report blindsides us.
How to run an audit
Inputs, command, what to look at first when it finishes.
Inputs needed
• Practice domain (adirondackorthodontics.com)
• GBP location ID (or address)
• Primary keyword (orthodontist albany ny)
• Brand: HIP or NEON Canvas
Command
# From a Claude Code session
# in /seoaudit/:
/seo-audit-and-plan \
adirondackorthodontics.com
Wall time
Typical: 30–45 min end-to-end. ~80% is parallel sub-audit dispatch (limited by API rate limits). Larger sites (200+ pages): 60–75 min.
What to read first
1. V5_AUDIT_REPORT.md — start at §11 (KPIs) and §4 (Gold Modules)
2. strategy-presentation.html — what the client sees
3. v5_score.json — composite scoring per-source
4. action_items.csv — what we'd queue
Push back on anything in here.
Every rule, threshold, citation, and weight in this doc is editable. If something looks wrong, name it — we'll dig into the research or change the rule. The whole point of the methodology being this visible is so you can challenge it.
Email Justin