Framework deep-dive

The SIGNALS Framework: 7 Dimensions of AI Citation Readiness

SIGNALS scores pages across 7 dimensions that predict citation frequency by ChatGPT, Claude, Perplexity, and Google AI Overviews. Each dimension is weighted by its measured effect size from peer-reviewed research — not by assumption or intuition.

The name is an acronym: Structure, Intent, Grounding, Newness, Alignment, Language, Substantiation. The order reflects the pipeline: structural signals determine whether a page can be retrieved and parsed; content signals determine whether it gets cited.

Score composition at a glance

The 7 dimensions are weighted by measured effect size. Alignment carries the most weight because it's the only dimension with a documented causal effect independent of domain authority.

Alignment
35%
Substantiation
15%
Grounding
13%
Structure
12%
Intent
10%
Language
10%
Newness
5%
Victor Xu
Victor Xu
Founder, SIGNALS · Building AI visibility infrastructure for websites
Updated May 2026

Each dimension explained

S
Structure — 12%
Is the content organized for AI extraction?
Source: ConvertMate 2026, AgentGEO 2026

Structure measures whether a page is organized in a way that AI systems can reliably extract and understand content hierarchy. The key finding from ConvertMate's 2026 benchmark: 68.7% of pages actually cited by AI engines use logical H1→H2→H3 heading hierarchy. Most pages don't.

The problem isn't just that unstructured pages are harder to parse — it's that AI systems use heading structure to understand what a page is about. A page with multiple H1s, skipped heading levels, or headings used purely for styling gives the AI system conflicting signals about content organization.

What SIGNALS checks
  • One H1 per page, present and content-relevant
  • H2s present and covering distinct subtopics
  • No skipped heading levels (H1 directly to H3)
  • Content sections are self-contained (200–400 words)
  • TL;DR or summary box present for long pages
  • No content buried in JavaScript-rendered tabs or accordions
I
Intent — 10%
Does the page cover adjacent buyer intents?
Source: Princeton GEO 2024 — query fan-out analysis

Princeton's GEO study included a query fan-out analysis showing that when a user types one prompt into an AI system, it internally decomposes that prompt into multiple sub-queries before retrieving pages. A page that only covers the primary intent gets cited for fewer of those sub-queries.

The practical implication: a page about "enterprise VR training" should also address "how much does enterprise VR training cost," "how long does enterprise VR training take to implement," and "enterprise VR training vs. classroom instruction." Each of those is a sub-query that AI systems check when answering the primary question.

What SIGNALS checks
  • H2s cover adjacent buyer questions, not just the primary topic
  • FAQ section addresses multiple buyer intent variations
  • Comparison content present (vs. alternatives, vs. traditional approaches)
  • Cost, timeline, and implementation questions addressed
G
Grounding — 13%
Are claims verifiable and sourced?
Source: Princeton GEO 2024 — +41% from statistics, +28% from expert quotes

Princeton's controlled experiment tested 22 content modifications across 10,000 queries. Adding statistics with source citations increased AI citation frequency by 41%. Adding named expert quotes increased it by 28%. These are among the largest effect sizes in the study — and they're achievable without rebuilding a page.

The mechanism makes sense: AI systems prefer citing content that makes verifiable claims. "VR reduces learning time significantly" is not verifiable. "VR reduces learning time by 40% compared to classroom instruction, according to a PwC study of 1,000 employees (2023)" is. The second version is a citation unit — the AI system can quote it directly and the user can verify it.

What SIGNALS checks
  • At least 2 statistics with named sources and years
  • Expert quotes with name and role/affiliation
  • Direct answer opening (BLUF) — key claim in first 2 sentences
  • No unsourced superlatives ("best," "leading," "most")
  • Claims are specific enough to be verifiable
N
Newness — 5%
Is the content demonstrably current?
Source: Discovered Labs 2026 — β=+0.05

Page recency has a small but real positive effect on AI citation frequency — Discovered Labs measured β=+0.05. This is the smallest effect size in the framework, which is why Newness carries only 5% of the score. It's real, but it's not a priority fix unless everything else is already addressed.

What matters for Newness isn't just when the page was published — it's whether recency is visible on the page. An AI system reading raw HTML doesn't have access to CMS metadata. If "Last Updated: March 2025" isn't visible in the page body, it doesn't register.

What SIGNALS checks
  • Visible "Last Updated" date in page body (not just meta tags)
  • Statistics reference years within the last 3 years
  • Internal date consistency (no conflicting years across the page)
A
Alignment — 35%
Does page vocabulary match buyer search language?
Source: Discovered Labs 2026 — β=+0.37, q≈10⁻⁷³ · Dominant signal

Alignment is the framework's dominant signal, and the reason it carries 35% comes down to one methodological choice that separates the Discovered Labs study from everything before it: domain fixed-effects controls.

Most content-citation correlation studies don't control for the domain. When you look at which pages get cited by AI engines, high-authority domains dominate — and high-authority domains tend to have better-structured content. So it looks like content quality drives citation. The Discovered Labs study used fixed-effects regression and double machine learning to isolate true causal effects at the page level, controlling for domain. When they did this, almost everything disappeared. Alignment — vocabulary matching buyer search intent — survived with β=+0.37.

Why this matters for small sites: If alignment is the signal that survives domain controls, it means a small site on a new domain can outperform an established brand in AI citation — if its vocabulary alignment is stronger. The large brand's domain authority doesn't protect it at the ranking stage if its pages use internal language instead of buyer language. We've seen this in audits repeatedly.

What SIGNALS checks
  • Page title matches documented buyer search queries for this topic
  • Opening paragraph contains 3+ buyer search phrases
  • H2s use buyer vocabulary, not internal product terminology
  • Meta description mirrors buyer query language
  • No jargon or internal language without buyer-facing explanation
L
Language — 10%
Is the page title and opening phrased for buyer queries?
Source: Princeton GEO 2024 — β=+0.09 title-prompt similarity, +15–30% fluency lift

Language is distinct from Alignment in scope: Alignment measures vocabulary throughout the page; Language focuses specifically on the title, H2s, and opening paragraph — the signals that AI systems weight most heavily when determining relevance at the ranking stage.

Princeton found that title-to-prompt similarity has an independent effect of β=+0.09, and that overall fluency optimization (writing that reads naturally for buyers rather than keyword-stuffed for search engines) increases AI visibility by 15–30%. The two effects compound: a well-phrased title that matches the buyer query, combined with a clear opening paragraph, can meaningfully move a page's ranking in the retrieved set.

What SIGNALS checks
  • Title phrased as a buyer query or directly answering one
  • H2s phrased as questions buyers actually ask
  • Opening paragraph reads naturally for a buyer, not keyword-stuffed
  • No passive voice or evasive hedging in key claims
S
Substantiation — 15%
Do external sources validate this content?
Source: ConvertMate 2026 — 6.5× citation multiplier from third-party mentions

Substantiation measures whether a brand or page has cross-domain validation — presence on third-party sites beyond its own domain. ConvertMate's 2026 benchmark found that brands mentioned on external domains receive 6.5× more AI citations than brands that exist only on their own site. This is the largest multiplier effect in the framework.

The underlying mechanism is trust: AI systems apply a credibility filter that downweights self-reported claims. A brand saying "we're the industry leader" carries less weight than a third-party source saying it. G2 reviews, press coverage, Reddit mentions, and ProductHunt listings all contribute to this cross-domain signal.

Substantiation is also the hardest dimension to improve quickly — you can't manufacture legitimate third-party mentions. It requires earning them through product launches, PR, customer reviews, and community presence. This is why it carries 15% rather than 25% — the signal is real and significant, but it's not something you can fix by editing your page.

What SIGNALS checks
  • Named third-party sources mentioned or linked on the page
  • Customer results or case studies with specific metrics
  • Author credentials stated (name, role, relevant experience)
  • Press or media mentions referenced
  • G2, Capterra, or review platform ratings visible
  • Presence in the Brave, Bing, and Google indexes — the retrieval-level cross-domain signal AI engines read from

Frequently asked questions

What is the SIGNALS framework?

A scoring system that measures AI citation readiness across 7 dimensions: Structure (12%), Intent (10%), Grounding (13%), Newness (5%), Alignment (35%), Language (10%), and Substantiation (15%). Each dimension is weighted by measured effect size from four peer-reviewed studies analyzing 2M+ AI citations.

Why is Alignment the dominant signal at 35%?

Because the Discovered Labs 2026 study — using fixed-effects regression and double machine learning — found it's the only page-level signal with a causal effect independent of domain authority. Effect size β=+0.37 at q≈10⁻⁷³. All other signals collapse to near-zero when domain is controlled for.

What does Alignment actually measure?

Lexical overlap between page vocabulary and documented buyer search patterns for the topic category. A page using internal product terminology instead of buyer language scores low on Alignment regardless of how good the underlying content is.

How is a SIGNALS score calculated?

Weighted average of 7 dimension scores (0–10 each). Weights: Alignment 35%, Substantiation 15%, Grounding 13%, Structure 12%, Intent 10%, Language 10%, Newness 5%. Score 70+ is good, 45–69 is fair, below 45 is critical.

Which dimensions are easiest to improve?

Structure, Grounding, Language, and parts of Intent can be improved quickly by editing the page — adding FAQ sections, sourced statistics, rewriting headings, and fixing the opening paragraph. Alignment requires understanding buyer vocabulary, which SIGNALS generates for you. Substantiation is the hardest — it requires earning third-party coverage, which takes time.

Related reading

SIGNALS Methodology & Research The RAG Pipeline Explained AI Citation Checklist: 25 Signals That Matter

Score your page on all 7 dimensions

SIGNALS audits any page against the full framework — scoring all 7 dimensions, diagnosing the pipeline, and generating page-specific fixes. Free for 1 page, no account required.

Audit my page free →

No account · No credit card · Results in under 60 seconds