SIGNALS scores pages across 7 dimensions that predict citation frequency by ChatGPT, Claude, Perplexity, and Google AI Overviews. Each dimension is weighted by its measured effect size from peer-reviewed research — not by assumption or intuition.
The name is an acronym: Structure, Intent, Grounding, Newness, Alignment, Language, Substantiation. The order reflects the pipeline: structural signals determine whether a page can be retrieved and parsed; content signals determine whether it gets cited.
The 7 dimensions are weighted by measured effect size. Alignment carries the most weight because it's the only dimension with a documented causal effect independent of domain authority.
Structure measures whether a page is organized in a way that AI systems can reliably extract and understand content hierarchy. The key finding from ConvertMate's 2026 benchmark: 68.7% of pages actually cited by AI engines use logical H1→H2→H3 heading hierarchy. Most pages don't.
The problem isn't just that unstructured pages are harder to parse — it's that AI systems use heading structure to understand what a page is about. A page with multiple H1s, skipped heading levels, or headings used purely for styling gives the AI system conflicting signals about content organization.
Princeton's GEO study included a query fan-out analysis showing that when a user types one prompt into an AI system, it internally decomposes that prompt into multiple sub-queries before retrieving pages. A page that only covers the primary intent gets cited for fewer of those sub-queries.
The practical implication: a page about "enterprise VR training" should also address "how much does enterprise VR training cost," "how long does enterprise VR training take to implement," and "enterprise VR training vs. classroom instruction." Each of those is a sub-query that AI systems check when answering the primary question.
Princeton's controlled experiment tested 22 content modifications across 10,000 queries. Adding statistics with source citations increased AI citation frequency by 41%. Adding named expert quotes increased it by 28%. These are among the largest effect sizes in the study — and they're achievable without rebuilding a page.
The mechanism makes sense: AI systems prefer citing content that makes verifiable claims. "VR reduces learning time significantly" is not verifiable. "VR reduces learning time by 40% compared to classroom instruction, according to a PwC study of 1,000 employees (2023)" is. The second version is a citation unit — the AI system can quote it directly and the user can verify it.
Page recency has a small but real positive effect on AI citation frequency — Discovered Labs measured β=+0.05. This is the smallest effect size in the framework, which is why Newness carries only 5% of the score. It's real, but it's not a priority fix unless everything else is already addressed.
What matters for Newness isn't just when the page was published — it's whether recency is visible on the page. An AI system reading raw HTML doesn't have access to CMS metadata. If "Last Updated: March 2025" isn't visible in the page body, it doesn't register.
Alignment is the framework's dominant signal, and the reason it carries 35% comes down to one methodological choice that separates the Discovered Labs study from everything before it: domain fixed-effects controls.
Most content-citation correlation studies don't control for the domain. When you look at which pages get cited by AI engines, high-authority domains dominate — and high-authority domains tend to have better-structured content. So it looks like content quality drives citation. The Discovered Labs study used fixed-effects regression and double machine learning to isolate true causal effects at the page level, controlling for domain. When they did this, almost everything disappeared. Alignment — vocabulary matching buyer search intent — survived with β=+0.37.
Why this matters for small sites: If alignment is the signal that survives domain controls, it means a small site on a new domain can outperform an established brand in AI citation — if its vocabulary alignment is stronger. The large brand's domain authority doesn't protect it at the ranking stage if its pages use internal language instead of buyer language. We've seen this in audits repeatedly.
Language is distinct from Alignment in scope: Alignment measures vocabulary throughout the page; Language focuses specifically on the title, H2s, and opening paragraph — the signals that AI systems weight most heavily when determining relevance at the ranking stage.
Princeton found that title-to-prompt similarity has an independent effect of β=+0.09, and that overall fluency optimization (writing that reads naturally for buyers rather than keyword-stuffed for search engines) increases AI visibility by 15–30%. The two effects compound: a well-phrased title that matches the buyer query, combined with a clear opening paragraph, can meaningfully move a page's ranking in the retrieved set.
Substantiation measures whether a brand or page has cross-domain validation — presence on third-party sites beyond its own domain. ConvertMate's 2026 benchmark found that brands mentioned on external domains receive 6.5× more AI citations than brands that exist only on their own site. This is the largest multiplier effect in the framework.
The underlying mechanism is trust: AI systems apply a credibility filter that downweights self-reported claims. A brand saying "we're the industry leader" carries less weight than a third-party source saying it. G2 reviews, press coverage, Reddit mentions, and ProductHunt listings all contribute to this cross-domain signal.
Substantiation is also the hardest dimension to improve quickly — you can't manufacture legitimate third-party mentions. It requires earning them through product launches, PR, customer reviews, and community presence. This is why it carries 15% rather than 25% — the signal is real and significant, but it's not something you can fix by editing your page.
A scoring system that measures AI citation readiness across 7 dimensions: Structure (12%), Intent (10%), Grounding (13%), Newness (5%), Alignment (35%), Language (10%), and Substantiation (15%). Each dimension is weighted by measured effect size from four peer-reviewed studies analyzing 2M+ AI citations.
Because the Discovered Labs 2026 study — using fixed-effects regression and double machine learning — found it's the only page-level signal with a causal effect independent of domain authority. Effect size β=+0.37 at q≈10⁻⁷³. All other signals collapse to near-zero when domain is controlled for.
Lexical overlap between page vocabulary and documented buyer search patterns for the topic category. A page using internal product terminology instead of buyer language scores low on Alignment regardless of how good the underlying content is.
Weighted average of 7 dimension scores (0–10 each). Weights: Alignment 35%, Substantiation 15%, Grounding 13%, Structure 12%, Intent 10%, Language 10%, Newness 5%. Score 70+ is good, 45–69 is fair, below 45 is critical.
Structure, Grounding, Language, and parts of Intent can be improved quickly by editing the page — adding FAQ sections, sourced statistics, rewriting headings, and fixing the opening paragraph. Alignment requires understanding buyer vocabulary, which SIGNALS generates for you. Substantiation is the hardest — it requires earning third-party coverage, which takes time.
Related reading
SIGNALS audits any page against the full framework — scoring all 7 dimensions, diagnosing the pipeline, and generating page-specific fixes. Free for 1 page, no account required.
Audit my page free →No account · No credit card · Results in under 60 seconds