Most teams treat answer engine optimization as a vibe — “make the content AI-friendly” — and then wonder why the citations never arrive. AEO is not a vibe. It is a list of concrete, checkable things, and if you work the list honestly you can score any site in an afternoon.

This is that list. Thirty-two checks, grouped into six audit areas, in the order we run them on every Answerly engagement. The order matters — crawler access gates everything, schema gates content, and measurement is what tells you whether the other 28 items did anything. Work top to bottom.

Each item has a checkbox, one or two sentences on why it matters, and a verify note so you are not guessing. A site that passes 28 of 32 we consider citation-ready. Below 20, the structural work has not started.

You can run most of this yourself. You can also let our free AI visibility audit score the majority of these 32 points automatically — it crawls the site, checks the schema, parses robots.txt and reports back. Either way, here is the full list.

Crawler and access

If the AI crawlers cannot reach your pages, nothing else on this checklist matters. This area gates the rest.

  • robots.txt allows GPTBot, ClaudeBot, PerplexityBot and Google-Extended. These four user-agents feed ChatGPT, Claude, Perplexity and Google’s AI surfaces respectively — block one and you are invisible to that platform. Verify by opening yourdomain.com/robots.txt and confirming none of the four sit under a Disallow: /.

  • llms.txt is present at the site root. The llms.txt file gives AI systems a clean, markdown summary of what the brand does and which pages matter — a shortcut around messy HTML. Verify by loading yourdomain.com/llms.txt and checking it returns a 200 with real content, not a 404.

  • An XML sitemap exists and is referenced in robots.txt. Sitemaps are still how crawlers — AI and classical — discover your full URL set efficiently. Verify the sitemap loads, lists current URLs, and that robots.txt contains a Sitemap: line pointing to it.

  • Primary content is not blocked behind client-side JavaScript. Several AI crawlers do not execute JS, so a page that renders its body text in the browser ships them an empty shell. Verify with curl yourdomain.com/key-page — your headline and first paragraph must appear in the raw HTML.

  • Priority pages are server-rendered or pre-rendered. SSR or static pre-rendering guarantees every crawler sees the full content regardless of JS support — the safest default for AEO. Verify by disabling JavaScript in your browser and confirming the page still shows its full text and structure.

  • CDN bot-fight mode does not challenge legitimate AI agents. Aggressive bot protection (Cloudflare Bot Fight, hCaptcha walls) can silently 403 GPTBot and friends even when robots.txt allows them. Verify in your CDN bot analytics that GPTBot, ClaudeBot and PerplexityBot show successful 200 responses, not blocks — see our AI crawler access policy for the full configuration.

Schema and structured data

Structured data tells AI systems what your entities are — it removes ambiguity that plain prose leaves on the table.

  • Organization schema is deployed site-wide. Organization JSON-LD anchors the brand as a knowledge-graph entity with a name, logo, URL and contact point. Verify by pasting any page into Google’s Rich Results Test and confirming an Organization object is detected.

  • Article schema is on every blog post and guide. Article markup gives AI the headline, author, publish date and modified date in a machine-readable form — all signals it weighs when deciding what to cite. Verify each post emits Article (or BlogPosting) with a populated author and datePublished.

  • FAQPage vs QAPage is the correct choice for each page. FAQPage is for a list of questions you authored and answered; QAPage is for a single user-submitted question with answers. Picking the wrong one is a validation error AI systems may discount. Verify each FAQ block uses FAQPage and that no community-style page misuses it.

  • Product or Service schema is present where relevant. Commercial pages — service offerings, products — should carry Service or Product markup so AI can surface price, provider and offer details. Verify your service and product pages emit the matching type with at least name, description and provider.

  • sameAs links connect entities to authoritative profiles. sameAs pointers (LinkedIn, Wikidata, Crunchbase, official socials) let AI confirm an entity is the same one referenced elsewhere — the core of entity resolution. Verify your Organization and Person schema each include at least two working sameAs URLs.

  • BreadcrumbList schema reflects the real site hierarchy. Breadcrumb markup tells AI where a page sits in the information architecture, which helps it judge topical context. Verify deep pages emit a BreadcrumbList whose positions match the actual URL path.

  • Every schema block validates clean. Invalid JSON-LD is frequently ignored wholesale rather than partially read. Verify zero errors in both Google Rich Results Test and the Schema.org Validator — see our schema stack for AI citation for the full deployment.

Content structure

This is where most teams lose citations. The facts can be perfect — if the structure is wrong, the AI summarises instead of quoting, and the summary hides your URL.

  • Every priority page opens with a direct answer of 30 words or fewer. AI extractors quote short, self-contained answer sentences; anything past 30 words gets paraphrased and the citation goes elsewhere. Verify the first sentence after each H1 or H2-question answers the query in one concrete sentence under the word limit.

  • A Quick Facts table with at least five rows sits near the top. AI systems quote table rows verbatim because each row is self-contained and unambiguous. Verify each priority page has a Parameter/Value table of five or more rows — the pattern is detailed in our Quick Facts table guide.

  • H2 headings are phrased as the questions users actually ask. AI extractors map prompts to sections by H2 match, so a question-form H2 is a direct hook for that prompt. Verify your H2s read as natural questions ending in ”?”, not as noun-phrase labels.

  • Each section is passage-extractable on its own. AI lifts whole passages, so a section must make sense without the paragraphs around it — no unresolved “this”, “that” or “it” referring back. Verify each section names its entity explicitly in the first sentence rather than relying on prior context.

  • Priority pages follow the four-layer extraction recipe. Hero, X-is-Y intro, Quick Facts, question-form H2s — the structure AI systems quote most reliably. Verify against the full four-layer extraction recipe and confirm all four layers are present in order.

  • No wall-of-text — paragraphs are mixed and short. Dense unbroken text buries the extractable sentence; short paragraphs, lists and tables surface it. Verify no paragraph runs longer than four or five lines, and that lists and tables break up the page.

Brand and entity

AI systems cite entities they recognise. If the model is unsure who you are, it defaults to a competitor it does know.

  • The brand has a Wikidata entity. Wikidata is a primary structured source AI models draw on for entity recognition — an entry makes the brand a known, resolvable thing. Verify by searching wikidata.org for the brand and confirming an item exists with the correct description and identifiers.

  • A disambiguatingDescription separates the brand from namesakes. When a brand name collides with other entities, a disambiguatingDescription in schema tells AI which one you are. Verify your Organization schema includes a short, specific description that no namesake could share — see our Wikidata and knowledge graph guide.

  • Name, address and phone are identical everywhere. Inconsistent NAP across the site, Google Business Profile and directories fragments the entity and weakens recognition. Verify the exact same legal name, address and phone string appears on the site, in schema and on every major listing.

  • Every article and page carries a named-author byline. Anonymous content reads as low-trust to AI; a named, real author is a citability signal. Verify each post shows a human author name linked to a bio page, not “Admin” or “The Team”.

  • The brand has a knowledge-graph presence. Beyond Wikidata, the brand should be resolvable in Google’s Knowledge Graph and surface a knowledge panel for branded queries. Verify by searching the brand name in Google and checking whether a knowledge panel appears.

Authority and E-E-A-T

Structure gets you in the running. Authority is what makes the AI pick you over the equally well-structured competitor.

  • Named experts have verifiable sameAs profiles. A real expert with a LinkedIn and one other confirmable profile is far more citable than an unverifiable name. Verify each author’s Person schema includes sameAs links that resolve to live, matching profiles — see E-E-A-T with named experts.

  • Claims are backed by external citations. Pages that cite authoritative external sources read as more trustworthy to AI than pages that assert without evidence. Verify your priority pages link out to primary sources — regulators, standards bodies, original research — where they make factual claims.

  • The brand has HARO or press mentions on third-party sites. Independent mentions on news and industry sites are off-site authority signals AI weighs heavily. Verify by searching the brand name and confirming coverage exists on sites you do not control.

  • dateModified is current on pages you still stand behind. AI systems prefer fresh content and use dateModified as the freshness signal. Verify priority pages carry an accurate, recent dateModified — and that it reflects a real edit, not a fake bump.

  • Author schema is deployed for every contributor. Person schema with name, role, bio and sameAs gives AI the structured author identity that backs the byline. Verify every author named on the site has a corresponding Person object with populated fields.

Measurement

Without this area, you have no idea whether the other 28 items worked. Set it up before, not after.

  • Citation tracking is set up across the major LLMs. You cannot improve what you do not measure — tracking tells you who gets cited for your prompts in ChatGPT, Perplexity, Gemini and Google AI Overviews. Verify a tracker is running on a schedule and producing weekly citation data — the best AEO tools piece names the options.

  • A prompt universe is defined and documented. The set of prompts you want to win — five seeds expanding to 15-30 variants — is the scoreboard everything else is measured against. Verify the prompt list exists in writing and that each prompt maps to a target page.

  • A baseline is frozen before the work starts. A dated, screenshotted snapshot of where you stand at day zero is the only honest reference point for proving progress later. Verify you have stored baseline screenshots of each tracked prompt’s answer set.

  • A monthly reporting cadence is in place. Citation behaviour shifts week to week, so a fixed monthly report catches drift and proves trend rather than noise. Verify a recurring report exists covering visibility, share of voice and citations per platform — the method is in measuring AI citations.

How to use this checklist

Score the site honestly. Count the boxes you can truthfully tick today, not the ones you intend to. The number tells you where you are: above 28 and the site is citation-ready and the work shifts to content depth and authority; between 20 and 28 and the structural foundation is half-built; below 20 and AEO has not really started.

My honest opinion after running this list across hundreds of sites — the six areas are not equal in effort. Crawler and schema are an afternoon of work each and you should never fail them. Content structure is a few weeks of disciplined rewriting. Brand entity and authority are the slow ones, measured in months, and they are exactly where competitors give up. That is the opportunity. The boring, checkable items are boring because they work — and most of your competition has not finished them either.

Run the free AI visibility audit and we will score most of these 32 points for you, with the failing items punch-listed in priority order.