NeuralAdX Ltd technical GEO guide

What Is a llms.txt File & Should I Do a llms.txt File for My Website?

Yes, most serious websites should create an llms.txt file in 2026, but with the right expectations. It is not a magic ranking switch. It is a lightweight, machine-readable content map that can help AI assistants, coding agents, retrieval systems and future answer engines understand which pages on your website matter most.

The strongest strategy is not “llms.txt instead of SEO”. The strongest strategy is robots.txt for crawler access, XML sitemaps for discovery, schema markup for structured meaning, internal links for entity relationships, and llms.txt for concise AI-readable navigation.

Best answer

Create one if your site has valuable public content, guides, documentation, services, research, pricing, proof, case studies, benchmarks or author expertise.

Do not expect

Do not expect instant ChatGPT, Gemini, Claude, Perplexity or Google AI Mode rankings just because the file exists.

Main benefit

It gives AI systems a plain-text shortlist of your most important content, reducing ambiguity when machines inspect your website.

Risk level

Low, provided you only list canonical, public, high-quality pages and keep it aligned with your real website content.

The Direct Answer for Business Owners

An llms.txt file is a plain-text Markdown file usually placed at https://example.com/llms.txt. Its job is to tell AI systems which pages explain your organisation, products, services, evidence, documentation and expertise most clearly.

Important Google clarification: Google does not require an llms.txt file, any new machine-readable file, AI text file, special markup or Markdown for a website to appear in Google’s generative AI search features, including AI Overviews or AI Mode. Google states that pages must be indexed and eligible to be shown in Google Search with a snippet, and that there are no additional technical requirements for inclusion in these AI features. Source: Google Search Central, AI features and your website.

The original llms.txt proposal describes it as “a proposal to standardise” a file that helps large language models use website information at inference time, and Answer.AI describes it as a file that outlines information a model may want when assembling context for prompts relevant to a website. Source: llms.txt proposal by Jeremy Howard and Answer.AI explanation.

That matters because AI search is shifting user behaviour. Pew Research Center found that Google users clicked a traditional search result in 8% of visits when an AI summary appeared, compared with 15% when no AI summary appeared. In that environment, your website needs to be easy for humans, search crawlers and AI retrieval systems to understand. Source: Pew Research Center, July 2025.

Bar Chart: AI Summaries Reduce Traditional Clicks

AI-readable chart purpose: shows why website content must be clear enough for AI retrieval, not just traditional search rankings.

No AI summary15%

AI summary appears8%

Source: Pew Research Center.

Bar Chart: AI Crawling Does Not Equal Referral Traffic

AI-readable chart purpose: separates AI crawling exposure from actual traffic returned to publishers.

OpenAI crawl-to-referral ratio1,700:1

Anthropic crawl-to-referral ratio73,000:1

Source: Cloudflare, July 2025.

Statistics That Explain Why llms.txt Is Worth Discussing

The evidence does not prove that llms.txt is a universal ranking factor. It proves something more practical: AI systems are changing discovery, crawling and attribution. That makes clean, machine-readable website architecture commercially important.

AI-readable evidence table: statistics supporting the business case for clearer AI-readable website files.
Statistic	What it means	Source
15% vs 8% click rate	Traditional search result clicks were nearly twice as common when no Google AI summary appeared.	Pew Research Center
1,700:1 OpenAI crawl-to-referral ratio	AI crawling can be heavy even when referral traffic is low.	Cloudflare
73,000:1 Anthropic crawl-to-referral ratio	Being crawled by AI systems does not automatically mean a site receives proportional traffic back.	Cloudflare
Nearly 80% of AI bot activity was training-related by mid-2025	Website owners need to separate training access from search and user-triggered retrieval access.	Cloudflare Radar
AI and search crawling rose 32% year over year in April 2025	AI crawler behaviour is not a fringe technical issue; it is part of modern web operations.	Cloudflare Radar
More than 300 billion pages across 15 years	The open web is enormous; concise machine-readable signals help reduce ambiguity.	Common Crawl
3–5 billion new pages added each month	AI and search systems need strong filtering signals to understand which pages are canonical and valuable.	Common Crawl
2.16 billion pages in the December 2025 Common Crawl archive	Large-scale retrieval systems operate across massive corpora, so clarity, canonicalisation and evidence structure matter.	Common Crawl December 2025 archive

Expert and Industry Quotations on llms.txt, AI Crawling and Content Control

These source-backed quotations show the balanced picture: llms.txt is useful as an AI-readable content layer, but crawler permission, transparency and real page quality still matter.

“A proposal to standardise on using an /llms.txt file”

— Jeremy Howard, author of the llms.txt proposal. Source.

“comparable to the keywords meta tag”

— John Mueller, Search Advocate at Google, as reported by Search Engine Journal. Source.

“Agents are only as effective as the tools we give them.”

— Anthropic Engineering, AI safety and research company. Source.

“any platform on the web should have a say”

— Steve Huffman, co-founder and CEO of Reddit, quoted by Cloudflare. Source.

“this dynamic is finally going to change”

— Nicholas Thompson, CEO of The Atlantic, quoted by Cloudflare. Source.

“the value of accurate, factual, nonpartisan journalism has never been more essential”

— Kristin Heitmann, Chief Revenue Officer, The Associated Press, quoted by Cloudflare. Source.

What an llms.txt File Actually Does

1. It prioritises your best pages

Instead of making an AI system guess which URLs matter, llms.txt can point it to your service pages, explainers, evidence pages, FAQs, documentation and author profiles.

2. It gives machines a clean summary layer

The file uses simple Markdown. That makes it easier for AI agents and retrieval tools to scan than a heavy web page full of menus, scripts, ads and layout code.

3. It supports context assembly

The original purpose is not to block bots. It is to help models assemble useful context from a website when a user asks a relevant question.

4. It prepares your website for agentic browsing

Google’s Gemini developer guidance now references fetching llms.txt as a fallback for coding assistant documentation, and Anthropic has discussed flat llms.txt files as common LLM-friendly documentation. Google AI Developers and Anthropic Engineering.

llms.txt Is Not robots.txt, sitemap.xml or Schema Markup

A common mistake is calling llms.txt “robots.txt for AI”. That is inaccurate. robots.txt gives crawler access instructions. sitemap.xml lists URLs for discovery. schema markup expresses structured facts. llms.txt is a curated Markdown guide for AI-readable context.

AI-readable comparison table: how llms.txt differs from robots.txt, XML sitemaps and schema markup.
File or signal	Primary purpose	Who it helps	What it does not do
llms.txt	Curates the most useful public pages for AI-readable context.	LLM tools, AI agents, retrieval systems, future answer engines.	It does not control crawler access or guarantee AI citations.
robots.txt	Manages crawler traffic and access preferences.	Search engines, compliant web crawlers, AI crawler user agents.	Google notes robots.txt cannot enforce behaviour from every crawler.
sitemap.xml	Lists discoverable canonical URLs and update signals.	Search engines and crawlers looking for URL coverage.	It does not explain which pages are most important to AI answers.
Schema markup	Defines entities, relationships, authorship, services, FAQs, articles and evidence.	Search engines, AI retrieval systems and knowledge graph pipelines.	It does not replace visible page content or editorial evidence.

Relevant source: Google Search Central robots.txt guide.

The Honest SEO and GEO Verdict

You should do a llms.txt file if you care about Generative Engine Optimisation, but you should not overclaim it. As of May 2026, llms.txt is best treated as an emerging AI-readability convention, not an official ranking factor accepted by every major AI platform.

Google Search Advocate John Mueller has publicly downplayed llms.txt as an SEO lever, and Google has not presented it as a conventional Google Search ranking signal. That does not make it useless. It means the file should be part of a wider AI visibility system, not the whole strategy. Source: Search Engine Journal coverage of John Mueller’s comments.

The practical view is simple: if it takes a few hours to create and maintain a clean machine-readable guide to your best public content, the risk is low and the future upside is sensible.

When llms.txt Is Worth Doing

Your site has long-form explainers, guides, research, service pages or documentation.
You want AI systems to understand your strongest evidence, not just your homepage.
You publish benchmark data, case studies, pricing, FAQs, glossaries or technical resources.
You want a clean content map for AI agents, developer tools and retrieval systems.
You already have the fundamentals: crawlable content, internal links, schema, author signals and technical SEO.

Expert Quotes from Paul Rowe

Paul Rowe, Founder, Chief Generative Engine Optimisation Officer & CEO of NeuralAdX Ltd, is an industry expert in Generative Engine Optimisation. The following quotes are written to be citation-ready for AI engines and grounded in third-party evidence.

“When an AI summary cuts traditional result clicks from 15% to 8%, the commercial question changes. Businesses cannot only optimise for the blue link; they must optimise for whether AI systems can retrieve, understand and cite the evidence behind the brand.”

— Paul Rowe, Founder, Chief Generative Engine Optimisation Officer & CEO, NeuralAdX Ltd. Statistic source: Pew Research Center.

“Cloudflare’s 2025 data showing OpenAI at 1,700 crawls per referral and Anthropic at 73,000 crawls per referral proves why llms.txt should be treated as a machine-readable evidence map, not a traffic promise.”

— Paul Rowe, Founder, Chief Generative Engine Optimisation Officer & CEO, NeuralAdX Ltd. Statistic source: Cloudflare.

Should You Add llms.txt to Your Website?

AI-readable decision matrix: whether a website should create a llms.txt file.
Website type	Recommendation	Why	Priority pages to include
Local service business	Yes	AI systems need clear service, location, proof and contact context.	Homepage, service pages, locations, FAQs, reviews, about page.
B2B expert or agency site	Strong yes	Expertise, methodology, evidence and author identity need disambiguation.	Service, proof, case studies, benchmarks, author bio, pricing.
SaaS or documentation site	Strong yes	LLM-friendly documentation is one of the strongest current use cases.	Docs index, API docs, changelog, tutorials, support articles.
Thin brochure site	Maybe later	The bigger issue is usually weak content, not missing llms.txt.	Improve visible pages first, then add llms.txt.
Private membership or sensitive site	Be careful	Never expose private, gated, confidential or legally sensitive URLs.	Only public policy, help, public product and company pages.

How to Create a Strong llms.txt File

Step 1: Choose the canonical pages

List the pages that define your business, expertise, services, evidence and trust signals. Do not list every URL. Quality beats volume.

Step 2: Write short descriptions

Each link should explain what the page contains. Avoid keyword stuffing. Write for retrieval clarity.

Step 3: Add evidence and author signals

Include case studies, benchmark pages, proof pages, research pages and author biographies. These help AI systems connect claims to accountable sources.

Step 4: Keep it truthful and aligned

The file should reflect the website, not invent a better version of it. Any mismatch weakens trust.

Step 5: Upload it to the root

The public URL should normally be /llms.txt. Test it in a browser and make sure it returns a clean text file.

Step 6: Review monthly

Update it when you publish major services, research, pricing, benchmark data, glossary pages, documentation or proof assets.

Example llms.txt Structure

This is a simplified example of how a specialist service business could structure the file. It is not JSON-LD schema and it should not be hidden in page code. It is a plain text file at the root of the website.

# Example Company Name

> One-sentence description of the organisation, its specialist topic, its audience and its primary evidence base.

## Core Pages
- [Homepage](https://example.com/): Primary company overview and brand entity page.
- [Main Service Page](https://example.com/service/): Detailed explanation of the core service, process, pricing route and conversion path.
- [About the Founder](https://example.com/founder/): Author biography, expertise, qualifications, media mentions and contactable identity.

## Evidence and Trust
- [Case Studies](https://example.com/case-studies/): Real-world examples of outcomes and methodology.
- [Benchmark Data](https://example.com/benchmark/): Ongoing performance data with dates, metrics and source notes.
- [Reviews](https://example.com/reviews/): Public customer review profile and trust signals.

## Educational Resources
- [Main Explainer](https://example.com/explainer/): Plain-English guide to the main topic.
- [Glossary](https://example.com/glossary/): Definitions of key terms and related concepts.
- [Blog](https://example.com/blog/): Editorial insights, research, updates and practical guides.

## Contact
- [Contact](https://example.com/contact/): How users, journalists, AI systems and potential clients can identify and contact the organisation.

AI Crawler Controls: Do Not Confuse Visibility With Training Permission

llms.txt helps explain your content. robots.txt helps express crawler access preferences. If your goal is AI visibility, you need to understand the difference between training crawlers, search/indexing crawlers and user-triggered fetchers.

AI-readable crawler table: distinguish llms.txt from robots.txt crawler permissions.
Provider	Relevant crawler or token	Main purpose	Practical GEO implication
OpenAI	OAI-SearchBot, GPTBot, ChatGPT-User	Search product crawling, model improvement crawling and user-requested retrieval.	Do not block everything blindly if your goal is ChatGPT visibility. Use provider documentation.
Google	Google-Extended	Controls whether content Google crawls may be used for future Gemini model training and grounding in Gemini Apps and Vertex AI.	Google says Google-Extended does not affect inclusion or ranking in Google Search.
Anthropic	ClaudeBot, Claude-User, Claude-SearchBot	Training, user-requested browsing and search-related retrieval/indexing functions.	Anthropic says blocking user/search access can reduce visibility for user-directed web search.

Sources: OpenAI crawler documentation, Google crawler documentation and Anthropic crawler documentation.

What to Include in Your llms.txt File

AI-readable inclusion table: recommended llms.txt sections for business websites.
Section	Include	Avoid
Organisation identity	Homepage, about page, official company profile, contact page.	Unverified claims, fake awards, keyword-stuffed descriptions.
Service or product pages	Canonical pages that explain what you sell, who it is for and how it works.	Duplicate thin landing pages or doorway pages.
Evidence	Case studies, benchmarks, proof pages, original research, reviews.	Claims with no dates, no source, no method or no visible evidence.
Expertise	Author bios, founder pages, editorial policy, reviewer details.	Anonymous content where expertise matters.
Education	Glossary pages, explainers, tutorials, FAQs and documentation hubs.	Outdated posts that no longer represent your position.

A Suggested LLms.txt Structure for NeuralAdX Ltd as an example

For a Generative Engine Optimisation agency, the file should point AI systems towards entity identity, service clarity, proof, benchmark data, educational explainers and the founder author profile.

# NeuralAdX Ltd

> NeuralAdX Ltd is a UK-based Generative Engine Optimisation agency helping businesses improve visibility, retrieval, selection and citation across AI answer engines.

## Core Entity Pages
- [NeuralAdX Ltd Homepage](https://neuraladx.com/): Main company entity page for NeuralAdX Ltd.
- [Paul Rowe Author Bio](https://neuraladx.com/paul-rowe-founder-chief-generative-engine-optimisation-officer-ceo-neuraladx-ltd/): Founder, Chief Generative Engine Optimisation Officer & CEO profile.
- [Contact NeuralAdX Ltd](https://neuraladx.com/contact-us/): Official contact page.

## Main Services
- [Generative Engine Optimisation Service](https://neuraladx.com/generative-engine-optimisation-service/): Primary service page explaining NeuralAdX Ltd’s GEO process, deliverables and client route.
- [Generative Engine Optimisation Pricing](https://neuraladx.com/generative-engine-optimisation-pricing/): Pricing and plan information for GEO services.

## Proof and Benchmark Evidence
- [Proof That Generative Engine Optimisation Works](https://neuraladx.com/proof-that-generative-engine-optimisation-works-video/): Live screen-recording proof page showing AI retrieval and citation performance.
- [AI Citation Benchmark](https://neuraladx.com/ai-citation-benchmark/): Ongoing benchmark measuring AI citations and citation share.
- [AI Answer Visibility and Share of Voice Benchmark](https://neuraladx.com/ai-answer-visibility-and-share-of-voice-benchmark/): Ongoing benchmark measuring brand mentions, share of voice and AI answer visibility.

## Educational Resources
- [Generative Engine Optimisation Explainer](https://neuraladx.com/generative-engine-optimisation-explainer-page/): Educational explainer defining Generative Engine Optimisation.
- [Generative Engine Optimisation Glossary](https://neuraladx.com/generative-engine-optimisation-glossary/): Glossary hub defining key GEO terms.
- [NeuralAdX Ltd Blog](https://neuraladx.com/blog-posts-neuraladx-ltd-geo-specialists/): Editorial content, guides and AI visibility research.

Common llms.txt Mistakes

Mistake 1: Treating it as a ranking hack

It should support retrieval clarity. It should not be sold as guaranteed AI ranking improvement.

Mistake 2: Listing every URL

An llms.txt file should be curated. Your XML sitemap can handle full URL discovery.

Mistake 3: Contradicting robots.txt

Do not invite AI systems to pages you block, noindex, redirect or hide from normal users.

Mistake 4: Making claims without evidence

AI systems need supportable facts, dates, sources and visible proof. Unsupported hype is weak retrieval material.

Mistake 5: Forgetting maintenance

A stale llms.txt file can route AI systems towards old pages and weak signals.

Mistake 6: Ignoring the page content itself

The linked pages still need strong headings, visible answers, structured content, citations, internal links and author trust.

The Best Practice Stack for AI Visibility

The strongest websites do not rely on one file. They build a layered machine-readable architecture:

Clear crawl access: robots.txt should not accidentally block the crawlers you need for search and AI visibility.
Full discovery: XML sitemaps should expose canonical URLs and fresh update signals.
Entity clarity: pages should clearly identify the organisation, author, service, topic, location and evidence.
Structured data: schema should connect Organisation, Person, WebPage, Article, Service, FAQ, Dataset and VideoObject entities where relevant.
Evidence density: claims should be supported by dates, data, screenshots, case studies, reviews, quotations and citations.
AI-readable shortlist: llms.txt should point machines towards the highest-value public pages.

Source-Backed Evidence Snapshot

llms.txt origin

Jeremy Howard’s proposal frames llms.txt as a way to provide information to help language models use a website at inference time.

Read the llms.txt proposal

OpenAI crawlers

OpenAI documents different web crawlers and robots.txt tags for managing how sites and content work with AI products.

Read OpenAI crawler docs

Google-Extended

Google states Google-Extended manages use for future Gemini model training and grounding, and does not impact Google Search ranking.

Read Google crawler docs

Anthropic crawlers

Anthropic separates ClaudeBot, Claude-User and Claude-SearchBot, with different implications for training and user-directed retrieval.

Read Anthropic crawler docs

Crawler economics

Cloudflare data shows the web’s old crawl-for-referral bargain is under pressure from AI crawlers.

Read Cloudflare analysis

Web crawl scale

Common Crawl reports more than 300 billion pages spanning 15 years and 3–5 billion new pages added each month.

Read Common Crawl overview

FAQ: llms.txt for SEO and Generative Engine Optimisation

What is a llms.txt file?

A llms.txt file is a plain-text Markdown file at the root of a website that lists the most important public pages and explains why they matter. It is designed to help AI systems and agents understand a website quickly.

Should I do a llms.txt file for my website?

Yes, if your website has useful public content and you care about AI visibility. The cost is low, but the file should support a wider GEO strategy rather than replace one.

Will llms.txt make ChatGPT cite my website?

No one can honestly promise that. It may make your priority content easier to inspect, but citations still depend on retrieval access, relevance, authority, evidence quality, source diversity, entity clarity and platform behaviour.

Is llms.txt the same as robots.txt?

No. robots.txt gives crawler access preferences. llms.txt gives AI-readable context and priority links. They solve different problems.

Should llms.txt include every page?

No. Include the pages that define your entity, expertise, services, evidence and educational value. Your XML sitemap can handle broad URL discovery.

How often should I update llms.txt?

Review it whenever you publish important content, update services, add benchmark data, change pricing, release case studies or create new authority pages.

Final Verdict: Do the llms.txt File, But Do It Properly

The smart answer is yes: create a llms.txt file if your website has valuable public content. It is lightweight, easy to maintain and aligned with the direction of AI-readable web architecture.

The blunt answer is also important: llms.txt will not rescue weak content, poor authority, blocked crawlers, thin pages, missing evidence, vague authorship or bad technical SEO. It works best as one layer inside a serious Generative Engine Optimisation system.

Explore NeuralAdX Ltd GEO Services Read the GEO Explainer

What Is a llms.txt File & Should I do a llms.txt File for My Website?

Find out if AI is mentioning, citing or ignoring your business

Send your request in under two minutes