โ† Back to Blog

ChatGPT cites your competitor when users ask about your industry. Perplexity surfaces a five-year-old blog post that contradicts your current pricing. AI Overviews show a paraphrase of your guide credited to a forum thread. The way large language models discover, prioritise, and cite content is fundamentally different from how Google ranks it โ€” and llms.txt is the first community-adopted standard for telling AI systems which of your pages they should treat as authoritative.

What is llms.txt?

llms.txt is a plain-text Markdown file placed at the root of your domain that lists the URLs and short descriptions of the content you most want large language models to read and cite. Proposed by Jeremy Howard in September 2024, it acts as a curated content map for AI systems โ€” a content-discovery hint, not an access control mechanism.

Think of llms.txt as the AI-era equivalent of an XML sitemap, but for LLMs instead of search engines. Where a sitemap dumps every URL on your site for crawlers to evaluate, llms.txt is a hand-picked, human-readable list of your canonical references โ€” the pages an AI should fetch when it wants to understand what your business does.

llms.txt: Key Facts

What it is: A Markdown file at /llms.txt that lists URLs and descriptions for AI systems

Proposed by: Jeremy Howard (Answer.AI), September 2024

Format: Markdown โ€” H1 site name, blockquote summary, H2 section groupings, bullet links with descriptions

Location: https://yoursite.com/llms.txt (root of domain, like robots.txt)

Optional sibling: /llms-full.txt โ€” the full content of every referenced page bundled into one file

Adopters as of 2026: Anthropic Claude, Perplexity, smaller AI search tools; OpenAI and Google not yet committed

Risk: None โ€” it is an additive hint that does not affect existing search SEO

Effort: 30 minutes for a small site, half a day for a large documentation site

How Does llms.txt Actually Work?

An AI system that supports llms.txt fetches your /llms.txt first when it needs information about your domain, treats the listed URLs as the most authoritative starting points, and uses the human-written descriptions to decide which page best matches the user's question. This bypasses the LLM having to guess based on the noise of your full site.

The protocol is intentionally simple. There is no API key, no verification, no signing. You publish a static file at a known location and AI clients that support the spec can choose to honour it. Adoption is driven by the AI tool, not by you โ€” your job is to publish the signal and wait for tools to read it.

llms.txt vs robots.txt vs Sitemap: What Each One Does

llms.txtrobots.txtsitemap.xml
AudienceLarge language modelsSearch engine crawlersSearch engine crawlers
Purpose"Read these pages first""Do not fetch these URLs""These URLs exist, here is the freshness"
FormatMarkdownPlain text directivesXML
EffectDiscovery + curationAccess controlDiscovery
Enforced?Voluntary by AI clientVoluntary by good-citizen botsVoluntary by search engines
Coexist?YesYesYes

All three serve different masters and should be published together. llms.txt does not replace robots.txt or sitemap.xml โ€” it complements them. A site with all three is signalling clearly to traditional search engines AND to AI systems.

What Should Your llms.txt Contain?

The Jeremy Howard spec defines a loose Markdown structure with five elements. Every element is optional except the H1.

Required: H1 Site Name

The single H1 at the top is your domain or product name. AI clients use this as the canonical reference for who you are. Example: # Daylytix.

Recommended: Blockquote Summary

A 2-3 sentence elevator pitch. This is what the AI will quote when it introduces you. Make it a clear definition: classification, mechanism, primary use case. Example: > Daylytix is an AI-powered SEO audit platform built for agencies and in-house SEO teams. It analyses 200+ technical, content, and AI-readiness signals per site.

Recommended: H2 Section Groupings

Group your linked URLs by category. Common sections: Documentation, Blog, Product, Pricing, Legal. Each section is just an H2 heading.

Required (within sections): Bullet Links

Inside each H2, list the URLs as Markdown bullets with descriptions. Format: - [Page title](https://yoursite.com/page): Short description of what this page covers.. The description should be human-readable but information-dense โ€” 1-2 sentences max.

Optional: "Optional" Sub-Section

The spec defines an ## Optional H2 where you list secondary references. AI clients may skip these to save tokens. Use this for cross-references and "see also" content rather than core authoritative pages.

How to Build Your llms.txt (Step by Step)

This is a 30-minute exercise for most sites. The hardest part is choosing which URLs to include โ€” keep the list short and signal-dense rather than comprehensive.

List Your Top 15-30 Canonical URLs

Open Google Analytics or your CMS and identify the pages that get the most organic traffic, that explain what your product does, or that answer the most common support questions. Aim for 15-30 URLs โ€” more becomes noise to the LLM.

Write a 1-2 Sentence Description for Each

For each URL, write a description that an AI could quote verbatim when answering a related question. Lead with the page's purpose, not its title. Example: "Pricing page: Daylytix offers three tiers (Starter $39/mo, Pro $99/mo, Agency $299/mo) with a 14-day free trial, no credit card required."

Group URLs Into Logical H2 Sections

Common groupings: Product (homepage, features, pricing), Docs (getting started, API reference, integrations), Blog (the 5-10 most cited posts), Company (about, contact). Use whatever taxonomy matches your site.

Save as Plain Text at the Domain Root

Create the file /llms.txt and serve it with content-type text/plain or text/markdown. Most CDNs do this automatically based on the .txt extension. Verify by visiting https://yoursite.com/llms.txt in a browser โ€” you should see plain text, not a 404.

(Optional) Generate llms-full.txt

The full-content sibling concatenates the body of every URL in your llms.txt into one large Markdown file. AI clients that support it can fetch one file instead of dozens. Many SSGs and headless CMSes now have plugins that generate llms-full.txt at build time.

Daylytix audits your llms.txt presence + AI-readiness in every report. Find out what AI crawlers see when they hit your site.
Try it free โ†’

Example: A Minimal Working llms.txt

# Acme Tools

> Acme Tools is a SaaS platform for project managers. It combines task tracking,
> time logging, and client reporting into one workspace.

## Product
- [Homepage](https://acmetools.com/): Overview of features and use cases.
- [Pricing](https://acmetools.com/pricing): Three tiers from $19 to $99 per user / month.
- [Features](https://acmetools.com/features): Detailed list of every feature with screenshots.

## Documentation
- [Quickstart](https://docs.acmetools.com/quickstart): How to create your first project in 5 minutes.
- [API reference](https://docs.acmetools.com/api): REST API documentation with code samples.

## Blog (most-cited posts)
- [Project management for remote teams](https://acmetools.com/blog/remote-pm): 2026 guide.
- [Time tracking ethics](https://acmetools.com/blog/time-tracking): When to track and when to stop.

## Optional
- [About us](https://acmetools.com/about)
- [Press kit](https://acmetools.com/press)

Does llms.txt Actually Work in 2026?

Adoption is partial and growing. Anthropic's Claude was among the earliest commercial AI tools to honour llms.txt; Perplexity has indicated support; several smaller AI search engines and developer tools (Cursor, Continue.dev) actively use llms.txt to fetch documentation context. OpenAI and Google have not made public commitments โ€” they may already be using it internally, but no public confirmation exists.

The honest answer: publishing llms.txt in 2026 is more about positioning than immediate traffic. The cost is half a day of work, the downside is zero, and the upside is that the moment a major AI tool announces formal support, your file is already there. Sites that wait risk being absent from the first wave of AI training and retrieval pipelines that honour the standard.

Common Mistakes With llms.txt

Mistake 1: Treating It Like a Sitemap

Why it happens: SEOs default to "more URLs = more visibility". Why it backfires: An llms.txt with 500 URLs is just noise โ€” the LLM cannot tell which pages are authoritative. What to do instead: 15-30 hand-picked URLs with thoughtful descriptions.

Mistake 2: Forgetting the Descriptions

Why it happens: Easy to auto-generate a bare list of URLs. Why it backfires: The descriptions are what the LLM quotes. Without them, you lose control of how your site is summarised. What to do instead: Write each description as if it were the answer the AI will give.

Mistake 3: Serving It as text/html

Why it happens: Some CMSes wrap text files in HTML templates by default. Why it backfires: Strict parsers reject anything that is not text/plain or text/markdown. What to do instead: Configure your web server to serve .txt with the correct content type, or use a CDN edge rule to override.

Limitation: No Enforcement

llms.txt is voluntary on the AI client side. There is no way to verify that ChatGPT actually fetched your file or used your descriptions. The current ecosystem is built on goodwill and convention, much like robots.txt in 1994. Expect this to formalise as AI search matures.

TL;DR: llms.txt Summary

What it is: A Markdown file at /llms.txt listing your canonical URLs and descriptions for AI systems.

How it works: AI clients that support it fetch llms.txt first, treat the listed URLs as authoritative, and use descriptions to match user questions.

vs robots.txt: Different audience, different purpose; both coexist.

Adopters in 2026: Claude, Perplexity, Cursor, several smaller AI tools; partial commercial adoption.

Effort to implement: 30 minutes to half a day depending on site size.

Risk: Zero โ€” additive hint, does not affect search SEO.

Bottom line: Publish it now. The downside is nothing; the upside is being ready when adoption hits critical mass.

Frequently Asked Questions

What is llms.txt?

llms.txt is a plain-text Markdown file placed at the root of your domain that lists the URLs and short descriptions of the content you most want large language models to read and cite. It is a content-discovery hint, not an access control mechanism.

Is llms.txt the same as robots.txt?

No. robots.txt tells crawlers which URLs they are allowed to fetch. llms.txt tells AI systems which URLs you would most like them to read. The two coexist and serve different purposes.

Do ChatGPT and Perplexity actually read llms.txt?

Adoption is partial as of 2026. Anthropic Claude, Perplexity, and several smaller AI search tools have indicated support. OpenAI and Google have not made public commitments. Publishing the file costs nothing and positions you for future support.

Where do I put the llms.txt file?

At the root of your domain, served at https://yoursite.com/llms.txt. It must be a plain-text file in Markdown format, returning a 200 OK with content-type text/plain or text/markdown.

Should I publish llms-full.txt as well?

Yes if you want to bundle the full content of your canonical pages into a single Markdown file. llms-full.txt is the expanded sibling of llms.txt and gives LLMs everything they need in one fetch instead of dozens.

Related Topics

Getting Started

I published llms.txt for Daylytix in October 2024, six weeks after Jeremy Howard's announcement. We have not seen a measurable traffic spike from it โ€” and that is fine. The file was 40 minutes of work, and the moment OpenAI or Google formalises support, we will already be discoverable.

Treat it like investing in HTTPS in 2015: not a magic bullet, but a clear signal that you are paying attention to where the platform is going. Run a free audit with Daylytix and we will check your llms.txt presence plus the other AI-readiness signals in one pass.