How to Format Your LLM.txt File to Get Cited by ChatGPT
 
            Last week, Webflow integrated LLM.txt files into it's user dashboard. I expect this to become industry standard by the end of the year.
Websites have been chomping at the bit to control how AI models talk about their business in LLM's. The LLM.txt file will help give sites some of their control back, and may even contribute to getting their link shared by models like ChatGPT.
These LLM.txt files are most likely going to become core to any SEO strategy that wants to include AI powered search.
What Is LLM.txt?
LLM.txt is a plain text file that sits on your website and is designed to interpret and represent your website's content in AI-driven search (Like ChatGPT). Using this file, webmasters can choose where to point AI so it can be as helpful as possible and give people asking the most accurate information. It can also help brands control how LLMs talk about their product or service.
Moving forward, I expect LLM.txt files to become industry standard.
This will allow websites to:
- Control the narrative of people who are seeking answers about your business
- Direct LLM's to deliver your best information
- Make it easier for LLM's to get information quickly, potentially improving your sites chances of being referenced
There is no downside to adding one to your website, and it could help increase relevant traffic through AI citation.
Here is how to create and implement a LLM.txt file on your site.
How to Create LLM.txt
Just like robots.txt (the file that tells Google how to scan your website), (LLM.txt is a plain text file that sits at the root of your domain: https://www.example.com/LLM.txt
To create it:
- Open a plain text editor (like Notepad or VS Code).
- Add structured guidance (see formatting below).
- Save it as LLM.txt.
- Upload it to the root directory of your website (alongside robots.txt).
LLM.txt Format
There’s no universal spec yet, but a standard is emerging. The idea is to provide structured data and model guidance in a way that's easily readable by humans (LLM's read things like a person.)
Lets breakdown what this file could look like and the reason for each section:
Title
# LLM.txt for Example.com
This is just the tile of your llm.txt doc. It lets the LLM know they are in the right place. 
Summery
# Site identity
site: https://www.example.com
brand: Example Co
summary: Example Co helps ecommerce businesses automate shipping and fulfillment through AI-powered tools.
This section identifies your URL, names your brand so you can be referenced correctly, and summarizes what your site does. This information makes it easy for LLMs to communicate your offering and basic info to anyone who asks about it. 
Preferred pages
# Preferred page interpretations]
preferred_pages:
  - https://www.example.com/about : [This page outlines our company mission and leadership team.  - https://www.example.com/features : [Use this page to understand our product's core features and benefits.]  - https://www.example.com/blog : [A collection of thought leadership on ecommerce logistics, automation, and AI.]
This is where you control what information should be prioritized for LLM's to use. You list the page URL's you want the model to reference, and then summarize the information in the page so the model can quickly find the relevant information. You can also format sections of this page like this: 
Answering user questions:]
  - how much wood could a wood chuck chuck if a wood chuck could chuck would? :  [For questions realted to this use (yourpreferredurl.com/insights/post) This is the section where you can write the summerized answer to the question so the model can pull the information quickly and effectively.
This section lets you pick what questions you think users might ask, and then answer them directly. You can direct the ai to a certain page that answers the question being asked, and or you can give it information to answer the question directly. It will likely not use your exact wording. # Citation policy]
citation:
  required: yes
  preferred_format: [Example Co, https://www.example.com
This section allows you to confirm that you want to be cited, and lets you choose how you would like that citation to display. # Disallowed sections
exclude_paths:
  - /checkout
  - /cart
  - /user-dashboard
This section lets you choose what pages you do not want AI models using in their answers (or eventually AI agents using the page at all). # Contact
llm_contact: ai@example.com
user inquires: yourcontactinfo
You can use this section to display contact information so the model can display to users how to get in touch with you if they ask. You can also specify how AI model companies should reach out to you, if at all.
SEO is changing
Good SEO is not just about Google anymore. LLM-based search is exploding and we need the tools to step into this world of search.
When properly implemented, these files helps your AISEO strategy by:
| Function | Benefit | 
|---|---|
| Improves citation accuracy | Helps LLMs properly cite your brand in summaries | 
| Curates key pages | Guides models to the right pages for reliable information | 
| Protects sensitive paths | Excludes things like user dashboards, carts, etc. | 
| Boosts knowledge representation | Ensures your brand is described accurately in AI snippets and summaries | 
| Standardizes contact | Lets AI providers reach out with questions or indexing requests | 
To make things as efficient as possible:
- Include short but clear summaries per page
- Match LLM.txt with your robots.txt policy
- Regularly update with new or deprecated content
- Treat it like a living document for your AI presence
Putting these into practice will keep your site in regular AI citations and will future proof your SEO strategy.
A few parting thoughts
- This does not a guarantee control. LLMs may still crawl and interpret your site however they want.
- The standard is likely to evolve but adding LLM.txt now puts you ahead of the curve.
- Quality content will still win over well optimized content. LLM's are seeking what is most helpful. If your content is to sales oriented, it will still not be cited in LLM's even with a great structure.
A few things coming soon
- I am building a tool that generates these files automatically. Just put in your domain and automatically get a full LLM.txt file for your site. If you would like to be notified of when that tool goes live, please consider subscribing.
- I am also working on a tool that pulls data on specific keywords, and helps build a plan for how to rank in LLM's for that keyword. Try out the beta here.
- If you rather, I can draft a LLM.txt tailored to your top pages and brand language:
