Making your Jekyll site agent-friendly
Table of Contents
A growing share of your readers are not human. ChatGPT, Claude, Perplexity, agent frameworks, and the indexing crawlers behind them all want to read your site. They are also pretty bad at it. They burn tokens parsing HTML, stripping nav chrome, decoding theme toggles, and hunting for the actual prose.
Moonwalk now ships with two tiny Jekyll plugins that fix this. Browsers still see your site exactly as before. Agents get clean Markdown.
What ships
jekyll-markdown-output
For every post rendered at /foo, this plugin emits a sibling at /foo.md containing:
- a small YAML frontmatter block (title, date, url, summary, tags, category, author)
- the post’s source Markdown with Liquid rendered
No layouts, no nav, no theme toggle scripts. Just the content.
_site/
foo.html <- as before
foo.md <- new: clean Markdown
An agent that lands on /foo can fetch /foo.md and skip HTML parsing entirely. It is the same pattern Anthropic, Stripe, and a growing set of agent-friendly sites use.
jekyll-llms-output
This one generates two files at the root of your site:
- /llms.txt - a curated index of your most important pages, following the llmstxt.org spec. Think of it as a sitemap for LLMs.
- /llms-full.txt - the full content of all configured collections, concatenated. One fetch and an agent has your entire blog.
If you drop a YAML data file at _data/llms.yml with a title, description, and sections, the plugin uses your hand-curated structure. Otherwise it auto-generates from your posts.
Why two plugins instead of one
They solve different problems and compose well.
The llms.txt file is a single discovery file. An agent that has never seen your site can fetch it and learn what is here. But it is one file, and it does not scale to deep linking. If a user shares a specific post URL with an agent, the agent still needs to fetch that page.
That is where Markdown siblings come in. Any post URL has a Markdown twin one extension away. Agents do not need to consult the index first.
Ship both. They are complementary.
How to enable them
Both ship as Moonwalk runtime dependencies. If you are on Moonwalk 0.2.0 or later, you only need to add them to your config:
plugins:
- jekyll-markdown-output
- jekyll-llms-output
markdown_output:
collections: [posts]
llms_output:
index:
collections: [posts]
full:
collections: [posts]
respect_markdown_output: true
Build the site. You should see something like:
MarkdownOutput: wrote 9 markdown file(s)
LlmsOutput: wrote /llms.txt
LlmsOutput: wrote /llms-full.txt
That’s it.
Customizing what shows up
Skip a single post
Add markdown_output: false to a post’s frontmatter and no Markdown twin is emitted for it. Useful for drafts you want indexed by Google but not picked up by agents.
Curate llms.txt
The auto-generated index is fine. A curated one is much better. Drop this in _data/llms.yml:
title: Your Site
description: >
One paragraph about who you are and what readers
(human or otherwise) will find here.
sections:
- heading: Featured Writing
links:
- title: My best post
url: https://example.com/best-post
description: One line on what it covers
- heading: About
links:
- title: About me
url: https://example.com/about
description: Background and contact
The plugin uses this verbatim. You decide what an agent encounters first.
Respect Markdown twins in the bulk file
The default config sets respect_markdown_output: true. This tells the llms-full plugin to skip any post that has opted out via the markdown frontmatter flag. Without it, opted-out posts would still leak into the bulk file.
What about GitHub Pages
GitHub Pages restricts plugins to a whitelist, and these two are not on it. Your options:
- Build the site yourself in CI - Actions, Netlify, Cloudflare Pages, Vercel - and deploy the built site to GH Pages.
- Skip the plugins and serve HTML only.
Cloudflare Pages, Netlify, and Vercel run them without restriction. If you are on GH Pages and want this to work, the cleanest path is a small GitHub Actions workflow that builds and deploys.
Should you bother
If your site is mostly for humans and you do not care whether an LLM cites it well, no. Move on.
If you write things you want agents to find, recommend, and quote accurately, this is a roughly five-minute change with a non-trivial payoff. Agents are reading your site whether you help them or not. The question is whether they read source Markdown or a wrestled-down approximation of your HTML.
Both plugins are open source and small enough to read in one sitting:
They were the smallest change I could think of that meaningfully improves how the agentic web reads my own writing. Now they are part of Moonwalk too.