Google On Serving Markdown Pages To LLM Crawlers

Google's John Mueller responded to a question on the pros and cons of serving raw markdown pages to LLM crawlers and bots. John didn't say much but he did list a number of concerns and things you should be on top of, if you do go down that avenue.

A Markdown is a lightweight markup language used to create and edit technical documents using plain text and special characters for formatting. Markdown files are converted into HTML by a Markdown parser, which allows browsers to display the content to readers.

The question posted on Reddit was, "What is the actual risk/reward impact of serving raw Markdown to LLM bots?"

John replied with these concerns:

  • Are you sure they can even recognize MD on a website as anything other than a text file?
  • Can they parse & follow the links?
  • What will happen to your site's internal linking, header, footer, sidebar, navigation?
  • It's one thing to give it a MD file manually, it seems very different to serve it a text file when they're looking for a HTML page.

John then wrote on Bluesky, "Converting pages to markdown is such a stupid idea. Did you know LLMs can read images? WHY NOT TURN YOUR WHOLE SITE INTO AN IMAGE?"

So keep these questions in mind when considering doing this.

