July 18, 2023  SEONews

Robots.txt is not the answer: Proposing a new meta tag for LLM/AI – Search Engine Land

While Google is opening up the discussion on giving credit and adhering to copyright when training large language models (LLMs) for generative AI products, their focus is on the robots.txt file.

However, in my opinion, this is the wrong tool to look at.

My former colleague Pierre Far wrote an excellent article on Crawlers, search engines and the sleaze of generative AI companies where he highlighted some of the immense challenges currently facing the online publishing industry. Similar to his article, I will keep this proposal high-level as developments in this field are extremely fast-paced.

Why not use robots.txt

There are a few reasons why using robots.txt is the wrong starting point for the discussion on how to respect the copyright of publishers.

Not all LLMs use crawlers and identify themselves

The burden is on the website operator to identify and block individual crawlers, which may use and/or sell their data for generative AI products. This creates a lot of extra (and…

Read Full Story: https://news.google.com/rss/articles/CBMiQmh0dHBzOi8vc2VhcmNoZW5naW5lbGFuZC5jb20vcm9ib3RzLXR4dC1uZXctbWV0YS10YWctbGxtLWFpLTQyOTUxMNIBAA?oc=5

The post Robots.txt is not the answer: Proposing a new meta tag for LLM/AI – Search Engine Land first appeared on SEO, Marketing and Social News | OneSEOCompany.com.



source: https://news.oneseocompany.com/2023/07/18/robotstxt-is-not-the-answer-proposing-a-new-meta-tag-for-llmai-search-engine-land_2023071847768.html

Your content is great. However, if any of the content contained herein violates any rights of yours, including those of copyright, please contact us immediately by e-mail at media[@]kissrpr.com.