Crawlers, search engines and the sleaze of generative AI companies – Search Engine Land
The boom of generative AI products over the past few months has prompted many websites to take countermeasures.
The basic concern goes like this:
AI products depend on consuming large volumes of content to train their language models (the so-called large language models, or LLMs for short), and this content has to come from somewhere. AI companies see the openness of the web as permitting large-scale crawling to obtain training data, but some website operators disagree, including Reddit, Stack Overflow and Twitter.
This answer to this interesting question will no doubt be litigated in courts around the world.
This article will explore this question, focusing on the business and technical aspects. But before we dive in, a few points:
- Although this topic touches on, and I include in this article, some legal arguments, I am not a lawyer, I am not your lawyer, and I am not giving you any advice of any sort. Talk to your favorite lawyer cat if you need legal advice.
- I used to work at…
The post Crawlers, search engines and the sleaze of generative AI companies – Search Engine Land first appeared on SEO, Marketing and Social News | OneSEOCompany.com.
source: https://news.oneseocompany.com/2023/07/13/crawlers-search-engines-and-the-sleaze-of-generative-ai-companies-search-engine-land_2023071347606.html
Your content is great. However, if any of the content contained herein violates any rights of yours, including those of copyright, please contact us immediately by e-mail at media[@]kissrpr.com.