Some context about this here: https://arstechnica.com/information-technology/2023/08/openai-details-how-to-keep-chatgpt-from-gobbling-up-website-data/

the robots.txt would be updated with this entry

User-agent: GPTBot
Disallow: /

Obviously this is meaningless against non-openai scrapers or anyone who just doesn't give a shit.

  • 7heo@lemmy.ml
    ·
    11 months ago

    That won't stop OpenAI. We need actual blocking, on the server side. Problem is, with federation and all, it will be really, really difficult to do. And expensive.