Hi, I’m building a personal website and I don’t want it to be used to train AI. In my robots.txt
file I blocked:
- ChatGPT-User
- GPTBot
- Google-Extended
- FacebookBot
What bots should I also add? Are there any other ways to block AI bots?
IMPORTANT: I don’t want to block search engine crawlers, only bots that are used to train AI.
I get that argument. Perhaps the fact that I’m a professor influences my thinking. And, since we are in a privacy community, something like ChatGPT and privacy don’t mix.
Meredith Whittaker (Signal) says[1]:
(I do keep on eye on their progress because it is interesting https://benchmarks.llmonitor.com/)
https://time.com/collection/time100-ai/6309018/meredith-whittaker/ ↩︎
Agreed that privacy can be a concern. Ideally it will be possible to run LLMs locally in the near future, but we’ll see.