I'm working on a project that involves gathering lots of data from a website for later analysis. It's a crank conspiracy site, most likely populated by boomers, but I'm still wary about getting caught and having my IP blocked as it's not possible for me to use a VPN. Ideally I want to visit the site every 10 minutes or so. Is there anything I can do, other than varying the re-visit time in the script, to avoid detection? Also, how likely is detection in the first place? Is that something site admins are likely to check?
HTTP requests in python scheduled via cron on a remote server. I’ll check out the links, looks like they might be useful