There are laws against unauthorized computer access.
This is a scenario where you have a server explicitly saying "Stop! You are not permitted to access this computer!", and yet you persist in circumventing that by hiding your identity and accessing it anyway. Those are some murky waters.
For those that are interested in the specifics, Jamie Williams wrote a piece for the EFF[0] in the wake of hiQ vs Linkedin which dealt with this exact question.
It depends on who the server operator is. If it's your server, yeah, anyone I don't want to be there should go away. If it's your enemy's server, the argument that they're sending that page to the rest of the Internet turns out to be a decent one.
The server says nothing of the kind. The response that was previously positive is now broken, and it happens to be fixed if you access it from a different IP.
Maybe we need a status code that means ‘lay off all the requests made from this entire system’?
> Although the HTTP standard specifies "unauthorized", semantically this response means "unauthenticated". That is, the client must authenticate itself to get the requested response.
So it would seem that it actually doesn't positively imply that you're NOT authorized.
Which kind of makes sense; machines can't detect legality of things, just that certain procedural niceties haven't been observed.
> The client does not have access rights to the content; that is, it is unauthorized, so the server is refusing to give the requested resource.
Machines don't have any legal responsibility, bot-operators do. Which is why respecting these things is sort of important. At any rate, 40x does not mean "try again with a different user agent and another IP"
403 is per request, not requester. I get random 403s when just browsing some websites. Does that mean I should close the browser and not hit refresh for fear of breaking some wire fraud unauthorized access law?
If you go by the semantics of what the 403 code means, absolutely, that's excatly what the status code means.
In practice there's of course nuance, like anyone will occasionally type in the wrong password on a log-in screen, maybe try again and then realize it was the wrong log-in prompt. That's mostly fine.
That's different from deliberate trying to circumvent a measure like this. If you are doing the stuff in the link, you are absolutely crossing a line and you know it.
There's a large difference between "I got a 403 so I hit F5 once" and "I got a 403 so I used a residential proxy and spoofed my user-agent".
> On the contrary, there are no laws that say you can't scrape a site.
You are both wrong: copyright law both says you can't (in some cases for some uses) and that you can (under implicit license, fair use, and other rules) in others.
In that case, the data compilation itself would be protected, not the individual data points. If I used a scraper to copy everything verbatim, then yes, it would be a violation.