Why can't a malicious site obtain a CSRF token via GET before attacking?
Your description is correct.
If site A tells your browser to go to B and get the token, that's fine, but as it is a cross-domain request, A will not have access to the token in Javascript (this is a browser feature). So when A tells your browser to go back to B and actually do something, it still cannot include the token in the request.
That is, unless B set the token as a cookie. Evidently, that would be flawed, because the token cookie would also be sent, thus negating any protection. So the token in this case must be sent as either a form value or a request header (or something else that is not sent automatically like a cookie).
This also means that if B is vulnerable to cross-site scripting, it is also vulnerable to CSRF, because the token can then be stolen, but CSRF is the smaller problem then. :)
Correct.
Site A can't get site B's csrf token because of the browser's CORS strategy.
And we need to validate the request's referer
(It can be forged).https://en.wikipedia.org/wiki/HTTP_referer
It is also a good practice to validate the crsf token in url(AKA query string).
FYI,Laravel
, a popular web framework, uses a hidden CSRF token field in the form to prevent csrf attack.