Serving 404 directly
First of all, as already been pointed out, doing a 301
redirect from a non-existent page to a single /notfound
moniker, is a really bad practice, and is likely against the RFCs.
What if the user simply mistyped a single character of a long URL? Modern browsers make it non-trivial to go back to what has been typed in order to correct it. The user would have to decide whether your site is worth a retyping from scratch, or whether your competitor might have thought of a better experience.
What if the user simply followed a broken link, which is broken in a very obvious way, and could be easily fixed? E.g.,
http://www.example.org/www.example.com/page
, where an absolute URL was mistyped by the creator to be a relative one, or maybe a URI like/page.html.
, with an extra dot in the end. Likewise, you'll be totally confusing the user with what's going on, and offering a terrible user experience, where if left alone, the URL could easily have been corrected promptly.
But, more importantly, what real problem are you actually trying to solve?!
For better or worse, it's a pretty common practice to indiscriminately redirect from
http
tohttps
scheme, without an account of whether a given page may or may not exist. In fact, if you employ HSTS, then content served overhttp
effectively becomes meaningless; the browser with a policy would never even be requesting anything overhttp
from there on out.Undoubtedly, in order to know whether or not a given page exists, you must consult with the backend. As such, you might as well do the redirect from
http
tohttps
from within your backend; but it'll likely tie up your valuable server resources for little to no extra benefit.Moreover, the presence or absence of the page may be dictated by the contents of the cookies. As such, if you require that your backend must discern whether a page does or does not exist for an
http
request, then you'll effectively be leaking private information that was meant to be protected byhttps
in the first place. (In turn, if your site has no such private information, then maybe you shouldn't be usinghttps
in the first place.)
So, overall, the whole approach is just a REALLY, REALLY bad idea!
Consider instead:
Do NOT do a
301
redirect from all non-existent pages to a single/notfound
page. Very bad practice, very bad UX.It is totally OK to do an indiscriminate redirect from
http
tohttps
, without accounting for whether or not the page exists. In fact, it's not only okay, but it's the way God intended, because an adversary should not be capable of discerning whether or not a given page exists for anhttps
-based site, so, if you do find and implement a solution for your "problem", then you'll effectively create a security vulnerability and a data leak.
Use https://www.drupal.org/project/fast_404 module for serving 404 pages directly without much overload.
I'd suggest redirecting to a 404 page is a poor choice, and you should instead serve the 404 on the incorrect URL.
My reasons for stating this are:
- By redirecting away from the page, you are issuing headers that implicitly say "The content does not exist on this URL, but it does over here". I'm not sure how the various search engines would react to being redirected to a 404
- I can speak from my own experience as a user when I say that having the URL change on me when I've mis-typed by a single character can be very frustrating. I then need to spend the time to type out the entire URL again.
- You can avoid having logic in your .htaccess file or whatever to judge a page as a 404. This will greatly simplify your initial logic (which by-the-by gets computed on every single page load) - and will remove far more redirects than just the odd one of http://badurl to https://badurl to https://404