Fastest way to detect external URLs Fastest way to detect external URLs javascript javascript

Fastest way to detect external URLs


If you consider a URL being external if either the scheme, host or port is different, you could do something like this:

function isExternal(url) {    var match = url.match(/^([^:\/?#]+:)?(?:\/\/([^\/?#]*))?([^?#]+)?(\?[^#]*)?(#.*)?/);    if (typeof match[1] === "string" && match[1].length > 0 && match[1].toLowerCase() !== location.protocol) return true;    if (typeof match[2] === "string" && match[2].length > 0 && match[2].replace(new RegExp(":("+{"http:":80,"https:":443}[location.protocol]+")?$"), "") !== location.host) return true;    return false;}


Update: I did some more research and found that using new URL is easily fast enough, and IMO the most straight-forward way of doing this.

It is important to note that every method I've tried takes less than 1ms to run even on an old phone. So performance shouldn't be your primary consideration unless you are doing some large batch processing. Use the regex version if performance is your top priority.

These are the three methods I tried:

new URL:

const isExternalURL = (url) => new URL(url).origin !== location.origin;

String.replace:

function isExternalReplace(url) {  const domain = (url) => url.replace('http://','').replace('https://','').split('/')[0];         return domain(location.href) !== domain(url);}

Regex:

const isExternalRegex = (function(){  const domainRe = /https?:\/\/((?:[\w\d-]+\.)+[\w\d]{2,})/i;  return (url) => {    const domain = (url) => domainRe.exec(url)[1];    return domain(location.href) !== domain(url);  }})();

Here are some basic tests I used to test performance: https://is-external-url-test.glitch.me/


I've been using psuedosavant's method, but ran into a few cases where it triggered false positives, such as domain-less links ( /about, image.jpg ) and anchor links ( #about ). The old method would also give inaccurate results for different protocols ( http vs https ).

Here's my slightly modified version:

var checkDomain = function(url) {  if ( url.indexOf('//') === 0 ) { url = location.protocol + url; }  return url.toLowerCase().replace(/([a-z])?:\/\//,'$1').split('/')[0];};var isExternal = function(url) {  return ( ( url.indexOf(':') > -1 || url.indexOf('//') > -1 ) && checkDomain(location.href) !== checkDomain(url) );};

Here are some tests with the updated function:

isExternal('http://google.com'); // trueisExternal('https://google.com'); // trueisExternal('//google.com'); // true (no protocol)isExternal('mailto:mail@example.com'); // trueisExternal('http://samedomain.com:8080/port'); // true (same domain, different port)isExternal('https://samedomain.com/secure'); // true (same domain, https)isExternal('http://samedomain.com/about'); // false (same domain, different page)isExternal('HTTP://SAMEDOMAIN.COM/about'); // false (same domain, but different casing)isExternal('//samedomain.com/about'); // false (same domain, no protocol)isExternal('/about'); // falseisExternal('image.jpg'); // falseisExternal('#anchor'); // false

It's more accurate overall, and it even ends up being marginally faster, according to some basic jsperf tests. If you leave off the .toLowerCase() for case-insensitive testing, you can speed it up even more.