How does Chrome address bar determine it's an URL or a search string? How does Chrome address bar determine it's an URL or a search string? google-chrome google-chrome

How does Chrome address bar determine it's an URL or a search string?


It would be possible to find the answer through Chromium, as funroll mentioned—but here's the basic idea of what's going on, at least according to my testing.

A string entered into the 'omni box' is determined to be a URL if it follows the format of:

[protocol][subdomains].[subdomains].[domain name].[tld]

Where subdomains (which are optional, of course) and the domain name both contain only letters (for Chrome, this seems to include accented letters), numbers, spaces, and hyphens, and the TLD/Top Level Domain is from an approved list—.com, .net, etc—unless a protocol is specified, in which case any TLD is treated as valid. Protocols also come from a set list, but can be in pretty much any format with a colon following any number of slashes. If the protocol is not part of the set list, the entire URL is treated as a search instead.

If there is a slash after a string in the above URL format (e.g., stackoverflow.com/), then anything afterwards works.

Alternatively, if a slash occurs at the start of the string, Chrome treats it as a URL as well (with the file:// protocol).


Examples of valid URLs (according to Chrome):

  • stackoverflow.com
  • abc.stackoverflow.com
  • abc.abc.abc.abc.stackoverflow.com
  • stáckoverflow.com (this changes the URL, but is allowed—try it!)
  • stack-overflow.com
  • -stackoverflow.com (might not even be a legal domain name, but it works)
  • 4stackoverflow.com
  • stackoverflow.com
  • stackoverflow.com/not valid characters !@#$^æ
  • [http]://stackoverflow.com (the brackets aren't legal, but I can't include the link otherwise)
  • [http]:////stackoverflow.com
  • [http]:stackoverflow.com
  • [http]:stackoverflow.mynewtld

Examples of invalid URLs:

  • stack overflow.com
  • stackoverflow*.com
  • stack/overflow.com
  • stackoverflow.mynewtld

And, well, just about everything else.


Let's just hope there's a library out there somewhere to do all this instead.


-(BOOL)doesString:(NSString *)string containCharacter:(char)character{    if ([string rangeOfString:[NSString stringWithFormat:@"%c",character]].location != NSNotFound)    {        return YES;    }    return NO;}- (void)openURL:(NSString *)urlString {   urlString = [urlString stringByTrimmingCharactersInSet:[NSCharacterSet whitespaceCharacterSet]];   if ([self doesString:urlString containCharacter:'.'])   {       if ([urlString rangeOfString:@"http"].location != 0)        {            urlString=[@"http://" stringByAppendingString:urlString];        }    }    else    {        urlString = [GOOGLE_CODE stringByAppendingString:urlString];    }    urlString= [urlString stringByAddingPercentEscapesUsingEncoding:NSUTF8StringEncoding];

I can't find Chrome's code of search bar, so I finally use this code of a little bug.


In response to username tbd's post

Note: In RFC 921/1123's specification for domain names, it is defined that they cannot start with - but a - is a valid interior character so this regex has been modified to comply with that

Edit: Updated to comply with RFC 3986

Here's a regex expression to check for urls according to username tbd's observations.Some invalid urls will still be flagged as valid. This regex is in python flavour so if you're using javascript or php make sure to escape the /'s.

((http|https|file)://)?([a-z0-9][a-z0-9\-_~\/:\?#\[\]@!$&\'\(\)\*+,;=]*)(\.[a-z0-9\-_~\/:\?#\[\]@!$&\'\(\)\*+,;=]+)+