Validate a hostname string
import redef is_valid_hostname(hostname): if len(hostname) > 255: return False if hostname[-1] == ".": hostname = hostname[:-1] # strip exactly one dot from the right, if present allowed = re.compile("(?!-)[A-Z\d-]{1,63}(?<!-)$", re.IGNORECASE) return all(allowed.match(x) for x in hostname.split("."))
ensures that each segment
- contains at least one character and a maximum of 63 characters
- consists only of allowed characters
- doesn't begin or end with a hyphen.
It also avoids double negatives (not disallowed
), and if hostname
ends in a .
, that's OK, too. It will (and should) fail if hostname
ends in more than one dot.
Here's a bit stricter version of Tim Pietzcker's answer with the following improvements:
- Limit the length of the hostname to 253 characters (after stripping the optional trailing dot).
- Limit the character set to ASCII (i.e. use
[0-9]
instead of\d
). - Check that the TLD is not all-numeric.
import redef is_valid_hostname(hostname): if hostname[-1] == ".": # strip exactly one dot from the right, if present hostname = hostname[:-1] if len(hostname) > 253: return False labels = hostname.split(".") # the TLD must be not all-numeric if re.match(r"[0-9]+$", labels[-1]): return False allowed = re.compile(r"(?!-)[a-z0-9-]{1,63}(?<!-)$", re.IGNORECASE) return all(allowed.match(label) for label in labels)
Don't reinvent the wheel. You can use a library, e.g. validators. Or you can copy their code:
Installation
pip install validators
Usage
import validatorsif validators.domain('example.com') print('this domain is valid')