How to validate a url in Python? (Malformed or not)
Use the validators package:
>>> import validators>>> validators.url("http://google.com")True>>> validators.url("http://google")ValidationFailure(func=url, args={'value': 'http://google', 'require_tld': True})>>> if not validators.url("http://google"):... print "not valid"... not valid>>>
Install it from PyPI with pip (pip install validators
).
Actually, I think this is the best way.
from django.core.validators import URLValidatorfrom django.core.exceptions import ValidationErrorval = URLValidator(verify_exists=False)try: val('http://www.google.com')except ValidationError, e: print e
If you set verify_exists
to True
, it will actually verify that the URL exists, otherwise it will just check if it's formed correctly.
edit: ah yeah, this question is a duplicate of this: How can I check if a URL exists with Django’s validators?
django url validation regex (source):
import reregex = re.compile( r'^(?:http|ftp)s?://' # http:// or https:// r'(?:(?:[A-Z0-9](?:[A-Z0-9-]{0,61}[A-Z0-9])?\.)+(?:[A-Z]{2,6}\.?|[A-Z0-9-]{2,}\.?)|' #domain... r'localhost|' #localhost... r'\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3})' # ...or ip r'(?::\d+)?' # optional port r'(?:/?|[/?]\S+)$', re.IGNORECASE)print(re.match(regex, "http://www.example.com") is not None) # Trueprint(re.match(regex, "example.com") is not None) # False