Is it better to store telephone numbers in some canonical format or "as entered"? Is it better to store telephone numbers in some canonical format or "as entered"? database database

Is it better to store telephone numbers in some canonical format or "as entered"?


Store it however you prefer, but turn it into human readable format before you show it to the user. And please don't force your users to enter phone numbers in a format of your choosing, let them just type it in however they like.

That's how I do it.


Hopefully this is a more practical and applied answer to an old question.

Take a look at https://github.com/googlei18n/libphonenumber.

As @Gumbo alluded to, I would store the phone number as E.164, which the above library parses for you. It can be used from several different programming languages.

For DB storage you could in fact use E.164 as Base64 (since it ironically is valid base64), and decode Base64 as bytes. I believe the number of bytes from a string like that will fit a standard long. Personally I would just store the E.164 as a string in the database though.

Of course, you should probably also store what the user entered originally before parsing, but I highly recommend you enter in some canonical number like E.164 for future integration with other systems.


What's your userbase?

If they're going to be limited geographically (i.e., US-only) and you're going to validate numbers strictly, then format the number canonically for them -- i.e., strip out any formatting they used (like periods between numbers...) and put in the dashes (do not fail validation if they don't stick to your formatting... that's just mean). I'd store that cleaned-up version in the DB as well, not a stripped number; it makes your life a bit easier when generating custom reports, etc..

If you might have users/numbers from all over the world, it might be better to save the formatting they used. Also don't forget the case that sometimes US residents are currently traveling and using a foreign number: don't block them unintentionally.

Either way: make sure you DON'T define the column as numeric, or make it too small. International numbers with formatting can easily be over 16 chars long.