Detect base64 encoding in PHP? Detect base64 encoding in PHP? php php

Detect base64 encoding in PHP?


Apologies for a late response to an already-answered question, but I don't think base64_decode($x,true) is a good enough solution for this problem. In fact, there may not be a very good solution that works against any given input. For example, I can put lots of bad values into $x and not get a false return value.

var_dump(base64_decode('wtf mate',true));string(5) "���j�"var_dump(base64_decode('This is definitely not base64 encoded',true));string(24) "N���^~)��r��[jǺ��ܡם"

I think that in addition to the strict return value check, you'd also need to do post-decode validation. The most reliable way is if you could decode and then check against a known set of possible values.

A more general solution with less than 100% accuracy (closer with longer strings, inaccurate for short strings) is if you check your output to see if many are outside of a normal range of utf-8 (or whatever encoding you use) characters.

See this example:

<?php$english = array();foreach (str_split('az019AZ~~~!@#$%^*()_+|}?><": Iñtërnâtiônàlizætiøn') as $char) {  echo ord($char) . "\n";  $english[] = ord($char);}  echo "Max value english = " . max($english) . "\n";$nonsense = array();echo "\n\nbase64:\n";foreach (str_split(base64_decode('Not base64 encoded',true)) as $char) {  echo ord($char) . "\n";  $nonsense[] = ord($char);}  echo "Max nonsense = " . max($nonsense) . "\n";?>

Results:

Max value english = 195Max nonsense = 233

So you may do something like this:

if ( $maxDecodedValue > 200 ) {} //decoded string is Garbage - original string not base64 encodedelse {} //decoded string is useful - it was base64 encoded

You should probably use the mean() of the decoded values instead of the max(), I just used max() in this example because there is sadly no built-in mean() in PHP. What measure you use (mean,max, etc) against what threshold (eg 200) depends on your estimated usage profile.

In conclusion, the only winning move is not to play. I'd try to avoid having to discern base64 in the first place.


function is_base64_encoded($data){    if (preg_match('%^[a-zA-Z0-9/+]*={0,2}$%', $data)) {       return TRUE;    } else {       return FALSE;    }};is_base64_encoded("iash21iawhdj98UH3"); // trueis_base64_encoded("#iu3498r"); // falseis_base64_encoded("asiudfh9w=8uihf"); // falseis_base64_encoded("a398UIhnj43f/1!+sadfh3w84hduihhjw=="); // false

http://php.net/manual/en/function.base64-decode.php#81425


I had the same problem, I ended up with this solution:

if ( base64_encode(base64_decode($data)) === $data){    echo '$data is valid';} else {    echo '$data is NOT valid';}