Special characters in Android sms Special characters in Android sms android android

Special characters in Android sms


You are suffering from encoding problems. From the description it looks like 'A' is sending data in one charset and not including information about what charset that is. The root cause is that to pass extended (non-ascii) characters between two systems they have to agree on an encoding to use. If you are restricted to 8 bit values then the systems agree to use the same codepages. In SMS there is a special GSM codepage for 7 or 8 bit encodings or UTF-16 can be used which uses 2 bytes to represent each character. What you see when you enter 250 characters followed by a single extended character shows you what is happening in the application. An SMS message is restricted to 140 octets. When you are using an 8 bit encoding your 250 chars fit into 2 messages (250 < 280) however once you added the "รง" the app changed to using UTF-16 encoding so suddenly all your characters are taking 2 octets and you can only fit 70 characters into a message. Now it takes 3.5 SMS messages to transfer the entire message.

On Android the decoding of the SMS message is part of the framework telephony code in SmsCbMessage.java. It works out the language code and encoding of the message body. If this is incorrect (the message was encoded with an english codepage but uses french extended chars) then you can get odd characters appearing.

You are right that this is not the mobile network at fault. I suspect it is phone A's messaging application although it is possible that Android is failing to correctly identify the encoding of a valid SMS. I wonder how it works between A and an iPhone or some other manufacturers device.


I have encountered the same problem when I had to show a few special characters in an sms unicode app. The method I used was take the string that I need to send as sms, run it in a for loop to take each character , find its ascii code , use that integer value to encode that string using a delimiter. This string can be sent as sms, which needs to be decoded using the same delimiter that is used for sending, then convert each ascii code char in it to characters (language specific), form a string by appending the converted chars. This text will be same as the one that was sent as sms.

Regards