Cross-platform unicode in C/C++: Which encoding to use?
UTF-8 on all platforms, with just-in-time conversion to UTF-16 for Windows is a common tactic for cross-platform Unicode.
Our software is cross-platform as well, and we faced similar problems. We decided that our goal is to have the least amount of conversions possible. This means that we use wchar_t
on Windows and char
on Unix/Mac.
We do this by supporting _T
and LPCTSTR
and similar on Unix and by having generic functions that easily convert between std::string
and std::wstring
. We also have a generic std::basic_string<TCHAR>
(tstring
) which we use in most cases.
So far this works quite well. Basicly most functions take a tstring
or a LPCTSTR
and those which don't will get their parameters converted from a tstring
. That means that most of the time we don't convert our strings and pass through most parameters.