Design considerations for internationalization Design considerations for internationalization php php

Design considerations for internationalization


Our game Gemsweeper has been translated to 8 different languages. Some things I have learned during that process:

  • If the translator is given single sentences to translate, make sure that he knows about the context that each sentence is used in. Otherwise he might provide one possible translation, but not the one you meant. Tools such as Babelfish translate without understanding the context, which is why the result is usually so bad. Just try translating any non-trivial text from English to German and back and you'll see what I mean.

  • Sentences that should be translated must not be broken into different parts for the same reason. That's because you need to maintain the context (see previous point) and because some languages might have the variables at the beginning or end of the sentence. Use placeholders instead of breaking up the sentence. For example, instead of

"This is step" "of our 15-step tutorial"

Write something like:

"This is step %1 of our 15-step tutorial"

and replace the placeholder programmatically.

  • Don't expect the translator to be funny or creative. He usually isn't motivated enough to do it unless you name the particular text passages and pay him extra. For example, if you have and word jokes in your language assets, tell the translator in a side note not to try to translate them, but to leave them out or replace them with a more somber sentence instead. Otherwise the translator will probably translate the joke word by word, which usually results in complete nonsense. In our case we had one translator and one joke writer for the most critical translation (English).

  • Try to find a translator who's first language is the language he is going to translate your software to, not the other way round. Otherwise he is likely to write a text that might be correct, but sounds odd or old-fashioned to native speakers. Also, he should be living in the country you are targeting with your translation. For example a German-speaking guy from Switzerland would not be a good choice for a German translation.

  • If any possible, have one of your public beta test users who understands the particular translation verify translated assets and the completed software. We've had some very good and very bad translations, depending on the person who provided it. According to some of our users, the Swedish translation was total gibberish, but it was too late to do anything about it.

  • Be aware that, for every updated version with new features, you will have to have your languages assets translated. This can create some serious overhead.

  • Be aware that end users will expect tech support to speak their language if your software is translated. Once again, Babelfish will most probably not do.

Edit - Some more points

  • Make switching between localizations as easy as possible. In Gemsweeper, we have a hotkey to switch between different languages. It makes testing much easier.

  • If you are going to use exotic fonts, make sure these include special characters. The fonts we chose for Gemsweeper were fine for English text, but we had to add quite a few characters by hand which only exist in German, French, Portughese, Swedish,...

  • Don't code your own localization framework. You're probably much better off with an open source framework like Gettext. Gettext supports features like variables within sentences or pluralization and is rock-solid. Localized resources are compiled, so nobody can tamper with them. Plus, you can use tools like Poedit for translating your files / checking someone else's translation and making sure that all strings are properly translated and still up to date in case you change the underlying source code. I've tried both rolling my own and using Gettext instead and I have to say that Gettext plus PoEdit were way superior.

Edits - Even More Points

  • Understand that different cultures have different styles of number and date formats. Numbering schemes are not only different per culture, but also per purpose within that culture. In EN-US you might format a number '-1234'; '-1,234' or (1,234) depending on what the purpose of the number is. Understand other cultures do the same thing.

  • Know where you're getting your globalization information from. E.g. Windows has settings for CurrentCulture, UICulture, and InvariantCulture. Understand what each one means and how it interacts with your system (they're not as obvious as you might think).

  • If you're going to do east Asian translating, really do your homework. East-Asian languages have quite a few differences from languages here. In addition to having multiple alphabets that are used simultaneously, they can use different layout systems (top-down) or grid-based. Also numbers in east Asian languages can be very different. In the en-US you only change systems for limited conditions (e.g. 1 versus 1st), there are additional numeric considerations besides just comma and period.


  • My menus and various lists in the application are sorted alphabetically for each language for easier reading.

lists should be sorted, menus shouldn't. keep in mind that a given user might want to use your application in more than one language, he should still find everywhere in the same place.

the same with shortcuts, if you have any: do not translate them.

also, remember that internationalization and translation are two very different things, manage them separately.


When we worked on the i18n/l10n issues of Dreamfall and Age of Conan, we came across a few issues that are worth keeping in mind. Some of these we solved, some were solved for us, and some we worked around. Some we never solved...

  • Make sure all your tools and all your code supports all the charsets you want to use, and double check that assumption twice during the course of the project and a couple more times to be sure.

  • Make sure you use a font that supports all the languages you want to use. Most fonts that claim to be unicode are only unicode in the sense that the characters it has is at the correct codepoint. It does not mean it has usable characters for all codepoints.

  • Text-wrapping is not only done at spaces, as some languages don't use space to separate words (chinese comes to mind). Make sure your text-wrapping routines handles text without any spaces at all.

  • Handling plural correctly is tricky in the easy cases, and damned hard in the hard cases. Make sure you know enough about the languages you'll be using to be able to write code to handle the plural issue correctly. Keep in mind that english (and the other "western" languages are among the easy ones.

  • Never break sentences and build strings with them to fit a variable, as the variable might be placed elsewhere in the sentence in a different language. Use placeholders.

  • Keep in mind that for some languages, the value of the placeholder might change how to write the sentence. Grammar is hard. Make sure you have a plan for dealing with it. (Specifically, make sure you have a way to classify the values you use in the placeholders according to gender, time, etc).