Rails I18N and L10N
When making multi-lingual web applications using the Ruby on Rails platform there is a lot of supporting code available to make the process easier.
Or more complicated.
This is a guide to how to make a Ruby on Rails application so that it is easy to build, easy to maintain, and easy to update as new translations become available. I'll walk you through the process of making your rails application in such a way that you don't tear your hair out trying to pull it all together correctly.
First let's discuss the history of I18N on Rails so that you have a context to work within and so that you can see what your Rails project may need to do in order to take advantage of the current I18N functionality within Rails.
Rails History
Rails 1.2 was capable of providing support for I18N in a somewhat limited way. There was a plugin called Globalize that did a really nice job of providing the ability to do a lot of things with regard to translation of text. It stored translated phrases within a database and also supported the use of translated view files. The database of translated phrases ended up not working out so well because it really affected performance of the application due to multiple database queries per web page being necessary to fetch all the possible translations for all possible phrases (even if they weren't used on a givenpage) for every single web page.
Rails 2.0 and 2.1 included some fundamentals intended to be useful for I18N/L10N support but these releases had so little actual functionality, and no compatible plugins or gems to fill the gap, that they are not very useful for supporting a multi-lingual web application.
Rails 2.2 revealed the first real glimpse of a useful set of functionality necessary to support an application in multiple languages. Even so it was limited to only supporting the use of translation phrases within the application. Entire views in different languages or for different locales were not possible.
Rails 2.3 now provides a more complete set of functionality for support of Internationalized web applications. It adds support for translated views. In addition some very good gems are available for getting the L10N job done well.
Because the capabilities of older versions of Rails were limited I'm going to focus on the use of Rails 2.3.2 (or newer) in the remained of this document.
The HOWTO Guide
What you will find here is a description of exactly how to setup your code structure, some useful rake tasks to create, and procedures for how to use all of this so that you can create localization (L10N) kits for translators and then readily use the files returned by the translators to localize the application.
gettext Overview
The approach taken here relies on the use of a set of gems that together provide a full and robust implementation of the well-known gettext mechanism. The gettext code and process were created many years ago as a part of the GNU project in order to support the translation of many UNIX utilities. Because gettext has been in use for so many years there are a significant body of translation tools built that know how to implicitly use the files associated with gettext.
There are two major file types used by gettext, a po file and an mo file. The po file is plain text while the mo file is a binary file designed for high performance lookups of translation keys. There is a third type of file, a pot file, that can be considered to be a 'master po' file. There is a gettext utility that converts a pot file into a po file for a given language that's ready to add translations into.
A po file is primarily made up of pairs of lines where the first line contains a translation lookup key labeled msgid and the second line contains a translated phrase labeled msgstr. Here's an example:
msgid "close_button"
msgstr "Fermer"
Note: as a general rule the gettext code will simply show the value for the msgid if it does not find an mo file for the currently specified language or locale. Many people rely on this as the mechanism for displaying English text, which is commonly the default locale for Rails applications unless explicitly specified. This is generally not a good practice because it results in slower lookups due to the use of lookup keys that are longer than truly necessary. Instead you should have an English po/mo file pair alongside the rest of the translations. This also provides the very valuable advantage of being able to change the English text without having to change any code!
Configuring Rails to use gettext
Configuration consists of several discrete elements.
gems
You must first install the necessary gems. I recommend that you install gettext and all dependencies at the same time like this:
gem install --include-dependencies gettext_rails
This will typically install gettext, gettext_rails, gettext_activerecord, locale, and locale_rails.
Note: You must use version 2.0.3 or newer. Version 2.0.3 currently has a bug that I've submitted a patch for that prevents it from displaying the default language view if a view for any other supported language is found. I expec thi to be fixed in version 2.0.4 of the locale_rails gem.
Next you'll need to revise your Rails code so that it uses the gettext gems. In your config/environment.rb file add the following within the RailsInitializer section:
config.gem "locale"
config.gem "locale_rails"
config.gem "gettext"
config.gem "gettext_rails"
config.gem "gettext_activerecord"
Do this in order to be thorough. Rails will probably find the gems anyway, but by adding these lines you also provide some documentation of your dependencies for other maintainers.
app/controllers/application_controller.rb
Next edit the main application controller. In Rails 2.3 the name of this file is now consistent with the naming conventions of the rest of the controller files, application_controller.rb. Add this line before any class or method definitions:
init_gettext 'myapp'
Of course you should use a label that represents the actual rails application that you are working with instead of 'myapp'. This value will be used when searching for po and mo files, the base file name will need to match this label. So for example if you actually used 'myapp' here then your po files would be named myapp.po and mo files would be myapp.mo.
config/initializers/i18n.rb
You'll need to create this file. This performs initialization of the I18N context at the time that the web server starts.
Note: if you make changes to this file or add entries to the locale directory you will need to restart the web server.
Place this code within the file:
# Tell the supported locales. This is the one of additional setting by Ruby-Locale for Ruby on Rails.
# If supported_locales is not set, the locale information which is given by WWW browser is used.
# This setting is required if your application wants to restrict the locales.
# This technique keys off the subdirectory names of the locale directory to establish the list of
# currently supported locales. Specify a plain language name for those cases where the country
# specific locale is not needed. Specify a full locale value, such as zh-Hant, for those cases
# where it is significant.
locale_list = Dir.entries(File.join(RAILS_ROOT, 'locale'))
locale_list.delete('.') # Remove current directory notation
locale_list.delete('..') # Remove parent directory notation
I18n.supported_locales = locale_list
# Tell the default locale. If this value is not set, "en" is set.
# With this library, this value is used as the lowest priority locale
# (If other locale candidates are not found, this value is used).
I18n.default_locale = "en"
The setup of the I18N.supported_locales is designed so that as new languages or locales are added to the application they'll be automatically picked up at the next web server startup. The advantage to this is that it works in conjunction with the locale_rails plugin to constrain the selection of locales to only thos actually available. The method that I'm showing you for configuration here sniffs the web browser's preferred languages list and selects the first one that matches a supported locale.
It is important to note that the directory names used are important. I generally will setup language-only directory names unless there is a specific need to differentiate the web page presentation based on a specific locale. Typically you want locale specific translations for views rather than PO text. For example a view in German for Austria might have different marketing messages and different support contact information than a view in German for Germany.
lib/tasks/gettext.rake
You'll need to create a rake file that helps with the maintenance of po and mo files. This is a huge win and absolutely should not be skipped. The contents of your rake file should be something like this:
desc "Update pot/po files."
task :updatepo do
require 'gettext_rails/tools' #HERE!
GetText.update_pofiles("myapp", Dir.glob("{app,lib,bin}/**/*.{rb,erb,rjs}"), "MyApp 1.0.0")
end
desc "Create mo-files"
task :makemo do
require 'gettext_rails/tools' #HERE!
GetText.create_mofiles
end
You'll again want to substitute your application label for 'myapp' and 'MyApp' here.
Later, when we have added some text translation handling to the code we'll be able to run rake in order to extract all of the translateable strings from the entire application all at once. This acts to guarantee that absolutely none of the translations will be missed, a classic problem when working with a site that has hundreds of different translated phrases scattered through dozens (or hundreds) of files.
This creates a pot file in po directory underneath the rails root. Generate your po files from it by using the msginit command. Here is an example of the sequence of steps to make a German po file. Begin in the po directory. Make a directory for the German po file. Generally using just the language works best unless you need to account for different character sets. Examples of languages with different character sets for the same language are Chinese and Romanian.
mkdir de
cd de
msginit -i ../myapp.pot -o myapp.pot --locale=de_DE
When you run the makemo task it creates the mo files in a directory structure whose top level directory is 'locale' with subdirectories for each language or locale that has an associated mo file. The gettext code will automatically find these mo files. You may need to restart your web server in order for the new translations to be seen due to the way that the gettext library handles caching of data for better performance.
Coding for gettext use
Writing code that uses the gettext library uses two primary methods. Lookup of words and phrases and selection of entire views.
Phrases
Now we're ready to actually write code that uses gettext! In order to obtain a translation lookup for the current language or locale we use a method named _(). Here's an example:
translated_string = _("hello_world")
If you were to include this someplace in your rails application at this point (remember, we have not yet made any mo files) you would see that the value of translated_string is 'hello_world'. Not particulary impressive.
Go ahead now and write or revise your code so that short phrases or paragraphs are written as shown above. If you have entire views that need to be translated then they should be handled differently. I'll explain that next.
Views
In order to present a translated view a naming convention is used. The default language is what will be found in an ordinarily named erb file such as index.html.erb. The rails default language is English so unless you've specified a different default then the regularly named views will contain English. If you want a view that works for any locale in a given language then use only the language code in the file name. If you want a locale specific view then use the full locale code.
Here are some examples:
- show.html.erb (Default English)
- show_en_UK.html.erb (British English)
- show_de.html.erb (German)
- show_de_DE.html.erb (German in Germany)
- _footer.html.erb (Default English)
- _footer_zh_Hans.html.erb (Traditional Chinese)
- _footer_zh_Hant.html.erb (Simplified Chinese)
- _footer_fr_FR.html.erb (France French)
- _footer_fr_CA.html.erb (Canadian French)
- _footer_fr.html.erb (French in locales other than Canada and France)
- _footer_en_UK.html.erb (British English)
As a general rule if the user's locale does not exactly match an available translation then an attempt is made to match the language portion of the locale. If that works then the translation is shown. If that does not work then the default language is shown, usually English.
When to Use a Phrase or View
Sometimes you'll find that it makes more sense to translate entire views rather than individual phrases. Here are the criteria that I recommend to help decide which to use.
- If you already have a view then translate the view.
- If your text is longer than a single sentence, use a view.
- If you have a phrase used within some JavaScript code use the _() method on it.
- If you have a phrase used within an ERB (or other) tag then use the _() method on it.
If you're reading carefully you'll realize that in the above I'm actually advocating the use of phrases within views in some cases. Let me explain the logic of that.
Translators have a tough job translating web sites. Each new technology that comes along seems to be just a little bit different. But translators are not, in general, technical people. They work best when provided clear and concise instructions on what to do with the strange looking stuff that we send them. From working with translators for many years now I've found that the following seems to work pretty well.
Anything needing translation that falls inside of an erb tag should be a translated phrase. So for example you have a view with the following HTML in it:
<p>Thanks for visiting us. You may <%= link_to _('place an order'), :controller => 'order', :action => 'index' -%> easily on our site. </p>
Translators understand ordinary HTML. They have no idea what that 'junk' is inside of the strange <% %> markers. So being able to say to them "Don't touch anything inside the <% %> markers" means that they will not translate the names of your controller or action and thus break the code and cause you headaches. As you see I've encapsulated the 'place an order' phrase inside the _() method so that it gets translated separately.
Note: there may be some situations where the translation that results from this is not sensible since the context doesn't work out right. In those cases you'll need to deviate from this rule on a case-by-case basis.
Making PO and MO Files
Now that you have at least one instance of using the _() method you can generate a pot file. Use the rake task we created earlier.
rake updatepo
You'll now find a directory named po under the RAILS_ROOT. It will contain a pot file whose basename is the label you used with init_gettext. From this file you can then generate po files for each language or locale that you plan to support.
Now use the utility called msginit, which is part of the operating system's gettext package, to properly make a po file from the pot file. Here's an example:
msginit --input=myapp.pot --locale=fr_FR
This will make a subdirectory underneath po called fr_FR that in turn contains a directory called LC_MESSAGES and that directory contains a file called myapp.po. This is your new po file! Do this for each locale or language that you plan to support.
As a general rule make your translations general to a language unless you need a specific locale or variant such as zh_Hans or pt_BR. It is usually more common to find that views are locale specific, particularly when marketing messages are targeted to a geography, and phrases are language generic.
Coding for Translation
When writing code that must handle translatable strings there are a number of things that you need to know for the best success.
The sentence structure of different languages can be significantly different from your native language. The order of different words can be completely opposite what you have come to expect with your own language. For example in English you might talk about "the red ball" but when speaking French the order of words is "the ball red" instead.
Many languages are very context sensitive in their phrasing. This means that the form of words is dependent on the other words surrounding them. Because of this it may sometimes not be possible to break up a sentence, or even a paragraph, into smaller pieces and then 'glue' them together into a complete concept that is syntactically correct. When possible setup things so that you're translating an entire sentence at once.
Pluralization
Sometimes you'll need to translate a phrase that contains a value from a variable. You might want to report on how many records were found matching some criteria. You might want to provide some details from a database that you just looked up.
When translating phrases that have to do with quantities of an item we run into some real challenges. Some languages use a single word form of an item regardless of how many are being described. English uses two word forms for items. Here's an example:
zero apples
one apple
two apples
three apples
Only in the case of a single apple is the word 'apple' singular. In all other cases, including zero, it is plural.
Not all languages are like this. Some have special word forms for zero items, other have special word forms for two, three, four and five of an item. This makes translation particularly tricky. Here's how to setup the po file for these cases.
#: myapp_plural.b:11
msgid "There is an apple.\n"
msgid_plural "There are %d apples.\n"
msgstr[0] "Ringo ga arimasu."
msgstr[1] "Ringo ga %d arimasu."
The integer array index for msgstr is used to determine which translation to select for a given situation. Refer to the GNU gettext manual (link is at the bottom under References) for more details.
References
The Rails Internationalization (I18N) API
The I18N Rails Guide
Ruby on Rails gettext
GNU gettext Manual