[UE4] Localising your game

Back to Localisation Overview.

Introduction

Localisation is something that people often don’t think about until the end of their project, which is why the UE4 localisation system aims to stay out of your way as much as possible. Basically, you’ll be fine as long as you use FText and its associated functions for your user-facing text (with some caveats about text formatting and shipping with the correct internationalisation support - both of which were covered in the previous post).

This post will cover what you need to know to localise the text in your game.

What are cultures?

Cultures in UE4 contain all the internationalisation information for a particular locale.

Their names are composed of three hyphen separated parts (an IETF language tag):

  • A two-letter ISO 639-1 language code (eg, “zh”).
  • An optional four-letter ISO 15924 script code (eg, “Hans”).
  • An optional two-letter ISO 3166-1 country code (eg, “CN”).

When UE4 looks for a translation to use for a particular culture, it processes them from most to least specific; for example:

  • zh-Hans-CN:
    • zh-Hans-CN
    • zh-CN
    • zh-Hans
    • zh
  • en-GB:
    • en-GB
    • en

This means that you should choose the least specific viable culture to achieve maximum coverage for a particular translation. In most cases that’s just the language code, however there are a couple of exceptions (there may be more, but these are the ones I have personal experience of).

Chinese

Chinese has two variants, Simplified and Traditional, represented by the “Hans” and “Hant” script codes. You’ll want to use “zh-Hans” and “zh-Hant” for your Simplified and Traditional translations.

Note: You may notice that UE4 itself uses “zh-CN” for its Simplified Chinese translation. This is because not all platforms give you a script code for Chinese, and historically UE4 didn’t have a way to infer the script code if it was missing. This has since been fixed (so if you’re given “zh-CN” by a platform, UE4 will infer “Hans” for it, and treat it as “zh-Hans-CN”), and we’ll likely be migrating the Simplified Chinese translation of UE4 to “zh-Hans” sometime this year.

Spanish

Spanish has two main variants, European and Latin American, however there is no convenient script code that can be used to differentiate these. There is an IETF language tag for Latin American Spanish, “es-419”, however few platforms will give you this, preferring instead to give you an actual country code (eg, “es-MX”).

To solve this, we recommend that you use “es” for European Spanish and “es-419” for Latin American Spanish, and then use the culture re-mapping feature of UE4 to map the Latin American Spanish speaking countries onto “es-419”.

This is done by adding the following to your DefaultGame.ini file:

[Internationalization]
+CultureMappings="es-AR;es-419"
+CultureMappings="es-BO;es-419"
+CultureMappings="es-CL;es-419"
+CultureMappings="es-CO;es-419"
+CultureMappings="es-CR;es-419"
+CultureMappings="es-CU;es-419"
+CultureMappings="es-DO;es-419"
+CultureMappings="es-EC;es-419"
+CultureMappings="es-GT;es-419"
+CultureMappings="es-HN;es-419"
+CultureMappings="es-MX;es-419"
+CultureMappings="es-NI;es-419"
+CultureMappings="es-PA;es-419"
+CultureMappings="es-PE;es-419"
+CultureMappings="es-PR;es-419"
+CultureMappings="es-PY;es-419"
+CultureMappings="es-SV;es-419"
+CultureMappings="es-US;es-419"
+CultureMappings="es-UY;es-419"
+CultureMappings="es-VE;es-419"

What is the localisation pipeline?

The UE4 localisation pipeline is all modelled around an “author-at-source” approach to localisation. This means that if you need some text saying “Hello World!” in your UI, you just type “Hello World!” into the text property (or use the NSLOCTEXT or LOCTEXT macro in C++) and something else takes care of grabbing that text so that it can be localised.

While the “author-at-source” approach is very dynamic and avoids people having to think about localisation during development, it can also be very frustrating for teams that want strict control over the text used in their game. To address this 4.16 will be adding support for String Tables to allow an “author-once-and-reference” approach to localisation (although internally the pipeline just treats String Tables as an “author-at-source” source).

Note: People wanting more strict control over their text would often work around the lack of native String Tables by using a “fake” native culture (most likely “es-US-POSIX”) with IDs as the source text. Then, taking advantage of the string collapsing that we had prior to 4.14, they would translate these IDs into each language they wanted to target. As of 4.16 this approach is no longer recommended, and people using it should consider transitioning to String Tables.

The localisation pipeline itself works on “localisation targets”, and a localisation target is made up of two parts; its configuration (stored in Config/Localization/), and its data (stored in Content/Localization/{TargetName}/).

If we assume a localisation target working with English (“en”) and French (“fr”), then its layout in the Content/Localization/ folder would look like this:

  • {TargetName}/
    • {TargetName}.manifest
    • en/
      • {TargetName}.archive
      • {TargetName}.po
      • {TargetName}.locres
    • fr/
      • {TargetName}.archive
      • {TargetName}.po
      • {TargetName}.locres

All of the above files and folders are generated by the various parts of the localisation pipeline.

  • {TargetName}.manifest:
    • Manifests are custom JSON files that store all of the text gathered from your source code and assets by the localisation pipeline.
    • Manifests are re-generated each time the localisation gather step is run, and shouldn’t be hand-edited.
  • {TargetName}.archive:
    • Archives are custom JSON files that store per-culture translations for the text gathered into the manifest.
    • Archives are trimmed each time the localisation gather or import steps are run to remove entries for old sources.
    • Archives may be hand-edited, however we strongly advise against doing so (preferring instead to edit the PO files).
  • {TargetName}.po:
    • PO (Portable Object) files contain the per-culture text to be translated, along with their current translations.
    • PO files are generated by the localisation export step, and are re-imported into the archives by the localisation import step.
    • PO is a common format, and may be edited locally either by-hand or via a translation tool like Poedit, or collaboratively via something like OneSky.
  • {TargetName}.locres:
    • LocRes are custom binary files that store the compiled per-culture translations for use at runtime.
    • LocRes files are re-generated each time the localisation compile step is run, and are the only files that get staged into a packaged game build.

What is the localisation dashboard?

The localisation dashboard is a tool that takes care of managing your localisation target configuration. While it’s still classed as “experimental”, it is stable (albeit a little clunky), and we use it internally for all of our games. It’s the recommended way to manage localisation target configuration.

Before you can use it, you’ll need to enable it via “Editor Settings” -> “Experimental” -> “Localization Dashboard”, and once enabled you can access the dashboard via “Windows” -> “Localization Dashboard”.

The dashboard will create a “Game” target for you by default, and unless your game is particularly complex this is likely the only target you will need. You can use the “Gather Text” settings to specify where your source code and assets can be found, and you can use the “Cultures” settings to specify which languages you’re going to localise your game for (you’ll also need to choose a “native” culture - this should be the language you author your content in).

Once your target is set-up, you can use the toolbar under the “Cultures” section to Gather, Export, Import, and Compile your game texts. This is a process that can be run iteratively over time as new translations become available, or as new source text is added.

Once these actions have been run, you’ll find some INI files under Config/Localization/ - these are generated each time those actions are run via the dashboard, and don’t need to be submitted to source control unless you plan to automate your localisation pipeline (automation is outside the scope of this post, but you can take a look at the “Localise” class (Localisation.Automation.cs) in the Unreal Automation Tool (UAT) for an example of how it would work).

Note: Currently the localisation dashboard and localisation commandlets (which form the localisation pipeline) have two completely different configuration layouts for a localisation target (with the localisation dashboard generating the commandlet version). I plan to unify these at some point into a leaner “convention-based-configuration” localisation target format that removes a lot of the repetition, along with all the old deprecated settings.

 
comments powered by Disqus