summaryrefslogtreecommitdiff
path: root/python/compare-locales/docs/index.rst
blob: 925ca0f88a8698e21b82b0d5c2a5430e3d92d32b (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
============
Localization
============

.. toctree::
   :maxdepth: 1

   glossary

The documentation here is targeted at developers, writing localizable code
for Firefox and Firefox for Android, as well as Thunderbird and SeaMonkey.

If you haven't dealt with localization in gecko code before, it's a good
idea to check the :doc:`./glossary` for what localization is, and which terms
we use for what.

Exposing strings
----------------

Localizers only handle a few file formats in well-known locations in the
source tree.

The locations are in directories like

    :file:`browser/`\ ``locales/en-US/``\ :file:`subdir/file.ext`

The first thing to note is that only files beneath :file:`locales/en-US` are
exposed to localizers. The second thing to note is that only a few directories
are exposed. Which directories are exposed is defined in files called
``l10n.ini``, which are at a
`few places <https://dxr.mozilla.org/mozilla-central/search?q=path%3Al10n.ini&redirect=true>`_
in the source code.

An example looks like this

.. code-block:: ini

    [general]
    depth = ../..

    [compare]
    dirs = browser
        browser/branding/official

    [includes]
    toolkit = toolkit/locales/l10n.ini

This tells the l10n infrastructure three things: Resolve the paths against the
directory two levels up, include files in :file:`browser/locales/en-US` and
:file:`browser/branding/official/locales/en-US`, and load more data from
:file:`toolkit/locales/l10n.ini`.

For projects like Thunderbird and SeaMonkey in ``comm-central``, additional
data needs to be provided when including an ``l10n.ini`` from a different
repository:

.. code-block:: ini

    [include_toolkit]
    type = hg
    mozilla = mozilla-central
    repo = http://hg.mozilla.org/
    l10n.ini = toolkit/locales/l10n.ini

This tells the l10n pieces where to find the repository, and where inside
that repository the ``l10n.ini`` file is. This is needed because for local
builds, :file:`mail/locales/l10n.ini` references
:file:`mozilla/toolkit/locales/l10n.ini`, which is where the comm-central
build setup expects toolkit to be.

Now that the directories exposed to l10n are known, we can talk about the
supported file formats.

File formats
------------

This is just a quick overview, please check the
`XUL Tutorial <https://developer.mozilla.org/docs/Mozilla/Tech/XUL/Tutorial/Localization>`_
for an in-depth tour.

The following file formats are known to the l10n tool chains:

DTD
    Used in XUL and XHTML. Also for Android native strings.
Properties
    Used from JavaScript and C++. When used from js, also comes with
    `plural support <https://developer.mozilla.org/docs/Mozilla/Localization/Localization_and_Plurals>`_.
ini
    Used by the crashreporter and updater, avoid if possible.
foo.defines
    Used during builds, for example to create file:`install.rdf` for
    language packs.

Adding new formats involves changing various different tools, and is strongly
discouraged.

Exceptions
----------
Generally, anything that exists in ``en-US`` needs a one-to-one mapping in
all localizations. There are a few cases where that's not wanted, notably
around search settings and spell-checking dictionaries.

To enable tools to adjust to those exceptions, there's a python-coded
:py:mod:`filter.py`, implementing :py:func:`test`, with the following
signature

.. code-block:: python

    def test(mod, path, entity = None):
        if does_not_matter:
            return "ignore"
        if show_but_do_not_merge:
            return "report"
        # default behavior, localizer or build need to do something
        return "error"

For any missing file, this function is called with ``mod`` being
the *module*, and ``path`` being the relative path inside
:file:`locales/en-US`. The module is the top-level dir as referenced in
:file:`l10n.ini`.

For missing strings, the :py:data:`entity` parameter is the key of the string
in the en-US file.

l10n-merge
----------

Gecko doesn't support fallback from a localization to ``en-US`` at runtime.
Thus, the build needs to ensure that the localization as it's built into
the package has all required strings, and that the strings don't contain
errors. To ensure that, we're *merging* the localization and ``en-US``
at build time, nick-named :term:`l10n-merge`.

The process is usually triggered via

.. code-block:: bash

    $obj-dir/browser/locales> make merge-de LOCALE_MERGEDIR=$PWD/merge-de

It creates another directory in the object dir, :file:`merge-ab-CD`, in
which the modified files are stored. The actual repackaging process looks for
the localized files in the merge dir first, then the localized file, and then
in ``en-US``. Thus, for the ``de`` localization of
:file:`browser/locales/en-US/chrome/browser/browser.dtd`, it checks

1. :file:`$objdir/browser/locales/merge-de/browser/chrome/browser/browser.dtd`
2. :file:`$(LOCALE_BASEDIR)/de/browser/chrome/browser/browser.dtd`
3. :file:`browser/locales/en-US/chrome/browser/browser.dtd`

and will include the first of those files it finds.

l10n-merge modifies a file if it supports the particular file type, and there
are missing strings which are not filtered out, or if an existing string
shows an error. See the Checks section below for details.

Checks
------

As part of the build and other localization tool chains, we run a variety
of source-based checks. Think of them as linters.

The suite of checks is usually determined by file type, i.e., there's a
suite of checks for DTD files and one for properties files, etc. An exception
are Android-specific checks.

Android
^^^^^^^

For Android, we need to localize :file:`strings.xml`. We're doing so via DTD
files, which is mostly OK. But the strings inside the XML file have to
satisfy additional constraints about quotes etc, that are not part of XML.
There's probably some historic background on why things are the way they are.

The Android-specific checks are enabled for DTD files that are in
:file:`mobile/android/base/locales/en-US/`.

Localizations
-------------

Now that we talked in-depth about how to expose content to localizers,
where are the localizations?

We host a mercurial repository per locale and per branch. Most of our
localizations only work starting with aurora, so the bulk of the localizations
is found on https://hg.mozilla.org/releases/l10n/mozilla-aurora/. We have
several localizations continuously working with mozilla-central, those
repositories are on https://hg.mozilla.org/l10n-central/.

You can search inside our localized files on
`Transvision <https://transvision.mozfr.org/>`_ and
http://dxr.mozilla.org/l10n-mozilla-aurora/.