Home Development of Websites BabelFishis a polyglot in the JavaScript world

BabelFishis a polyglot in the JavaScript world

by admin

BabelFishis a polyglot in the JavaScript world
The Internet brings globality to our lives.And many web resources are not limited to audiences living in the same country and speaking the same language. However, maintaining multiple language versions of a site manually is an unpleasant undertaking and, starting at a certain scale, unlikely to be realistic.
For example, there are currently over 15000 phrases in the REG.RU dictionaries, of which about 200 use declension, and over 2000 use variable substitution. Every day at least 10 phrases are added. And this despite the fact that we just started localization of the site and there are plans for new languages ahead.
Although the tasks of internationalization and localization of software (including on the web) are not new, and, in general, quite standard, there are not many good universal tools for their solution. It is not always easy to select such a tool for a particular stack of client and server technologies, especially if you want to use the same tool for both.
DON’T PANIC.
A package was recently published BabelFish 1.0, which is designed to internationalize JavaScript applications.
We liked the ideas behind it so much that we even ported them to Perl as a CPAN module Locale::Babelfish , and use that for Perl applications. But let’s go back to the JavaScript implementation.

Overview

BabelFishis a polyglot in the JavaScript world
So what are the features of this library?

  • A very convenient and compact syntax for declension and substitution.
  • Ability to work on both server and client (older browsers will require a support package es5-shim ).
  • Automatically converts data structures to a "flat" form.
  • Ability to store and give complex structures instead of text.

Let’s look at some examples of the module’s capabilities. A typical phrase looks like this :
In the sky #{cachalotes_count} ((sperm whale|sperm whale|sperm whale)):cachalotes_count.
It also supports exact matching and the possibility of nested interpretation of variable occurrences. A typical example is when instead of "0 sperm whales" we want to write "no sperm whales", instead of "1 sperm whale" just "sperm whale", while leaving the spelling "21 sperm whales":
((=0 no sperm whales|=1 sperm whale|#{count} sperm whale|#{count} sperm whale|#{count} sperm whale))
Note that if a variable named count is used, you can omit its name with a colon at the end of the phrase.
Babelfish API offers the method t(locale, key, parameters) to resolve a key in a particular localeto the finished text or data structure. The call looks like this :

babelfish.t( 'ru-RU', 'some.complex.key', { a: "test" } );babelfish.t( 'ru-RU', 'some.complex.key', 17 ); // the variables count and value will be equal to 17

To make code easier to read and print less, you usually create a method like this (coffee):

window.t = t = (key, params, locale) ->locale = _locale unless locale?babelfish.t.call babelfish, locale, key, params

Here the locale is moved to the end of the argument list and becomes optional. You can now write briefly :

t( 'some.complex.key', { a: "test" } );// both entries below are equivalent :t( 'some.complex.key', 17 );t( 'some.complex.key', { count => 17, value => 17 } );

The downside of conciseness of syntax is that translators (dictionary and pattern staff) need to get used to the syntax, even though it is not complicated.
The solution is to provide an interface for translators that offers, in addition to the phrase to be translated, the context of the phrase, fixtures with the typical data used in its formation, and a viewing area for the results.
It is also useful to provide snippets that insert ready-made constructs for declination and variable substitution.
Consider the process of integrating Babelfish into your application on the browser side.

Installation

Babelfish is available as both an npm package and a bower package. If you need to work with both Node.JS and browsers at the same time, we recommend using the npm package + browserify (an example is at babelfish demo ), but most developers will find it easier to use bower.
Here we assume that the current locale is defined as window.lang:

# assets/coffee/babelfish-init.coffeedo (window) ->"use strict"BabelFish = require 'babelfish'locale = switch window.langwhen 'ru' then 'ru-RU'when 'en' then 'en-US'else window.langwindow.l10n = l10n = BabelFish()l10n.setFallback 'be-BY', [ 'ru-RU', 'en-US' ]window.t = t = (args...) ->l10n.t.apply l10n, [ locale ].concat(args)null

Storage and compilation of dictionaries

Internal format

Dictionaries are generated in Babelfish’s internal format, which allows you to bind not only text but other data structures to the key. Mechanism for serialization and deserialization of dictionaries in JSON is attached (stringify/load).
In fact, you can add phrases to dictionaries like this :

babelfish.addPhrase( 'ru-RU', 'some.complex.key', 'key text' );babelfish.addPhrase('ru-RU', 'some.complex.anotherkey', 'other key text' );

Or so :

babelfish.addPhrase('ru-RU', 'some', {complex: {key: 'key text', anotherkey: 'otherkey text'}});

When adding complex data structures, you can specify the flattenLevel parameter (false or 0), after :

babelfish.addPhrase('ru-RU', 'myhash', {key: 'key text', anotherkey: 'another key text'}, false);

And then when we call t(‘myhash’) we will get an object with key and another key. This is very useful when localizing external libraries (for example, to provide configurations for jQuery UI plugins).
The only requirement for serialization of such data is that it can be represented in JSON format.
Note that Babelfish uses lazy (delayed) compilation to parse syntax. That is, for phrases with parameters, functions will be generated the first time they are used and the next time they are called, the result will be generated quickly. On one hand this simplifies serialization, on the other hand it can be a problem if you use paranoid CSP policies (which forbid eval and Function() in the browser). The package author is not against implementing compatibility mode, so if you really need it, just create a ticket in project tracker.

YAMLformat

For most applications, the YAML format, which is also supported "out of the box", is more suitable. I would recommend storing data in this format by compiling it into an internal format before use. In particular, dictionaries can be combined with each other and given to the client as plain JavaScript.
This converts the nested YAML keys into a flat structure :

 some:complex:key: "Some text at least of #{count}" 

is converted to some.complex.key.
By the way, Babelfish can automatically, without direct instruction, recognize not just phrases, but lists (as complex data structures). So, if you specify

 mylist:- british- irish 

That when calling t('mylist') we get ['british', 'irish' ] This will come in handy later on.

Transformations of localization phrases

BabelFishis a polyglot in the JavaScript world
Usually we need to perform additional transformations on phrases before compiling them. These we have include such as :

  • Conversion from Markdown to HTML;
  • typography;
  • adding classes and attributes specific to our BEM implementation.

Automatic typography is useful to everyone, and using the Markdown format makes it easier to both read text and interact with translators.
We put the original dictionaries in the assets/locales directory, converting them further into ready-to-use ones in config/locales.
Clearly, your transformation stack is likely to be different from ours.
And here is an example of compiling YAML dictionaries to Babelfish internal format with Markdown conversion (grunt):

# Gruntfile.coffee# need packages glob, marked, traversemarked = require 'marked'traverse = require 'traverse'grunt.registerTask 'babelfish', 'Compile config/locales/*.<locale> .yaml to Babelfish assets', ->fs = require 'fs'Babelfish = require 'babelfish'glob = require 'glob'files = glob.sync '**/*.yaml', { cwd: 'config/locales' }reFile = /(^|.+/)(.+).([^.]+).yaml$/# do not wrap each line with <p>renderer = new marked.Renderer()renderer.paragraph = (text) ->textfor file in filesm = reFile.exec(file)continue unless m[folder, dict, locale] = [m[1], m[2], m[3], '']b = Babelfish localetranslations = grunt.file.readYAML "config/locales/#{folder}#{file}"# mdtraverse(translations).forEach (value) ->if typeof value is 'string'@update marked( value, { renderer: renderer } )b.addPhrase locale, dict, translationsres = "// #{file} translationn"res += "window.l10n.load("res += b.stringify localeres += ");n"resPath = "assets/javascripts/l10n/#{folder}#{dict}.#{locale}.js"grunt.file.write resPath, resgrunt.log.writeln "#{resPath} compiled."

Now you can paste and connect the finished scripts to your application in any way you want.

Localeselection

To choose a server-side locale, the most correct way is to parse the Accept-Language header. The following npm-module will help us locale You can also find the source code of nodeca.core

Rollback to another locale

Babelfish maintains a list of rollback rules for other locales in case the desired phrase is not in the current locale.
For example, we want to take data for Belarusian locale in order of priority from Belarusian, Russian and English locales :

babelfish.setFallback( 'be-BY', [ 'ru-RU', 'en-US' ] );

Localization

In addition to internationalization, we also have the task of localizing the application.In particular, we need to be able to format currencies, dates, and time ranges in a locale-sensitive way, for example.

Localization of dates

Let’s use the slightly modified date formatting data from Rails:

# config/locales/formatting.ru-RU.yamldate:abbr_day_names:- Sun- Mon- Tue- Wed- Thu- Fri- Satabbr_month_names:-- Jan.- Feb.- Martha- Apr.- May- June- July- Aug.- Sept.- Oct.- Nov.- dec.day_names:- Sunday- Monday- Tuesday- Wednesday- Thursday- Friday- Saturdayformats:default: '%d.%m.%Y'long: '%-d%B%Y'short: '%-d %b'month_names:-- January- February- Martha- April- May- June- July- August- September- October- November- Decemberorder:- day- month- yeartime:am: until noonformats: default: '%a, %d %b %Y, %H:%M:%S %z' long: '%d %B %Y, %H:%M' short: '%d %b, %H:%M' pm: in the afternoon

# assets/coffee/babelfish-init.coffeestrftime = require 'strftime'l10n.datetime = ( dt, format, options ) ->return null unless dt formatdt = new Date(dt * 1000) if 'number' == typeof dtm = /^([^.%]+).([^.%]+)$/.exec formatformat = t("formatting.#{m[1]}.formats.#{m[2]}", options) if mformat = format.replace /(%[aAbBpP])/g, (id) ->switch idwhen '%a't("formatting.date.abbr_day_names", { format: format })[dt.getDay()] # wdaywhen '%A't("formatting.date.day_names", { format: format })[dt.getDay()] # wdaywhen '%b't("formatting.date.abbr_month_names", { format: format })[dt.getMonth() + 1] # monwhen '%B't("formatting.date.month_names", { format: format })[dt.getMonth() + 1] # monwhen '%p't((if dt.getHours() < 12 then "formatting.time.am" else "formatting.time.pm"), { format: format }).toUpperCase()when '%P't((if dt.getHours() < 12 then "formatting.time.am" else "formatting.time.pm"), { format: format }).toLowerCase()strftime.strftime format, dt

Now we have a helper :

window.l10n.datetime( unix timestamp or Date object, format_string_or_config ).

Similarly, you can build helpers for currencies and other localizable values.

Other realizations

Babelfish parser is built on PEG.js.With some modifications you can use its grammar in other PEG parsers as well.Given the lack of syntax binding to JavaScript and the ease of use, we can assume that implementations of Babelfish for other platforms will be published.
As I mentioned above, we implemented the Babelfish 1.0 dialect for the Perl language.

Conclusion

To illustrate Babelfish’s capabilities, we have published a small demonstration project using marked and jade
I must say that in the process of using in our project, some of Babelfish’s capabilities were significantly expanded as a result of our queries. For example, storing complex data structures was actually transferred to Babelfish from our Perl project.
As is usually the case with nodeca, they have produced a thoughtful, quality and forward-looking library. Just a reminder that they have developed such hits as js-yaml , mincer , argparse , pako and markdown-it
Special thanks to the author of the module Vitaly Puzrin ( @puzrin ). This article was prepared with the active participation of the REG.RU development department, in particular : IgorMironov , dreamworker , nugged TimurN

You may also like