Automated localization build tool

From CDOT Wiki
Revision as of 17:48, 13 October 2007 by Armenzg (talk | contribs) (regular expressions information)
Jump to: navigation, search

Project Name

automated localization tool - This tool should allow that given a locale and a set of changes (unknown how) gives a build of the same language but different region (e.g. en-IN from en-GB)

Project Description

  • THE BUG - Bug 399014 – we need an l10n-merge tool
  • Learn python
  • Understand the scripts from the test l10n tools
  • Understand the l10n build system
  • Reproduce en-IN from en-GB
  • Determine what our Python based system will "do" in 0.1 release

Tool's 0.1 Release Functionality & Features

  • Should be able to accept a localization file
  • Should be able to accept a Firefox build (eg; en-GB or en-US)
  • Stores file's data locally (need to use IO if .txt file (or) XML Parser if it's an .xml file). So this would be taking the key/value pairs and storing them in a Dictionary (Dictionary is the equivalent of a Map in Java) or something along those lines.
  • Read through every DTD and Properties file in the current directory with the "Parser.py" file
  • Changes the word "color" to "colour" in every DTD file and have it saved

Project Leader(s)

Project Details

  • Our script for now; We will be also posting on the bug 399014
  • We are also awaiting for some code that dynamis has been working on in Japan
  • Notes from Axel(pike) about the project
  • Team notes - we collect notes related to the project
  • Armen's MozDev process - diary - You can read notes of what Armen has been trying
  • The l10n tools are in mxr.m.o/mozilla/source/testing/l10n
  • The file we have been using: Parser.py using the DTDParser()
  • mozilla/tools/l10n/l10n.py might give you an idea of what it takes to copy existing data over to a new location
  • To get the l10n tools type: $> cvs -z3 -d:pserver:anonymous@cvs-mirror.mozilla.org:/cvsroot co mozilla/tools/l10n mozilla/testing/tests/l10n


  • Some notes from trying to get en-GB (read more):
* make -f client.mk l10n-checkout MOZ_CO_PROJECT=browser MOZ_CO_LOCALES=en-GB LOCALES_CO_TAG=HEAD
* An option for the .mozconfig: mk_add_options MOZ_OBJDIR=@TOPSRCDIR@/../en-GB
  • To get en-US:
* 
  • To check the completness of your localization //takes long
* make -f tools/l10n/l10n.mk check-l10n
  • To fill out what is missing in the source of your localization
* MOZ_CO_PROJECT=browser make -f tools/l10n/l10n.mk create-en-GB

Regular expressions

>> color(s) -> colour(s) -- re.sub(r'([Cc])olor', r'\1olour', instring)
>> dialogue  -> dialog -- re.sub(r'([Dd])ialogue', r'\1ialog', instring)
>> Go forward -> Go forwards -- 
>> Minimize -> Minimise
>> Center -> Centre
>> Organize -> Organise
>> Customize -> Customise

DTD regular expression analysis

  • An analysis of regular expression, specific to DTDs (from Parser.py)
self.key = re.compile('<!ENTITY\s+([\w\.]+)\s+(\"(?:[^\"]*\")|(?:\'[^\']*)\')\s*>', re.S)
\s+ - one or more (??) blank spaces, tabs, end of line, and others whitespace elements
([\w\.]+) - one or more alphanumeric characters and/or(??) a dot
\s+ - more of the same
(\"(?:[^\"]*\")|(?:\'[^\']*)\') - if the left of '|' matches the right part doesn't get analyzed
 * (\"(?:[^\"]*\") - matching something in between " and "
  * (?:[^\"]* - I'M NOT SURE OF THIS PART
 * (?:\'[^\']*)\') - 
 * 
\s* - none or more(??) white characters
re.S - makes the dot to match even new lines - it is like raising DOTALL  flag
A matching line:
* <!ENTITY  colorsDialog.title              "Colors">

Related regular expressions theory

* (...) what is inside the parentheses are a group - the contents of a group can be retrieved after a match has been performed, and can be matched later in the string with the \number special sequence
* (?...) - This is an extension notation - Extensions usually do not create a new group; (?P<name>...) is the only exception to this rule. Following are the currently supported extensions.
* List of supported extensions: (?iLmsux), (?:...), (?P<name>...), (?P=name), (?#...), (?=...), (?!...), (?<=...), (?<!...), (?(id/name)yes-pattern|no-pattern)
* \number - Matches the contents of the group of the same number. Groups are numbered starting from 1.

Project news

There are some common news from the collaborators that should be written here rather than splitting it between the collaborators:

  • Sep. 24, 2007 - We are going to have a call conference with Michal from Toronto office
  • Oct. 05, 2007 - Python will be our language of choice for this project which is a great opportunity to thoroughly learn it since it will be our first time using it. Determined some main tasks ahead of us before 0.1 release (tasks mentioned in Product Description).
  • Oct. 07, 2007 - Added an 0.1 Release Functionality & Features section to the wiki so we have a clear description of what our project's 0.1 release should be able to do.
  • Oct. 12, 2007 - Updated 0.1 Release Functionality & Features section. A lot of the 0.1 code has been done.

Bugs

- There are no bugs related specifically to the project, but the following were mentioned in our conversations

Project Contributor(s)