W3C home > Mailing lists > Public > public-multilingualweb-lt@w3.org > April 2012

Re: [all] suggestions for consolidating requirements

From: Shaun McCance <shaunm@gnome.org>
Date: Fri, 27 Apr 2012 18:20:22 -0400
To: public-multilingualweb-lt@w3.org
Message-ID: <1335565222.2136.923.camel@recto>
On Thu, 2012-04-26 at 13:23 +0100, David Lewis wrote:
> DaveL, Moritz: Could localeSpecificContent be consolidated with
> dropRule, e.g. specifying the content should be drops for specific
> locales, for every translation, or for every translation except
> specified locales?

I'll weigh in on this, as the person behind dropRule. I think dropRule
might be redundant with localeSpecificContent, depending on how the
processing expectation are defined. In fact, I'd considered proposing
exactly this earlier.

localeSpecificContent is more powerful, of course, and can be used to
exclude something only from certain translations.

To understand how dropRule is implemented in itstool, and how I would
implement localeSpecificContent, let me outline the workflow:

A) People write XML source content, possibly augmented with custom
   ITS rules and attributes.
B) We generate a POT file from the source content using built-in ITS
   rules plus any custom rules or attributes. A POT file is not tied
   to any language. It just contains the strings to translate along
   with some metadata about the strings.
C) Translators either create a PO file from the POT file, or they
   merge the POT file with their existing PO file. PO files are where
   actual translations are kept. There's one PO file per language.
   They do whatever they do to make good translations.
D) We extract the translated strings from the PO files and rewrite
   the source XML files, again applying ITS rules to see what to
   apply where.

I know a lot of the new stuff being worked on by the working group
deals with what happens in (C), but everything for this (and really,
for ITS 1.0 in general) is in (B) and (D). Here's what happens with

B) Dropped content is effectively treated as non-translatable content,
   and is not included in the POT file.
D) Dropped content is actually dropped from the info set when content
   is merged from the PO files.

With localeSpecificContent, I can't do the same thing in (B), because
I'm generating a template file used by all translations. I can put a
comment on the string to the effect of "This string will only be used
by ... translations." I can do the same thing in (D).

*However*, one of the messy bits of reality I've had to deal with in
itstool is that the language codes used in XML are not the same as
the POSIX locale codes used on GNU/Linux and Unix-like systems. I get
PO files with a POSIX locale specified, and I have to convert that to
a BCP47 language tag for e.g. xml:lang attributes.

As a result, I already have special handling for the identifier "C",
which in POSIX locales means "whatever the source language is". So
at least for itstool users, I know that none of the translations are
in the "C" locale. That means I can reliably drop strings from the
POT files in (B) with this:

<its:localeSpecificContent selector="//mal:comment" locale="C"/>

So basically, if the specification for localeSpecificContent is
worded in a way that makes what I've outlined here conformant,
then I don't need dropRule.

Postscript: Language and locale often get jumbled up, but I don't
personally think any of this has anything to do with locales. This
is about language-specific content, and the locale attribute holds
a list of language tags.

Received on Friday, 27 April 2012 22:20:48 UTC

This archive was generated by hypermail 2.3.1 : Tuesday, 6 January 2015 21:08:16 UTC