W3C home > Mailing lists > Public > public-multilingualweb-lt@w3.org > July 2012

RE: [ACTION 107] Locale Filter

From: Yves Savourel <ysavourel@enlaso.com>
Date: Thu, 12 Jul 2012 09:08:06 +0200
To: "'Felix Sasaki'" <fsasaki@w3.org>
CC: "'Shaun McCance'" <shaunm@gnome.org>, <public-multilingualweb-lt@w3.org>
Message-ID: <assp.0540ba31be.assp.05404c4813.004401cd5ffd$18a12130$49e36390$@com>
+1 for Basic Filtering then.

It should be easy to implement, and powerful enough to not need to bring up a regex discussion.

 

-ys

 

From: Felix Sasaki [mailto:fsasaki@w3.org] 
Sent: Thursday, July 12, 2012 8:52 AM
To: Yves Savourel
Cc: Shaun McCance; public-multilingualweb-lt@w3.org
Subject: Re: [ACTION 107] Locale Filter

 

To go for filtering, one probably woud need to change only a small bit in the definition, see below.

2012/7/12 Yves Savourel <ysavourel@enlaso.com>

Hi Shaun, all,

Thanks for the nice definition.
I have just a few notes:


1- We define 'sublocale' as "A sublocale is any locale that can be formed only by adding subtags.", but we don't use the term anywhere in the text other than in the definition. Maybe we could get rid of the definition? (definition are often controversial).

2- The text "The Locale Filter data category is only valid on element nodes" seems to restrict the selector to point to element nodes. Is there s reason for that? I think we should be able to define a locale filter to attributes as well: the translatable ones may need to be excluded just like an element.

3- A bit later we have "...and associates with each node..." I think it associates only with the selected nodes. Maybe the sentence "The Locale Filter data category is only valid on element nodes, and associates with each node a filter type and a locale list." could probably be something like: "The locale filter data category associates the selected nodes with a filter type and a locale list".

4- (Very minor): I wonder if the paragraph "If the Locale Filter data category is specified multiple times for an element, the normal precedence rules apply, and the value of the locale list with the highest precedence is applied. The locale list is not a combination of values from multiple rules." is helpful: It repeats what the section 6.1 says and just adds the bit about not combining rules. Maybe it could be reduced to just states that? Note that we don't warn about combining rules in other data categories like locNote.

5- I agree with "include" and "exclude" rather than "positive" and "negative": it's much better.

6- And now, just for fun: All that same functionality could be defined with a single attribute that has a simple regex expression. You would try to match a given target locale code to the expression: if it matches it's included, if it doesn't it's not. "all" would be ".*", "none" would be "". Such mechanism is used in SRX for example to select which rules to apply to which languages.
I'm not really saying we should go that way, since a regex has its problems too. But it may be worth a thought, because what I'm really getting at is an implementation question: if I see "fr" does it mean "fr-ca" as well?
This may be answered by the use or not of the BCP47 filtering definition. In any case, we probably want something explicit about this.

Cheers,
-yves



-----Original Message-----
From: Shaun McCance [mailto:shaunm@gnome.org]
Sent: Thursday, July 12, 2012 5:52 AM
To: public-multilingualweb-lt@w3.org
Subject: [ACTION 107] Locale Filter

NB: I know I'm supposed to write the text for the Locale Filter data category, and I thought action 107 was for that, but I now see it's associated to issue 10, which is different. This is the proposal for Locale Filter.

----------------------

= Locale Filter

== Definition

The Locale Filter data category specifies that a node is only applicable to certain locales, or that it is not applicable to certain locales.

This data category can be used for several purposes, including, but not limited to:

 * Include a legal notice only in locales for certain regions.
 * Drop editorial notes from all localized output.

The Locale Filter data category is only valid on element nodes, and associates with each node a filter type and a locale list.

 

change "locale list" to "basic language range as defined in RFC 4647, section 2.1".

 

The locale filter type can take the following values:

 * "all": The element is included in all locales.
 * "none": The element is included in no locales.
 * "include": The element is only included in locales in the
   locale list, or sublocales of locales in the locale list.
 * "exclude": The element is included in all languages except
   those in the locale list, or sublocales of locales in the
   locale list.

The locale list is a comma-separated list of locale tags from BCP 47. The list MAY contain whitespace, which MUST be ignored.
A sublocale is any locale that can be formed only by adding subtags.

 

Change "The locale list .. adding subtags." to 

"The basic language range is applied for filtering locales, using the basic filtering approach as defined in RFC 4647, sec. 3.3.1.

 

Change "locale list" as needed below.

 

- Felix

 

 

 


If the locale filter type is "all" or "none", a locale list SHOULD NOT be provided. If one is, it MUST be ignored. If the locale filter type is "include" or "exclude", a locale list SHOULD be provided. If one is not, it MUST default to the empty list.

== Implementation

The Locale Filter data category can be expressed with global rules, or locally on an individual element. The information applies  to the textual content of the element, including child elements and attributes. The default is that the locale filter type is "all".

If the Locale Filter data category is specified multiple times for an element, the normal precedence rules apply, and the value of the locale list with the highest precedence is applied. The locale list is not a combination of values from multiple rules.

GLOBAL: The localeFilterRule element contains the following:

 * A required selector attribute. It contains an XPath expression
   which selects the nodes to which this rule applies.
 * A required localeFilterType attribute with the value "all",
   "none", "include", or "exclude".
 * An optional localeFilterList attribute with a comma-separated
   list of locales.

LOCAL: The following local markup is available for the Locale Filter data category:

 * A localeFilterType attribute with the value "all", "none",
   "include", or "exclude".
 * A localFilterList attribute with a comma-separated list of
   locales.

--------------------
NB: The requirements document uses "positive" and "negative" instead of "include" and "exclude". I think "include" and "exclude" suggest the functionality more clearly. I added "all", because that is the default behavior, and I think it's important for a rule to be able to reset things to the default.

The requirements document uses a semicolon as a list delimiter rather than a comma. The example given in section 4.3 of BCP 47 uses commas, though it doesn't seem to actually specify it. Commas are also used by the HTTP Accept-Language header. Is there a reason to prefer a semicolon?











 

-- 
Felix Sasaki

DFKI / W3C Fellow

 
Received on Thursday, 12 July 2012 07:08:44 UTC

This archive was generated by hypermail 2.4.0 : Friday, 17 January 2020 16:31:47 UTC