W3C home > Mailing lists > Public > public-i18n-its@w3.org > January to March 2006

Re: New ITS syntax

From: Felix Sasaki <fsasaki@w3.org>
Date: Thu, 30 Mar 2006 01:41:05 +0900
Message-ID: <442AB8A1.9000809@w3.org>
To: "Lieske, Christian" <christian.lieske@sap.com>
Cc: Yves Savourel <ysavourel@translate.com>, public-i18n-its@w3.org
Hi Christian, all

Although I made the proposal for the "locale" category, I'll take it
back again.

The reason: As Yves pointed out, the concept "local" is not yet well
understood in the XML world (and elsewhere). I talked a lot about this
to Addison Phillips. He is co-editor of the RFC 3066bis document for
language tags, see
http://www.ietf.org/internet-drafts/draft-ietf-ltru-registry-14.txt .

He and Mark Davis (the other co-editor) pointed me to various problems:

- relation between language and locale?
- should a locale identifier be a stable, registered unit, like language
identifier, or an adoptable set of properties
- what properties should be included in the unit / the set?

etc.

The funny thing: Both of them don't agreed yet on a single point yet,
and so do many others.

Addison warned me a lot that ITS should not try to define s.t. in this
area, and I agree.

Let's hope that Addison etc. will have a solution with a wide base of
consensus soon. Maybe we can point to them in ITS 2.0 ...

A comment below.

Lieske, Christian wrote:
> Hello everyone,
> 
> On the issue of "locale" information:
> 
>>From my understanding, ITS should provide a data category which captures the
> source locale and possibly even the target locale(s). I would derive the
> requirement related to target locales for example from the comments which
> Felix got during his visit to Xerox. Furthermore, I guess that 
> http://esw.w3.org/topic/its0503ReqLangLocale really is about that flavour
> of locale requirement (which corresponds to what I have in mind):
> 
> If a localizer does not know that a certain bit of content initially was meant
> for a certain locale, or now has to go into a different locale, then the localizer
> cannot work the way he would need to work.
> 
> Example: If a certain bit of content only was created for Germany in German, then
> a localizer possibly would need to adapt that content if it were to be used in 
> Austria (since in that country people might be more familiar with the term "Mistkübel" 
> than with the term "Papierkorb" which from my understanding roughly correspond to "recycle bin").

you can express that difference with a language identifier, since it
provides a region subtag, see  sec. 2.2.4. in
http://www.ietf.org/internet-drafts/draft-ietf-ltru-registry-14.txt .

> 
> I would argue that the availability of this type of locale information would be valuable
> even it cannot yet be provided by means of standard values.
> 
> I am not sure whether Richard had this kind of information in mind for the language information
> http://www.w3.org/International/its/itstagset/itstagset.html#datacat-lang

I agree it would be nice to have a means to differentiate between
language and locale more clearly. It is just not possible yet :( , and
I'm afraid we are not in the position to make new ground here.

Regards, Felix.

> 
> Best regards,
> Christian
> -----Original Message-----
> From: public-i18n-its-request@w3.org [mailto:public-i18n-its-request@w3.org] On Behalf Of Felix Sasaki
> Sent: Mittwoch, 29. März 2006 03:52
> To: Yves Savourel
> Cc: public-i18n-its@w3.org
> Subject: Re: New ITS syntax
> 
> Hi Yves,
> 
> Many thanks!
> 
> Yves Savourel wrote:
>> Hi Felix, all,
>>
>> Looking at 
>> http://lists.w3.org/Archives/Public/public-i18n-its/2006JanMar/0301.html
>> For tomorrow item #4 of the agenda.
>>
>> Here are some comments:
>>
>>> #About rubyRule:
>>> ... 
>>> <its:rubyRule its:selector="//span[class='ruby']"
>>>  its:rubyBaseMap="span[class='rubyBase']"
>>>  its:rubyTextMap="span[class='rubyText']"/>
>> Just a reminder: Don't forget the <rp> element that exists in the W3C ruby module.
>> (And what about complex ruby constructs?)
> 
> thanks for the reminder. I have integreated
> http://www.w3.org/TR/ruby/#abstract-def below. I used "xyzPointer", see
> discussion at http://www.w3.org/Bugs/Public/show_bug.cgi?id=3017 .
> 
> I also created localRuby, see below.
> 
>>
>>> #About langRule: The element langRule is used to express 
>>> that a given piece of content (selected by the attribute langMap)
>>> is used to express language information as defined by RFC 3066 
>>> or its successor. Example:
>>> <its:langRule its:selector="//p" its:langMap="@mylangattribute"/>
>>> ...
>>> #About localeRule: The element localeRule is used to express that
>>> a given piece of content (selected by the attribute localeMap) is 
>>> used to express locale information. Example: <its:localeRule 
>>> its:selector="//p" its:langMap="@mylocaleattribute"/>.
>> Mmmm... I guess langMap and localeMap stay named like this while the other change to xyzPointer/PassThrough/Etc. 
> 
> no, I would change everything to "pointer", hence: its:langPointer .
> 
> One question: if
>> the content of langMap is always an attribute, why the '@'? Is the value an XPath expression or the name of the equivalent
>> attribute/element?
> 
> an XPath expression. It has the same meaning as all "pointer"
> attributes: an XPath expression relative to the node(s) selected by
> its:select, e.g.
> 
> <ns1:p myLangAtt=".."> ...
> 
> would be
> <its:langRule its:select="//ns1:p" its:langPointer="//@myLangAtt"/>
> 
>> Something tells me involving 'locale' before there is a clear consensus in the XML world on what is it and have a RFC3066-like
>> reference for the values, is a bad idea.
> 
> you are right, I have dropped locale. Btw., Addison Philipps has given
> me the same feedback :(
> 
>> When you say "The value of @mylocaleattribute might be compliant to RFC 3066 bis, but this is not mandatory." then it means
>> basically "use whatever value you want", and that makes it non-interoperable. I can understand <langRule> because it maps to
>> XML/ITS-recommended way to specify language and there is value set define for it. But what is the use case for mapping a
>> user-defined locale to ...nothing interoperable. Knowing the name of the attribute used for specifying the locale is not enough: one
>> needs a defined set of values. To me, having localeMap may raise the false hope that ITS provides some kind of interoperable locale
>> concept.
>>
>>
>> Cheers,
>> -yves
>>
>>
> 
> Below the corrected syntax. A question: What is the latest state for
> withinTextRule?
> 
> cheers,
> 
> Felix
> 
> 
> namespace its = "http://www.w3.org/2005/11/its"
> 
> # having itsGlobal as the entry point of the schema serves as a wrapper
> schema
> # for an external rules file.
> 
> start = itsGlobal
> 
> itsGlobal = element its:rules { ns*, rule+ }
> 
> ns = element its:ns { attribute its:prefix { xsd:NCName }, attribute
> its:uri {
> xsd:anyURI } }
> 
> selector = attribute its:selector { text }
> 
> rule = translateRule | locInfoRule | dirRule | termRule |
> langRule | rubyRule | withinTextRule
> 
> translateRule = element its:translateRule { selector, attribute
> its:translate {
> "yes" | "no" } }
> 
> #About locInfoRule: At the locInfoRule element, there must be either a
> #locInfo element [not attribute] or a locInfoRef attribute. If neither is
> #present, there must be either a locInfoPointer attribute or a
> locInfoRefPointer
> #attribute. There is an optional locInfoType attribute.
> 
> locInfoRule = element its:locInfoRule { selector, attribute its:locInfoRef {
> xsd:anyURI }?, attribute its:locInfoRefPointer { text }?, attribute
> its:locInfoPointer {
> text }?, attribute its:locInfoType { "alert" | "description" }?, element
> locInfo { text }? }
> 
> dirRule = element its:dirRule { selector, attribute its:dir { "ltr" |
> "rtl" |
> "lro" | "rlo" } }
> 
> #About termRule: In an instance document, we would need an attribute
> #term="yes" to indicate a term. In the global rule, "being" a term is
> #expressed via the name of the element termRule, hence the attribute
> #term="yes" is not necessary any more. The attributes termRef and
> #termRefPointer are alternatives. It is an error if they occur at the same
> #termRule element.
> 
> termRule = element its:termRule { selector, attribute its:termRef {
> xsd:anyURI
> }?, attribute its:termRefPointer { text } }
> 
> #About langRule: The element langRule is used to express that a given
> #piece of content (selected by the attribute langPointer) is used to express
> #language information as defined by RFC 3066 or its successor. Example:
> #<its:langRule its:selector="//p" its:langPointer="@mylangattribute"/>
> #expresses that all p elements (including attributes and textual content
> #of child elements) have a language value conformant to RFC 3066 or its
> #successor. The value is given by the @mylangattribute attached to the p
> #elements.
> langRule = element its:langRule { selector, attribute its:langPointer {
> text } }
> 
> 
> #About rubyRule: The element rubyRule is used (1) to map existing ruby
> #"markup to ITS ruby, which itself is defind in terms of the W3C ruby
> #specification, or (2) to add ruby text to attribute values. Example for
> #(1): <its:rubyRule its:selector="//span[class='ruby']"
> #its:rbPointer="span[class='rubyBase']"
> #its:rtPointer="span[class='rubyText']"/> . Example for (2):
> #<its:rubyRule its:selector="/body/img[1]/@alt" its:rbPointer="."
> #its:rt="World Wide Web Consortium"/> . It is an error if both an
> #its:rt attribute and an its:rtPointer attribute occur at the
> #same <its:rubyRule> element.
> rubyRule = element its:rubyRule { selector, attribute its:rubyPointer {
> text }?,
> attribute its:rbPointer { text }?, attribute its:rtPointer { text }?,
> attribute its:rpPointer { text } ?,
> attribute its:rbcPointer { text } ?, attribute its:rtcPointer { text }?,
> attribute its:rubyText { text }? }
> 
> 
> 
> #About withinTextRule: withinTextRule is based on Yves / AZ proposal,
> #see http://www.w3.org/Bugs/Public/show_bug.cgi?id=2878
> withinTextRule = element its:withinTextRule { selector }
> 
> #locale usage of ITS are itsLocalAttributes, or ruby. Just for
> #convinience, the span element contains the itsLocalAttributes.
> itsLocal = element its:span { itsLocalAttributes, text } | rubyLocal
> 
> itsLocalAttributes = translateLocal | locInfoLocal | dirLocal | termLocal
> translateLocal = attribute its:translate { "yes" | "no" }?
> 
> #About locInfoLocal: There must be either a a locInfo attribute or a
> #locInfoRef attribute. There is an optional locInfoType attribute.
> locInfoLocal =  attribute its:locInfo { text }?, attribute its:locInfoRef {
> xsd:anyURI }?, attribute its:locInfoType { "alert" | "description" }
> 
> dirLocal = attribute its:dir { "ltr" | "rtl" | "lro" | "rlo" }?
> 
> #About termLocal: the attribute term is mandatory, the attribute termRef
> #is optional.
> termLocal =  attribute its:term { "yes" }?, attribute its:termRef {
> xsd:anyURI }?
> 
> #On ruby: todo: still need to write the ruby content model, which is
> #identical to w3c ruby, and global ruby rules.
> 
> # rubyLocal is defined in terms of
> http://www.w3.org/TR/ruby/\#definition. The (rbc, rtc, rtc?) alternative
> of the content model for the ruby element corresponds to complex ruby
> markup. The minimal content model for the ruby element is (rb, (rt |
> (rp, rt, rp))).
> rubyLocal = element its:ruby { RubyCommonAtts, ((rb, (rt | (rp, rt,
> rp))) | (rbc, rtc, rtc?)) }
> rbc = element its:rbc { RubyCommonAtts, rb+ }
> rtc = element its:rtc { RubyCommonAtts, rt+ }
> rb = element its:rb { RubyCommonAtts, inline* }
> rt = element its:rt { RubyCommonAtts, attribute its:rbspan { text },
> inline* }
> rp = element its:rp { RubyCommonAtts, text }
> inline = text
> RubyCommonAtts = itsLocalAttributes
> 
> 



Received on Wednesday, 29 March 2006 16:41:25 UTC

This archive was generated by hypermail 2.3.1 : Tuesday, 6 January 2015 20:43:07 UTC