- From: Felix Sasaki <fsasaki@w3.org>
- Date: Wed, 01 Mar 2006 20:24:51 +0900
- To: public-i18n-its@w3.org
- Message-ID: <44058483.9090500@w3.org>
Hi all, This is the summary of the of ITS f2f, Mandelieu, February / March 2006. It encompasses our change proposals, and mostly "cleared" minutes (some topics which I found hard to summarize are here "as is"). Cheers, Felix --------------------------------------------- action items --------------------------------------------- The action items from the f2f and the call yesterday: - all: discuss change proposals from f2f within the next two weeks. I would propose that we give us a deadline for this discussion, e.g. until the ITS call on 15 March. If we don't agree on a proposal, we just should drop it. On Friday this week I will make an bugzilla entry for each proposal which needs more discussion. - action: for editor's of the techniques document: give examples how to use its:locInfoRef (see below) - action: decide if we need the distinction between "alert" and "description" for localization information. - action: Richard to describe an additional level of conformance for Ruby. - action: Felix to update bugzilla with open issues, "ITS 1.1 / 2.0" proposals, the change proposals above - action: Christian and Felix need to update their result of conformance discussion in the spec. - action: All to think about f2f (April & June), see mail from Yves I have marked all change proposals with "proposal-xx". There are 09 proposals. The discussion from yesterday's call is marked as "Discussion during the call on Tuesday:". Could you please until Friday go trough the proposals and write a mail with s.t. like proposal-01: agree proposal-02: agree ... or instead of "agree": "needs more discussion", or some comment with agreement. Note: most of the proposals concern syntactic simplifications or "making clearer" of global rules, they add *no* new functionality. The main functionality related proposal is proposal-05. --------------------------------------------- Simplification and clarification of global markup for selection --------------------------------------------- ************ proposal-01: have only one data category per <documentRule> element ************ We observed that there is no need for selector attributes with data category specific names in a global position. We propose to define that "each <documentRule> element is used for only *one* data category at at time. Hence, we can simplify the definition of <documentRule> from <documentRule> contains data category + various data category specific selector Attributes to <documentRule> contains one data category attribute + *the* its:selector attribute. Examples: 1. translatability: <its:documentRule its:selector="//p" its:translate="yes"/> 2. localization information: <its:documentRule its:selector="//*" its:dir="ltr"/> 3. terminology: <its:documentRule its:selector="//qterm" its:term="yes"/> 4. directionality: <its:documentRule its:selector="//*" its:dir="rtl"/> 5. ruby: <its:documentRule its:selector="/body/img[1]/@alt" its:rubyText="Some ruby text"/> ************ proposal-02: use instead of <documentRule> elements with data category specific names ************ We would propose to use instead of <its:documentRule> a set of elements: for each data category *one* element. Example: 1. translatability: <its:translateRule its:selector="//p" its:translate="yes"/> 2. localization information: see below. 3. terminology: <its:termRule its:selector="//qterm" its:term="yes"/> 4. directionality: <its:dirRule its:selector="//*" its:dir="rtl"/> 5. ruby: <its:rubyRule its:selector="/body/img[1]/@alt" its:rubyText="Some ruby text"/> In this way, it is easier to validate global rules (e.g. make sure that @its:translate only occurs at the <its:translateRule> element). ************ proposal-03: create a child element <locInfo> for global localization information ************ For the expression of localization information in global rules, we would prose an element with a child element: <its:locInfoRule its:selector="//p"> <its:locInfo>Some localization information</its:locInfo> </its:locInfoRule> In this way, we avoid natural language text as attribute content, at least for global rules (that was the case before with @its:locInfo). ************ proposal-04: have an attribute @its:locInfoRef for localization information globally / locally ************ In addition to having the localization information in local position in an attribute (bad for translatability!), we propose to have an attribute @its:locInfoRef which contains a URI. This allows for very different usage scenarios: localization information can be in the data base, in an external xml file, on a web site, in the same document. action: to give examples in the techniques document how to use this. Example: <text> <joke its:locInfoRef="http://www.example.com/klingons#humor">Three man went to a pup: an Klingon, ... </joke> </text> the URI in @its:locInfoRef is resolved to: "In Star Track, Klingons are known for having no sense of humor. (note: germans might be more appropriate here)" ************ proposal summary 01-04: ************ the new conent model for documentRule, reflecting proposals 01-04, is: documentRule = {translate | (locInfo,locInfotype?(maybe optional)) | (term,termRef?) | dir | ruby} translate = element translateRule { attribute selector {...}, attribute translate {"yes"|"no"}} locInfo = element locInfoRule { attribute selector {...}, attribute locInfoRef { xsd:anyURI}?, element locInfo { text }} term = element termRule { attribute selector {...}, attribute term {"yes"} } termRef = element termRefRule { attribute selector {...}, attribute termRef {xsd:anyURI} } dir = element dir { attribute selector {...}, attribute dir {"ltr"|"rtl"|"lro"|"rlo"}} ************ proposal-05: Separating the tasks of globally identifying+adding ITS information to XML nodes, versus globally identifying+mapping data categories to XML nodes. Having a set of "map" attributes (e.g. @its:translateMap") for the mapping task. ************ Background: Yves required at some point the @its:locInfoContent attribute, to be able to refer to existing localization information in a document, rather than "adding" this information to the document. We would propose to generalize this requirement and distingush between: - identifying+adding information to nodes in an XML document (which all existing global rules, except @locInfoContent do) - identifying+mapping data categories to nodes in an XML document (e.g. saying "this existing node is mapped to the localization information data category", or "this node has the 'meaning' of the localization information data category"). Purpose of mapping: ITS data categories are used to "normalize" a document, that is to say "this kind of existing markup has the meaning of this ITS data category". Mapping makes meaning explicit, but does not add information. Example of the need to separate mapping and adding information: <its:documentRule its:dir="ltr" its:dirSelector="//*[@dir='ltr']"/> is not mapping, it opens the door for errors (via the repitition of "ltr" in both attribute values)/> instead, we propose for mapping separate mapping attributes, one for each part of a data category: <its:dirRule its:selector="//*" its:dirMap="@dir"/> The attribute for mapping contains a relative location path (relative to the nodes which are selected by the its:selector attribute). It would not be enough to have only one XPath in the selector attribute, as in the case below: <span class="ruby">... <its:documentRule selector="//span" rubyTextMap="@class='rt'"> (the span elements are identified by the XPath expression n the selector attribute. The mapping to its:rubyText is done via the XPath expression in the rubyTextMap attribute) Benefit: People who already have ITS related markup in their schema (e.g. "translate" attribute in DITA, ruby in opendoc), can be convinced to adopt ITS not by changing their schema, but by making the semantics of their existing markup declarations clear with the separate documentRule element. Example: <its:documentRule its:selector="//*" its:translateMap="@dita:translate"/> (saying "dita:translate" has the semantics of "its:translate") <its:documentRule its:selector="//odf:ruby" its:rubyMap="."/> (saying "the odf ruby element has the semantics of the its:ruby element") Wide spread adoption of ITS becomes easier. Influence of this change to the working draft: We propose to change the description of the general mechanisms for global rules (i.e. integrate the difference "adding information" versus "mapping"), and show with *non-normative examples* for each data category in the data category sections, how these mechanisms can be used. Input to the non-normative examples: see below. The following is a "go trough all data categories", to see how this proposal works. Note: the markup change proposal to have different element names for global rules is not implemented below, so you still have e.g. its:documentRule instead of its:translateRule. --------------------------------------------- Single data categories: Translatability --------------------------------------------- - Scenario: there is no translatability information in the document. Tasks for ITS: identify and add information. Example: <p>...</p> <documentRule its:selector="//p" its:translate="yes" /> - Scenario: there is no translatability information in the document, but a different element you identify. Tasks for ITS: identify and add information. Example: <p class="translate">... </p> <documentRule its:selector="//p[class='translate']" its:translate="yes" /> - Scenario: there are the same values, but a different name which does not match ITS. Tasks for ITS: identify, map, add information. Example: <p translation="yes">...</p> <p translation="no">...</p> <documentRule its:selector="//p" its:translateMap="@translation"/> Discussion during the call on Tuesday: [[Yves: why do you still have the selector? Why not a translateMap attribute only? Richard: translateMap allows you the specification of several attributes Christian: no more qualified names? Richard: qualified names are still possible .. as for mapping: .. mapping only works only if the semantics are really identical Sebastian: it makes a specific assertation that it is really identical .. useful e.g. if you just want to use the elements / attributes in your own namespace .. as it stands, you can make formally clear that this is identical .. processing of "mapping" does not mean adding extra nodes Christian: if my host vocabulary has an attribute 'translation' .. if we have a discrepancy with values? Richard: that is the next scenario:]] - Scenario: there are different attribute names and values, but with same semantics as ITS. Tasks for ITS: identify, map, add information. Example: <p translation="true">...</p> <p translation="false">...</p> <documentRule its:selector="//p[@translation='true']" its:translate="yes"/> <documentRule its:selector="//p[@translation='false']" its:translate="no"/> if we only want to identify and add information, we would have: <p translate="true">...</p> <p translate="false">...</p> <documentRule its:selector="//p[@translate='true']" its:translate="yes"/> <documentRule its:selector="//p[@translate='false']" its:translate="no" /> benefit for adding in this case: an ITS aware editor could use the information to be able to process the non-ITS markup in the same way as ITS markup, e.g. highlighting translatable text. - Scenario: there is a different vocabulary, with different values and different semantics. Tasks for ITS: identify, add information, but no mapping. Example: <p translation="true">...</p> <p translation="false">...</p> <p translation="maybe">...</p> <documentRule its:selector="//p[@translation='true']" its:translate="yes" /> <documentRule its:selector="//p[@translation='false']" its:translate="no" /> <documentRule its:selector="//p[@translation='maybe']" its:translate="yes" /> (has to be decided whether 'maybe' should be 'yes' or 'no') This scenario works not if the values cannot be enumerated. Examples: <p translation="0.2">...</p> <p translation="0.235">...</p> <p translation="0.9">...</p> <p translation="If I have time">...</p> - Scenario: People want to say "my markup relates to an ITS data category", but they do not want to use ITS values. Tasks for ITS: identify, map. Example: <xyz:p xzy:translate="yes">...</xzy:p> <xyz:p xzy:translate="no">...</xzy:p> <documentRule its:selector="//xyz:p" its:translateMap="@xyz:translate"/> (this means "xyz:translate has the meaning of the translatability data category; I 'trust' that the values of xyz:translate fit as well") Benefit: There is no need to "pollute" your namespace with ITS markup. This usage of mapping just passes the information via the map attributes. --------------------------------------------- Single data categories: Localization Information --------------------------------------------- - General examples: ex. locInfo 1: YR_QUERY(year, month) DNote: Only the words inside the parentheses should be translated. Leave the rest in upper case. ex. locInfo 2: Shift DNote: This refers to Image Shift. A single word has been used because of space restrictions. ex. locInfo 3: enabled This refers to 'stapler options'. - Scenario: there is no localization information in the document. Tasks for ITS: identify, add information. Example: a rule which says "identify the jokes". <joke>three klingons ....</joke> <its:documentRule its:select="//joke" its:locInfoRef="http://www.myExample.com/klingon#humor"/> - Scenario: the localization information is available as an attribute value in the instance. Tasks for ITS: identify, map. Example: <its:documentRule selector"//joke" its:locInfoMap="@note"/> <text> <joke note="In Star Track, Klingons are known for having no sense of humor. (note: germans might be more appropriate here)">Three man went to a pup: an Klingon </joke> </text> - Scenario: there is no localization information in the document. Tasks for ITS: add the information in the instance. Examples: - if there is no localization information in the instance: <text> <joke its:locInfo="In Star Track, Klingons are known for having no sense of humor. (note: germans might be more appropriate here)">Three man went to a pup: an Klingon </joke> </text> or <text> <p its:locInfoRef="http://www.example.com/klingon#humor">Three man went to a pup: an Klingon </p> </text> - Scenario: "same values, but different name in existing vocabulary which does not match ITS". This scenario from the translation data does not apply, because there are no enumerated lists of values with localization information. - Scenario: there is an existing locInfoRef attribute. Tasks for ITS: identify, map. Example: <xyz:p xyz:dnote="someURI">.. <its:documentRule its:selection="//xyz:p" its:locInfoRefMap="@xyz:dnote"/> action: Distinction between "alert" and "description": still to be discussed, if we need a mapping here. --------------------------------------------- Single data categories: Directionality --------------------------------------------- - Scenario: no directionality information at all, but we can isolate elements with directionality information (e.g. an <arabic> element). Tasks for ITS: identify, add information. Examples: <arabic>...</arabic> <its:documentRule its:selector="//arabic" its:dir="rtl"/> <its:documentRule its:selector"//span[@xml:lang='ar']" its:dir="rtl"/> <span xml:lang="ar">... - Scenario: there is already directionality information in the document, but not in the ITS namespace. Tasks for ITS: identify, add information. Example: <bdo dir="rtl"> ...</bdo> <its:documentRule its:selector"//bdo[@dir['rtl']]" its:dir="rlo"/> (case for XHTML 1 or e.g. old version of xmlspec) <someElement dir="rtl">...</someElement> <its:documentRule its:selector"//*[[@dir['rtl']]" its:dir="rtl"/> Sebastian: resolve this by order of documentRule elements .. or say "//*[not(self::bdo)][@dir['rtl']]" - other scenarios: follow. --------------------------------------------- Single data categories: Ruby --------------------------------------------- ************ proposal-06: Use the existing conformance levels of W3C Ruby, and have one additional one. ************ - On conformance: We agreed to refer to the W3C ruby specification and to cite its existing level (simple and complex ruby markup) of conformance. We propose to have another level of conformance (working title "intermediate ruby markup"), which will be contributed by Richard. action: Richard to describe an additional level of conformance. Example from opendoc: <odf:ruby> <odf:rubyBase>W3C</odf:rubyBase> <odf:rubyText>World Wide Web Consortium</odf:rubyText> </odf:ruby> Example of simple ruby from the W3C ruby spec: <its:ruby> <its:rb>W3C</its:rb> <its:rt>World Wide Web Consortium</its:rt> </its:ruby> We can use the mapping mechanism to describe that this both has the same meaning: <its:documentRule its:selector="//odf:ruby" its:rubyMap="."/> <its:documentRule its:selector="//odf:rubyBase" its:rubyBaseMap="."/> <its:documentRule its:selector="//odf:rubyText" its:rubyTextMap="."/> The global rules attributes for ruby stay as they are, that is: <its:documentRule its:rubyText="World Wide Web Consortium" its:selector="/body/img[1]/@alt"/> Mapping to different realizations of ruby: <span class="ruby">... <its:documentRule selector="//span" rubyTextMap="@class='rt'"> (takes the value of the span, not of the attribute) <span class="rb">... <span class="rt">... <p rubytext="s.t.">... <its:documentRule selector="//p/@rubytext" rubyTextMap="."> --------------------------------------------- Terminology data category --------------------------------------------- TBD. TODO. ----------------------------------------------------- Visit from Paul Nelson and Markus Scherer (Microsoft) ----------------------------------------------------- - feedback from Paul Nelson: - dir is not necessary for existing formats. Sebastian: new format could just pull in the tag. Paul, Markus: why not using HTML directly? felix: we don't try to invente s.t. new, only cite excisting practice. Paul: often people use global styleing . Sebastian: a mixed document with arabic, English. It is not explicit that directionality should be taken into account. Could use "dir" for that purpose, like "xml:id". Paul: for translation, if you would translate pieces of a string, e.g. "filename.jpg". You would have a pattern called "filepattern": <its:documentRule its:selector="//scr/text()[match(.,*.jpg)]" its:translate="yes"/> imagefile.jpg . Paul: people have to be prevented from translating ".jpg". <entry its:translate="yes">filename<its:span its:translate="no">.jpg</its:span></entry> Paul: aiming at documents? Felix: textual documents, software related documents. Paul: example RSS feed needs some regular expression to figure out what is being translated. Felix: not specify the spec to one version of XPath, but say "XPath and its successor". Markus: if you map the selectors back to css? So that people see how it works? non normative? Markus: yes, so that people just see how it works. Sebastian: is it possible to replace xpath with css? Markus: you should give input to us (CSS working group); it might be difficult to target every attribute. xpath is good for tools, css is good if you just want to translate a given page. So give an example how a CSS stylesheet would look like that has the functionality you want to achieve (*not* creating additional documentRule elemens in CSS). two user scenarios: having ad hoc localizability information for a web page (with CSS), versus having information for a tool (with XPath). Felix: was does MS about this topic? Paul: you have html like dialogues, where everything is parsed on an ID basis. Then external file which processes the expression. Where's a lot of software which does it already. Sebastian: So real life is "give absolute IDs"? Paul: yes, an author does not track that. Markus: so complex selector mechanisms are not necessary? Paul: what I see what other vendors are doing, yes. Felix: ITS is for the engineers who have to adopt a great variety of formats with no (Id or other) localizability information. Paul: yes, for that user scenario ITS seems to be quite useful. And there is a large open source effort to have a standard localization process, e.g. in development countries. ---------------------- Test suite discussion ---------------------- - Sebastian's implementation of the proposal for mapping (see above). <documentRules xmlns="http://www.w3.org/2005/11/its"> <ns its:prefix="t" its:uri="http://www.tei-c.org/ns/1.0"/> <documentRule its:translate="yes" its:selector="//t:body/t:p[1]/*"/> <documentRule its:translate="no" its:selector="//t:body/t:p"/> <documentRule its:selector="//tei:p[starts-with(@rend,'translate(')]" its:translateMap="substring-before(substring-after(@rend,'translate('),')')"/> </documentRules> </teiHeader> <text> <body> <p rend="normal">Hello <hi>world</hi></p> <p rend="special">Goodbye</p> <p its:translate="yes">translate me</p> <p>Don't translate me</p> <p rend="translate(yes)">I want to be be translated</p> Sebastian: we need to specify the precedence between inherited and mapping. Example: precedence between its:translate="yes" and rend="translate(yes)". Need to say that if something is mapped, it then has also to be processed like an attribute with that semantics. ************ proposal-07: add to the precedence rules a rule for "Selections inherited from other local usages of ITS markup" ************ - Felix implementation (without mapping yet): Question: Where to put precedence of inherited local information: 1. Implicit local selection in instance documents (data category attributes on a specific element) 2. Local selections in instance documents (using a documentRules element) 3. Global selections in an external file (using a documentRules element) 4. Global selections in a schema, expressed with a documentRules element 5. Selections expressed with schemaRule (See also the note in Section 5.1.2: Global, Rule-based Selection) 6. > Selections inherited from other local usages of ITS markup Selections via defaults for data categories, see Section 6.1: Position and Default Selections of Data Categories Sebastian: implementation is very expensive, since all xpath expressions have to be generated again and again. Comparing two node sets and see if there is some overlap is the best way of doing it. Sebastian: Error in my implementation: if you have a rule saying s.t. about translatability and directionality: That would produce 2 templates Sebastian: Why this output? Felix: to make basic conformance clear Sebastian: my implementation: could have differnet modes, one for each data category . Felix: that is not as expensive as my way, since you only have to go trough the doc one time. ---------------- Test Files --------------- Sebastian: other data categories in your implementation? Felix: only dir and translate. Sebastian: Felix's approach is easier to check. Felix: it check's whether information comes from local, global or inheritance. .. do we need to check whether information comes from? Sebastian: no. .. XLIFF from Yves is possibly the most easiest example. .. if we publish the implementations, we need to prove that they do the same thing .. we just discovered that inheritance is right in each of our implementations .. what to do after last call? Felix: we go to canidate recommendation .. and then we have implementations and tests, we can reach proposed recomendation Sebastian: why make mapping explicit? .. it does not give new power? Felix: mapping gives you a possibility to make semantics clear, useful e.g. for editing ruby from different sources Sebastian: you can specify "meaning" of markup, without the need to extract it. .. it is also necessary to distingush mapping existing "locInfoContent" (in Yves old terminology) versus adding locInfo to a node. .. we made a conclusion to allow it everywhere. Felix: how about child elements versus attributes? Sebastian: allow to reinforce the structure in all schema languages .. e.g. the new restriction, that a documentRule can have only one data category, easier to check with a rule. --------------------------- conference call --------------------------- http://www.w3.org/Guide/1998/08/teleconference-calendar.html#s_2031 - action items christian: to make bug on relation between markup and data category .. done felix: bring xml:lang question to the core group .. made a request, this is done, felix: close bug ..., write a mail to Francois .. the same for other bugs yves: bug 2890: .. not open, so closed somehow .. that is the terminology thing felix: done yves: action 2808, felix to write s.t. and close the comment felix: done yves: tag set editor's to integrate discussion results into spec .. the changes we discussed ... action: Yves to enter bug on section three felix: how about giving the last word on word smithing to english speakers? christian: would be great - next f2f: Richard: 30 / 31 of may I have a f2f: Yves: the f2f without last call? Felix: general way of working is more important Yves: how about doing this in England? week of the 5th of June would be fine? Richard: fine the whole week Yves: so let's find this week tentivale Yves: would it make sense to have a f2f before Mai? Richard: what is the purpose of this f2f? .. it is o.k. to finalize the draft .. or to discuss the last call Yves: how about April? Sebastian: first week of April is holiday Felix: how about 18, 19, 20 Richard: I maybe in Shangai Yves: fine for me Richard: It might take some time until I know Sebastian: we could have the meeting at his office ************ proposal-08: add xml:lang as a data category, to be able to map various language attriutes to xml:lang ************ Richard: do we need an its:lang attribute to say "this is the same as xml:lang"? Sebastian: it is easier to provide "its:langMap" .. to give the bridge .. it is like ruby, where we don't give new markup, but map to existing markup Richard: e.g. for translation tools, it would be useful .. this is an very extensible mechanism Felix: we should do things also thinking of time Richard: but let's have xml:lang .. by just refering to the spec. <its:langRule its:selector="//*" its:langMap="@myLangAttribute"/> says: "@myLangAttribute" has the meaning of xml:lang. Sebastian: this asserts that people use the same values as xml:lang. ------------------------ Structure of the working draft ------------------------ Felix: How about just adding a new subsection on "adding (or s.t. else?) versus mapping" Richard: basic content of that section would be: Felix: describing the difference between the two methodologies, and giving non-normative examples Richard: for me it is two types say: ************ proposal-09: Give up the schemaRule element ************ Proposal from Felix: the schemaRule element gives no new functionality compared to global rules and is in some cases not possible to process (see RELAX NG example below). <element="p"> <its:documentRule> instead of schemaRule <its:schemaRule=".." ..</element> "//p" p in footnote != p in div "//footnote/p" "//div/p" element div = p1?, p2 p1 = element p { attribute id?, text }[translate="yes"] p2 = element p { text }[translate="no"]
Received on Wednesday, 1 March 2006 11:25:33 UTC