Re: issue-41 (mtconfidence), issue-42 (mtConfidence, textAnalysisAnnotation, quality)

P.S.: Another example which shows that non complete overriding would mess
up things: a slight modification of a file from the ITS 1.0 test suite

http://www.w3.org/International/its/tests/inputdata/LocNote3.xml


<Res xmlns:its="http://www.w3.org/2005/11/its" its:version="1.0">

 <body>

  <msg id="FileNotFound" its:locNote="{1}  and {2}  are filenames"
its:locNoteType="alert">

   <data its:locNote="The variable {1} is the name of the host.">{1} not
found.</data>

   <data its:locNote="The variable {2} is the name of the host.">{2} not
found.</data>

  </msg>

 </body>

</Res>

With the non complete overriding semantics, the locNote at the "data"
element becomes an locNoteType "alert". But that is not intended, only the
"msg" element has the alert.

Similar examples I think easily can be created for the more complex data
categories in ITS 2.0.

- Felix

2012/9/19 Felix Sasaki <fsasaki@w3.org>

> Hi Yves, all,
>
> 2012/9/19 Yves Savourel <ysavourel@enlaso.com>
>
>> Hi Felix, all,
>>
>> > This creates problems. As Dave and Declan ask at
>> >
>> http://lists.w3.org/Archives/Public/public-multilingualweb-lt/2012Sep/0085.html
>> >
>> http://lists.w3.org/Archives/Public/public-multilingualweb-lt/2012Sep/0087.html
>> > overriding semantics in ITS 1.0 is always complete, and ITS 2.0 so far
>> is the same.
>> > I would have to change my whole "artifical output" implementation to
>> change that,
>> > so I would probably object.
>>
>> Actually, I think the bit "Override semantics are always complete, that
>> is all information that is specified in one rule element is overridden by
>> the next one." has been added in 2.0. It's not in 1.0 (
>> http://www.w3.org/TR/its/#selection-precedence).
>>
>
> That's correct - I added it after a question from Dave, see
>
>
> http://lists.w3.org/Archives/Public/public-multilingualweb-lt/2012Jul/0228.html
>
>
>>
>> That may have been the intent, but I even wonder if it was important with
>> the initial data category.
>> Note also that, the wording is not as specific
>>
>> If I understand this correctly you are saying that:
>>
>> - If we have a data category with 3 information AAA, BBB and CCC.
>> - If there is a global rule that define AAA='a' and BBB='b' for a node N
>> - and the same node N has a local attribute that specify CCC='c', the 3
>> information for that for N will be AAA=undefined, BBB=undefined, CCC='c'
>> and not AAA='a', BBB='b' and CCC='c'?
>>
>
> Correct.
>
>
>>
>> If I misunderstood, then forget the rest of this email.
>>
>> If not:
>>
>> This is not very natural: how can something undefined (the local AAA and
>> BBB) override anything: they don't exist.
>>
>
> It is "natural" because we define precedence on a "per data category"
> basis. Even if that was implicit for ITS 1.0, I can prove easily that we
> did the same in ITS 1.0, via the ITS 1.0 test suite. See e.g.
>
> http://www.w3.org/International/its/tests/test2/EX-locNotePointer-attribute-1-result.xml
> there is a test result for each element node / attribute node *per data
> category*. Several values are captured in that manner. We even had
> attributes "outputType" making clear from which the values came
> (local, global, inheritance, default). These attributes only make sense if
> the overriding semantics is complete.
>
>
>>
>> This also prevent the user to define some information using pointers
>> globally and complement the information with ITS local attributes, like
>> this:
>>
>> <doc xmlns:i='http://www.w3.org/2005/11/its' i:version='2.0'>
>> <i:rules version='2.0'>
>> <i:locQualityIssueRule selector='//z' locQualityIssueTypePointer='@type'
>> locQualityIssueSeverityPointer='@score' />
>> </i:rules>
>> <p>Text with <z type='other' score='1'
>> i:locQualityIssueComment='comment'>error</z></p>
>> </doc>
>>
>> An example where not overriding undefined local information would be
>> useful is the Storage Size data category: often the encoding and the line
>> break type of the storage will be the same for the whole document, but the
>> size constraint will be different locally. Having to repeat everything over
>> and over is a rather un-efficient.
>>
>
> But we did the same for ITS 1.0: e.g. its:term="yes" is the same for each
> term, and a termreference is additional information. We didn't allow to
> have just a term reference. We make that even clear in the definition for
> local, interrelating "term" and "termInfoRef" by saying the latter is
> optional.
>
> "
>
>    -
>
>    A term<http://www.w3.org/International/multilingualweb/lt/drafts/its20/its20.html#att.local.no-ns.attribute.term> attribute
>    with the value "yes" or "no".
>    -
>
>    An optional termInfoRef<http://www.w3.org/International/multilingualweb/lt/drafts/its20/its20.html#att.local.no-ns.attribute.termInfoRef> attribute
>    that contains a URI referring to the resource providing information about
>    the term.
>
> "
>
> If you introduce the new approach of overrding semantics, this gets messed
> up: you will have situations looking at a given node asking "where does
> this termInfoRef come from - locally set or globally set?".
> Imagine also that as a user of an ITS processor you want to debug ITS
> local markup and rules, because information for a given node doesn't look
> right. Without the complete overriding that can be a real challenge.
>
>
>
>> It seems 2.0 has several data categories with more than a single
>> information.
>
>
> Yes - but I object (object in the W3C formal objection sense, if needed)
> against given up the overriding semantics, and rather not fulfill each need
> in these data categories. Simplicity here is much more important than
> expressivity IMO.
>
> One reason is backwards compatibility with ITS 1.0, see above. Another is
> the implementation strategy I (and I think Sebastian Rahtz) used for 1.0.
> Note that this strategy is also mentioned in the spec, and this comes from
> ITS 1.0:
> "The precedence order fulfills the same purpose as the built-in template
> rules of [XSLT 1.0]."
> Now, in XSLT you would create a real mess if you would have templates for
> each piece of information of a data category - you'd rather have a template
> *per data category precendence*, e.g.
>
> <xsl:template match="*[@its:term]" priority="+1000" mode="translate">...
> </xsl:template>
>
> This template says: local "term" attribute has the highest precedence. The
> template doesn't even check for termInfoRef, since that is optional (see
> above).
>
>
>> And obliterating existing information defined globally because one
>> *other* information is set locally used may prove challenging.
>>
>
>
> I rather see huge benefits to go that way, in addition to compatibility
> with 1.0. With complete overriding, Is very clear for each node in a
> document what ITS information pertains to it. You gave storage size as an
> example, but  think about quality issue, precise or disambiguation: with
> the masses of attributes we have here, the non complete overriding will
> create a real mess when people want to understand where information comes
> from.
>
> Best,
>
> Felix
>
>
>>
>> Cheers,
>> -yves
>>
>>
>>
>
>
> --
> Felix Sasaki
> DFKI / W3C Fellow
>
>


-- 
Felix Sasaki
DFKI / W3C Fellow

Received on Wednesday, 19 September 2012 05:10:48 UTC