Re: action-221 summary of overriding discussion

Hi Felix,
Thank for the summary. I'd agree partial overriding will be complex and 
probably cause more problems than it solves.

Taking the direction of a 'compound data category' may be a better 
solution - but modified slightly.

I think the root of the issue is the combination of having the single 
data category as the unit of conformance while also being the unit of 
override completeness. The former carries the implication (though not 
explicitly stated) that because any single data category implementation 
imparts ITS conformance, then each and every data category must satisfy 
an independent use case.

However, if we relax this assumption in a controlled way we can simply 
avoid partial override by designing certain data categories to be used 
in _combination_ (the subtle difference to a single data category 
being'compound'). In this event we can then either just live with the 
fact that there may be one or to data categories that may impart 
conformance individually even though they are not useful by themselves, 
or we add them as a specific exclusion to the 
single-data-category-for-conformance rules.

cheers,
dave



On 20/09/2012 18:04, Felix Sasaki wrote:
> Hi all,
>
> during todays call we discussed the overriding issue. The summary 
> below does not take into account who said what, just the main points. 
> Below that there is a proposal for a solution.
>
> 1)
> In ITS 1.0 we specified overriding of ITS information (given via one 
> of: local markup, global rules, inheritance or defaults) to be 
> "complete". That can be shown by the ITS 1.0 test suite files. They 
> specify for each node the approach (local, global etc.) ITS 
> information has been created. Example:
>
> <o:node path="/{}msgList/{}body[1]/{}msg[3]" 
> outputType="new-value-global">
>  <o:output o:locNoteType="alert">
>   <o:locNoteText>The variable <code>{0}</code> has three possible 
> values: 'printer','stacker' and 'stapler options'.</o:locNoteText>
>  </o:output>
> </o:node>
>
> That only makes sense with complete overriding.
>
> 2) In ITS 2.0 there are data categories like mtConfidence or 
> textAnalysisAnnotation that would benefit a lot from not having to 
> specify information like "which tool produced this annotation or which 
> tool created this machine translation" in sync with other information, 
> e.g. MT confidence score or the actual named entity annotation.
>
> 3) Partial overriding might be a solution to the requirement of 2). 
> One could provide once, e.g. at the top of a document, what tools have 
> been used, and take that for the complete document into account.
>
> 4) Partial overriding would create backwards compatibility issues (see 
> 1) above) and problems for other data categories. E.g. for 
> localization quality issue, there are a lot of mutually exclusive 
> options for using metadata. The partipal overriding can get very 
> complex if we start to use pointers (e.g. 
> localizationQualityTypePointer), standoff (again for localization 
> quality issue). Also, sometimes an inherited value might not be what 
> the creator of the metadata intended, e.g. having an inherited "alert" 
> localization note type.
>
> 5) Another solution might be a compound data category. That was 
> mentioned during the call, but we ran out of time to discuss it.
>
> 6) For the data categories in question (the "tool" specific part of 
> mtConfidence and text analyis annotation), it seems that there is no 
> need for global rules or inline markup. It would be sufficient to have 
> a means to say: "for a given document the following tooling was used".
>
> Any comments on that summary?
>
>
> SOLUTION PROPOSAL.
>
> My proposal would be that we do not change overriding behavior at all. 
> From 6), it seems that the tool specific information does not have a 
> need for selection at all, so no need for local markup, global rules, 
> inheritance or defaults. It is sufficient to express: "Then processing 
> data category X, the ITS aware processor MUST pass the information 
> given about tooling to the application".
> So we could define a description of tooling information, based in 
> Christian's proposal, and say each description MUST make clear what 
> data category it relates to. The effect would be:
> - Dropping the tool specific part of text analysis annotation
> - Dropping the tool specific part of for mtConfidence
> - Having a tool description format that is not part of the data to be 
> processed or of global rules, but that is independent. In addition to 
> the tooling information in Christians proposal, the format should 
> target what data category it refers to.
> Influence on processing ITS: the tool information would be have a 
> parameter providing the tool specific information. Note that this is 
> different to its:param, which is a  for XPath-processing.
> Influence on conformance: an application implementing the data 
> categories in question MUST be able to ship the information to an 
> application consuming ITS information. There is no influence on the 
> selection mechanisms, that is existing conformance clauses stay "as is".
> Influence on test suite: in addition to the current format or as a 
> kind of "header" in it, we would define a block with the tool specific 
> info.
>
> Thoughts? If people agree on this, I'd be happy to draft a section and 
> some text cases.
>
> Best,
>
> Felix
>
>
> -- 
> Felix Sasaki
> DFKI / W3C Fellow
>

Received on Thursday, 20 September 2012 23:41:30 UTC