Re: Tool info specification (Re: action-221 summary of overriding discussion)

Lovely definition. I had always understood the geometric explanation but this usage is nice.

I'm liking Felix's proposal and wondering (allowing for the fact I've not had caffeine yet this morning) if it could harmonise the locQualityPrécis category.

Phil



On 21 Sep 2012, at 19:39, "Arle Lommel" <arle.lommel@dfki.de> wrote:

> Orthogonal usually means that one issue intersects with another but is not otherwise determined by it. The term comes from the idea of a two-axis system in which the axes are at right angles (orthogonal).
> 
> So they are not unrelated—their combination may be extremely important—but they can vary independently and the value of one does not entail any particular value for the other.
> 
> Orthogonal is different from parallel, where there is a correspondence of value, and also from totally independent, where there is no meaningful intersection or relationship between two things. 
> 
> Hope that help explain the academic mumbo-jumbo  
> 
> Arle
> 
> --
> Arle Lommel
> Berlin, Germany
> Skype: arle_lommel
> Phone (US): +1 707 709 8650
> 
> Sent from a mobile device. Please excuse any typos.
> 
> On Sep 21, 2012, at 18:22, Yves Savourel <ysavourel@enlaso.com> wrote:
> 
>>> I think the issues you mention can be resolved, but first we'd need 
>>> to agree on the following:
>>> ...
>>> Information about tools used for producing metadata (+content) 
>>> is orthogonal to data categories
>> 
>> Shockingly some of us don't have PhDs and, not being completely familiar with the academic lingo, may need a specific definition of what 'orthogonal' exactly means in this context :)
>> 
>> For me, I agree that the information about the tool that was used to annotate the document is un-related to the information of the data category itself.
>> With one exception: somewhere in the data category information there should be a way to point to the tool information.
>> 
>> -yves
>> 
>> 
>> 
>> From: Felix Sasaki [mailto:fsasaki@w3.org] 
>> Sent: Friday, September 21, 2012 8:13 AM
>> To: Yves Savourel
>> Cc: public-multilingualweb-lt@w3.org
>> Subject: Re: Tool info specification (Re: action-221 summary of overriding discussion)
>> 
>> Hi Yves,
>> 2012/9/21 Yves Savourel <ysavourel@enlaso.com>
>> Thanks for the example Felix,
>> 
>>> ... All tool specifications allow for identifying the relevant
>>> data categories. In that way it becomes explicit that e.g. a
>>> certain MT tool is relevant for mt-confidence.
>>> 
>>> the tool specifications have "id" attributes, e.g. "t-2" for "bing" translator.
>>> Yves' requirement of referring to tool info from a piece of XLIFF could be
>>> realized by referring to the ID attribute.
>> How exactly the relationship between the local data category markup and the tool is expressed?
>> 
>> Currently not at all.
>> 
>> 
>> It seems you are saying: the ITS way is to look at the itsDataCategoryIdentifer element in the tool info.
>> That's clumsy IMO, but it is indeed preventing any tool-specific data on the data category side.
>> 
>> Correct, that's a huge benefit IMO: to separate the metadata itself from information about production of metadata - or in the case production of content+metadata.
>> 
>> 
>> But the case for several tools used for the same data category is not really catered for.
>> 
>> Correct.
>> 
>> When you say "referring to tool info from a piece of XLIFF could be realized by referring to the ID attribute" who is defining the attribute that does the referring? ITS or XLIFF?
>> 
>> Good question :) In my mind it was XLIFF, but obviously you are pushing for a mechanism on the ITS side.
>> 
>> 
>> If it's XLIFF, then I disagree: I think the ITS mechanism must have provision for both cases. (Actually I even think the MT case would tend to favor that multi-tool case: knowing which tool produced a given MT is probably more relevant when you have several candidates).
>> 
>> Having such provision probably means some kind of tool-ref attribute in each data category using the tool information.
>> Which means it probably needs to be specify for each local occurrence over and over again.
>> We're back to square one, admittedly now with only one attribute referring to the tool info rather than with all the tool info... I suppose that's a progress :)
>> 
>> Yes, that's a progress :)
>> 
>> I think the issues you mention can be resolved, but first we'd need to agree on the following:
>> - Partial inheritance is out of scope
>> - Information about tools used for producing metadata (+content) is orthogonal to data categories
>> 
>> 
>> Now, if we agree on that, I think it would be OK to have a data category "ITS Tool information" which is available both locally and globally. Locally, it would have the tool references you mention, e.g.
>> 
>> <span its:tool-ref="#t1" ...> (in tool-ref there might be a comma-separated list of "ref" values)
>> meaning Enrycher and the "disambiguation" data data category have been used to create metadata for the content of "span". We could also have a global rule like
>> 
>> <its:toolInfoRule selector="trans-unit/target" tool-ref="#t-2"/>
>> meaning that "Bing" translate has been used to create translated content and the mt confidence score information.
>> 
>> What is the difference to previous approaches? With the above we don't change selection at all and actually don't see anything about the relation between data categories. E.g. there might be no disambiguation or mt-confidence annotation at all. The "toolInfo" data category allows applications to interrelate the annotations, if they are available - but we don't require testing that and don't create new conformance claims. That's a huge benefit IMO.
>> 
>> If in above approach there is a "local" tool-ref attribute, that would inherit in the document. So since Declan and Tadej need a "document only" solution without XPath, that global approach would accomodate that.
>> 
>> The "new" ITS mechanism of referencing is actually not new: we do that with standoff in localization quality issue already. And it seems that in the new draft of Provenance, standoff also would be much more appropriate, instead of too much usage of pointer attributes.
>> 
>> Best,
>> 
>> Felix 
>> 
>> 


************************************************************
This email and any files transmitted with it are confidential and
intended solely for the use of the individual or entity to whom they
are addressed. If you have received this email in error please notify
the sender immediately by e-mail.

www.vistatec.com
************************************************************

Received on Sunday, 23 September 2012 08:53:38 UTC