W3C home > Mailing lists > Public > public-multilingualweb-lt@w3.org > October 2012

[ISSUE-42] Wording for the tool information markup

From: Yves Savourel <ysavourel@enlaso.com>
Date: Mon, 1 Oct 2012 07:35:31 -0600
To: <public-multilingualweb-lt@w3.org>
Message-ID: <assp.062172df87.assp.06211e069d.005e01cd9fd9$a2425c30$e6c71490$@com>
Hi all,

Here is a proposed wording for the Tool Information elements and the toolRefs attribute.

This is, I believe, the result of the discussion we had in Prague.
However, while the toolRefs attribute would work seems clear to me, I'm still not very clear on the tool information part. How common and re-useable that can be across tool.


===== 

In some cases, it may be important for instances of data categories to be associated with the processor that generated them. For example, the score of the MT Confidence data category is most meaningful when the consumer of the information also knows what processor produced it.

ITS provides a way to store processor information independently from the data categories using the toolInformation element.

The attribute toolRefs is used to associate a given tool that generated a given data category with the content of the element where the information for that data category is set.

The value of toolRefs is a space-separated list of references where each reference is composed of two parts: a data category identifier and a URI pointing to the relevant toolInformation element.

The data category identifier can be either: one of the pre-defined identifiers, or a user-defined value with a prefix.

[[TODO: need a grammar production to define the value here]]
 
The URI pointing to a tool for a given data category is overridden only when a new toolRefs attribute is defined with a new URI for the same data category.

Document:

<doc its:toolRefs="mtConfidence/file:///tools.xml#T1" xlmns:its="http://www.w3.org/2005/11/its">
 <p its:mtConfidenceScore="0.78">Text translated with tool T1</p>
 <p its:mtConfidenceScore="0.34" its:toolRefs="mtConfidence/file:///tools.xml#T2">Text translated with tool T2</p>
</doc>

Separate document with the list of tools (tools.xml):
 
<its:processInfo>
 <its:toolInfo xml:id="T1" dataCategory="mtConfidence">
  <its:toolName>Bing Translator</its:toolName>
  <its:toolVersion>123</its:toolVersion>
  <its:toolValue></its:toolValue>
 <its:toolInfo>
 <its:toolInfo xml:id="T2" dataCategory="mtConfidence">
  <its:toolName>myMT</its:toolName>
  <its:toolVersion>456</its:toolVersion>
  <its:toolValue>FR-to-EN-General</its:toolValue>
 <its:toolInfo>
<its:processInfo>

=====

The part I'm still not quite sure about is the actual tool information.

The example above is loosely based on Felix' proposal (http://lists.w3.org/Archives/Public/public-multilingualweb-lt/2012Sep/0134.html)

I suppose the tool information element should:
- allow other vocabularies
- have a default set of properties common to most tools, like name, version, etc.
- have some basic property re-useable across tool implementing different data categories. For example <toolValue> in the example above. For mtConfidence it would hold the engine, for Text Annotation it would hold something specific to text Annotation, etc. (hence: one data category per tool)

But I'm not sure how efficient this would be.

Cheers,
-yves
Received on Monday, 1 October 2012 13:53:33 UTC

This archive was generated by hypermail 2.3.1 : Sunday, 9 June 2013 00:25:01 UTC