W3C home > Mailing lists > Public > public-multilingualweb-lt@w3.org > October 2012

Re: [ISSUE-42] Wording for the tool information markup

From: Felix Sasaki <fsasaki@w3.org>
Date: Tue, 2 Oct 2012 13:15:51 +0200
Message-ID: <CAL58czq81Zut+VE8a7axJZdJ2aoormTaMM=p3uswBB9-STL_Xw@mail.gmail.com>
To: Yves Savourel <ysavourel@enlaso.com>
Cc: public-multilingualweb-lt@w3.org
Hi Yves, all,

2012/10/1 Yves Savourel <ysavourel@enlaso.com>

> Hi all,
> Here is a proposed wording for the Tool Information elements and the
> toolRefs attribute.
> This is, I believe, the result of the discussion we had in Prague.
> However, while the toolRefs attribute would work seems clear to me, I'm
> still not very clear on the tool information part. How common and
> re-useable that can be across tool.
> =====
> In some cases, it may be important for instances of data categories to be
> associated with the processor that generated them. For example, the score
> of the MT Confidence data category is most meaningful when the consumer of
> the information also knows what processor produced it.
> ITS provides a way to store processor information independently from the
> data categories using the toolInformation element.
> The attribute toolRefs is used to associate a given tool that generated a
> given data category with the content of the element where the information
> for that data category is set.
> The value of toolRefs is a space-separated list of references where each
> reference is composed of two parts: a data category identifier and a URI
> pointing to the relevant toolInformation element.
> The data category identifier can be either: one of the pre-defined
> identifiers, or a user-defined value with a prefix.
> [[TODO: need a grammar production to define the value here]]
> The URI pointing to a tool for a given data category is overridden only
> when a new toolRefs attribute is defined with a new URI for the same data
> category.
> Document:
> <doc its:toolRefs="mtConfidence/file:///tools.xml#T1" xlmns:its="
> http://www.w3.org/2005/11/its">
>  <p its:mtConfidenceScore="0.78">Text translated with tool T1</p>
>  <p its:mtConfidenceScore="0.34"
> its:toolRefs="mtConfidence/file:///tools.xml#T2">Text translated with tool
> T2</p>
> </doc>

Would it make sense to use a different delimiter? "/" may conflict with "/"
in paths.

> Separate document with the list of tools (tools.xml):
> <its:processInfo>
>  <its:toolInfo xml:id="T1" dataCategory="mtConfidence">
>   <its:toolName>Bing Translator</its:toolName>
>   <its:toolVersion>123</its:toolVersion>
>   <its:toolValue></its:toolValue>
>  <its:toolInfo>
>  <its:toolInfo xml:id="T2" dataCategory="mtConfidence">
>   <its:toolName>myMT</its:toolName>
>   <its:toolVersion>456</its:toolVersion>
>   <its:toolValue>FR-to-EN-General</its:toolValue>
>  <its:toolInfo>
> <its:processInfo>

Do you need the "dataCategory" attribute? It seems the data category is
made explicit via the reference mechanism in "its:toolRefs". Also, dropping
the "dataCategory" attribute allows then to refer to the same tools from
various data categories - e.g. OKAPI used for quality issue versus for
creating translation metadata etc.

> =====
> The part I'm still not quite sure about is the actual tool information.
> The example above is loosely based on Felix' proposal (
> http://lists.w3.org/Archives/Public/public-multilingualweb-lt/2012Sep/0134.html
> )
> I suppose the tool information element should:
> - allow other vocabularies
> - have a default set of properties common to most tools, like name,
> version, etc.
> - have some basic property re-useable across tool implementing different
> data categories. For example <toolValue> in the example above. For
> mtConfidence it would hold the engine, for Text Annotation it would hold
> something specific to text Annotation, etc. (hence: one data category per
> tool)

Isn't there something we could re-use for this? My proposal was based on
Christian's input / OLIF, but tool information seems to be commonly needed
e.g. in build systems.


> But I'm not sure how efficient this would be.
> Cheers,
> -yves

Felix Sasaki
DFKI / W3C Fellow
Received on Tuesday, 2 October 2012 11:16:21 UTC

This archive was generated by hypermail 2.4.0 : Friday, 17 January 2020 16:31:55 UTC