W3C home > Mailing lists > Public > public-multilingualweb-lt@w3.org > October 2012

Re: [ISSUE-42] Wording for the tool information markup

From: Tadej Štajner <tadej.stajner@ijs.si>
Date: Fri, 05 Oct 2012 13:42:31 +0200
Message-ID: <506EC7A7.2090406@ijs.si>
To: public-multilingualweb-lt@w3.org
Hi, Yves, Felix,
from the TA tool standpoint, the information present toolName, 
toolVersion and toolAddInfo cover all use cases I've encountered so far 
with regard to figuring out what was generated where.

I have a feeling that toolAddInfo has the risk of becoming a 'kitchen 
sink' attribute, and I'd pre-empt that with prescribing what kind of 
things 'should' be there (for instance, for MT: language pairs, engine, 
for TA: inner engine, model parameters). Structuring the toolInfo data 
model even more is likely overkill, and this looks like a good sweet spot.

-- Tadej


On 02. 10. 2012 23:28, Felix Sasaki wrote:
> Hi Yves, all,
>
> no opinion on my side on the delimiter topic, sorry for bringing it 
> up. A comment on the tool specific aspect below.
>
> 2012/10/2 Yves Savourel <ysavourel@enlaso.com 
> <mailto:ysavourel@enlaso.com>>
>
>     > <doc its:toolRefs="mtConfidence/file:///tools.xml#T1"
>     > xlmns:its="http://www.w3.org/2005/11/its">
>     >
>     > Would it make sense to use a different delimiter? "/" may
>     conflict with "/" in paths.
>
>     Hmm... almost any ASCII delimiter may also be in the path. The
>     first occurrence is the delimiter.
>     But I suppose '|' could be used instead. It just doesn't look as
>     graceful for some reason.
>
>
>     > Do you need the "dataCategory" attribute? It seems the
>     > data category is made explicit via the reference mechanism in
>     "its:toolRefs".
>     > Also, dropping the "dataCategory" attribute allows then to refer to
>     > the same tools from various data categories - e.g. OKAPI used
>     for quality
>     > issue versus for creating translation metadata etc.
>
>     I'm not sure we can go from many data category instances to one
>     tool information. And this is where I'm having trouble with tool
>     information:
>
>     The mtConfidence need to have a defined way to specify the engine used
>
>
> Is there really a defined way? The current version of the draft at
> http://www.w3.org/International/multilingualweb/lt/drafts/its20/its20.html#mtconfidence-implementation
> says:
>
> "Some examples of values are:
> A BCP 47 language tag with t-extension, e.g. ja-t-it for an Italian to 
> Japanese MT engine
> A Domain as per the Section 6.9: Domain
> A privately structured string, eg. Domain:IT-Pair:IT-JA, 
> IT-JA:Medical, etc."
>
> To me that is the same as saying: you can use anything. Of course we 
> can wrap the "anything" in a field saying "here is MT engine 
> information". Is that what you mean?
>
>     , the Text analysis may need something else
>
>
> I actually doubt that the text analysis "anything" will be more 
> specific. My prediction is that there will be not more interop than 
> saying "in this field there is data category specific information: ...".
>
> So you could achieve that by changing your proposal like this
> <its:processInfo>
>   <its:toolInfo xml:id="T1">
>    <its:toolName>Bing Translator</its:toolName>
>    <its:toolVersion>123</its:toolVersion>
>    <its:toolAddInfo datacategory="mtconfidence">ja-t-it</its:toolAddInfo>
>   <its:toolInfo>
>   <its:toolInfo xml:id="T2">
>    <its:toolName>myMT</its:toolName>
>    <its:toolVersion>456</its:toolVersion>
>    <its:toolAddInfo datacategory="mtconfidence">Domain:IT-Pair:IT-JA</its:AddInfo>
>   <its:toolInfo>
> <its:processInfo>
>
> and allow for several addInfo elements in one "toolInfo". You won't 
> gain a lot from these, but not less as with "FR-to-EN-General" inside 
> "toolValue" at
> http://lists.w3.org/Archives/Public/public-multilingualweb-lt/2012Oct/0000.html
>
> Best,
>
> Felix
>
>     , etc. It seems each data category will need one or two entry that
>     mean different things depending on the data category. We can use a
>     common element for this, but then we need to have one tool
>     information per data category.
>
>     Maybe the examples people are working on (action items 239 to 243
>     for Arle, Phil, Declan and Tadej) will help in defining this.
>
>     Cheers
>     -yves
>
>
>
>
>
>
> -- 
> Felix Sasaki
> DFKI / W3C Fellow
>
Received on Friday, 5 October 2012 11:43:41 UTC

This archive was generated by hypermail 2.4.0 : Friday, 17 January 2020 16:31:55 UTC