Re: [ISSUE-42] Wording for the tool information markup

Hi Yves, all,

no opinion on my side on the delimiter topic, sorry for bringing it up. A
comment on the tool specific aspect below.

2012/10/2 Yves Savourel <ysavourel@enlaso.com>

> > <doc its:toolRefs="mtConfidence/file:///tools.xml#T1"
> > xlmns:its="http://www.w3.org/2005/11/its">
> >
> > Would it make sense to use a different delimiter? "/" may conflict with
> "/" in paths.
>
> Hmm... almost any ASCII delimiter may also be in the path. The first
> occurrence is the delimiter.
> But I suppose '|' could be used instead. It just doesn't look as graceful
> for some reason.
>
>
> > Do you need the "dataCategory" attribute? It seems the
> > data category is made explicit via the reference mechanism in
> "its:toolRefs".
> > Also, dropping the "dataCategory" attribute allows then to refer to
> > the same tools from various data categories - e.g. OKAPI used for quality
> > issue versus for creating translation metadata etc.
>
> I'm not sure we can go from many data category instances to one tool
> information. And this is where I'm having trouble with tool information:
>
> The mtConfidence need to have a defined way to specify the engine used


Is there really a defined way? The current version of the draft at
http://www.w3.org/International/multilingualweb/lt/drafts/its20/its20.html#mtconfidence-implementation
says:

"Some examples of values are:
A BCP 47 language tag with t-extension, e.g. ja-t-it for an Italian to
Japanese MT engine
A Domain as per the Section 6.9: Domain
A privately structured string, eg. Domain:IT-Pair:IT-JA, IT-JA:Medical,
etc."

To me that is the same as saying: you can use anything. Of course we can
wrap the "anything" in a field saying "here is MT engine information". Is
that what you mean?



> , the Text analysis may need something else


I actually doubt that the text analysis "anything" will be more specific.
My prediction is that there will be not more interop than saying "in this
field there is data category specific information: ...".

So you could achieve that by changing your proposal like this

<its:processInfo>
 <its:toolInfo xml:id="T1">
  <its:toolName>Bing Translator</its:toolName>
  <its:toolVersion>123</its:toolVersion>
  <its:toolAddInfo datacategory="mtconfidence">ja-t-it</its:toolAddInfo>
 <its:toolInfo>
 <its:toolInfo xml:id="T2">
  <its:toolName>myMT</its:toolName>
  <its:toolVersion>456</its:toolVersion>
  <its:toolAddInfo
datacategory="mtconfidence">Domain:IT-Pair:IT-JA</its:AddInfo>

 <its:toolInfo>

<its:processInfo>


and allow for several addInfo elements in one "toolInfo". You won't gain a
lot from these, but not less as with "FR-to-EN-General" inside "toolValue"
at
http://lists.w3.org/Archives/Public/public-multilingualweb-lt/2012Oct/0000.html

Best,

Felix



> , etc. It seems each data category will need one or two entry that mean
> different things depending on the data category. We can use a common
> element for this, but then we need to have one tool information per data
> category.
>
> Maybe the examples people are working on (action items 239 to 243 for
> Arle, Phil, Declan and Tadej) will help in defining this.
>
> Cheers
> -yves
>
>
>
>


-- 
Felix Sasaki
DFKI / W3C Fellow

Received on Tuesday, 2 October 2012 21:29:10 UTC