- From: Tadej Štajner <tadej.stajner@ijs.si>
- Date: Fri, 05 Oct 2012 13:42:31 +0200
- To: public-multilingualweb-lt@w3.org
- Message-ID: <506EC7A7.2090406@ijs.si>
Hi, Yves, Felix, from the TA tool standpoint, the information present toolName, toolVersion and toolAddInfo cover all use cases I've encountered so far with regard to figuring out what was generated where. I have a feeling that toolAddInfo has the risk of becoming a 'kitchen sink' attribute, and I'd pre-empt that with prescribing what kind of things 'should' be there (for instance, for MT: language pairs, engine, for TA: inner engine, model parameters). Structuring the toolInfo data model even more is likely overkill, and this looks like a good sweet spot. -- Tadej On 02. 10. 2012 23:28, Felix Sasaki wrote: > Hi Yves, all, > > no opinion on my side on the delimiter topic, sorry for bringing it > up. A comment on the tool specific aspect below. > > 2012/10/2 Yves Savourel <ysavourel@enlaso.com > <mailto:ysavourel@enlaso.com>> > > > <doc its:toolRefs="mtConfidence/file:///tools.xml#T1" > > xlmns:its="http://www.w3.org/2005/11/its"> > > > > Would it make sense to use a different delimiter? "/" may > conflict with "/" in paths. > > Hmm... almost any ASCII delimiter may also be in the path. The > first occurrence is the delimiter. > But I suppose '|' could be used instead. It just doesn't look as > graceful for some reason. > > > > Do you need the "dataCategory" attribute? It seems the > > data category is made explicit via the reference mechanism in > "its:toolRefs". > > Also, dropping the "dataCategory" attribute allows then to refer to > > the same tools from various data categories - e.g. OKAPI used > for quality > > issue versus for creating translation metadata etc. > > I'm not sure we can go from many data category instances to one > tool information. And this is where I'm having trouble with tool > information: > > The mtConfidence need to have a defined way to specify the engine used > > > Is there really a defined way? The current version of the draft at > http://www.w3.org/International/multilingualweb/lt/drafts/its20/its20.html#mtconfidence-implementation > says: > > "Some examples of values are: > A BCP 47 language tag with t-extension, e.g. ja-t-it for an Italian to > Japanese MT engine > A Domain as per the Section 6.9: Domain > A privately structured string, eg. Domain:IT-Pair:IT-JA, > IT-JA:Medical, etc." > > To me that is the same as saying: you can use anything. Of course we > can wrap the "anything" in a field saying "here is MT engine > information". Is that what you mean? > > , the Text analysis may need something else > > > I actually doubt that the text analysis "anything" will be more > specific. My prediction is that there will be not more interop than > saying "in this field there is data category specific information: ...". > > So you could achieve that by changing your proposal like this > <its:processInfo> > <its:toolInfo xml:id="T1"> > <its:toolName>Bing Translator</its:toolName> > <its:toolVersion>123</its:toolVersion> > <its:toolAddInfo datacategory="mtconfidence">ja-t-it</its:toolAddInfo> > <its:toolInfo> > <its:toolInfo xml:id="T2"> > <its:toolName>myMT</its:toolName> > <its:toolVersion>456</its:toolVersion> > <its:toolAddInfo datacategory="mtconfidence">Domain:IT-Pair:IT-JA</its:AddInfo> > <its:toolInfo> > <its:processInfo> > > and allow for several addInfo elements in one "toolInfo". You won't > gain a lot from these, but not less as with "FR-to-EN-General" inside > "toolValue" at > http://lists.w3.org/Archives/Public/public-multilingualweb-lt/2012Oct/0000.html > > Best, > > Felix > > , etc. It seems each data category will need one or two entry that > mean different things depending on the data category. We can use a > common element for this, but then we need to have one tool > information per data category. > > Maybe the examples people are working on (action items 239 to 243 > for Arle, Phil, Declan and Tadej) will help in defining this. > > Cheers > -yves > > > > > > > -- > Felix Sasaki > DFKI / W3C Fellow >
Received on Friday, 5 October 2012 11:43:41 UTC