- From: Felix Sasaki <fsasaki@w3.org>
- Date: Thu, 20 Sep 2012 19:04:54 +0200
- To: "public-multilingualweb-lt@w3.org" <public-multilingualweb-lt@w3.org>
- Message-ID: <CAL58czrUWa2=AWpCd2tcXR4FPfD72q36jUh-a+XGTQziYDUopg@mail.gmail.com>
Hi all, during todays call we discussed the overriding issue. The summary below does not take into account who said what, just the main points. Below that there is a proposal for a solution. 1) In ITS 1.0 we specified overriding of ITS information (given via one of: local markup, global rules, inheritance or defaults) to be "complete". That can be shown by the ITS 1.0 test suite files. They specify for each node the approach (local, global etc.) ITS information has been created. Example: <o:node path="/{}msgList/{}body[1]/{}msg[3]" outputType="new-value-global"> <o:output o:locNoteType="alert"> <o:locNoteText>The variable <code>{0}</code> has three possible values: 'printer','stacker' and 'stapler options'.</o:locNoteText> </o:output> </o:node> That only makes sense with complete overriding. 2) In ITS 2.0 there are data categories like mtConfidence or textAnalysisAnnotation that would benefit a lot from not having to specify information like "which tool produced this annotation or which tool created this machine translation" in sync with other information, e.g. MT confidence score or the actual named entity annotation. 3) Partial overriding might be a solution to the requirement of 2). One could provide once, e.g. at the top of a document, what tools have been used, and take that for the complete document into account. 4) Partial overriding would create backwards compatibility issues (see 1) above) and problems for other data categories. E.g. for localization quality issue, there are a lot of mutually exclusive options for using metadata. The partipal overriding can get very complex if we start to use pointers (e.g. localizationQualityTypePointer), standoff (again for localization quality issue). Also, sometimes an inherited value might not be what the creator of the metadata intended, e.g. having an inherited "alert" localization note type. 5) Another solution might be a compound data category. That was mentioned during the call, but we ran out of time to discuss it. 6) For the data categories in question (the "tool" specific part of mtConfidence and text analyis annotation), it seems that there is no need for global rules or inline markup. It would be sufficient to have a means to say: "for a given document the following tooling was used". Any comments on that summary? SOLUTION PROPOSAL. My proposal would be that we do not change overriding behavior at all. From 6), it seems that the tool specific information does not have a need for selection at all, so no need for local markup, global rules, inheritance or defaults. It is sufficient to express: "Then processing data category X, the ITS aware processor MUST pass the information given about tooling to the application". So we could define a description of tooling information, based in Christian's proposal, and say each description MUST make clear what data category it relates to. The effect would be: - Dropping the tool specific part of text analysis annotation - Dropping the tool specific part of for mtConfidence - Having a tool description format that is not part of the data to be processed or of global rules, but that is independent. In addition to the tooling information in Christians proposal, the format should target what data category it refers to. Influence on processing ITS: the tool information would be have a parameter providing the tool specific information. Note that this is different to its:param, which is a for XPath-processing. Influence on conformance: an application implementing the data categories in question MUST be able to ship the information to an application consuming ITS information. There is no influence on the selection mechanisms, that is existing conformance clauses stay "as is". Influence on test suite: in addition to the current format or as a kind of "header" in it, we would define a block with the tool specific info. Thoughts? If people agree on this, I'd be happy to draft a section and some text cases. Best, Felix -- Felix Sasaki DFKI / W3C Fellow
Received on Thursday, 20 September 2012 17:05:19 UTC