W3C home > Mailing lists > Public > public-multilingualweb-lt@w3.org > December 2012

Re: [All] its-tool-ref vs. its-tools-ref

From: Felix Sasaki <fsasaki@w3.org>
Date: Sun, 02 Dec 2012 13:41:01 +0100
Message-ID: <50BB4C5D.2050601@w3.org>
To: "public-multilingualweb-lt@w3.org" <public-multilingualweb-lt@w3.org>
Hi Dave, Phil, Yves, all,

I think both Phil and Yves come from the "data versus ITS annotation" 
distinction. We had discussed that before, and it always leads to the 
1) sometimes you want to provide tool information just about the textual 
2) sometimes you want to provide tool information about the textual data 
+ annotations
3) sometimes you want to provide tool information about just annotations

Now, 1) is no problem: just use tool-ref from provenance, and you are fine.

2) and 3) have the problem that the ITS annotation expressing tool 
information gets interrelated with the data category annotation: 
confidence from disambiguation, MT, or terminology. That leads to 
inheritance problems, and that was the reason why we invented the 
separate tool annotation mechanism.

If we now drop the data category independent tool mechanism, we will 
force users of disambiguation, MT, or terminology to implement provenance.

If we drop toolRef/revToolRef from provenance (as suggested by Yves 
below), we will not respond to use case 1). That is, if e.g. you created 
a text via machine translation and want to express the tool, you could 
not say this (I'm assuming here we would not have toolRef/revToolRef 
from provenance but the independent tool mechanism, and the attribute 
its:annotator is renamed to its:originator):

<p its:originators="mt-engine-xyz">Some text</p>

You would need to say

<p its:originators="xyz|mt-engine-xyz">Some text</p>

The question when is: what data category is "xyz"? In case 1) no data 
category is involved, so there is no suitable data category identifier.

For me the solution would be to say:

- annotatorsRef is for responding to use cases 2) and 3)
- we'd need to make clear that the tool identified via its:annotatorsRef 
can be the tool actual inserting the annotation or being closely related 
to it. MT engines would fall under this category.
- we need to make clear that provenance is only meant for cases in which 
no ITS annotation is involved at all, that is scenario 1).

If we follow that line of reasoning, we would keep annotatorsRef (just 
insert the "s", as suggested by Yves), but to make clear that this 
mechanism always involves ITS annotations.

I don't know what people think, but trying to move things forward I did 
some edits offline (without changes in CVS). See an update proposal with 
change markes here


Comments welcome.



Am 01.12.12 23:40, schrieb Phil Ritchie:
> Yves, All
> In trying to clarify the situation for myself:
> There are two sets of data:
>  1. The content (of primary importance) and the agents that have
>     created and interacted with the content;
>  2. The container of the content - metadata - and the agents that have
>     created and modified it.
> To me anything that pertains to A is the realm of Provenance; that 
> which pertains to B is the realm of ITS Tools Annotation.
> With this view, MT Confidence should use Provence (tool, toolRef, 
> revTool, revToolRef). This is how locQuality* would have to record any 
> tools as it does not have its own tool related attributes.
> Now, this makes me realise that we then have data categories which are 
> related to each other. This would seem to require people to use Global 
> markup in order to capture this relation:
> <its:rules>
> <its:mtConfidenceRule....
>         <its:provRule....
> <its:locQualityIssueRule
> </its:rules>
> In Local markup can attributes from different data categories be mixed?
> <span its-mt-confidence="0.785" its-loc-quality-issue-comment="Even as 
> consumable as raw mt output this is bad!">This text was produced by 
> machine translation engine, is for gisted output but has been rated by 
> an end user.</span>
> _Worse still at this point_ in the proceedings it makes me _realise 
> that I need_ a local attribute for locQuality* called 
> "locQualityIssueConformance". In fairness this is _in the original 
> Requirements Specification_ 
> (https://www.w3.org/International/multilingualweb/lt/wiki/Requirements#Quality_Assurance_.28QA.29) 
> but fell through the cracks.
> Phil.
> From: Yves Savourel <ysavourel@enlaso.com>
> To: <public-multilingualweb-lt@w3.org>,
> Date: 01/12/2012 20:43
> Subject: RE: [All] its-tool-ref vs. its-tools-ref
> ------------------------------------------------------------------------
> Hi Felix, Jörg, Phil, Dave, all,
> === a) if we had toolsRef wouldn't be logical to have annotatorsRef 
> (rather than annotatorRef)?
> === b) Now I'm starting to wonder what the annotatorRef exactly is 
> pointing too. Reading this sentence:
> "...For example, the score of the MT Confidence data category 
> (provided via the mtConfidence attribute) is meaningful only when the 
> consumer of the information also knows what MT engine produced it,"
> It clearly refers to the MT engine, which may or may not be the actual 
> tool that does the annotation (i.e. adds the markup).
> I know it sound like nitpicking, but if annotatorRef is about the tool 
> that created the information for the data category, as opposed to the 
> tool that introduced the actual ITS markup to hold that information 
> (and I know in some cases it can be the same tool), then:
> -- 'annotator' seems like a wrong choice. originatorsRef or originsRef 
> or processorsRef may be closer to the function of the attribute.
> -- And this brings me back to the Provenance's toolRef/revToolRef: It 
> seems then that annotatorRef hold the same information as those two 
> attributes. Therefore:
> 1) What its:annotatorRef for provenance holds?
> and 2) Can't we remove toolRef/revToolRef to use annotatorRef with 
> 'provenance' and 'provenance-rev'?
> I think currently annotatorRef does not define clearly which tool it 
> addresses: The MT confidence example indicates the 'originator' of the 
> information, but the sentence: "The attribute annotatorRef provides a 
> way to associate all the annotations of a given data category within 
> the element with information about the processor that generated those 
> data category annotations." indicates the 'annotator' of the 
> information. Which is it?
> -ys
> -----Original Message-----
> From: Felix Sasaki [mailto:fsasaki@w3.org]
> Sent: Saturday, December 01, 2012 12:29 PM
> To: public-multilingualweb-lt@w3.org
> Subject: Re: [All] its-tool-ref vs. its-tools-ref
> Jörg, Phil, Yves, all,
> thanks for the feedback. I have changed this now to its:annotatorRef 
> (HTML its-annotator-ref). See the diff for the spec, examples and the 
> schemas attached. We can discuss this on the Monday call. If possible 
> I'd like to make the final change to this before the call, so please 
> send feedback before, if needed.
> Thanks,
> Felix
> Am 01.12.12 17:22, schrieb Jörg Schütz:
> > What about "its-annotator-ref" or "its:annotarRef" for the ITS 
> annoation?
> >
> > Cheers -- Jörg
> >
> > On Dec 01, 2012 at 14:07 (UTC+1), Yves Savourel wrote:
> >>> Any suggestions?
> >>
> >> agentsRef if we change toolsRef
> >> or agentRef/revAgentRef if we change toolRef/revToolRef
> >>
> >> -ys
> >>
> >>
> >> -----Original Message-----
> >> From: Felix Sasaki [mailto:fsasaki@w3.org]
> >> Sent: Saturday, December 01, 2012 4:33 AM
> >> To: public-multilingualweb-lt@w3.org
> >> Subject: [All] its-tool-ref vs. its-tools-ref
> >>
> >> Hi all,
> >>
> >> while working on
> >>
> >> http://www.w3.org/International/multilingualweb/lt/drafts/its20/its20
> >> .html#list-of-elements-and-attributes
> >>
> >>
> >> I realized that the provenance "reference to tools" attribute is very
> >> similar to the its tool annotation attribute:
> >>
> >> - in provenance: its-tool-ref or its:toolRef
> >> - for ITS annotation: its-tools-ref or its:toolsRef
> >>
> >> I think we should rename its-tools-ref (that is the annotation
> >> mechanism) including the XML counterpart its:toolsRef) to avoid
> >> confusion. Since that is a normative change we should get this done
> >> on Monday before the call. Any suggestions?
> >>
> >> - Felix
> >>
> >>
> >>
> >
> ************************************************************
> This email and any files transmitted with it are confidential and
> intended solely for the use of the individual or entity to whom they
> are addressed. If you have received this email in error please notify
> the sender immediately by e-mail.
> www.vistatec.com
> ************************************************************
Received on Sunday, 2 December 2012 12:41:30 UTC

This archive was generated by hypermail 2.3.1 : Tuesday, 6 January 2015 21:08:25 UTC