- From: Felix Sasaki <fsasaki@w3.org>
- Date: Tue, 29 Jan 2013 18:26:57 +0100
- To: Mārcis Pinnis <marcis.pinnis@Tilde.lv>
- CC: "public-multilingualweb-lt@w3.org" <public-multilingualweb-lt@w3.org>
- Message-ID: <51080661.1020300@w3.org>
Hi Mārcis, as stated at http://lists.w3.org/Archives/Public/public-multilingualweb-lt/2013Jan/0237.html - if we follow the "trust is important and the main comment is not about introducing multi-layer annotations to ITS2" reasoning, then we can close this discussion. Hence I'm providing a few more arguments in the 237 mail, please have a look. Best, Felix Am 29.01.13 17:50, schrieb Mārcis Pinnis: > > Hi Felix, > > My comments are inline. > > Best regards, > > Mārcis ;o) > > -----Original Message----- > From: Felix Sasaki [mailto:fsasaki@w3.org] > Sent: Tuesday, January 29, 2013 6:35 PM > To: Mārcis Pinnis > Cc: public-multilingualweb-lt@w3.org > Subject: Re: issue-68 from an annotation representation point of view, > with potential implications for annotatorsRef and standoff markup > > Hi Mārcis, > > Am 29.01.13 16:02, schrieb Mārcis Pinnis: > > > Hi Felix, > > > > > > Could you please explain this in a bit more detail. I am not sure I > follow your idea how the stand-off mechanism won't be hierarchical and > won't overlap anymore? > > > > > > So ... (I think it gets very confusing and difficult to follow what > the current proposal is) if I understand correctly the ref and id > places have been switched so that one could refer from a single span > only to a single textAnalyticsAnnotation, right? > > Correct. > > > Then if I understand correctly, the following would not resolve to > having "University of London" as an organisation and "London" as a > place, right? (probably not, therefore, I think I have lost the idea). > > > > > > <span tanRefs="id1">University of <span > > > tanRefs="id2">London</span></span> > > It would. The point is: ITS annotations marked as not inheriting (like > terminology or disambig) always refer to an element (or attribute) *text > > content* - but excluding nested elements. Now, in your example you > have separate annotations. So the one with tanRefs="id1" would refer > to text content "University of London", and the one with "tanRefs="id" > would refer to London. > > Mārcis: OK, then I guess I do not understand what was meant with (see > below, I marked it in red ... if you will get the e-mail as HTML, if > not – here is a quote from below: > > Mārcis: I quote: „it would also mean - that's a different to my > proposal - that annotations would not be hierarchical and they would > not overlap, since they always - both in the inline and standoff case > - are anchored at the same span of text” > > Mārcis: actually I am also a bit confused about overlapping – we keep > mentioning it, but would that even be possible with the stand-off > mark-up? For overlapping you need to specify a range (like the example > given by Felix from NIF or TEI ... cannot remember right now, but that > described ranges), but here we can only get nested annotations > (hierarchical and also contradicting). > > > > > > <its:textAnalyticsAnnotations id="id1"> <its:textAnalyticsAnnotation > > > its-tan-type="entity" > > > its-tan-ident-ref="http://dbpedia.org/resource/UniversityOfLondon" > > > its-tan-class-ref="http://nerd.eurecom.fr/ontology#Organisation" > > > its-tan-confidence="0.7" annotatorsRef="tan|annotator-1"/> > > > </its:textAnalyticsAnnotations> <its:textAnalyticsAnnotations > > > id="id2"> <its:textAnalyticsAnnotation its-tan-type="entity" > > > its-tan-ident-ref="http://dbpedia.org/resource/London" > > > its-tan-class-ref="http://nerd.eurecom.fr/ontology#Place" > > > its-tan-confidence="0.7" annotatorsRef="tan|annotator-1"/> > > > </its:textAnalyticsAnnotations> > > > > > > Also ... is there a reason why the ref and id places have been switched? > > Yes - so that there is the same appraoch as for localization quality > issue and provenance. > > > In the initial Felix proposal the textAnalyticsAnnotation elements > could refer to one span in text (thus allowing actually easy cleanup > of the content). This IMO makes content management slightly more > difficult (or am I missing something)? > > I'm not sure whether this is more difficult? > > Mārcis: Now if in the other scenario where you had the id in the span > you could remove the textAnalyticsAnnotation elements easy (or even > maybe keep them apart having a physical stand-off solution), this is > not possible if you have a "ref" attribute in the span, because you > would have to run through the whole document and remove all ref > attributes that link to the section you would want to delete. > > Mārcis: I am not sure whether there is really a need for that, but > where there are multi-layer multi-annotator solutions this might in > some cases turn out to be handy... > > Mārcis: This is, however, just a comment ... maybe there is a really > justified reason why for LQI and Provenance it has been decided to > have the refs in the span...?! > > Best, > > Felix > > > > > > Best regards, > > > Mārcis ;o) > > > > > > -----Original Message----- > > > From: Felix Sasaki [mailto:fsasaki@w3.org] > > > Sent: Tuesday, January 29, 2013 4:39 PM > > > To: public-multilingualweb-lt@w3.org > <mailto:public-multilingualweb-lt@w3.org> > > > Subject: Re: issue-68 from an annotation representation point of view, > > > with potential implications for annotatorsRef and standoff markup > > > > > > Am 29.01.13 10:56, schrieb Tadej Štajner: > > >> Hi, Felix, Phil, > > >> maybe 'tanRefs' was misleading. the intention was to point to an > > >> its:textAnalysisAnnotations, element which could in turn contain > > >> contain several its:textAnalysisAnnotation elements that all describe > > >> the same fragment. > > > Thanks for the clarification, Tadej - that makes things clearer to me. > > > > > > I think it also means that we could - instead of "my" standoff > > > proposal > > > - have standoff markup for a joint terminology + disambiguation data > category, to allow for both kinds of annotations to be represented for > the same fragment. At Marcis: it would also mean - that's a different > to my proposal - that annotations would not be hierarchical and they > would not overlap, since they always - both in the inline and standoff > case - are anchored at the same span of text. > > > > > > Best, > > > > > > Felix > > > > > >> Is this valid usage of the its:textAnalysisAnnotations, or was it > > >> only meant to be a container for the individual rules? I was looking > > >> at this example for inspiration: > > >> http://www.w3.org/International/multilingualweb/lt/drafts/its20/examp > > >> les/xml/EX-locQualityIssue-local-2.xml > > >> > > >> > > >> Alternatively, having multiple values would also work equivalently, > > >> then we could point to individual textAnalysisAnnotation statements. > > >> -- Tadej > > >> > > >> On 29. 01. 2013 10:41, Felix Sasaki wrote: > > >>> Thanks, Phil. Tadej, was the intention of its:tanRefs at > > >>> http://lists.w3.org/Archives/Public/public-multilingualweb-lt/2013Ja > > >>> n/0212.html > > >>> > > >>> to have several pointers, e.g. allow for > > >>> its:tanRefs="tan1 tan2 tan3" > > >>> or just one, that is only "tan1"? > > >>> > > >>> Best, > > >>> > > >>> Felx > > >>> > > >>> > > >>> Am 29.01.13 10:34, schrieb Phil Ritchie: > > >>>> All > > >>>> > > >>>> @Felix: "But while doing that a question on the LQI/Provenance > > >>>> implementers: is it a feature that you point to just one external > > >>>> standoff unit, or an oversight, and it could it be several ones?" > > >>>> > > >>>> My current thinking is that stand-off stores many annotations for > > >>>> one segment. This is because if several segments are linked to one > > >>>> stand-off block, then if one of those segments needs to have > > >>>> another unique issue registered against it, you have to copy the > > >>>> stand-off, add the unique annotation and change the reference id's > > >>>> so that the link is between the segment with the additional > > >>>> annotation and the copied stand-off. > > >>>> Complex. > > >>>> > > >>>> Another argument for pointing to a single stand-off is that > > >>>> although the "classification" attributes of the markup might be > > >>>> identical (e.g. > > >>>> loc-quality-issue-type="style" loc-quality-issue-severity="75") > > >>>> each may have a different loc-quality-issue-comment to highlight > > >>>> the specific nature of the error. > > >>>> > > >>>> Hmm. The benefit of the id being on the segment/element and the > > >>>> idRefs being on the stand-off really comes into its own if you want > > >>>> to have multiple annotations across many data categories for the > > >>>> same segment/element. > > >>>> > > >>>> <span id="loaded">blah</span> > > >>>> > > >>>> <its:prov ref="loaded"... > > >>>> <its:locQualityIssues ref="loaded"... > > >>>> <its:textAnalysis ref="loaded" > > >>>> (on the train, I know this is not valid markup.) > > >>>> > > >>>> Phil > > >>>> > > >>>> > > >>>> > > >>>> On 28 Jan 2013, at 19:57, "Felix Sasaki" <fsasaki@w3.org > <mailto:fsasaki@w3.org>> wrote: > > >>>> > > >>>>> But while doing that a question on the LQI/Provenance implementers: > > >>>>> is it > > >>>> a feature that you point to just one external standoff unit, or an > > >>>> oversight, and it could it be several ones? > > >>>> > > >>>> > > >>>> ************************************************************ > > >>>> This email and any files transmitted with it are confidential and > > >>>> intended solely for the use of the individual or entity to whom > > >>>> they are addressed. If you have received this email in error please > > >>>> notify the sender immediately by e-mail. > > >>>> > > >>>> www.vistatec.com <http://www.vistatec.com> > > >>>> ************************************************************ > > >>>> > > >>>> > > >>> > > >> > > > >
Received on Tuesday, 29 January 2013 17:27:30 UTC