W3C home > Mailing lists > Public > public-multilingualweb-lt@w3.org > January 2013

RE: issue-68 from an annotation representation point of view, with potential implications for annotatorsRef and standoff markup

From: Yves Savourel <ysavourel@enlaso.com>
Date: Mon, 28 Jan 2013 07:08:06 -0700
To: <public-multilingualweb-lt@w3.org>
Message-ID: <assp.0740c82129.assp.074040a703.01e201cdfd60$e53b2a00$afb17e00$@com>
>> b) Links to term DB have been put in termInfoRef so far,
>> so we would basically split the termInfoRef features into two.
>> I'm not against, I'm just pointing out that conversion from 
>> ITS1 to ITS2 won't be automatic.
>
> Not sure if I understand: if we set "tan-type" to "term" and have 
> the "ident-ref" attribute: wouldn't that be teh lik to the termDB?

What I meant is: there is a single attribute in ITS1 to represent 2 things, while in ITS2 we would have 2 distinct attributes. Converting ITS1 markup to ITS2 would not be a simple matter of renaming or auto-rewriting the markup. A human would have to make choice in some cases.


>> It seems we would have two very different ways to use standoff 
>> markup: LQI and Provenance use a reference to the standoff enclosing 
>> element, here we would have the reference in the enclosing element 
>> to the local inline content. It would probably be better to be consistent.
>
> Not sure - if the term+disambig standoff is adopted, I actually would rather 
> call it different, e.g. "multilayer annotation" - and keep everything else 
> as is. The rational is that the other mechanisms seem to work fine. So I 
> wouldn't make the multilayer approach for disambig+term a big dial and new 
> architectural principle, to be deployed for other data categories - but rather 
> a means (to be defined in the respective sub section) trying to resolve a 
> last call comment.

I don't see a difference between what the standoff markup of LQI/Provenance does and this standoff for Term+Disambiguation does. They both allow you to assign several instances of the same data category on a single node. As a user I would not understand why there would be two different ways to do it. ...Or I'm missing something.


>> Also, for LQI and Provenance we also have either local attributes or standoff 
>> markup, not both at the same time like in this proposal. It would be 
>> also probably better to be consistent.
>
> See above - I think the proposed algorithm for fetching the multilayer info for 
> disambig+term described in the previous mail (fetch for each node the annotation 
> wrapper and resolve the IDs) is fairly simple - and we wouldn't need to 
> change other parts of the spec IMO.

And we should not. 
But, IMO, we should be consistent in how standoff markup is done, so we should adjust the Term+Disambiguation proposal to match the existing standoff mechanism we already have.

In other words: Why should Term+Disambiguation standoff works by pointing from the standoff data into the inline content, while the current standoff mechanism works the other way around? I don't see any difference in what we try to achieve. So why a different mechanism? (just trying to understand).


>> (From Dave)
>> iii) you loose the ability to associate standoff elements and 
>> content through global ITS rules, and hence loose the ability 
>> to annotate content in attributes.
>
> True - but that ability is not needed for disambiguation and terminology 
> anyway: as I understand it most annotation tools in both areas work on 
> text content. Also, I don't have seen global rules for terminology 
> working on attribute content. Others, have you?

Not so far. But global rules yes: dfn for example.


Best,
-yves
Received on Monday, 28 January 2013 14:08:32 UTC

This archive was generated by hypermail 2.3.1 : Tuesday, 6 January 2015 21:08:26 UTC