W3C home > Mailing lists > Public > public-multilingualweb-lt@w3.org > February 2013

Re: [Issue-68] Disambiguation (and term)

From: Felix Sasaki <fsasaki@w3.org>
Date: Tue, 26 Feb 2013 10:41:54 +0100
Message-ID: <512C8362.2000704@w3.org>
To: Dave Lewis <dave.lewis@cs.tcd.ie>
CC: Tadej Stajner <tadej.stajner@ijs.si>, public-multilingualweb-lt@w3.org
Hi Dave, all,

Am 26.02.13 01:20, schrieb Dave Lewis:
> Hi Tadej, Guys,
> A typographical question, in the word file you have:
> "
> *Note:*
> Text Analysis is primarily intended for textual content. Nevertheless, 
> the data category can also be used in multi-media contexts. Example: 
> objects on an image could be annotated with DBpedia IRIs. [CL1] 
> <#_msocom_1>
> When serializing the ITS Text Analysis 
> <http://www.w3.org/International/multilingualweb/lt/drafts/its20/its20.html#Disambiguation>data 
> category markup in HTML, the preferred way is to serialize in RDFa 
> Lite or Microdata due to the existing search and crawling 
> infrastructure that is able to consume these formats.
> "
> ------------------------------------------------------------------------
> But the two paragraphs seem to be separate topics, so they should be 
> under separate Note heading I think - assuming the issue about RDFs 
> mapping is a note?

Both is part of one note - we could e.g. have bullet points in the note 
to make clear that these are separate topics. Also, Christian had in the 
word doc a comment on the multi-media context, saying that if we have 
this we should add a reference, e.g. to

> Further, I don't know if I agree with the wording of the second 
> paragraph. Sure, the example makes the valid point that Text Analysis 
> can do the same job as the RDFa lite. But these are potentially 
> different use cases. For example it may be that the authors add text 
> annotation to feed into a subsequent terminology process, but use ITS 
> so they can strip out the ta annotation so it can be easily stripped 
> out again later, i.e. they have no intention to add the equivalent 
> markup into the HTML. Also, if the confidence score is a relevant 
> piece of information, then the RDFa solution should be 'preferred'.
> I'd suggest either moving this note to some best practice, or change 
> the wording just so it highlights the equivalence of the mapping, but 
> doesn't try to go into detail of when one approach is preferred over 
> the other.

Just FYI, that paragraph is in the last call draft too. How about 
re-formulating like this
"When serializing the ITS Text Analysis data category markup in HTML, 
one way to serialize the markup is RDFa Lite or Microdata. This 
serialization is due to the existing search and crawling infrastructure 
that is able to consume these formats. For other usage scenarios, e.g. 
add text annotation to feed into a subsequent terminology process, using 
ITS Text Analysis data category markup natively is preferred. In this 
way, the markup easily can be stripped out again later."



> cheers,
> Dave
> On 22/02/2013 14:45, Tadej Stajner wrote:
>> Hi, all,
>> after some discussion, this is the proposal we came up. Felix already 
>> summarized it, now I'm attaching the actual document and the examples.
>> -- Tadej
>> On 2/22/2013 12:31 PM, Felix Sasaki wrote:
>>> Hi David,
>>> Marcis, Tadej, Christian and I worked on a text proposal. Tadej will 
>>> send that out later today. Summary of the change:
>>> - renaming disambiguation
>>> - changing attribute prefixes to "ta"
>>> - dropping the "level"
>>> - rewriting the disambiguation section - but not changing behaviour.
>>> So I hope that we can close the issue next week, and implementers 
>>> can start to change the attribute names in their tests and drop 
>>> "level".
>>> Best,
>>> Felix
>>> Am 22.02.13 12:17, schrieb Dr. David Filip:
>>>> Felix, Dave,
>>>> It seems based on the discussions up to date that this issue has been
>>>> resolved and even addressed by Tadej in the test suit
>>>> http://lists.w3.org/Archives/Public/public-multilingualweb-lt/2013Feb/0027.html 
>>>> However, editorial actions to change the spec accordingly do not seem
>>>> to have been assigned so far
>>>> [agenda:] 
>>>> http://lists.w3.org/Archives/Public/public-multilingualweb-lt/2013Feb/0089.html
>>>> [minutes:] http://www.w3.org/2013/02/18-mlw-lt-minutes.html
>>>> Tadej is drafting the changes based on the above minutes
>>>> @Tadej, will those changes be ready to be assigned to a co-editor by
>>>> the Monday call?
>>>> Assigning these should be made an agenda point on Monday in order to
>>>> be able to close Issue-68
>>>> Thanks for attention
>>>> dF
>>>> Dr. David Filip
>>>> =======================
>>>> LRC | CNGL | LT-Web | CSIS
>>>> University of Limerick, Ireland
>>>> telephone: +353-6120-2781
>>>> cellphone: +353-86-0222-158
>>>> facsimile: +353-6120-2734
>>>> mailto: david.filip@ul.ie
Received on Tuesday, 26 February 2013 09:42:29 UTC

This archive was generated by hypermail 2.3.1 : Tuesday, 6 January 2015 21:08:29 UTC