W3C home > Mailing lists > Public > public-multilingualweb-lt@w3.org > February 2013

Re: [Issue-68] Disambiguation (and term)

From: Tadej Stajner <tadej.stajner@ijs.si>
Date: Tue, 26 Feb 2013 15:28:04 +0100
Message-ID: <512CC674.6010307@ijs.si>
To: Felix Sasaki <fsasaki@w3.org>
CC: Dave Lewis <dave.lewis@cs.tcd.ie>, public-multilingualweb-lt@w3.org
Yes, this paragraph has been here for a while. It does feel as something 
for a 'best practices' document or subsection. In any case, Felix's 
wording sounds better.
-- Tadej

On 2/26/2013 10:41 AM, Felix Sasaki wrote:
> Hi Dave, all,
>
> Am 26.02.13 01:20, schrieb Dave Lewis:
>> Hi Tadej, Guys,
>>
>> A typographical question, in the word file you have:
>> "
>>
>> *Note:*
>>
>> Text Analysis is primarily intended for textual content. 
>> Nevertheless, the data category can also be used in multi-media 
>> contexts. Example: objects on an image could be annotated with 
>> DBpedia IRIs. [CL1] <#_msocom_1>
>>
>> When serializing the ITS Text Analysis 
>> <http://www.w3.org/International/multilingualweb/lt/drafts/its20/its20.html#Disambiguation>data 
>> category markup in HTML, the preferred way is to serialize in RDFa 
>> Lite or Microdata due to the existing search and crawling 
>> infrastructure that is able to consume these formats.
>> "
>>
>> ------------------------------------------------------------------------
>> But the two paragraphs seem to be separate topics, so they should be 
>> under separate Note heading I think - assuming the issue about RDFs 
>> mapping is a note?
>
> Both is part of one note - we could e.g. have bullet points in the 
> note to make clear that these are separate topics. Also, Christian had 
> in the word doc a comment on the multi-media context, saying that if 
> we have this we should add a reference, e.g. to
> http://archive.xmlprague.cz/2013/presentations/Local_Knowledge_for_In_Situ_Services/index.xhtml#%2812%29
> Thoughts?
>
>>
>> Further, I don't know if I agree with the wording of the second 
>> paragraph. Sure, the example makes the valid point that Text Analysis 
>> can do the same job as the RDFa lite. But these are potentially 
>> different use cases. For example it may be that the authors add text 
>> annotation to feed into a subsequent terminology process, but use ITS 
>> so they can strip out the ta annotation so it can be easily stripped 
>> out again later, i.e. they have no intention to add the equivalent 
>> markup into the HTML. Also, if the confidence score is a relevant 
>> piece of information, then the RDFa solution should be 'preferred'.
>>
>> I'd suggest either moving this note to some best practice, or change 
>> the wording just so it highlights the equivalence of the mapping, but 
>> doesn't try to go into detail of when one approach is preferred over 
>> the other.
>
> Just FYI, that paragraph is in the last call draft too. How about 
> re-formulating like this
> "When serializing the ITS Text Analysis data category markup in HTML, 
> one way to serialize the markup is RDFa Lite or Microdata. This 
> serialization is due to the existing search and crawling 
> infrastructure that is able to consume these formats. For other usage 
> scenarios, e.g. add text annotation to feed into a subsequent 
> terminology process, using ITS Text Analysis data category markup 
> natively is preferred. In this way, the markup easily can be stripped 
> out again later."
>
> Best,
>
> Felix
>
>>
>> cheers,
>> Dave
>>
>>
>>
>> On 22/02/2013 14:45, Tadej Stajner wrote:
>>> Hi, all,
>>> after some discussion, this is the proposal we came up. Felix 
>>> already summarized it, now I'm attaching the actual document and the 
>>> examples.
>>> -- Tadej
>>>
>>> On 2/22/2013 12:31 PM, Felix Sasaki wrote:
>>>> Hi David,
>>>>
>>>> Marcis, Tadej, Christian and I worked on a text proposal. Tadej 
>>>> will send that out later today. Summary of the change:
>>>> - renaming disambiguation
>>>> - changing attribute prefixes to "ta"
>>>> - dropping the "level"
>>>> - rewriting the disambiguation section - but not changing behaviour.
>>>>
>>>> So I hope that we can close the issue next week, and implementers 
>>>> can start to change the attribute names in their tests and drop 
>>>> "level".
>>>>
>>>> Best,
>>>>
>>>> Felix
>>>>
>>>> Am 22.02.13 12:17, schrieb Dr. David Filip:
>>>>> Felix, Dave,
>>>>>
>>>>> It seems based on the discussions up to date that this issue has been
>>>>> resolved and even addressed by Tadej in the test suit
>>>>> http://lists.w3.org/Archives/Public/public-multilingualweb-lt/2013Feb/0027.html 
>>>>>
>>>>>
>>>>> However, editorial actions to change the spec accordingly do not seem
>>>>> to have been assigned so far
>>>>> [agenda:] 
>>>>> http://lists.w3.org/Archives/Public/public-multilingualweb-lt/2013Feb/0089.html
>>>>> [minutes:] http://www.w3.org/2013/02/18-mlw-lt-minutes.html
>>>>> Tadej is drafting the changes based on the above minutes
>>>>> @Tadej, will those changes be ready to be assigned to a co-editor by
>>>>> the Monday call?
>>>>> Assigning these should be made an agenda point on Monday in order to
>>>>> be able to close Issue-68
>>>>>
>>>>> Thanks for attention
>>>>> dF
>>>>>
>>>>>
>>>>> Dr. David Filip
>>>>> =======================
>>>>> LRC | CNGL | LT-Web | CSIS
>>>>> University of Limerick, Ireland
>>>>> telephone: +353-6120-2781
>>>>> cellphone: +353-86-0222-158
>>>>> facsimile: +353-6120-2734
>>>>> mailto: david.filip@ul.ie
>>>>>
>>>>
>>>>
>>>
>>
>
Received on Tuesday, 26 February 2013 14:28:46 UTC

This archive was generated by hypermail 2.3.1 : Sunday, 9 June 2013 00:25:08 UTC