Re: NLP Interchange Format was: Re: Let's drop RDFa in the requirements ! from Felix Sasaki on 2012-05-11 (public-multilingualweb-lt@w3.org from May 2012)

From: Felix Sasaki <fsasaki@w3.org>
Date: Fri, 11 May 2012 09:33:08 +0200
To: Sebastian Hellmann <hellmann@informatik.uni-leipzig.de>
Cc: Dave Lewis <dave.lewis@cs.tcd.ie>, Maxime Lefrançois <maxime.lefrancois@inria.fr>, Multilingual Web LT Public List <public-multilingualweb-lt@w3.org>, public-ontolex@w3.org
Message-ID: <CAL58czqwVCaOdk2rNpEv2s-QRPRcf4w6Y4G0v4gk=fqqvZKPDw@mail.gmail.com>
Thanks a lot for this, Sebastian.

As we describe at
http://www.w3.org/International/multilingualweb/lt/wiki/Requirements#Implementation_Approach
and in
https://www.w3.org/International/multilingualweb/lt/track/issues/2
we envisage ITS attributes (its-*) for HTML5 and an automatic conversion to
RDFa

"

   - the working group will provide an algorithm to convert its- attributes
   into RDFa and Microdata markup, to serve the needs of the Semantic Web
   community and of search engine optimization.
   - The conversion to RDFa will add URIs to each metadata item in an HTML5
   document. This is needed as reference points for the metadata items after
   extraction of RDF.

"

Tadej is likely to work describing that conversion algorithm (which I guess
will be pretty straightforward). Sebastian or others, how would NIF fit
into this picture? What alignment between the conversion to RDFa and
potentially to NIF is needed?

Felix


2012/5/10 Sebastian Hellmann <hellmann@informatik.uni-leipzig.de>

> Dear all,
> I was following the conversation about RDFa and would like to draw your
> attention to the NLP Interchange Format (NIF), which we are still
> developing within LOD2. Although I am not 100% up-to-date with all your
> requirements, I would assume, that NIF tackles some of the issues you are
> having, i.e. the no literals as subject problem or a general uncertainty
> how to handle things.
>
> Please find the latest document (one week old) about it here:
> http://svn.aksw.org/papers/**2012/WWW_NIF/public/string_**ontology.pdf<http://svn.aksw.org/papers/2012/WWW_NIF/public/string_ontology.pdf>
>
> We are currently gathering requirements for NIF version 2.0. We will
> prepare a draft within the next two months and then a community reviewing
> phase.
> I will be at Dublin, so please feel free to ask me any questions.
>
> NIF is already compatible to the lemon model and NERD.
>
> So to compare it to Tadej example, I made one here:
> It concerns the first occurrence of "Semantic Web" on http://www.w3.org/**
> DesignIssues/LinkedData.html<http://www.w3.org/DesignIssues/LinkedData.html> highlighted here:
> http://pcai042.informatik.uni-**leipzig.de/~swp12-9/**
> vorprojekt/index.php?**annotation_request=http%3A%2F%**
> 2Fwww.w3.org%2FDesignIssues%**2FLinkedData.html%23hash_10_**12_**
> 60f02d3b96c55e137e13494cf9a02d**06_Semantic%2520Web<http://pcai042.informatik.uni-leipzig.de/~swp12-9/vorprojekt/index.php?annotation_request=http%3A%2F%2Fwww.w3.org%2FDesignIssues%2FLinkedData.html%23hash_10_12_60f02d3b96c55e137e13494cf9a02d06_Semantic%2520Web>
>
> Here is the NIF example for it (sso:oen is probably the same as
> itsx:mentions):
> <http://www.w3.org/**DesignIssues/LinkedData.html#**offset_717_729<http://www.w3.org/DesignIssues/LinkedData.html#offset_717_729>
> >
>      a str:StringInContext ;
>      itsx:mentions <http://dbpedia.org/resource/**Semantic_Web<http://dbpedia.org/resource/Semantic_Web>>
> .
>      sso:oen <http://dbpedia.org/resource/**Semantic_Web<http://dbpedia.org/resource/Semantic_Web>>
> .
>
> Additionally "semantic" could have a lexical entry. Note that 1. the
> offset is 4 shorter and that the DBpedia Wiktionary link is working already
> of type lemon:LexicalEntry .
>
> <http://www.w3.org/**DesignIssues/LinkedData.html#**offset_717_725<http://www.w3.org/DesignIssues/LinkedData.html#offset_717_725>
> >
>    a str:StringInContext ;
>    sso:hasLexicalEntry <http://wiktionary.dbpedia.**org/resource/semantic<http://wiktionary.dbpedia.org/resource/semantic>>
> .
>
>
> All the best,
> Sebastian
>
>
>
> On 05/08/2012 03:46 PM, Dave Lewis wrote:
>
>> Hi Maxime,
>> Thanks you for this further clarification.
>>
>> I think a formulation you define, where the litteral would be the
>> _object_ of the triple while the span is the subject, may be sufficient for
>> what ITS is looking for. We only want to mark the litteral for further
>> processing, rather than wanting to make direct assertions about it as a
>> subject.
>>
>> The question of whether we should be using RDFa for this at all is a
>> broader one. It would be good to get other views, especially from potential
>> implementors of ITS2.0 on this?
>>
>> Also, to reinforce Maxime's point, the ontolex members and their
>> expertise would be very welcome at the upcoming dublin workshop. On the 11
>> june we are looking at future roadmaps for convergence of the multilingual
>> web with LOD. On the 12 and 13th we will be focussing directly on the
>> requirements for the ITS2.0 recommendation that the MLW-LT WG is currently
>> producing. We've not finalised the schedule yet, but I imagine that these
>> RDFa issue would be examined early on the 12th in the context of
>> terminology management and it tool support in localization.
>>
>> Kind Regards,
>> Dave
>>
>>
>> On 02/05/2012 11:08, Maxime Lefrançois wrote:
>>
>>> Hi Dave, The MSW-CG and MLW-LT-XG members,
>>> my answers below
>>>
>>> ------------------------------**------------------------------**
>>> ------------
>>>
>>>    *De: *"David Lewis" <dave.lewis@cs.tcd.ie>
>>>    *À: *public-multilingualweb-lt@w3.**org<public-multilingualweb-lt@w3.org>
>>>    *Envoyé: *Mardi 1 Mai 2012 02:23:47
>>>    *Objet: *Re: Let's drop RDFa in the requirements !
>>>
>>>    Hi Maxime,
>>>    Some comments below:
>>>
>>>    On 27/04/2012 15:57, Maxime Lefrançois wrote:
>>>
>>>        Hi,
>>>
>>>        in mail
>>>        http://lists.w3.org/Archives/**Public/public-multilingualweb-**
>>> lt/2012Apr/0131.html<http://lists.w3.org/Archives/Public/public-multilingualweb-lt/2012Apr/0131.html>
>>> ,
>>>        I wrote a possible RDFa markup to represent the fact that "a
>>>        fragment of text is identified as a named entity". I stressed
>>>        that there is a shift of meaning : the meaning using RDFa is:
>>>        "there is a resource in the document that its:lexicalizes a
>>>        named entity, and that has for its:value in english some
>>>        fragment of text".
>>>
>>>        Actually, there will always be a shift of meaning if we are to
>>>        use RDFa, and this is a strong conceptualization
>>>        incompatibility between ITS and RDF. In fact, in ITS one
>>>        annotates fragments of text (litterals), but in RDF litterals
>>>        can't be subject of a triple. As simple as that.
>>>
>>>
>>>    But does wrapping the litteral in a span and then adding an id
>>>    attribute to that not make it dereferencable and then therefore
>>>    the potential subject of a triple?
>>>
>>> Yes and no,
>>>  - the uri could be the subject of a triple anywhere of the web, but the
>>> uri refers to the span, and not to the the text fragment that the span
>>> contains.
>>>  - if you want to add a triple in the very same document, you need RDFa,
>>> and in RDF/RDFa there is no mechanism to use a litteral as a subject, it is
>>> forbidden. In RDFa lite, the minimal triple needs a property="" attribute
>>> to define the property of the triple, and the text fragment is the object
>>> of the triple.:
>>> <span id="myid" property="its:property">**mytext</span> -----> [:myid
>>> its:property "mytext"]
>>>
>>
>>
>>
>
> --
> Dipl. Inf. Sebastian Hellmann
> Department of Computer Science, University of Leipzig
> Projects: http://nlp2rdf.org , http://dbpedia.org
> Homepage: http://bis.informatik.uni-**leipzig.de/SebastianHellmann<http://bis.informatik.uni-leipzig.de/SebastianHellmann>
> Research Group: http://aksw.org
>
>
>


-- 
Felix Sasaki
DFKI / W3C Fellow
Received on Friday, 11 May 2012 07:33:40 UTC