Re: Let's drop RDFa in the requirements !

Hi, Maxime,
you are correct with the impendance mismatch with interpreting literals 
to be subjects in RDFa, which is not allowed. The solution that worked 
for us was to prescribe an URI generation scheme that allows 
dereferencing individual fragments. This then comes with its own baggage 
around URI stability, so RDFa wouldn't be practical as a primary medium, 
but it can still be a viable transformation target.

The data categories overlapping with the ontolex domain aren't only 
namedEntity and term, but also very likely  mtDisambiguation. They are 
all variants of the notion of linking a fragment of text to some entity. 
While the lemon model chain is a very complete representation of that, 
it has several concepts there that are out of the domain we operate 
within, so strictly using that model would be a leak of abstraction - 
what I recommend in order to limit the complexity of the domain is to 
define the itsx:mentions predicate that would basically be a short-cut 
of that chain, glossing over various lexical forms, entries and senses.

In a sense, having a fragment f and an entity e:

f itsx:mentions e .

should be defined as an equivalent assertion to:

lf rdf:type lemon:LexicalForm .
le rdf:type lemon:LexicalEntry .
ls rdf:type lemon:LexicalSense .
lf lemon:writtenRep f .
le lemon:form lf .
ls lemon:isSenseOf le .
ls lemon:isReferenceOf o .

provided that there exist some lf, le and ls that fit that query. What 
do you think of this approach - am I missing anything?

-- Tadej

On 5/2/2012 4:13 PM, Felix Sasaki wrote:
> Thank you for your mail, Maxime, your analysis below is correct, I 
> think also wrt the data categories. I assume that you and Tadej are in 
> a good position to assure that this is well coordinated between 
> Ontolex and MLW-LT.
>
> Felix
>
> 2012/5/2 Maxime Lefrançois <maxime.lefrancois@inria.fr 
> <mailto:maxime.lefrancois@inria.fr>>
>
>     Hi Felix,
>
>     After the resolution of issue-2, I understood that microdata and
>     RDFa was to play a very secondary role, and that custom HTML5
>     attributes were going to be the main metadata mechanism for HTML5.
>     This is the main reason why I suggested to drop RDFa from the
>     MLW-LT requirements to add them to the MSW requirements, as the
>     MSW-CG deals with SemWeb formalisms.
>
>     I understand now rereading the charter that a microdata and RDFa
>     description of metadata is wanted, anyways, I'll be happy to
>     contribute to the definition a model for ITS2.0 that is compatible
>     with the MSW-CG model, and to the mapping between its-* attributes
>     and RDFa/microdata markup.
>
>     The data categories targeted by MLW-LT are indeed different than
>     the goals of ontolex, the only data categories we need to be
>     carefull are namedEntity and terminology, because the link that
>     exists between a concept (potentially taken from an ontology) and
>     a text fragment that mentions this concept is complex in the lemon
>     model:
>     Ontology Entity <-> Lexical Sense <-> Lexical Entry -> Lexical
>     Form -> (Written) Representation
>
>     Kind regards,
>     Maxime Lefrançois
>     Ph.D. Student, INRIA - WIMMICS Team
>     http://maxime-lefrancois.info <http://maxime-lefrancois.info/>
>     @Max_Lefrancois <http://twitter.com/Max_Lefrancois>
>
>     ------------------------------------------------------------------------
>
>         *De: *"Felix Sasaki" <fsasaki@w3.org <mailto:fsasaki@w3.org>>
>         *À: *"Maxime Lefrançois" <maxime.lefrancois@inria.fr
>         <mailto:maxime.lefrancois@inria.fr>>
>         *Cc: *"David Lewis" <dave.lewis@cs.tcd.ie
>         <mailto:dave.lewis@cs.tcd.ie>>, "Multilingual Web LT Public
>         List" <public-multilingualweb-lt@w3.org
>         <mailto:public-multilingualweb-lt@w3.org>>,
>         public-ontolex@w3.org <mailto:public-ontolex@w3.org>
>         *Envoyé: *Mercredi 2 Mai 2012 14:10:53
>
>         *Objet: *Re: Let's drop RDFa in the requirements !
>
>         Hi Maxime,
>
>         have a look at our charter
>         http://www.w3.org/2011/12/mlw-lt-charter.html
>         which requires that we develop an RDFa serialization and a
>         microdata version of our metadata. We do not say that we will
>         provide an XML version. Of course many people here discuss XML
>         issues since this is the "legacy" of ITS 1.0, which will
>         continue IMO - but it will be brought to other serializations
>         as well.
>
>         There is already a good level of coordination between the
>         Ontolex group and MLW-LT - just have a look of the overlap in
>         participants
>         https://www.w3.org/2000/09/dbwg/details?group=53116&public=1
>         <https://www.w3.org/2000/09/dbwg/details?group=53116&public=1>
>         including you, I, Dave, Paul, ...
>
>         Also, I think the data categories targeted by MLW-LT are quite
>         different than the goals of ontolex - MLW-LT does not plan to
>         define lexicon models at all. Note also that Paul is
>         co-chairing the Dublin workshop.
>
>         Felix
>
>         2012/5/2 Maxime Lefrançois <maxime.lefrancois@inria.fr
>         <mailto:maxime.lefrancois@inria.fr>>
>
>             Hi Dave, The MSW-CG and MLW-LT-XG members,
>             my answers below
>
>             ------------------------------------------------------------------------
>
>                 *De: *"David Lewis" <dave.lewis@cs.tcd.ie
>                 <mailto:dave.lewis@cs.tcd.ie>>
>                 *À: *public-multilingualweb-lt@w3.org
>                 <mailto:public-multilingualweb-lt@w3.org>
>                 *Envoyé: *Mardi 1 Mai 2012 02:23:47
>                 *Objet: *Re: Let's drop RDFa in the requirements !
>
>
>                 Hi Maxime,
>                 Some comments below:
>
>                 On 27/04/2012 15:57, Maxime Lefrançois wrote:
>
>                     Hi,
>
>                     in mail
>                     http://lists.w3.org/Archives/Public/public-multilingualweb-lt/2012Apr/0131.html,
>                     I wrote a possible RDFa markup to represent the
>                     fact that "a fragment of text is identified as a
>                     named entity". I stressed that there is a shift of
>                     meaning : the meaning using RDFa is: "there is a
>                     resource in the document that its:lexicalizes a
>                     named entity, and that has for its:value in
>                     english some fragment of text".
>
>                     Actually, there will always be a shift of meaning
>                     if we are to use RDFa, and this is a strong
>                     conceptualization incompatibility between ITS and
>                     RDF. In fact, in ITS one annotates fragments of
>                     text (litterals), but in RDF litterals can't be
>                     subject of a triple. As simple as that.
>
>
>                 But does wrapping the litteral in a span and then
>                 adding an id attribute to that not make it
>                 dereferencable and then therefore the potential
>                 subject of a triple?
>
>             Yes and no,
>              - the uri could be the subject of a triple anywhere of
>             the web, but the uri refers to the span, and not to the
>             the text fragment that the span contains.
>              - if you want to add a triple in the very same document,
>             you need RDFa, and in RDF/RDFa there is no mechanism to
>             use a litteral as a subject, it is forbidden. In RDFa
>             lite, the minimal triple needs a property="" attribute to
>             define the property of the triple, and the text fragment
>             is the object of the triple.:
>             <span id="myid" property="its:property">mytext</span>
>             -----> [:myid its:property "mytext"]
>
>
>                     So other RDF models could exist to represent the
>                     simple fact that "a fragment of text is identified
>                     as a named entity", depending on the model choosen
>                     to represent ITS 2.0 with semantic web formalisms.
>                     What is the desirable semantic web model for ITS
>                     2.0 ? What are the pros and cons of each ?
>
>                     I think that the MLW-LT XG should not bother with
>                     RDFa at all, for three main reasons:
>
>                     1- I don't see any requirement that explicitly
>                     asks for semantic web
>                     2- It may be extremely confusing to have different
>                     conceptualization in the same recommendation
>                     3- This is typically the kind of conceptualization
>                     decision about lexical resources that the
>                     Multilingual Semantic Web Community Group will
>                     shortly have to face, and I don't think it's a
>                     really good idea to choose a semantic web model
>                     for ITS 2.0 too early as it might be incompatible
>                     with their requirements.
>
>
>                 I agree that the objective of ITS isn't to add
>                 knowledge to the semantic web per se. Neither is it
>                 clear that OWL-based reasoning, or even RDFS inference
>                 addresses any real use cases in the ITS problem area.
>                 However, RDFa is an established model for annotating
>                 HTML with meta-data and for using such meta-data to
>                 make meaningful links to external resources. These are
>                 both recurring ITS requirements.
>
>                 So the question is why would we introduce a different
>                 meta-data mechanism for HTML if RDFa is sufficient and
>                 possibly already benefiting from existing tools and
>                 data management support?
>
>                 However, we should definitely engage with the MLSW
>                 community on this. Are there some key representative
>                 that we should be aiming to attract for the MLW-LOD
>                 workshop?
>
>             I add the public-ontolex@w3.org
>             <mailto:public-ontolex@w3.org> mailing list as a
>             receipient of this mail, Paul Buitelaar and Philip Cimiano
>             are the chairs of the community group. People from the
>             MSW, are you going to the multilingual semantic web
>             workshop linked open data workshop, Dublin, 11 June ? the
>             registration form is open until 2012-05-09 here,
>             http://www.multilingualweb.eu/en/documents/dublin-workshop/dublin-cfp.
>
>
>             As I understood the community behind ITS1.0 is strongly
>             based on the XML, so the needs and expertise of the
>             members is mostly XML oriented...
>             Using RDFa will lead to the design of two incompatible
>             models for ITS2.0. put simply, one based on XML to
>             annotates text fragments, and another based on RDF where
>             text fragments can only be object of triples.
>             I don't think the MLW-LT community would immediately
>             benefit from a model in RDFa, and it might interfere with
>             the job that is being done by the MSW community group.
>             Once the working drafts of the MLW-LT and the MSW will be
>             submitted, it will be fairly straightforward to propose a
>             model for ITS2.0 that extends the one that MSW will produce.
>
>                 cheers,
>                 Dave
>
>                     So I suggest we drop RDFa in the requirements
>                     (delete the two lines that speak about RDFa ), and
>                     let's let the Multilingual Semantic Web Community
>                     Group deal with the semantic web, the mapping of
>                     ITS annotated XML documents into RDF, and the
>                     mapping between its-* attributes and RDFa.
>
>                     Regards,
>                     Maxime Lefrançois
>
>             Kind regards,
>             Maxime Lefrançois
>             Ph.D. Student, INRIA - WIMMICS Team
>             http://maxime-lefrancois.info <http://maxime-lefrancois.info/>
>             @Max_Lefrancois <http://twitter.com/Max_Lefrancois>
>
>
>
>
>         -- 
>         Felix Sasaki
>         DFKI / W3C Fellow
>
>
>
>
>
> -- 
> Felix Sasaki
> DFKI / W3C Fellow
>

Received on Wednesday, 2 May 2012 17:21:36 UTC