Re: LIME proposal for the OntoLex W3C Community Group from John P. McCrae on 2014-03-10 (public-ontolex@w3.org from March 2014)

From: John P. McCrae <jmccrae@cit-ec.uni-bielefeld.de>
Date: Mon, 10 Mar 2014 14:52:21 +0100
To: Armando Stellato <stellato@info.uniroma2.it>
Cc: Philipp Cimiano <cimiano@cit-ec.uni-bielefeld.de>, Manuel Fiorelli <fiorelli@info.uniroma2.it>, public-ontolex <public-ontolex@w3.org>
Message-ID: <CAC5njqqtwp9REJ5T9Q+eZDm2Fm-QeO3RTi9YWFvRmd0E5pqbXg@mail.gmail.com>
Hi,

Sorry, it seems that what I said was a bit unclear. I have no problem with
metadata properties that are also equivalent to SPARQL queries (In fact
even some of the things I proposed in the last email could be formulated as
SPARQL queries). My concern is more this current focus of this discussion
is about the idea of "language coverage" as described in the use case here
https://www.w3.org/community/ontolex/wiki/Specification_of_Use_Cases. In
particular, I have two main concerns

   - The actual SAOM negotiation can be done with SPARQL.
   - The idea of language coverage is too specific even for this use case.
   For example, this use case also covers something like below, that could be
   achieved without knowledge of 'language coverage':

*Merlin*: Hi, I’m Merlin the Wizard. I see you are a Genie, so i suppose we
can talk about magic*

*Djinni*: Oh yes, I like talking about magic. My reference ontology for
magic is: Xxxxx/magic.owl

*Merlin*: Erm…sorry, mine is: YYYYY/mana.owl

*Djinni*: Well, ok, what’s (are) your language(s)?

*Merlin*: actually I’m a good english speaker [ontology natively filled
with english terms]

*Djinni*: Mmm…I just speak arabian, and I’m able to express some of my
ideas in a very simple english [ :djinni_resource a lime:LexicalResource ;
lime:hasLexicon :djinni_lexicon_eng , :djinni_lexicon_ara . ]

*Merlin: *Do you know the words "mana", "spell" and "fireball"? [select *
from :djinni_lexicon_eng where { ?x ontolex:canonicalForm ?f . ?f
ontolex:writtenRep "mana"@eng }]

*Djinni*: Yes I do

*Merlin*: That is all I need to communicate.

OK, I stop with the wizards... but my point is, I am generally reluctant to
see *any vocabulary at all* included in the model. The case of language
coverage appears to me to introduce a lot of complexity (in the sense of
many vocabulary elements) for little gain (optimizing a single narrow part
of one use case that could be achieved by other means).

Regards,
John


On Sat, Mar 8, 2014 at 8:44 PM, Armando Stellato
<stellato@info.uniroma2.it>wrote:

> Dear John,
>
>
>
> well I’m a bit puzzled, in that this is surely worth discussing, but it’s
> a completely orthogonal topic again. The fact that Philipp mentioned the
> possibility to define their semantics through SPARQL does not change
> anything about the nature of these properties so, if you found them useless
> because of their redundancy with the data, they were useless/redundant even
> before.
>
> Maybe we should synthetize a few aspects and discuss them in a page of the
> wiki. What do you think? The impression is that in the emails we are
> opening new topics instead of closing the open ones, so it may be worth to
> have separate threads. Please let us know, if you feel we are almost close
> to the end, we may even go along with emails (maybe with specific threads).
>
>
>
> Btw, to reply to your specific question:
>
>
>
> The point of metadata is not to optimize commonly run SPARQL queries, for
> two primary reasons, firstly it bulks up the model and instances of the
> model with triples for these 'pre-compiled' queries and secondly it is very
> hard to predict what queries an end-user will want to run. It seems that
> the kind of metadata we are proposing to model is nearly entirely
> pre-compiled queries, and are of questionable practical application. That
> is, I ask a simple question: *if we can achieve resource interoperability
> for OntoLex already with SPARQL why the heck do we need metadata anyway??*
>
>
>
> Personally, as an engineer, I’m biased towards considering “redundancy the
> evil”, and keep information to its minimum (so I would tend to agree with
> your point). But, engineering 101 manual tells that you may sometimes give
> up the orthodoxy on the above principle, if this greatly improves
> performance, scalability etc…
>
> Furthermore, instead of trivially giving up, you should designate how,
> when and where the redundancy points are defined (whatever system you are
> speaking about).
>
>
>
> Now, narrowing down to our case, we have a clear point, the void file,
> that is a surrogate of a dataset, contains its metadata, and is always
> updated following updates to its content: no danger of dangling out-of-date
> redundant information then.
>
> We have also a clear scenario: packs of spiders roaming around the web and
> getting plenty of useful information from tons of different datasets
> without stressing their SPARQL endpoints; mediators examining metadata from
> multiple resources and taking decisions very quickly etc…
>
>
>
> But, I’m a just poor guy :) so, out of my personal view, let me mention
> some notable predecessors:
>
>
>
> Already mentioned by Manuel in his email of today, we have VOAF:
> http://lov.okfn.org/vocab/voaf/v2.3/index.html
>
> ..but VOAF is not a standard…
>
>
>
> …talking about standards, ladies and gentlemen, here is VoID itself and
> its many SPARQL deducible properties!
>
> https://code.google.com/p/void-impl/wiki/SPARQLQueriesForStatistics
>
>
>
> ..and to happily close my defense, well, in any case Manuel just confirmed
> in his email that I should have thought one second more about the SPARQL
> deducibility of LIME’s properties :-)
>
> Some of them are in fact SPARQL deducible, but it seems the one we took as
> an example (lime:languageCoverage<http://art.uniroma2.it/ontologies/lime#languageCoverage>)
> is exactly one of those not so trivial to write (maybe I’m not an expert
> with CONSTRUCTS, but I would say not possible at all).
>
> In the LIME module, we used RDF API and plain Java post processing to
> compute them, so I was not recalling which ones were simple SPARQL
> constructs and which ones needed more processing.
>
>
>
> Cheers,
>
>
>
> Armando
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
Received on Monday, 10 March 2014 13:52:50 UTC