Re: thoughts on the profile issue from Ivan Herman on 2011-08-03 (public-rdfa-wg@w3.org from August 2011)

From: Ivan Herman <ivan@w3.org>
Date: Wed, 3 Aug 2011 07:25:52 +0200
To: Gregg Kellogg <gregg@kellogg-assoc.com>
Cc: W3C RDFWA WG <public-rdfa-wg@w3.org>
Message-Id: <28D9DA87-B58C-46D0-A51D-03A88B2BDB17@w3.org>
On Aug 3, 2011, at 03:09 , Gregg Kellogg wrote:

> On Aug 2, 2011, at 6:48 AM, "Ivan Herman" <ivan@w3.org> wrote:
> 
>> Wow:-) Many things have happened while I was on vacations...
>> 
>> This mail is set of slightly random thoughts on the profile discussion. Instead of answering each individual mails, I would rather gather my thoughts at one place; it may also help in triggering new discussions. I am sorry it is fairly long... Bear with me!
>> 
>> ...
>> 
>> 4. The RDFa-Sem alternative _is_ interesting. What Niklas is saying (if my understanding is correct) is that a URI used for a @vocab _may_ be a reference to an RDFS vocabulary; so an RDFa processor may pick up all the RDFS vocabularies in the file, merge all these graphs, and do an RDFS reasoning on the merged graph. Just follow the RDFS semantics' document! In this sense, the usage of map:ProxyProperty is actually superfluous: by virtue of the RDFS semantics subPropertyOf, for example, should suffice. There are some details to handle (which version of the RDFS reasoning would one use), but that can be done.
> 
> Nilklas and I had in mind to create a member submission on this topic, given the time to do so.
> 
> One difference between general RDFS reasoning, and proxy reasoning, is the potentially open-ended nature of general RDFS reasoning. That is, inference steps may need to be repeated until no further triples are generated.

I would not call this 'open ended', but yes, that is the idea. And some measures should be taken to avoid infinity, due to, eg, the rdf:_i type properties. (One simple measure in our case would be not to include axiomatic triples in the first place!). But it is al doable and it does stay finite.

>  The proxy mechanism should be defined in terms of SPARQL updates that only need to be executed once per rule. Also, I'm not clear on how range inference might set the datatype of a literal without generating a new triple.

Well, we are getting into details here, but I would not want this group to start some sort of a different 'semantics' as an alternative. It would then become an issue how this relates to the general RDF Semantics... I do not think we should go there.

> 
>> Note that Niklas had a very reduced RDFS handling in mind, essentially exploiting subPropertyOf and subClassOf only. But why stopping there, why not exploiting, for example, range and domain statements? (Ok, I may ask too much here:-)
> 
> owl:sameAs, equivalentProperty, ... might be more appropriate. But, in general, I think this is correct, as long as it is in a fixed number of processing steps.

Again, my approach would be to stay as close as possible to the RDF Semantics document and not try to invent some sort of an 'own' inference mechanism. That is why we have other people working on standards (and some aspects of the RDF Semantics will be refreshed by the RDF WG anyway).

My approach would be to simply state what vocabulary RDFa-Sem understands, ie, if needed, carving out a sub-vocabulary of RDFS. This is exactly what the RDF Semantics document

http://www.w3.org/TR/rdf-mt/

when defining Simple entailment, RDFS entailment, etc. We are in an easy position because we do not have to add new terms to RDFS, we simply define (possibly) a sub-vocabulary of RDFS, so the only thing we define is that subvocabulary and let the generic semantics work its way. Based on what Niklas originally wrote this vocabulary might be as simple as 

{rdfs:subPropertyOf, rdfs:subClassOf, rdf:Property, rdfs:Class}

and no axiomatic triples.

Then a small subset of the entailment rules (section 7) of the Semantics document gives you the rules to follow and we are done. Note that, in theory, a small level of forward chaining might be necessary if the vocabulary author adds a subProperty for his/her own property, but I do not think that is a real problem.

Of course, alternatively, we can say that RDFa-Sem just follows RDFS and we do not say anything more. We probably define the entailment as being 'Horst' (or, rather, the RDFS entailment as defined in the SPARQL entailment document:

http://www.w3.org/TR/sparql11-entailment/

which ensure finiteness in any case).

My main direction here is: let us _not_ engage into the definition of our own semantics. A.k.a. let us not complicate our lives!

> 
>> So yes, that is an interesting line. Of course... implementing the full RDFS, though possible, is comparable in complexity to the management of profiles (though, with the expected size of an RDF graph in an RDFa file, a very simple, straightforward forward chaining reasoner would do the trick. But handling blank nodes in literals in an RDFS reasoner is still a bit tricky). I am wondering:  how many implementations will there be around that would produce not only the basic RDF graph, but the extended one as well? (We have several implementers around!) Note that a similar caching mechanism as the one discussed for profiles would be necessary to really make a good job.
>> 
>> (I would do it. O.k, I have the advantage of having implementd an RDFS reasoner in the past in Python, so...)
>> 
>> 5. If I am a user doing, say, SPARQL on the output of an RDFa processor, what would I query?
>> 
>> - If RDFa uses @profile, would my query rely on all terms and prefixes/uris that are defined through @profiles? I think yes. Indeed, the question here is: what is the probability of failing to get a profile file and therefore missing out triples? Here comes Shane's argument: the probability is very small, in fact. There won't be many profiles around, kosher RDFa processors would cache those anyway, so in a majority of the cases we could rely on all, expanded terms and URI-s.
>> - If RDFa uses RDFa-Sem, would my query rely not only on the core terms with the @vocab value but _also_ on the RDFS expanded terms? Well... not unless managing the RDFS reasoning is mandatory in the RDFa processor! Of course, if it is, then the same arguments apply as for profiles: there won't be that many @vocab-s with RDFS statements out there, kosher RDFa-Sem processors would cache those anyway, etc.
>> 
>> So: is RDFa-Sem mandatory? Because if not, then users may rely on those terms only if they use their own RDFa processors, or environments that have RDFS processing built in. And here is the catch: unfortunately, at the moment, not many RDF environment have RDFS processing built in, out of the box (eg, RDFLib does not have that). 
> 
> IMO, not mandatory, and simple applications will operate on the @vocab namespace.
> 
> Adding this as a SHOULD behavior might actually get RDFS reasoning to become more main stream, and provide a better direction for ontology developers who are tempted to re-create their own versions of standard properties.
> 
> 

If we do that, SHOULD is the minimum in my view...

Ivan


> Gregg
> 
>> at the moment, I find Niklas' RDFa-Sem proposal appealing and it might be considered as a possible improvement of @vocab that may make @profile unnecessary. Actually, it might make a bridge to the microdata discussion, too; after all, the mechanism would be an extension to what an RDF mapping of microdata does, and that might be good... But it is still unclear to me whether it is realistic go down that line in practice. If this does not work, though, than I would be fairly uneasy about dropping profiles
>> 
>> Sorry for this looooong mail
>> 
>> B.t.w.: I think fully using RDFa-Sem this way would really require community feedback. I wonder whether somebody (Niklas? Manu?) could do a, say, Google+ or a blog entry somewhere with the explicit goal of asking for feedback (Google+ seems to be the most active community discussion place these days...)
> 
> Good idea.
> 
>> Ivan
>> 
>> 
>> [1] http://lists.w3.org/Archives/Public/public-rdfa-wg/2011Jul/0048.html
>> [2] https://plus.google.com/u/0/112095156983892490612/posts/aUqGQSLzDPv
>> 
>> 
>> ----
>> Ivan Herman, W3C Semantic Web Activity Lead
>> Home: http://www.w3.org/People/Ivan/
>> mobile: +31-641044153
>> PGP Key: http://www.ivan-herman.net/pgpkey.html
>> FOAF: http://www.ivan-herman.net/foaf.rdf
>> 
>> 
>> 
>> 
>> 
>> 


----
Ivan Herman, W3C Semantic Web Activity Lead
Home: http://www.w3.org/People/Ivan/
mobile: +31-641044153
PGP Key: http://www.ivan-herman.net/pgpkey.html
FOAF: http://www.ivan-herman.net/foaf.rdf
Received on Wednesday, 3 August 2011 05:24:17 UTC