Ontology Mixing (was: Another way other than @profile, @vocab or @map) from Manu Sporny on 2010-03-21 (public-rdfa-wg@w3.org from March 2010)

From: Manu Sporny <msporny@digitalbazaar.com>
Date: Sun, 21 Mar 2010 15:19:54 -0400
To: RDFa WG <public-rdfa-wg@w3.org>
Message-ID: <4BA6715A.1020407@digitalbazaar.com>
On 03/20/2010 06:43 PM, Toby Inkster wrote:
> Bundling multiples ontologies/vocabularies is a great idea, but @profile
> is not the only way to do it, and not necessarily the best way to do it.

That is correct, I don't think anybody would argue that RDFa Profiles
are the /only/ way to do ontology mixing. The best way to do ontology
mixing is certainly what we're debating at the moment.

So, before we go too far, does that mean that this has been addressed as
far as you are concerned:

Toby wrote:
> Personally I don't think we've had enough debate on whether
> profile documents are needed at all ...
> What exactly are the use cases that show this to be insufficient?
> Personally, I don't think I've seen any yet.

Your response seems to imply that you think ontology mixing is a good
goal and you didn't seem to deny the use cases that were raised (UK
govt. consultations, simplifying music commerce).

Toby wrote:
> Yes, I realise RDFS reasoning is not necessarily a simple thing to
> implement, but given a choice between:
>
> 1. default prefixes: perform very RDFa1.0-like parsing, then, if you
> want to, perform (perhaps only limited) RDFS reasoning later on in the
> toolchain.
>
> 2. profiles: add recursive HTTP fetching, parsing and RDF querying to
> the RDFa parser itself.
>
> I'm still not convinced that #2 is really the simplest option. I'm not
> saying that I can't be convinced, just that I'm not convinced so far.

You may have noticed that the recent Deferred Resolution Graph
proposal[1] is similar to what you proposed for #1. So, if we are going
to compare apples to apples (triples to triples? :P ), what we're really
discussing are the following two solutions:

1. Default prefixes (your #1 above) with some sort of Deferred
   Resolution Graph implementation.
2. RDFa Profiles (your #2 above) implementation.

I say this because we can't just say "we support mixing ontologies via
xmlns: or default prefix". The Default prefix proposal, by itself, just
pushes the problem further up the application stack. At some point, the
default prefix solution (the RDFa Processor/Application stack) must:

1. Dereference a profile/default-prefix document to understand the
   mappings.
2. Convert previously unresolved triples to their final form.

The fact that we get an intermediate form of triples via solution #1
doesn't do anything other than push the problem to a later time, or
further up the technology stack. In other words - we're saying
"resolving triples generated via a default-prefix-only solution isn't
the RDFa Processors job... it's the applications job". We're only
simplifying the RDFa Processor, not the problem, since we're pushing the
complexity of resolution to the application layer.

The Real Problem: Resolving Unresolved Triples
----------------------------------------------

At the end of the day, the default-prefix generated triples MUST be
resolved, in some way, to be useful to an application. If a web browser
plugin is triggering off of a rdf:type of foaf:Person, no amount of
deferred resolution is going to help the browser. Said in another way,
if we have rdf:type == example:Person^^UNRESOLVED, at some point
example:Person^^UNRESOLVED MUST be resolved to foaf:Person to be useful
to the application. Default prefix with RDFS reasoning doesn't help us
do that - it just helps us note that the object still needs to be resolved.

A great amount of hang-wringing over the RDFa vocabulary proposal and
@token proposal revolves around the question of what to do when the
@profile document is not available. However, we seem to be missing the
fact that the default prefix proposal with RDFS reasoning doesn't help
us in this situation either. It just allows us to note that there are
certain items that cannot be resolved. So, we shouldn't kid ourselves
that the default prefix proposal makes the application utilizing RDFa
any more resistant to @profiles disappearing or cut-paste issues.

Don't Design Around Minority Cases
----------------------------------

By and large, we should assume that @profile documents are going to be
resolvable most of the time. If they are not, RDFa authors will most
likely stop using them because they will cause application headaches at
run-time. If you can't dereference a profile document, and you don't
have one cached or an ontology document back-up service, nothing we do
will result in a useful set of triples.

Similarly, as Shane mentioned, some of us are saying that we shouldn't
be afraid to generate junk triples. If that's true, then why are we
afraid of not generating triples or not generating all triples and
issuing a warning that there were profiles that could not be
dereferenced? Afterall, these are all edge cases that have effectively
the same outcome as not being able to dereference the profile document:
useful triples are not generated by the RDFa Processor layer for the
application layer.

Empowering RDFa Processor Implementers
--------------------------------------

We may be able to be more successful by telling RDFa Processor
implementers that they should use any mechanism available to them in
order to resolve @profile document mappings. This includes: ad-hoc
downloading, caching, using ontology backup services, and hard-coding
well-known profile mappings.

Doing this may shift the focus from worrying about @profile documents
not being available, to accepting that they will not be available at all
times and ensuring there are alternative ways to resolve the triples
when the documents are not available.

Default Prefixes (without RDFS Reasoning)
-----------------------------------------

There are two sub-proposals to the default prefix proposal:

1. Default prefixes with RDFS reasoning.
2. Default prefixes without RDFS reasoning.

Toby covered #1 here[2]. The other approach, #2, is to not do any sort
of RDFS reasoning and just simply tack-on the keyword value to what is
specified via @vocab. So, for example this:

<p vocab="http://xmlns.com/foaf/0.1/"
   about="#toby" typeof="Person">
   <span property="name">Toby Inkster</span> has an e-mail address
   with a SHA-1 signature of
   <span
property="mbox_sha1sum">3593fab87352e2a06c9fc7f291ac38093cec1b89</span>.
</p>

would generate these triples:

<#toby>
   rdf:type
      foaf:Person .
<#toby>
   foaf:mbox_sha1sum
      "3593fab87352e2a06c9fc7f291ac38093cec1b89" .

I can get behind this idea as a mechanism to simplify markup for
authors... I can even be okay with the idea that this may generate
spurious triples for those that do stuff like rel="me" or other bits
that may accidentally generate extra triples. We had the same "spurious
triples" concerns when discussing chaining and triple generation there,
and we came to live with and accept the consequences of authoring
mistakes in relation to chaining.

The one thing we shouldn't do with this default-prefix proposal without
RDFS reasoning is mistake it for a mechanism that allows us to mix
ontologies. The current mechanism in RDFa 1.0 that allows us to mix
ontologies is xmlns:. The mechanism that we're discussing for RDFa 1.1,
if mappings are stored in an external document, MUST dereference the
external document at some point to generate triples that are useful to
an application.

There is no way around that requirement, so the questions that we may
want to ask ourselves are:

* Are we okay with the RDFa Processor generating warnings for the
  application layer if @profiles are inaccessible?
* Are we okay with the RDFa Processor generating the wrong triples if
  @profiles are inaccessible and the application ignores the
  RDFa Processor warnings?

At this moment, I would answer "Yes" to both of those questions.

-- manu

[1]http://lists.w3.org/Archives/Public/public-rdfa-wg/2010Mar/0176.html
[2]http://lists.w3.org/Archives/Public/public-rdfa-wg/2010Mar/0174.html

-- 
Manu Sporny (skype: msporny, twitter: manusporny)
President/CEO - Digital Bazaar, Inc.
blog: PaySwarming Goes Open Source
http://blog.digitalbazaar.com/2010/02/01/bitmunk-payswarming/
Received on Sunday, 21 March 2010 19:20:24 UTC