RDFa profile microsyntax, proxy vocabularies, GRDDL from Niklas Lindström on 2011-07-20 (public-rdfa-wg@w3.org from July 2011)

From: Niklas Lindström <lindstream@gmail.com>
Date: Wed, 20 Jul 2011 02:41:12 +0200
To: public-rdfa-wg <public-rdfa-wg@w3.org>
Message-ID: <CADjV5jc9p8=3-kgwFMK3yY3mE7pgQfEt4R1XW6CjuEUeRbYUog@mail.gmail.com>
Hello all!

I've considered the case of profiles for a while, and I've conceived
some alternatives. This became rather lengthy, but I hope you'll find
it informational.

My perspective here is regarding our goal to make RDFa 1.1 really easy
for people (with various knowledge) to read and write, while
supporting mixed use of vocabularies and the fundamental RDF model.

We are exploring means of reducing the needs for prefixes and CURIEs
for simple cases. Currently, the @vocab attribute and the use of
profiles handles this. But profiles come at a potential cost which is
under debate.

## Complexity of profiles ##

It is arguably so that profiles *are* complex, using "out of band"
indirection to support the URI-based solution for mixing vocabularies.

The profile solution in its current form may also come across as
convoluted. It provides means of *syntactic* shortcuts, leaving
certain forms of expressions in RDFa attributes *dependent* on them. I
find it a bit awkward to parse RDFa in profiles in order to enable the
parsing of *some* of the RDFa in the main document. That also puts a
cognitive burden on anyone who wants to understand what's really going
on.

I get a general sense that profiles may be halfway towards somewhere.
I see some different paths ahead:

1. A common microsyntax supporting prefixes, default vocab and terms;
used inline in @prefix or linked to with @profile.
2. Remove profiles, promote "proxy vocabularies" with @vocab for the
simplest scenarios.
3. Turn profiles into a full GRDDL mechanism.


## 1. A profile microsyntax ##

One concern I have with profiles now is that they are directives given
as RDF for how to syntactically parse references. (I believe you've
debated this a lot already, but I can't help finding it awkward.) A
solution to this can be quite simple:

* Extend @prefix to support declaration of default vocab and terms.
* Define the profile syntax to be, verbatim, the syntax used in @prefix.

These extensions to @prefix can be really simple too. For instance, how about:

    : http://www.w3.org/1999/xhtml/vocab#

to declare the default vocab; and:

    :describedby http://www.w3.org/2007/05/powder-s#describedby

to declare a term. (Or perhaps another token than ":" for these, such
as "@", to make it more distinguishable.)

If external @profile references are to be supported (barring issues of
callbacks and broken links), I'd suggest that these (small) documents
are to be following a same origin policy (i.e. placed locally on the
server along with the documents using it).

(I just noticed that this looks a lot like Mark Birbeck's previous
suggestions in e.g. [1].)


## 2. Mapping on the semantic level ##

This alternative is quite different from profiles. It uses the power
of linked data in its own right.

As I considered the use of profiles, I got the impression of being
halfway towards custom vocabularies. So why not define *proxy
vocabularies* which maps (i.e. *links*) to other (more known) terms?
We already have what we need for that in RDF Schema, namely
subPropertyOf and subClassOf. Stéphane also mentioned this, suggesting
that we should use vocabularies with term mappings. (I'm not sure if
this is really what you meant Stéphane, but it is what I will go for.)

In some of the scenarios we're coming across now in RDFa, it seems
there may be an advantage to "importing" recommended terms into your
own vocabulary. By doing so, users don't need to mix vocabularies.
(Although mixing is arguably a good thing, it takes some experience to
see that).

(Neither Facebook OpenGraph nor schema.org even *link* their terms to
precursors (FOAF, GoodRelations etc.), much to the dismay of RDF users
who wish to integrate such data with already existing terms.)

A way which takes this into consideration, is to define a mapping
vocabulary which defines a set of classes and properties intended to
work as "aliases" or "proxies" (e.g ProxyClass/ProxyProperty). That
is, they are sub-concepts of some property or class, but define no
additional semantics, apart from the intent of using them within the
domain context of a specific vocabulary.

Such a vocabulary could be *just* a "proxy vocabulary", or it could be
a regular one which chooses to "import" certain terms important to
them.

Such vocabularies are to be used in a way reminiscent of RDFa
profiles, but *not* within the parsing context. That is, from an RDFa
point of view, the data is ready and in RDF form. The vocabulary will
be used by consumers of RDF who wish to process and integrate it with
other data, in any way deemed usable.

This will be a very cheap solution for publishing in many cases. It
*will* require RDF-savvy users to interpret the data (the used
properties and classes), but it will do so using established RDF
concepts. That is, on a *semantic* level, not a syntactic! (Granted,
we may invent the mapping/proxy concept, but by using
subPropertyOf/subClassOf, any system supporting direct RDFS inference
will work as expected. And this mapping vocabulary may turn out to be
generally beneficial.)

### Mapping vocabulary example ###

I've put together a small example of how this could look at:
<https://gist.github.com/1092350>

It includes:

* some example site data,
* a (site specific) proxy vocabulary (using the suggested "map" vocabulary),
* an implementation using SPARQL 1.1 CONSTRUCT,
* new, mapped data with proxy terms resolved to terms from other vocabularies.

I've also done an extended (functional) experiment of this at
<https://github.com/niklasl/rdf-sparql-lab/tree/master/curation>. It
goes even further by also:

* "fixing" values by coercing them to expected datatypes or IRIs,
* handling "proxied" inverses of properties (?s ?p ?o => ?o ?invP ?s ).

It is important to understand that this example illustrates *use* of
the data. With this alternative, RDFa would just have to support
@vocab. Nothing more. The rest is in the semantics of the links.

(And I'm not suggesting to remove @prefix at all. That'd be the
mechanism of choice for anyone mixing vocabularies which don't use any
mapping/proxy semantics to include other terms.)

### Remaining: the "default profile" ###

One thing still remains though. We still have the predefined set of
prefix declarations and term mappings available by default. It could
be kept as a hard-wired, built-in set tied to the host language.

Prefixes and default vocab can reasonably have defaults, but terms
would have no customisable equivalent in the RDFa syntax. And there is
an issue, which Shane points out, where a predefined term overrides a
term intended to be resolved against a given @vocab.

One way to solve that could be to remove the notion of predefined
terms, and state that the default vocab is
<http://www.w3.org/1999/xhtml/vocab#>  *unless overridden*. If one
uses @vocab, one have to use e.g. "xhv:license" explicitly. Or the
other way around, if one wants e.g. rel="license" to resolve against a
custom @vocab, one'd have to use rel=":license"...


## 3. Profiles revisited as GRDDL ##

There may of course be more complex scenarios where a multitude of
vocabularies intermix directly, and a declarative indirection (even
"macros") may be warranted.

There is already a W3C standard to explore though: GRDDL [2]. It can
be used in any complex scenario. I'd suggest that those who actually
need this complexity explore whether GRDDL as it stands will suffice.

If not, I can imagine a refined declarative mechanism for GRDDL, based
on the current profile mechanism. It could use either @class and @id,
the "app specific" data-attributes of HTML5, or.. microdata (depending
on how the current schism is resolved). Think "microdata + profiles =
GRDDL 2.0". I'm not saying I'd prefer this, but it is an option (one I
actually gravitated towards prior to the notion of "proxy
vocabularies").

Such a mechanism could do more than simply map terms/tokens to URIs.
One can imagine coercing values, converting strings and id tokens to
URIs, creating RDF Lists from elements, etc. That would provide
app-specific, declaratively indirected means of expressing RDF for use
by experts with demanding requirements.

But that is reasonably beyond what the RDF Web Applications Group
should do now. Such a path would become work on some new GRDDL
mechanism (say hGRDDL, as Ben Adida thought of some years ago [3]).


## Summary ##

To repeat, my suggested alternatives are:

1. An extended @prefix microsyntax with support for prefixes, default
vocab and terms. Support for linking to such declarations using
@profile.
2. Removal of profiles in favour of "proxy vocabularies" for really
simple "one vocab fits all" scenarios. Use nested @vocab or @prefix as
they stand now for regular vocabulary mixing.
3. Turn profiles into a GRDDL mechanism, ideally not overloading the
RDFa attributes but instead working on a declarative level upon either
@class and @id, data-attributes or microdata attributes.

Currently I lean towards proxy vocabularies (alternative 2). But I'd
love to discuss this further.

Best regards,
Niklas Lindström
Valtech AB

[1]: http://lists.w3.org/Archives/Public/public-rdfa-wg/2010Oct/0238.html
[2]: http://www.w3.org/2003/g/data-view
[3]: http://lists.w3.org/Archives/Public/public-rdf-in-xhtml-tf/2006Apr/0069.html

--
<http://neverspace.net/>
<http://valtech.se/>
Received on Wednesday, 20 July 2011 00:42:09 UTC