Re: RDFa and RDF

Elias, Max,

In the RDF-in-HTML task force, we've focused mostly on "how to publish
RDF in an HTML page," and we will probably not address this storage
issue directly, as it depends on many other moving parts (like SPARQL).

That said, I see two different cases to consider, depending on your

1) a structured/semantic data source is authoritative

Think of either a SQL or Triple backend, where the contents are rendered
as XHTML via some application. Then your SPARQL engine would run off
this backend, and the XHTML+RDFa becomes just another serialization of
the RDF: it gets rendered from this backend just like a normal
database-backed web application, with XHTML for humans to interpret.

2) the XHTML itself, or some closely related markup, is authoritative

This is the case for a semantic wiki, or for Creative Commons, or for
Queso: the actual authoritative data is XHTML+RDFa (either as standalone
pages, or delivered via something like ATOM).

Then, if you want to manage raw triples (e.g. for SPARQL), you'll need
to extract the triples and store them in a Triple Store (exactly what
Elias described). I believe SweetWiki stores both the XHTML+RDFa and the
extracted triples. The first is for presentation, the second is
extracted from the first and is used for indexing.

To put it another way, what do you do with RDF/XML? Either you're
publishing RDF/XML from a triple store (or from a SQL backend), in which
case you're just serializing triples as RDF/XML, and your storage is
just triples. Or, the RDF/XML comes from many different sources and is,
for you, the authoritative triple source. You must then extract the raw
triples from it if you want to serve them up as SPARQL.

The difference is that RDF/XML doesn't give you much more than the raw
triples themselves, while XHTML+RDFa also gives you styled, human
readability that can't always be re-created from the raw triples. That's
why, if it's your authoritative source of data, you'll want to keep the
XHTML+RDFa around, but extract and index the triples in parallel for
things like SPARQL. If it's just another way for you to serialize your
pre-existing triples, then it doesn't need to be integrated into your
SPARQL setup.


Elias wrote:
> Max,
> At IBM we have been doing some work in the area and we are exploring
> possible answers to your questions in Queso [1]
> We give access for read/write via an Atom Publishing Protocol endpoint.
> The contents are converted using AtomOWL to RDF (i.e. the entire
> contents of the Atom entry XML) and if the content is XHTML, we extract
> RDF triples encoded in RDFa and make them available as triples.
> Yes, XML-elements become XMLLiterals, subjects are URIs??? (not sure
> what else it could be) and yes it's working great for SPARQL queries.
> -Elias
> [1]
> Max Völkel wrote:
>> Hi RDFa'ers,
>>   I still have a tiny but (to me) important question about RDFa.
>>   First,  it  seems a great idea that I can annotate my XHTML document
>>   with  RDF,  even better, I can annotate each individual element with
>>   RDF,  can  make  elements  being  the  subject  or  object  of  RDF
>>   statements.  Wow.  I  even  can  make  RDF  statements  unrelated to
>>   elements on the page. Wow.
>>   Okay, how do I represent such structures in an API/triple store?
>>   Will the XML-elements become XML-Literals? What for the subjects?
>>   Are   XPointer-expressions  usingthe  document  as  their  root  an
>>   identifier  for  subjects?  I worry, because I need to represent the
>>   XML+RDFa  somehow  in  a  triple  store  in  order to to e.g. SPARQL
>>   queries.
>>   Maybe  I  overlooked  something,  but  I  never  found this relation
>>   explained.
>> Kind regards,
>>   Max Völkel
>> --
>> Dipl.-Inform. Max Völkel, Universität Karlsruhe / FZI
>>   +49 721 9654-854

Received on Thursday, 7 September 2006 20:20:53 UTC