Re: RDFa + RDF/XML Considered Harmful? (was RE: Ordnance Survey data as Linked Data) from Mark Birbeck on 2008-07-14 (semantic-web@w3.org from July 2008)

From: Mark Birbeck <mark.birbeck@webbackplane.com>
Date: Mon, 14 Jul 2008 17:38:47 +0100
To: "Richard Cyganiak" <richard@cyganiak.de>
Cc: "Tom Heath" <Tom.Heath@talis.com>, "Kingsley Idehen" <kidehen@openlinksw.com>, public-lod@w3.org, semantic-web@w3.org
Message-ID: <ed77aa9f0807140938m5872f3faoe4d155c710b3bee8@mail.gmail.com>
HI Richard,

> Well, RDFa has made life simpler for those publishers whose requirements are
> met well by RDFa. It has made life more complicated for client developers,
> since they have to support yet another RDF syntax.
>
> I think RDFa is an important piece of the SemWeb technology puzzle, but your
> claim that it "makes the situation an order of magnitude less complicated"
> is unfounded, IMO.

Whatever... :)

It's really not worth arguing about, and it certainly doesn't bother
me. I don't see everyone backing up their opinions with research and
statistics all the time, and the world doesn't fall apart.

But, no worries...I'll withdraw my claim.


>> as Kinglsey said, increasing the number of
>> ways to publish metadata increases the number of possible clients that
>> might consume the data:
>
> Kingsley was talking about a situation where the publisher offers *all*
> different methods of publishing RDF. Sure this increases the number of
> theoretically possible clients, but it also increases the cost of publishing
> RDF.

To paraphrase your comment above...it actually only increases the cost
of publishing RDF for those who are already publishing RDF.

But for those not yet publishing RDF, it's a *new* cost whatever way
you look at it (RDFa or RDF/XML).

Now, my *feeling* is that a *lot* of people will simply not incur that
cost when dealing with RDF/XML, but they *may* incur that cost if it's
as easy as publishing XHTML.

Coming at this from a different direction, metadata publishers as a
percentage of the number of people publishing web-pages on the web, is
probably very small. (Yes...I know...unfounded...)


> [snip]
>
> There is no rdf:Graph type, and Tom didn't say there is one.

I probably didn't explain myself very well. I wasn't using the
sarcastic "Ah...Tom...I have you here because I don't believe there is
such a thing as rdf:Graph" mode of expression. :)

I was using the "oh...I recall a while ago someone proposing
rdf:Graph...was it Jeremy?...and maybe I missed that it has gone
further than I thought" mode of expression.

As I said, I do think it's a good idea, and my comments were mainly
about how even if this type existed, it could work with RDFa.


> Tom used the
> words "RDF document" and "RDF graph" synonymously, which is a bit sloppy.
>
>> However, is not an rdf:Graph a type of information resource?
>
> Depends on how you squint at it. Technically speaking, a graph, as a
> mathematical entity, is a fixed, immutable thing. An information resource,
> on the other hand, can (and often does) vary over time. That is, tomorrow it
> might have a different representation than today.

Ok. But we were talking about 'named graphs', I believe, which are not
at all required to be static.

My understanding of Tom's point was, if we use (X)HTML to carry RDF,
via RDFa, then how do we know when we are talking about the named
graph of triples, or the HTML document that *contains* the named
graph.

I think it's a very good question, and I was trying to address it.


> One could say that, in the case where we look at the Web through RDF
> goggles, the concrete representation that we get back at any time from an
> information resource is an RDF graph. The information resource itself is not
> an RDF graph, but rather a function that returns an RDF graph.

Yes...one could say that, I guess. But if we deal with *named graphs*
I think things are different. I think you could say 'this is a graph'
in the same way you might say 'this is a web-page'.


> Now, if we ignore time and pretend that we just deal with a static,
> frozen-in-time snapshot of the Web, then it's probably okay to pretend that
> information resources are RDF graphs (because the function is constant).
> This is what the RDF browsers out there do in practice, they treat the Web
> as a set of named graphs, where the URIs of RDF documents are the graph
> names.

That's right. But my point is that I don't think there is anything
wrong with a document having multiple types; of course we know that
saying that a web-page is both a web-page and a car is problematic
(see below), but saying that a web-page is both a web-page and a
*named* graph doesn't have the same issues.


>> An
>> RDF/XML document delivered from a web server is both a document and a
>> graph,
>
> Yes and no, see above. It's true if you ignore time, but architecturally
> speaking, Web documents (information resources) change over time, while
> graphs are immutable.

Sure...but the graph corresponding to a particular 'name' can change.


>> but we have chosen to ignore that in the RDF architecture; it's
>> not possible to say 'this graph was published by', in RDF/XML, i.e.,
>> to talk about the information resource itself, because you will always
>> be talking about whatever the RDF/XML itself is about.
>
> Huh? Of course it is possible to talk about information resources in RDF.
> Assume that this is the content of an RDF document published at
> http://example.com/my_rdf_document.rdf :
>
> <rdf:Description rdf:about="http://example.com/my_rdf_document.rdf">
>    <dc:publisher>Richard</dc:publisher>
>    <dc:date>2008-07-14</dc:date>
>    ...
> </rdf:Description>
>
> This is an RDF document talking about itself. This is standard practice, you
> can find examples like this in the RDF spec, and everywhere in the wild.
> (Note that in the example, I could have used rdf:about="", because an empty
> URI is expanded to the URI of the document.)
>
>> But there is no reason that we could not enable this, and if we wanted
>> to go this route, RDFa+HTML allows it.
>
> It's equally possible in RDFa and RDF/XML, today.

Mmm...I don't agree. I agree that you can talk about information
resources, but I see RDF/XML as an odd type of information resource
that is 'pure' triples.

Of course, I could well be wrong. :)

But if I am, your scenario seems to fall foul of the whole information
resource/resource dilemma, so I'm having trouble seeing how that is
resolved. I'll illustrate why I think that RDF/XML cannot talk about
itself (in a kind of 'Godel's Incompleteness Theorem' sort of way).

Let's flip the whole thing on its head, to begin with. Let's say that
our RDF/XML document is referring to a car:

  <rdf:Description rdf:about="http://cars.org/123456">
    <dc:creator>BMW</dc:creator>
    <dc:date>2008-07-14</dc:date>
    ...
  </rdf:Description>

That's fine, and we could have hundreds of cars in our RDF/XML
document. However, let's stick to one car, and serve it from the same
URL as its subject. As you rightly say we can abbreviate the @about to
refer to the 'current document':

At http://cars.org/123456:

  <rdf:Description rdf:about="">
    <dc:creator>BMW</dc:creator>
    <dc:date>2008-07-14</dc:date>
    ...
  </rdf:Description>

Hopefully that is still so far so good. Now let's add your publisher
information:

At http://cars.org/123456:

  <rdf:Description rdf:about="">
    <dc:creator>BMW</dc:creator>
    <dc:publisher>Richard</dc:publisher>
    <dc:date>2008-07-14</dc:date>
    ...
  </rdf:Description>

Why does dc:publisher suddenly apply to the document and not the car?

If it does, how did you (or your software) know to apply it that way?

And even if you knew that dc:publisher applies to documents, (a) how
do you know to apply it to the RDF/XML document, and not
<http://cars.org/123456> (which might be a document), and (b) even if
you can resolve that, how would you decide what to apply the dc:date
to? Should it apply to the car or to the RDF/XML document?

For these reasons I have always viewed RDF/XML as 'scaffolding' that
holds triples which can be about *anything* you like...anything, that
is, except itself. :)

Regards,

Mark

-- 
Mark Birbeck, webBackplane

mark.birbeck@webBackplane.com

http://webBackplane.com/mark-birbeck

webBackplane is a trading name of Backplane Ltd. (company number
05972288, registered office: 2nd Floor, 69/85 Tabernacle Street,
London, EC2A 4RR)
Received on Monday, 14 July 2008 16:39:26 UTC