Re: Identification of RDFa content from Mark Birbeck on 2006-11-24 (public-rdf-in-xhtml-tf@w3.org from November 2006)

From: Mark Birbeck <mark.birbeck@x-port.net>
Date: Fri, 24 Nov 2006 23:17:04 +0000
To: "Ivan Herman" <ivan@w3.org>
Cc: "Ben Adida" <ben@mit.edu>, "public-rdf-in-xhtml task force" <public-rdf-in-xhtml-tf@w3.org>
Message-ID: <640dd5060611241517o5e9b39b4haea959a8fb75a5ed@mail.gmail.com>
Hi Ivan,

This is an interesting issue, and the problem has nothing to do with
modules, XHTML 1.2, XHTML 2, or anything like that. The key difficulty
is that RDFa has been specifically designed to beef up the metadata
features that HTML already has, and as a consequence, all HTML
documents are already RDFa-compliant.

Take something like this (in HTML):

  <head>
    <title>My site</title>
    <link rel="next" href="...">
    <link rel="previous" href="...">
  </head>

This tells us that the current document has 'next' and 'previous'
documents, and is simple, standard, HTML. Now, there's no reason at
all why some processing software shouldn't store the following
information about that document:

  <> h:next <...> .
  <> h:previous <...> .

Now we can use SPARQL to find all documents that refer to some other
document, and even documents that are the last in a chain. The fact
that this document 'contains' RDFa is down to the processor, and not
down to the author--it's in the eye of the beholder :).

I've said this before, and it's generally been met with the claim that
we're 'hijacking' people's data. Hopefully, an example like this shows
that we're certainly not 'forcing' documents to be RDFa when people
don't want them to be; what we're doing is saying that HTML documents
already have metadata, and RDFa defines some rules about how to treat
that metadata from an RDF standpoint. The fact that these rules are
entry-level RDFa is of course fortuitous, since it means that you can
also add far richer metadata later on.

So, what I'm interested to hear is a use case for something that
indicates the presence of RDFa in a document. It's pointless having it
*in* the document, since as I've shown, an RDFa parser is not going to
'fail' if it processes an HTML document with limited metadata, so you
could just process all documents that way.

But you could say that we don't want to process such documents, and
therefore the indicator would need to be outside, since you want to
save the cost of retrieval. (If you have to retrieve it to find out
whether to process it, as Ben says you might as well just go ahead and
process it.) But then you're into the problem that FoaF has--how do
you bootstrap the whole thing? Do we maintain a list of
RDFa-conformant documents?

To put this another way--indicating that a document 'is' RDFa is
pointless, since all documents 'just are', which means that any
indicator we devise is only playing the role of pointing out that some
document was intended to be part of some community of
specially-prepared documents that have been crafted to contain useful
metadata. That may or may not be useful--I couldn't say, but I just
wanted to clarify that some stamp of approval is not the same as
indicating the class of the document (the latter being unnecessary).

All the best,

Mark


On 23/11/06, Ivan Herman <ivan@w3.org> wrote:
> Hm. If we want a quick usage and spread of RDFa, then this may not be
> fully satisfactory at least in my view. Nobody knows when XHTML 1.2 will
> be published as a Rec, let alone XHTML 2.0 (the group's charter has just
> been sent to the AC, ie, there is not group yet!). What happens in the
> meantime?
>
> My hope is that the XHTML1.x RDFa module, as well as the final technical
> spec, will be published way before the full XHTML1.2, and that we can
> start using RDFa big time and quickly. Using a (possibly optional)
> profile tag might help that.
>
> Of course, we could rely on GRDDL and, say, Fabien's XSLT script [as an
> aside: we should have a clear test set; Fabien's script, for example,
> does not produce the same result as Elias' one, I think there are
> missing features...]. However, if I take an environment like Redland,
> that means that it would have to go and execute an 'outsider' script
> every time it wants to retrieve RDFa content (which also means that it
> would not work off-line) whereas if it knew via a profile that this is
> RDFa, it could parse the file right away and locally.
>
> Bottomline: I am still not convinced:-(; and I do not see harm in
> declaring a separate profile...
>
> Ivan
>
> Ben Adida wrote:
> > Ivan,
> >
> > Sorry for the delayed response here.
> >
> > RDFa is meant to be a natural part of XHTML. In other words, declaring a
> > document to be XHTML 1.2 or 2.0 is enough to make a parser look for
> > RDFa. This may be done by specifying a GRDDL profile in the XHTML 1.2
> > and 2.0 namespace documents.
> >
> > Of course, parsers may choose to be more promiscuous than that and look
> > inside XHTML 1.1 and 1.0 if they so choose...
> >
> > -Ben
> >
> > Ivan Herman wrote:
> >
> >>This may have been discussed before, in which case apologies. I have not
> >>seen a reference to it in the latest draft.
> >>
> >>The question: how does one discover that an XHTML file is 'RDFa-d'? The
> >>issue stroke me as a result of some discussions lately around the
> >>Tabulator[1] and Chris Bizer's announcement[2]. In both cases one can
> >>see engines that are able to make an indirect step, so to say; ie, they
> >>get a URI to a traditional site, but they can deduce the presence of a
> >>corresponding RDF data which they can add to their graph they build and
> >>explore. Examples are the <link references to RDF data, or the GRDDL
> >>profile.
> >>
> >>Hence the question again: how does an automatic procedure 'know' that an
> >>XHTML file contains RDFa encoded extra RDF data? Of course, a processor
> >>could RDFa process *all* XHTML file it gets hold of, but it may be worth
> >>adding some standard notification. Also, if such identification was
> >>around, the same URI could be used both for human consumption and for an
> >>RDFa-aware RDF environment.
> >>
> >>One would think of a profile attribute or is some sort of a special and
> >>predefined <link>... whichever. Something would be good.
> >>
> >>Any thoughts?
> >>
> >>Ivan
> >>
> >>
> >>[1] http://dig.csail.mit.edu/breadcrumbs/node/165
> >>[2] http://lists.w3.org/Archives/Public/semantic-web/2006Oct/0065.html
> >>
> >
> >
>
> --
>
> Ivan Herman, W3C Semantic Web Activity Lead
> URL: http://www.w3.org/People/Ivan/
> PGP Key: http://www.cwi.nl/%7Eivan/AboutMe/pgpkey.html
> FOAF: http://www.ivan-herman.net/foaf.rdf
>
>
>


-- 
Mark Birbeck
CEO
x-port.net Ltd.

e: Mark.Birbeck@x-port.net
t: +44 (0) 20 7689 9232
w: http://www.formsPlayer.com/
b: http://internet-apps.blogspot.com/

Download our XForms processor from
http://www.formsPlayer.com/
Received on Friday, 24 November 2006 23:17:13 UTC