Re: Blank nodes & classes from Mark Birbeck on 2007-06-22 (public-rdf-in-xhtml-tf@w3.org from June 2007)

From: Mark Birbeck <mark.birbeck@x-port.net>
Date: Fri, 22 Jun 2007 17:04:47 +0100
To: "Cédric Mesnage" <cedric.mesnage@lu.unisi.ch>
Cc: public-rdf-in-xhtml-tf@w3.org
Message-ID: <640dd5060706220904s27077b66ocab61ed50a0f84e4@mail.gmail.com>
Hi Cédric,

I'm open to leaving this entire issue until a future version, if
people have trouble with it. But since I do actually have an action
item pending to work through the whole rdfs:label issue more generally
(i.e., not just with <img> but in other places too), hopefully you'll
forgive me if I use your post as a sounding board for some thoughts.
;)

> >> I am strongly against using URIs to identify resources...
> >
> > I think that's a discussion for another list. :)
>
> all right, let's not discuss it further then.

Whatever... :)


> > In other words, if you believe that we shouldn't take anything from
> > the structure of the document, then we really should spell everything
> > out.
>
> It's not what I meant, we should make use of the structure of the
> document, but we shouldn't infer too much, when you say that :
>
> <span about="http://whatever.org"> food book</span>
>
> corresponds to the triple
>
> <http://whatever.org> rdfs:label "food book".
>
> it is an interpretation...

With respect, that's the point of RDFa. It's mostly about
interpretation. If an author who is unfamiliar with RDFa writes this:

  <a rel="p" href="o>Some link</a>

then we've agreed that from RDFa's perspective, it is safe to
interpret this as a triple, as follows:

  <> p o .

Why is it safe? Because the HTML spec itself comes about as close as
you could get to saying so. It says that @rel represents a
relationship between the current document and some target document. Of
course it doesn't mention RDF, but it just happens to be that RDF is
the most well known and well developed method of serialising
statements such as "this document stands in an 'x' relationship to
that document".

Now, given that what we're doing is largely about interpretation of
the HTML specification--putting aside new attributes like @about and
@datatype--it's perfectly legitimate to ask what else is the author
doing when they do this:

  <a rel="p" href="o>Some link</a>

Since every browser will display (or speak) this text in a way that is
different to the surrounding text, and further, given that this text
is invariably a link via which a user can navigate to a new resource,
I think it's pretty safe to say that this is a "human readable label
for the resource 'o'", which is, as luck would have it, the definition
of rdfs:label. :)

Continuing the point, but returning now to the example using @about,
let's say I have the following text in my document:

  The Prime Minister today flew to Russia.

As it stands we don't know which Prime Minister, or on which day. If
our author now does this:

  The
  <span about="http://people.org/tony-blair">Prime Minister</span>
  <span content="2007-06-22" datatype="xsd:date">today</span>
  flew to
  <span about="http://countries.org/russia">Russia</span>.

they have made quite an explicit connection between fragments of text
and a resource. It seems to be really stretching things to say that we
might cause problems if we interpret those fragments of text as 'human
readable labels', since it's clear that they are.

To illustrate, imagine for a moment that Google started indexing the
RDFa in these kinds of documents. If I search in Google for "Prime
Minister", would it not be legitimate to ask me if I'm looking for
articles specifically about a particular Prime Minister, and if so,
show me a list of them from different countries, and from the past? It
would seem from the mark-up above legitimate to have in that list of
Prime Ministers the person identified by the resource
"http://people.org/tony-blair". But you could only draw that
conclusion if we agree with this proposition:

  <http://people.org/tony-blair> rdfs:label "Prime Minister" .

Note that this statement does not rule out other labels which our
'Semantic Google' might have picked up as it scoured documents across
the web:

  <http://people.org/tony-blair> rdfs:label "Tony Blair" .
  <http://people.org/tony-blair> rdfs:label "the Leader of the Labour Party" .

etc.

But note that none of these are 'harmful' statements, since they are
true; someone, somewhere, used one of these labels as a way of
identifying the resource "http://people.org/tony-blair".


So, the key point is that if the author attaches some metadata to a
particular piece of mark-up then they have not only made statements
via the metadata that they have added, but they have also made a
statement by choosing one place to put their mark-up over another.


However, as I said to Ivan, there is no big deal in leaving this open
for discussion in a future version of RDFa. All I want to stress here
is that such interpretations *do* make sense, since many (not all, of
course) are given by HTML itself.


>... even worse an automatic interpretation which
> by definition will be wrong in many cases or if not wrong, will be
> different than what the developer/publisher meant. Moreover, the
> piece of information which semantics were not explicitly defined by
> the publisher does not seem to be of interest, it might be more
> important to automatically infer the triple :
>
> <> rdf:li <http://whatever.org>
>
> or  <> dcel:relation <http://whatever.org>
>
> as the only thing that was said by the publisher is that this
> resource is related to whatever :-)

I guess you could say that, but you might then be in danger of putting
the RDFa cart before the HTML horse. If I place the following in a
document:

  <img src="my-picture.jpg" alt="Me on holiday" />

then I *really* have added an image. We might not (yet) know what
triples to generate to indicate that we have an image--should we say
xh:img, or dcmitype:Image, or foaf:Image, or whatever--and because of
that I'm happy to leave the detail of the question to a future
version. But the point is that the author's intent is pretty clear,
and we might as well preserve that information in some form or
another.


One final thing; it's worth pointing out that most of the discussion
on this list seems to come from people consuming or generating RDFa on
servers. However, when you start to try to do things with RDFa in the
client you find that you often need some of the context information,
and rdfs:label is a fairly simple--non-invasive--means of retaining
some context, that does not go against what can be derived from how
the author has chosen to mark up a document.

Regards,

Mark

-- 
  Mark Birbeck, formsPlayer

  mark.birbeck@x-port.net | +44 (0) 20 7689 9232
  http://www.formsPlayer.com | http://internet-apps.blogspot.com

  standards. innovation.
Received on Friday, 22 June 2007 16:04:56 UTC