Re: XHTML-RDFa draft made public from Mark Birbeck on 2007-04-11 (public-rdf-in-xhtml-tf@w3.org from April 2007)

From: Mark Birbeck <mark.birbeck@x-port.net>
Date: Wed, 11 Apr 2007 14:00:08 +0100
To: "Elias Torres" <elias@torrez.us>
Cc: "Ivan Herman" <ivan@w3.org>, "Shane McCarron" <shane@aptest.com>, public-rdf-in-xhtml-tf@w3.org, "HTML WG" <w3c-html-wg@w3.org>
Message-ID: <640dd5060704110600j55dd4ed9n6b81561a2c182d32@mail.gmail.com>

Hi Elias,

> I have not had time to attend the meetings, let alone do my todo on
> investigating many ontologies to find the most used. I wasn't worried
> because I too thought that we were aiming at consensus on the new style
> and not the current. Where are we/you on this? current or new hybrid
> approach?

The situation is this; there is no immediately obvious 'rule' that we
can identify when parsing the document to help us determine whether to
use a plain literal or a typed literal.

For example, let's say we go the route of the hybrid, and decide that
if we see mark-up we use an XML literal, and if don't we use a plain
literal. This is fine for the  example of E = mc<sup>2</sup>, but what
do we do here:

  <tr>
    <td>First name</td>
    <td>Surname</td>
  </tr>
  <tr property="foaf:name">
    <td>Elias</td>
    <td>Torres</td>
  </tr>
  <tr property="foaf:name">
    <td>Ivan</td>
    <td>Herman</td>
  </tr>

In this situation the mark-up is purely structural, and plays no role
in the actual metadata itself.

So the hybrid view is not as good as it first appeared, I'm afraid.

On the two telecons where we have discussed this, we decided that the
choices seemed to be:

 * remove the mark-up completely and parse to plain literals;

 * go with the existing approach of using XML literals;

 * modify the 'hybrid' solution so that different elements have
different behaviour.

It was felt that removing all mark-up (option 1) was a last resort,
and we should try to avoid it if we can, since we'd be losing
information.

It was also felt that having different behaviour based on the mark-up
(option 3) could get too complicated, although I'm not so sure it
would be that difficult; we could say that only inline elements such
as <em>, <sup>, etc. trigger parsing as XML literal, whilst others
don't.

And it was generally agreed that although it may have other problems,
the current 'XML literal' approach at least had the benefit of
preserving all mark-up, and so allowing the users of the triples to
decide whether the mark-up was significant or not. (This is easily
done using functions provided in SPARQL.)

So the point we are at is that we need to know whether leaving the
decision to the SPARQL query writer is really such a problem as people
are saying it is, since if it's not, it remains the easiest way to go.
And that's what I thought you'd agreed to look at, Elias.

Regards,

Mark

-- 
  Mark Birbeck, formsPlayer

  mark.birbeck@x-port.net | +44 (0) 20 7689 9232
  http://www.formsPlayer.com | http://internet-apps.blogspot.com

  standards. innovation.

Received on Wednesday, 11 April 2007 13:00:20 UTC