Re: XHTML-RDFa draft made public from Ivan Herman on 2007-04-11 (public-rdf-in-xhtml-tf@w3.org from April 2007)

From: Ivan Herman <ivan@w3.org>
Date: Wed, 11 Apr 2007 15:16:30 +0200
To: mark.birbeck@x-port.net
Cc: Elias Torres <elias@torrez.us>, Shane McCarron <shane@aptest.com>, public-rdf-in-xhtml-tf@w3.org, HTML WG <w3c-html-wg@w3.org>
Message-ID: <461CDFAE.8080202@w3.org>

Mark,

Mark Birbeck wrote:
> 
> Hi Elias,
> 
>> I have not had time to attend the meetings, let alone do my todo on
>> investigating many ontologies to find the most used. I wasn't worried
>> because I too thought that we were aiming at consensus on the new style
>> and not the current. Where are we/you on this? current or new hybrid
>> approach?
> 
> The situation is this; there is no immediately obvious 'rule' that we
> can identify when parsing the document to help us determine whether to
> use a plain literal or a typed literal.
> 
> For example, let's say we go the route of the hybrid, and decide that
> if we see mark-up we use an XML literal, and if don't we use a plain
> literal. This is fine for the  example of E = mc<sup>2</sup>, but what
> do we do here:
> 
>  <tr>
>    <td>First name</td>
>    <td>Surname</td>
>  </tr>
>  <tr property="foaf:name">
>    <td>Elias</td>
>    <td>Torres</td>
>  </tr>
>  <tr property="foaf:name">
>    <td>Ivan</td>
>    <td>Herman</td>
>  </tr>
> 
> In this situation the mark-up is purely structural, and plays no role
> in the actual metadata itself.
> 

I am not sure I understand. In Ben's review of the issues:

http://lists.w3.org/Archives/Public/public-rdf-in-xhtml-tf/2007Mar/0096.html

this was the case of the datatype="plain". Ie, I would say

<tr property="foaf:name" datatype="plain">
 <td>Elias</td>
 <td>Torres</td>
</tr>

that would result in

<....> foaf:name "Elias Torres"

I may miss something but this seems to be covered by the hybrid view,
but I do not see any disagreement we might have had on the subsequent
thread...

> So the hybrid view is not as good as it first appeared, I'm afraid.
> 
> On the two telecons where we have discussed this, we decided that the
> choices seemed to be:
> 
> * remove the mark-up completely and parse to plain literals;
> 
> * go with the existing approach of using XML literals;
> 
> * modify the 'hybrid' solution so that different elements have
> different behaviour.
> 

You mean different HTML elements? That would be awful. But, as I say, it
looks to me that the hybrid view could work well.

> It was felt that removing all mark-up (option 1) was a last resort,
> and we should try to avoid it if we can, since we'd be losing
> information.
> 
> It was also felt that having different behaviour based on the mark-up
> (option 3) could get too complicated, although I'm not so sure it
> would be that difficult; we could say that only inline elements such
> as <em>, <sup>, etc. trigger parsing as XML literal, whilst others
> don't.
> 
> And it was generally agreed that although it may have other problems,
> the current 'XML literal' approach at least had the benefit of
> preserving all mark-up, and so allowing the users of the triples to
> decide whether the mark-up was significant or not. (This is easily
> done using functions provided in SPARQL.)
> 

I do not want to reopen the thread here:-) I guess you know my opinion...

Ivan


-- 

Ivan Herman, W3C Semantic Web Activity Lead
URL: http://www.w3.org/People/Ivan/
PGP Key: http://www.cwi.nl/%7Eivan/AboutMe/pgpkey.html
FOAF: http://www.ivan-herman.net/foaf.rdf

Received on Wednesday, 11 April 2007 13:16:24 UTC