Re: GRDDL and HTML

Harry, hello.

On 2007 Jan 9 , at 04.22, Harry Halpin wrote:

>     We've added a use-case to the GRDDL Use-Case document[1] that we
> believe addresses both the use of GRDDL transformations on non-XML  
> HTML
> (i.e., as you rightly pointed out, how it is possible) and then  
> presents
> the case for why XML (XHTML) is preferred.
>
> Tell us if you find it satisfactory!

http://www.w3.org/2001/sw/grddl-wg/doc43/scenario- 
gallery.htm#html_tidy_use_case

The case I'm pushing was the first one mentioned there, `If the  
tidying is simple (e.g. a <BR> is replaced by a <BR/>)'.  That seems  
to cover the case where the author intends a document to be GRDDLed,  
but for one reason or other (because a tool hiccups, say, or because  
the author wants to target HTML 3.2) they don't produce well-formed XML.

Myself, I'm rather nervous about the suggestion that `The script also  
systematically calls some classic transformations on the document in  
case these were not explicitly referenced in the page'.  That strikes  
me as a tool being too clever for its own good.

It also sits oddly with the usecase's talk of `licensing a  
transformation'.  If I don't put in a GRDDL declaration, then I am  
not licensing any transformation at all, and if you happen to find a  
GRDDL script that will produce output, that's nothing to do with me,  
and I can't be held responsible for it.

Perhaps there are three cases here:

1. I (as an author) produce well-formed XML and a GRDDL declaration.   
I license a transformation, and expect/require you to get all of the  
metadata (that is, if there were a CreativeCommons licence statement  
in the GRDDLed RDF, then you can't deny having seen it).

2. I produce mildly ill-formed XML and a GRDDL declaration.  I  
license any transformation, but if you don't bother, or try and fail  
(for one of the reasons mentioned in the usecase), then I can't  
object.  You're allowed to rely on any RDF you extract, but if that  
RDF is incorrect, then it's my fault, and I'm still held to it.

3. I produce well- or ill-formed XML and no GRDDL declaration.  You  
can do what you like, but I didn't license the transformation, and  
you can't blame me for any libellous remarks you deduce.  Scraping  
isn't pretty -- I don't see any real need for GRDDL to go this far.

The distinction between (1) and (2) was what I was getting at in the  
suggested text in [1].

The `see also' at the end links to JTidy; you might also want to add  
<http://home.ccil.org/~cowan/XML/tagsoup/>

All the best

Norman


[1] http://lists.w3.org/Archives/Public/public-grddl-comments/ 
2006OctDec/0031.html
-- 
------------------------------------------------------------------------ 
----
Norman Gray  /  http://nxg.me.uk
eurovotech.org  /  University of Leicester, UK

Received on Tuesday, 9 January 2007 12:20:32 UTC