Re: GRDDL from Julian Reschke on 2008-08-06 (public-html@w3.org from August 2008)

From: Julian Reschke <julian.reschke@gmx.de>
Date: Wed, 06 Aug 2008 10:56:18 +0200
To: Henri Sivonen <hsivonen@iki.fi>
CC: Toby A Inkster <tai@g5n.co.uk>, Justin James <j_james@mindspring.com>, public-html@w3.org
Message-ID: <48996732.4050806@gmx.de>

Henri Sivonen wrote:
> ...
>> (well, except for the hook pointing to the transform, if you want to 
>> count that).
> 
> It seems to me that it would be more useful for the extractor to contain 
> a catalog of transforms keying off well-known Content-Types (or root 
> namespaces for */*+xml types) and targeting the kind of RDF vocabularies 
> that the user of the extractor is interested in, since under such a 
> model the extractor would work even when the author isn't providing the 
> hook. There will always be more HTML pages labeled text/html without 
> GRDDL hooks than HTML pages with GRDDL hooks.

Yes. So that would be useful as well.

> ...
>>> I'm not sure if abusing HTML is the right characterization, but the 
>>> GRDDL setup violates the The Rule of Least Power TAG Finding.
>>> http://www.w3.org/2001/tag/doc/leastPower
>>
>> I'm not sure how using XSLT 1.0 violates that finding (please elaborate);
> 
> XSLT is in a more powerful language category than (scriptless) HTML or 
> any of the notations for RDF triples.
 > ...

That's true. But how do you use RDF triples or scriptless HTML to 
extract RDF out of (X)HTML?

>> but it's interested to see TAG findings quoted here.
> 
> My motivation to bring this up is pointing out that it's not only the 
> browsable Web (aka. the Web) that violates Architecture but the Semantic 
> Web violates it as well, which hopefully puts HTML5's Architecture 
> violations into perspective down the road when we will no doubt be 
> discussing Architecture violations.

I'm sure that we'll find violations of TAG findings outside the 
"browsable web" if we look for them. I disagree that it is the case here.

...
>>> Would it be an abuse of SVG if an SVG image wasn't served directly, 
>>> but instead a script that fetched the SVG file using XHR and rendered 
>>> it to <canvas> was served?
>>
>> Yes, that would be bad.
>>
>> Not sure what your point is, though.
> 
> If "run this PostScript/JavaScript program to see an image" (as opposed 
> to serving an image that isn't a program) is "bad", surely "run this 
> XSLT program to get triples" is also "bad".

RDF is not XML, nor HTML. Thus, if you have data in XML or HTML, and 
also want RDF, you can:

- Extend the host language so that RDF can be embedded, and extracting 
RDF is a simple process that doesn't require a transformation/scripting 
language. RDFa seems to be that solution.

- Extend the host language so that data inherent to that language can be 
transformed *to* RDF in a generic way (using a mapping language). That 
seems to be GRDDL.

- Document the mapping between the host language and RDF, do not touch 
the host language, and have transformers for each of the languages, 
triggered by contenttype/doctype/xmlnamespace.

- Serve both (in which case any of the options mentioned above could be 
applied server-side).

All of these do work and have advantages and disadvantages.

The advantage of GRDDL is that it's flexible and scales. The downside is 
that it requires applying XSLT. You may call that "bad", I wouldn't.

That being said, I'd be excited if we could make progress with the 
RDFa-in-HTML issue (which *is* in our charter).

BR, Julian

Received on Wednesday, 6 August 2008 08:57:08 UTC