Re: Validation Update: success! from Keith Alexander on 2007-06-25 (public-rdf-in-xhtml-tf@w3.org from June 2007)

From: Keith Alexander <k.j.w.alexander@gmail.com>
Date: Mon, 25 Jun 2007 19:30:22 +0100
To: "Ben Adida" <ben@adida.net>
Cc: Cédric Mesnage <cedric.mesnage@lu.unisi.ch>, RDFa <public-rdf-in-xhtml-tf@w3.org>
Message-ID: <op.tuho51wrzdej1c@keith-alexanders-computer.local>

On Mon, 25 Jun 2007 17:01:05 +0100, Ben Adida <ben@adida.net> wrote:

> I think each route would be an acceptable way to extract triples, and if
> you want all triples, you should probably do both.

The problem is quality and not quantity. If you parse in a different way  
than I intended, you risk getting factually and/or syntactically incorrect  
RDF. So it's important that it be clear whether I am publishing bad RDF,  
or you are interpreting it badly.

>> Personally, I think that if a @profile has actually been declared, it is
>> safer for a triplr/sponger to assume that that's *all* it should parse
>> for - despite other apparent indications.
>
> So you're saying that HTML can have no built-in RDF semantics?

Not can't, but doesn't in it's current incarnation (so I think anyway),  
because it doesn't say anything about RDF in the HTML spec. So it seems  
unfair on HTML authors to change the rules on them after the fact. As a  
developer, you are always free to screen-scrape of course, but I don't  
think the semantics are built in. I don't think you could get a lot  
further than <document> html:title /html/head/title/text() without running  
into trouble.

  A GRDDL-only approach to parsing discards so much DOM
> information about where the structure appears that it is simply not a
> sufficient technique for bridging the clickable and semantic webs.

I totally agree with you about this, and I'm by no means saying tha RDFa  
should merely be a GRDDL profile, but using GRDDL profiles doesn't mean  
that you have to discard the DOM information. GRDDL is important as a  
mechanism for letting authors state precisely what triples they are  
publishing.

> I don't see why there's a problem if the same triples are generated
> twice...

1/ It might not be the same triples - and some of them might be wrong.
2/ In this case, two rdf:IDs with the same value is syntactically  
incorrect RDF/XML.

  if we *have* to choose between a profile and a DTD, then we'd
> go with the DTD because we need things to validate, and it is fairly
> important to define what those new attributes mean semantically.
>

I don't think the profile and the DTD are really (or ought to be) doing  
the same thing  - the DTD is defining the syntax, but how the triples are  
generated is defined by the profile. The DTD can let you determine if the  
document is valid against that DTD, but can it define the triples  
generation rules and tell you if the RDF that you can derive from the  
document according to them, is valid?

I'm inclined to agree with Dan's comments on DTDs not conveying semantics.

Yours,

Keith

-- 
Using Opera's revolutionary e-mail client: http://www.opera.com/mail/

Received on Monday, 25 June 2007 18:30:23 UTC