- From: Harry Halpin <hhalpin@ibiblio.org>
- Date: Tue, 27 Feb 2007 12:28:30 -0500
- To: Dave Beckett <dave@dajobe.org>
- Cc: public-grddl-comments@w3.org
Actually, as the author of the offending web-page (sorry, it was hacked together by hand in my attempt to learn Embedded RDF - I'll fix it up and package it up with the VCard/RDF note after we get GRDDL to Last Call...), I think the answer is that Raptor is right and GRDDL.py is off. The reason is that while DanC correctly notes we underspecified lots of things, we did not underspecify that a GRDDL transforms XPath nodes to graphs: " If an information resource([WEBARCH] <http://www.w3.org/2004/01/rdxh/spec#WEBARCH>, section 2.2) IR is represented by an XML document with an XPath root node R, and R has a GRDDL transformation with a transformation property TP, and TP applied to R gives an RDF Graph G, then G is a GRDDL result of IR." I believe in order to get XPath nodes, once must get an XPath data model: "XPath operates on the abstract, logical structure of an XML document, rather than its surface syntax. This logical structure, known as the *data model*, is defined in [XQuery/XPath Data Model (XDM)] <http://www.w3.org/TR/xpath20/#datamodel>.]" [1] Therefore, if something is not a valid XML document, and Raptor claims that VCard Table is not, then it should not produce any GRDDL results. However, we do have a use case [2] that shows how tidy can be used to get well-formed XML out of tagsoup, and therefore get the Infoset. However, the paragraph DanC mentions notes that this should be a feature of the transform itself, although clients may try to do this at their own risk. [1]http://www.w3.org/TR/xpath20/#id-introduction [2] http://www.w3.org/2001/sw/grddl-wg/doc43/scenario-gallery.htm#html_tidy_use_case Dave Beckett wrote: > http://chatlogs.planetrdf.com/swig/2007-02-10#T03-28-23 > onwards: > > <chimezie> .grddl > http://www.ibiblio.org/hhalpin/homepage/notes/vcardtable.html "SELECT > ?homeSyn WHERE { ?homeSyn owl:equivalentProperty foaf:homePage }" > > <Emeka> Querying against 98 triples > ... > > Raptor failed on this document. > > Checking I found: > > $ xmllint --valid --noout > http://www.ibiblio.org/hhalpin/homepage/notes/vcardtable.html > http://www.ibiblio.org/hhalpin/homepage/notes/vcardtable.html:29: element > div: validity error : ID v.Address already defined > w3.org/2006/vcard/ns#Address">v:Address</a></td><td></td><td><div id="v.Address" > ^ > http://www.ibiblio.org/hhalpin/homepage/notes/vcardtable.html:48: element > tr: validity error : Element tr content does not follow the DTD, expecting > (th | td)+, got (td td a td td ) > </tr><tr id="v.url"> > ^ > http://www.ibiblio.org/hhalpin/homepage/notes/vcardtable.html:134: element > tr: validity error : ID v.role already defined > </tr><tr id="v.role"> > > > However GRDDL.py was generating triples. It was not obvious to me > that you are assuming the GRDDL process runs in WF-only XML mode. > > I shall change Raptor's use of libxml accordingly, if this is > the case. > > Is XML validation of the profile/namespace URIs, XSLT documents > also ignored? I would assume not, since they are somebody else's > mime type, spec. RDF/XML aka application/rdf+xml does use validation. > > Dave > > -- -harry Harry Halpin, University of Edinburgh http://www.ibiblio.org/hhalpin 6B522426
Received on Tuesday, 27 February 2007 17:28:49 UTC