- From: Chimezie Ogbuji <ogbujic@bio.ri.ccf.org>
- Date: Fri, 10 Nov 2006 12:57:33 -0500 (EST)
- To: GRDDL Working Group <public-grddl-wg@w3.org>
- Message-ID: <Pine.GSO.4.60.0611101253520.21289@joplin.bio.ri.ccf.org>
As an exercise, I wrote (from scratch) a GRDDL implementation for RDFLib and 4Suite and ported testHarness.py to work with the implementation (it uses RDFLib for processing the test manifest and a graph isomorphism mechanism to properly check non-lean graphs for equivalence). Both are attached and get through the test suites (including the RDFa test DanC added recently). I plan on modifying the ported testHarness to output test results using the EARL vocabulary [1]. Below are some notes along the way that I thought were relevant: ## Using the GRDDL source Uri as the Base URI ## The GRDDL source uri is used as the Base URI when parsing the source document as well as when parsing the resulting RDF syntax. The APIs for both scenarios allow an explicit base uri to be passed on as a parameter. This properly accomodated the use of empty relative URIs references within the result of one of the test cases (I forget which). The base was also used to resolve references to transformation uris (some of which were relative URI references). ## NS Dispatch Termination ## I setup a list of namespace uri's that are known to not be GRDDLable (to avoid any uneccessary attempt to glean from them). Currently the XHTML namespace is the only item in this list In addition, to aid in avoiding circular namespace dispatch processing, the implementation maintains a list of applied transforms. Perhaps it should also maintain a list of visited namespace uris to avoid that kind of redundancy as well? Is guidance in the spec appropriate for such a scenario? ## Guidance in parsing a GRDDL result (@method or @media-type?) ## Currently, the implementation keys off xsl:output/@method to determine how it parses the resulting RDF syntax. This seems to provide sufficient coverage, but ofcourse, doesn't accomodate specific mime-types (which can also be specified via xsl:output/@media-type). For example: RDFLib has a built in RDFa parser, however the client needs some *specific* indication of when to try parsing the resulting RDF syntax as RDFa. For example, if there is a specific media-type for RDFa (application/xhtml+xml+rdfa - or some such), the only way to guide the parser appropriately is to use xsl:output/@media-type otherwise the parser would only know that the result was XML but not whether it is (RDFa or RDF/XML). Currently it will only try to parse a GRDDL result identified as xml (via @method) as RDF/XML. I guess a more comprehensive approach would be to check the media-type as a secondary indication to @method, but what about if they 'clash' - i.e., the @method is xml, but the media-type is text/n3? Ofcourse, if the resulting XHTML/RDFa had GRDDL hooks that pointed to an RDFa2RDFXML transform, this would be a non-issue as the glean process would pick this up (as long as the RDFa/XHTML @method was 'xml'). ## Mime-types of GRDDL source URIs ## The implementation has a (disabled) mechanism for only attempting to parse a GRDDL source URI as XML if the content-type in the HTTP header response is appropriate: (?:text|application)/.*\+?xml' Should a glean not be attempted if a GRDDL source document is served as text/plain? The documents in the test suite, for instance are served as text/plain [1] http://www.w3.org/TR/EARL10-Schema/ Chimezie Ogbuji Lead Systems Analyst Thoracic and Cardiovascular Surgery Cleveland Clinic Foundation 9500 Euclid Avenue/ W26 Cleveland, Ohio 44195 Office: (216)444-8593 ogbujic@ccf.org
Attachments
- TEXT/PLAIN attachment: GRDDL Implementation (4Suite/RDFLib)
- TEXT/PLAIN attachment: test harness - ported to 4Suite/RDFLib
Received on Friday, 10 November 2006 17:57:59 UTC