- From: Manu Sporny <msporny@digitalbazaar.com>
- Date: Thu, 20 Nov 2008 17:54:31 -0500
- To: RDFa mailing list <public-rdf-in-xhtml-tf@w3.org>
During the telecon today, the question of how a URL with two fragment identifiers should be resolved was raised. For example, given the following URL: http://example.org/index.xhtml#people#shane When used as an object in a triple, should the RDFa parser output: 1. <http://example.org/index.xhtml#people#shane>, or 2. <http://example.org/index.xhtml#people>, or 3. <http://example.org/index.xhtml#people%23shane> RFC-3986 specifically dis-allows the use of '#' in a fragment identifer[1]. Note that the 'pchar' set does not contain the '#' character. However, in Appendix B, the document defines a regular expression for parsing a URI[2]. This regular expression specifies the fragment part of the regular expression as: (#(.*))? This means that any character after a '#' is allowed. Is this a contradiction in the spec? If so, how do we resolve it? Shane noted something during the call that seems to be a good compromise. Option #1: Translating all '#' characters after the initial '#' to '%23' (the percent-encoded hex value for '#'). Translating all reserved values that are not accepted fragment identifiers to their %HEX equivalent. or we could just do a straight copy-paste up to the application: Option #2: Leave the fragment as-is and pass it through to the application to deal with the double-hashed URL. If we do Option #1, we will also have to ensure that other reserved characters are encoded properly... except for the reserved values that are valid in a fragment ID - namely ":@?/", the rest would have to be encoded: reserved = gen-delims / sub-delims gen-delims = ":" / "/" / "?" / "#" / "[" / "]" / "@" sub-delims = "!" / "$" / "&" / "'" / "(" / ")" / "*" / "+" / "," / ";" / "=" Option #2 would be simpler from an implementation standpoint... but I can't tell if the spec allows that sort of behavior. If we choose to do the percent-encoded hex value, this is what TC 119 would become: ------------------------------------------------------------------- Purpose: This test ensures that RDFa parsers strip the fragment identifier from [base] when resolving subjects and objects. It also ensures that proper URL resolution is performed for URLs with multiple fragment identifiers. ====================== Test Case 119 ============================= ---------------------Test Case 119 XHTML-------------------------- <?xml version="1.0" encoding="UTF-8"?> <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML+RDFa 1.0//EN" "http://www.w3.org/MarkUp/DTD/xhtml-rdfa-1.dtd"> <html xmlns="http://www.w3.org/1999/xhtml" xmlns:dc="http://purl.org/dc/elements/1.1/"> <head> <base href="http://www.example.org/tc119.xhtml#fragment"></base> <title>Test 0119</title> </head> <body> <p> <div id="#manu" about="#tc-119" rel="dc:contributor" property="dc:creator" href="#manu#sporny">Manu Sporny</div> wrote this test. </p> </body> </html> ----------------------------------------------------------------- ---------------------Test Case 119 SPARQL ----------------------- ASK WHERE { <http://www.example.org/tc119.xhtml#tc-119> <http://purl.org/dc/elements/1.1/contributor> <http://www.example.org/tc119.xhtml#manu%23sporny> . <http://www.example.org/tc119.xhtml#tc-119> <http://purl.org/dc/elements/1.1/creator> "Manu Sporny" . } ----------------------------------------------------------------- -- manu [1] http://tools.ietf.org/html/rfc3986#section-3.5 [2] http://tools.ietf.org/html/rfc3986#appendix-B -- Manu Sporny President/CEO - Digital Bazaar, Inc. blog: POSIX Threads Don't Scale Past 100K Concurrent Web Requests http://blog.digitalbazaar.com/2008/09/30/scaling-webservices-part-1 blog: Fibers are the Future: Scaling Past 100K Concurrent Web Requests http://blog.digitalbazaar.com/2008/10/21/scaling-webservices-part-2
Received on Thursday, 20 November 2008 22:55:15 UTC