- From: Julian Reschke <julian.reschke@gmx.de>
- Date: Fri, 21 Nov 2008 07:37:30 +0100
- To: Manu Sporny <msporny@digitalbazaar.com>
- CC: RDFa mailing list <public-rdf-in-xhtml-tf@w3.org>
Manu Sporny wrote: > During the telecon today, the question of how a URL with two fragment > identifiers should be resolved was raised. For example, given the > following URL: > > http://example.org/index.xhtml#people#shane > > When used as an object in a triple, should the RDFa parser output: > > 1. <http://example.org/index.xhtml#people#shane>, or > 2. <http://example.org/index.xhtml#people>, or > 3. <http://example.org/index.xhtml#people%23shane> > > RFC-3986 specifically dis-allows the use of '#' in a fragment > identifer[1]. Note that the 'pchar' set does not contain the '#' character. > > However, in Appendix B, the document defines a regular expression for > parsing a URI[2]. This regular expression specifies the fragment part of > the regular expression as: > > (#(.*))? > > This means that any character after a '#' is allowed. Is this a > contradiction in the spec? If so, how do we resolve it? No, it's not a contradiction; because the regexp is not normative. > Shane noted something during the call that seems to be a good compromise. > > Option #1: Translating all '#' characters after the initial '#' to '%23' > (the percent-encoded hex value for '#'). Translating all > reserved values that are not accepted fragment identifiers > to their %HEX equivalent. > > or we could just do a straight copy-paste up to the application: > > Option #2: Leave the fragment as-is and pass it through to the > application to deal with the double-hashed URL. > ... An alternative is to leave the handling unspecified, as the input is invalid. Best regards, Julian
Received on Friday, 21 November 2008 06:38:10 UTC