- From: Manu Sporny <msporny@digitalbazaar.com>
- Date: Thu, 20 Nov 2008 17:54:31 -0500
- To: RDFa mailing list <public-rdf-in-xhtml-tf@w3.org>
During the telecon today, the question of how a URL with two fragment
identifiers should be resolved was raised. For example, given the
following URL:
http://example.org/index.xhtml#people#shane
When used as an object in a triple, should the RDFa parser output:
1. <http://example.org/index.xhtml#people#shane>, or
2. <http://example.org/index.xhtml#people>, or
3. <http://example.org/index.xhtml#people%23shane>
RFC-3986 specifically dis-allows the use of '#' in a fragment
identifer[1]. Note that the 'pchar' set does not contain the '#' character.
However, in Appendix B, the document defines a regular expression for
parsing a URI[2]. This regular expression specifies the fragment part of
the regular expression as:
(#(.*))?
This means that any character after a '#' is allowed. Is this a
contradiction in the spec? If so, how do we resolve it?
Shane noted something during the call that seems to be a good compromise.
Option #1: Translating all '#' characters after the initial '#' to '%23'
(the percent-encoded hex value for '#'). Translating all
reserved values that are not accepted fragment identifiers
to their %HEX equivalent.
or we could just do a straight copy-paste up to the application:
Option #2: Leave the fragment as-is and pass it through to the
application to deal with the double-hashed URL.
If we do Option #1, we will also have to ensure that other reserved
characters are encoded properly... except for the reserved values that
are valid in a fragment ID - namely ":@?/", the rest would have to be
encoded:
reserved = gen-delims / sub-delims
gen-delims = ":" / "/" / "?" / "#" / "[" / "]" / "@"
sub-delims = "!" / "$" / "&" / "'" / "(" / ")"
/ "*" / "+" / "," / ";" / "="
Option #2 would be simpler from an implementation standpoint... but I
can't tell if the spec allows that sort of behavior.
If we choose to do the percent-encoded hex value, this is what TC 119
would become:
-------------------------------------------------------------------
Purpose:
This test ensures that RDFa parsers strip the fragment identifier
from [base] when resolving subjects and objects. It also ensures
that proper URL resolution is performed for URLs with multiple
fragment identifiers.
====================== Test Case 119 =============================
---------------------Test Case 119 XHTML--------------------------
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML+RDFa 1.0//EN"
"http://www.w3.org/MarkUp/DTD/xhtml-rdfa-1.dtd">
<html xmlns="http://www.w3.org/1999/xhtml"
xmlns:dc="http://purl.org/dc/elements/1.1/">
<head>
<base href="http://www.example.org/tc119.xhtml#fragment"></base>
<title>Test 0119</title>
</head>
<body>
<p>
<div id="#manu" about="#tc-119" rel="dc:contributor"
property="dc:creator"
href="#manu#sporny">Manu Sporny</div>
wrote this test.
</p>
</body>
</html>
-----------------------------------------------------------------
---------------------Test Case 119 SPARQL -----------------------
ASK WHERE {
<http://www.example.org/tc119.xhtml#tc-119>
<http://purl.org/dc/elements/1.1/contributor>
<http://www.example.org/tc119.xhtml#manu%23sporny> .
<http://www.example.org/tc119.xhtml#tc-119>
<http://purl.org/dc/elements/1.1/creator>
"Manu Sporny" .
}
-----------------------------------------------------------------
-- manu
[1] http://tools.ietf.org/html/rfc3986#section-3.5
[2] http://tools.ietf.org/html/rfc3986#appendix-B
--
Manu Sporny
President/CEO - Digital Bazaar, Inc.
blog: POSIX Threads Don't Scale Past 100K Concurrent Web Requests
http://blog.digitalbazaar.com/2008/09/30/scaling-webservices-part-1
blog: Fibers are the Future: Scaling Past 100K Concurrent Web Requests
http://blog.digitalbazaar.com/2008/10/21/scaling-webservices-part-2
Received on Thursday, 20 November 2008 22:55:15 UTC