W3C home > Mailing lists > Public > whatwg@whatwg.org > August 2008

[whatwg] Application deployment

From: Christoph Päper <christoph.paeper@crissov.de>
Date: Sun, 3 Aug 2008 14:35:03 +0200
Message-ID: <B3123FEE-FEFC-4078-8DF9-3C4E6C93D65C@crissov.de>
Robert O'Callahan:
>> http://www.example.com/site.jar#/path/inside/foo.html#heading1
>
> URL parsing doesn't support multiple fragment identifiers

I'm surprised that RFC 3986 (like 2396) makes '#' reserved in  
fragment identifiers (only '[]', too). The fragment ID is terminated  
only by the end of the URI after all. The one reason for disallowing  
'#' I can think of is tokenization starting from the end of the  
string, but as far as I know that may fail for other parts.

   fragment    = *( pchar / "/" / "?" )
   pchar       = unreserved / pct-encoded / sub-delims / ":" / "@"
   unreserved  = ALPHA / DIGIT / "-" / "." / "_" / "~"
   pct-encoded = "%" HEXDIG HEXDIG
   sub-delims  = "!" / "$" / "&" / "'" / "(" / ")" / "*" / "+" /  
"," / ";" / "="

<http://www.example.com/site.jar#/path/inside/foo.html%23heading1>  
should work fine, though.

-----8<--------8<--------8<--------8<--------8<--------8<--------8<-----

I'm also surprised that RFC 3986 (unlike 2396) misses a section on US- 
ASCII characters deliberately excluded, i.e. <C0> and '"<>{}|\`^ ',  
previously also '[]'. I think

   reserved    = gen-delims / sub-delims
   gen-delims  = ":" / "/" / "?" / "#" / "[" / "]" / "@"
   ...

should be something like

   reserved    = delims / enclosing / unwise / controls
   delims      = gen-delims / sub-delims
   enclosing   = DQUOTE / "<" / ">" / SP
   unwise      = "{" / "}" / "|" / "\" / "`" / "^"
   controls    = %x00-1F / %x7F
   ...
Received on Sunday, 3 August 2008 05:35:03 UTC

This archive was generated by hypermail 2.4.0 : Wednesday, 22 January 2020 16:59:04 UTC