W3C home > Mailing lists > Public > public-iri@w3.org > February 2010

"Web Address processing" (ABNF, processing proposal)

From: Julian Reschke <julian.reschke@gmx.de>
Date: Fri, 12 Feb 2010 13:51:32 +0100
Message-ID: <4B754ED4.7030107@gmx.de>
To: "public-iri@w3.org" <public-iri@w3.org>
Hi,

I was looking at 
<http://tools.ietf.org/html/draft-ietf-iri-3987bis-00#section-7.2>, 
starting with the ABNF:

      href-ucschar  = " " / "<" / ">" / '"' / "{" / "}" / "|"
                       / "\" / "^" / "`" / %x0-1F / %x7F-D7FF
                       / %xE000-FFFD / %x10000-10FFFF
      href-pct-form = pct-encoded | "%"
      href-path-sep = "/" | "\"
      href-strip    =

Nits:

- it mixes RFC2616- and RFC5234-style ABNF ("|" vs "/")

- '"' doesn't work in RFC 5234 syntax, it needs to be the character 
code, or DQUOTE

- href-strip is undefined: it's not clear to me that it's actually going 
to be used (more below)

If we adopt the RFC 5234 predefined rules, href-ucschar can be rewritten as:

  CTL / SP / DQUOTE / "<" / ">" / "\" / "^" / "`" / "{" / "|" / "}" / 
%x80-D7FF / %xE000-FFFD / %x100000-10FFFF

...we might even want to name the production for

  %x80-D7FF / %xE000-FFFD / %x100000-10FFFF

globally.


Moving away from editorial issues:

I'd really like to discuss whether we can collapse more of LEIRI and 
HREF into a single definition.

- the ABNFs do not look different (yet)

- preprocessing (dropping leading and trailing whitespace) IMHO doesn't 
need to part of the definition of the protocol element

- preprocessing (stripping certain characters): is this really needed? 
Not convinced about that.

This would leave us with:

- special handling of non-ASCII characters in the query part

...which should me manageable.

Best regards, Julian
Received on Friday, 12 February 2010 12:52:15 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Monday, 30 April 2012 19:51:56 GMT