Proposal to re-open 4.3.3.1 (Re: ACTION-97 - Raise [] issue on public list [Fwd: RFC 2396 + RFC 2732 vs. RFC 3986 (XMLDSIG section 4.3.3.1)])

Thanks Konrad.  Based on your drilling down (and some more of my
own), I'd actually like to propose that we re-open 4.3.3.1; I hope
this message counts as new information.


The URI attribute is defined in the schema to be of the anyURI
primitive type.

[1] defines the mapping of that type to actual URIs by normative
reference to XLink, section 5.4 [2].  Lexically, an anyURI value is
*not* constrained to conform to the grammar in the URI spec.
(However, it must be a URI *after* escaping.)

If you read [2], the language will ring familiar from xmldsig-core's
original material, even though it is not 100% identical to what was
there.  Specifically, in XLink, it is clear that that language
describes how to turn an anyURI value into a URI that conforms to
the URI spec's requirements.

There are three issues here:


1. We might have gotten the meaning of some of the language in
4.3.3.1 backwards.

Indeed, if the string that we want to put into a URI attribute is a
URI (reference) as constrained in whatever relevant RFC, then
applying the escaping mechanism described in 4.3.3.1 is deeply
pointless, as that string will already conform to the constraints.

On the other hand, if we want to take a string that is permissible
as an anyURI value, but not permissible according to the URI spec of
the day, and create a URI from it, then the escaping indeed needs to
be applied (leaving aside the oddball remarks about % characters).
However, in this case, there is no point in re-specifying what
schema part 2 already says.

If this interpretation of things is correct, I propose that we
replace the first paragraph of 4.3.3.1 (including the enumerated
list) by the following:

	The URI attribute identifies a data object using a URI
	Reference [URI].  The mapping from this attribute's value to
	a URI reference MUST be performed as specified in part 2,
	section 3.2.7 of [XMLSCHEMA].

Further, I propose striking the following sentence:

 	XML signature applications MUST be able to parse URI syntax. 

... on the grounds that "be able to parse URI syntax" is 
ill-defined, and that more relevant requirements are actually
present where they are needed (specifically, the ability to properly
support certain same-document URI references).

The next sentence should then be changed to read as follows:

	We RECOMMEND that XML signature applications be able to
	dereference URIs in the HTTP scheme. 

2. We might want to bump our normative refences to various
recommendations to the current editions of these recommendations.
That includes but is not limited to Schema.

3. We might want to inquire if there are any plans regarding
updating schema's and xlink's normative references for the URI spec
toward RFC 3986, and whether there are any caveats with "consuming"
specs using RFC 3986 at this time.

Thoughts?

1. http://www.w3.org/TR/xmlschema-2/#anyURI
2. http://www.w3.org/TR/2001/REC-xlink-20010627/#link-locators
-- 
Thomas Roessler, W3C  <tlr@w3.org>





On 2007-10-10 00:39:10 +0200, Konrad Lanz wrote:
> From: Konrad Lanz <Konrad.Lanz@iaik.tugraz.at>
> To: XMLSec <public-xmlsec-maintwg@w3.org>
> Date: Wed, 10 Oct 2007 00:39:10 +0200
> Subject: ACTION-97 - Raise [] issue on public list [Fwd: RFC 2396 + RFC 2732  vs. RFC 3986 (XMLDSIG
> 	section 4.3.3.1)]
> List-Id: <public-xmlsec-maintwg.w3.org>
> X-Spam-Level: 
> Archived-At: <http://www.w3.org/mid/470C030E.800@iaik.tugraz.at>
> X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.1.5
> 
> ACTION-97,
> has gone to wrong address ....
> 
> -------- Original-Nachricht --------
> Betreff: 	RFC 2396 + RFC 2732 vs. RFC 3986 (XMLDSIG section 4.3.3.1)
> Datum: 	Fri, 28 Sep 2007 04:05:25 +0200
> Von: 	Konrad Lanz <Konrad.Lanz@iaik.tugraz.at>
> An: 	public-xmlsec-maintwg-request@w3.org
> Referenzen: 	<5661610A-3D6E-4982-9A35-B7208906A763@nokia.com>
> 
> 
> 
> Abstract:
> Discussion on RFC 2396 + RFC 2732 vs. RFC 3986.
> 
> A Potential Conclusion:
> There should be no issue with changing to RFC 3986, with the caveat that 
> we may want to allow to verify references with a "fragment only uri 
> reference" that actually has unescaped square brackets.
> 
> 
> Discussion:
> ======================================================================
> XMLDSIG 2002  4.3.3.1
> ======================================================================
> > The URI attribute identifies a data object using a URI-Reference, as
> > specified by RFC2396 [URI]. The set of allowed characters for URI
> > attributes is the same as for XML, namely [Unicode]. However, some
> > Unicode characters are disallowed from URI references including all
> > non-ASCII characters and the excluded characters listed in RFC2396
> > [URI, section 2.4]. However, the number sign (#), percent sign (%),
> > and square bracket characters re-allowed in RFC 2732 [URI-Literal]
> > are permitted.
> 
> 
> RFC 2396
> ========
> 
> fragment      = *uric
> uric          = reserved | unreserved | escaped
> reserved      = ";" | "/" | "?" | ":" | "@" | "&" | "=" | "+" |
>                  "$" | ","
> unreserved    = alphanum | mark
> mark          = "-" | "_" | "." | "!" | "~" | "*" | "'" |
>                  "(" | ")"
> 
> 
> -->
> 
> fragment      = *(
>                  ";" | "/" | "?" | ":" | "@" | "&" | "=" | "+" |
>                  "$" | ","
>                   alphanum |
>                  "-" | "_" | "." | "!" | "~" | "*" | "'" |
>                  "(" | ")"
>                   )
> 
> -->
> 
> a..zA..Z0..9-._~!$&'()*+,;=/?:@
> 
> 
> XMLDSIG 2002 allowed square brackets([])
>   as in RFC 2732.
> 
> 
> RFC 2732
> ========
> 
> > This document incudes an update to the generic syntax for Uniform
> > Resource Identifiers defined in RFC 2396 [URL].  It defines a syntax
> > for IPv6 addresses and allows the use of "[" and "]" within a URI
> > explicitly for this reserved purpose.
> 
>        reserved    = ";" | "/" | "?" | ":" | "@" | "&" | "=" | "+" |
>                      "$" | "," | "[" | "]"
> 
> -->
> 
> fragment      = *(
>                  ";" | "/" | "?" | ":" | "@" | "&" | "=" | "+" |
>                  "$" | "," | "[" | "]"
>                   alphanum |
>                  "-" | "_" | "." | "!" | "~" | "*" | "'" |
>                  "(" | ")"
>                   )
> 
> -->
> 
> a..zA..Z0..9-._~!$&'()*+,;=/?:@[]
> 
> 
> Although the grammar was changed in RFC 2732 in a way that allowed
> "[" | "]" in the fragment the prose in RFC 2732 is saying:
> 
> > It defines a syntax
> > for IPv6 addresses and allows the use of "[" and "]" within a URI
> > explicitly for this reserved purpose.
> 
> 
> That indicates that this overrules the grammar wich is also consistent
> with the current RFC 3986 grammar.
> 
> @@@ Konrad Lanz 28. Sept. 2007 @@@:
> > I propose however that we despite of this fact allow implementations to dereference a fragment only uri references containing to "unescaped square brackets" as the grammar in RFC 2732 (as opposed to its prose) would have allowed this.
> 
> 
> XMLDSIG 2002 allowed (#), percent sign (%)
> ===========================================
> Here the only valid interpretation is is that (#), percent sign (%)
> are allowed (in their non-percent encoded form) to sperate the fragment
> and to initiate a percent encoding
>   respectively
> because RFC 2396
>   says
> the following:
> 
> > The character "#" is excluded
> > because it is used to delimit a URI from a fragment identifier in URI
> > references (Section 4). The percent character "%" is excluded because
> > it is used for the encoding of escaped characters.
> 
> Wich is also consistent with RFC 3986 and the latest draft XMLDSIG 2007.
> 
> 
> +========+ The interpretation above makes the mention of number sign (#)
> |        | and percent sign (%) in 4.3.3.1 redundant.
> | BEWARE | Some implementations may have wrongly interpreted 4.3.3.1
> |        | to allow number sign (#) and percent sign (%) in in their
> |        | non-percent encoded form in the fragment, wich however
> |        | contradicts the grammar in RFC 2396 and the prose in
> +========+ RFC 2732 and is inconsistent with RFC 3986.
> 
> If such a misinterpretation caused the production of signatures
> containing an xpointer like the following
> 
> #xpointer(//*[@authenticate='true']) (cf. EBICS-Standard in Germany)
> 
> it does not comply to the grammar in RFC 3986 and the interpretation
> of RFC 2732 above does not allow square brackets in the fragment.
> 
> Correct would be the following
> 
> #xpointer(//*%5B@authenticate='true'%5D)
> 
> 
> As however square brackets wrongly appear to be allowed in fragments
> according to RFC 2732 grammar, but prohibited to the prose in RFC 2732
> we may want to allow implementations to verify such signatures and
> advocate against the creation of new signatures that fail to escape the
> gen-delims characters in RFC 3986 (unless they really delimit the
> components of the URI).
> 
> 
> The text in the current draft correctly follows RFC 3986, but maybe we
> would like to add a note pointing to this mail.
> 
> ======================================================================
> XMLDSIG 2007 4.3.3.1
> ======================================================================
> 
> RFC 3986
> 
> fragment      = *( pchar / "/" / "?" )
> pchar         = unreserved / pct-encoded / sub-delims / ":" / "@"
> unreserved    = ALPHA / DIGIT / "-" / "." / "_" / "~"
> sub-delims    = "!" / "$" / "&" / "'" / "(" / ")"
>                   / "*" / "+" / "," / ";" / "="
> 
> -->
> 
> fragment      = *( pct-encoded / ALPHA / DIGIT / "-" / "." / "_" / "~"
>                   / "!" / "$" / "&" / "'" / "(" / ")"
>                   / "*" / "+" / "," / ";" / "="
>                   / "/" / "?" )
> 
> -->
> 
> a..zA..Z0..9-._~!$&'()*+,;=/?:@
> 
> 
> ==>
> 
> The allowed characters are equal usinf the interpretation in this mail.
> 
> RFC 2396 fragment chars are : a..zA..Z0..9-._~!$&'()*+,;=/?:@
> RFC 3986 fragment chars are : a..zA..Z0..9-._~!$&'()*+,;=/?:@
> 
> 
> regards
> 
> Konrad Lanz
> 
> P.S: Non percent encoded unicode caracters that can live in URI
> references inside XML are disjoint from the set of characters in
> RFC 2396 and RFC 3986 grammar and hence do not need to be discussed
> here further.
> 
> -- 
> Konrad Lanz, IAIK/SIC - Graz University of Technology
> Inffeldgasse 16a, 8010 Graz, Austria
> Tel: +43 316 873 5547
> Fax: +43 316 873 5520
> https://www.iaik.tugraz.at/aboutus/people/lanz
> http://jce.iaik.tugraz.at
> 
> Certificate chain (including the EuroPKI root certificate):
> https://europki.iaik.at/ca/europki-at/cert_download.htm
> 
> -- 
> Konrad Lanz, IAIK/SIC - Graz University of Technology
> Inffeldgasse 16a, 8010 Graz, Austria
> Tel: +43 316 873 5547
> Fax: +43 316 873 5520
> https://www.iaik.tugraz.at/aboutus/people/lanz
> http://jce.iaik.tugraz.at
> 
> Certificate chain (including the EuroPKI root certificate):
> https://europki.iaik.at/ca/europki-at/cert_download.htm
> 

> begin:vcard
> fn:Konrad Lanz
> n:Lanz;Konrad
> org:IAIK/SIC;Java Security
> adr:http://jce.iaik.tugraz.at;;Inffeldgasse 16a;Graz;Styria;8010;Austria
> email;internet:Konrad.Lanz@iaik.tugraz.at
> title:Bakk.techn.,BSc Hons
> tel;work:+43 316 873 5547
> x-mozilla-html:FALSE
> url:http://www.iaik.tu-graz.ac.at/aboutus/people/lanz
> version:2.1
> end:vcard
> 

Received on Tuesday, 16 October 2007 15:13:04 UTC