Re: [Integrity] Revisit RFC6920 URIs for benefit of servers and Semantic Web applications from Austin William Wright on 2015-03-11 (public-webappsec@w3.org from March 2015)

From: Austin William Wright <aaa@bzfx.net>
Date: Wed, 11 Mar 2015 11:38:05 -0700
To: Devdatta Akhawe <dev.akhawe@gmail.com>
Cc: Semantic Web <semantic-web@w3.org>, "public-webappsec@w3.org" <public-webappsec@w3.org>
Message-ID: <CANkuk-WR+uhouP2aMN+8OsntB=Hx13y4TbBrp2O3s+P=D4jOXw@mail.gmail.com>
Understood that it doesn't need to support everything right from the first
version.

The spec, as currently defined, could make minor adjustments to enhance
forward compatibility, that I do not believe would add additional cost to
user agent implementors.

If RFC6920, for whatever reason, is completely off the table, then I would
suggest changing the format of the token so it doesn't look like a valid
RFC3986 URI, by replacing use of the colon and dash with equals and comma,
respectively:

; proposed changes
option-expression  =  option-name "=" [ option-value ]
hash-expression    = hash-algo "," base64-value

This should be unambiguous, because neither comma nor equal sign will
appear in option-name nor hash-algo. And the lack of a colon prevents
RFC3986 parsing.

This enhancement would also allow the TR to give tokens to hashes matching
those in the IANA registry ("SHA-256" instead of "sha256"), further
supporting forward compatibility.

Aside: The ABNF in the current Editor's Draft seems to allow the following
string:

option-name:option-valuesha256-Cg==

i.e. without a WS between option-expression and hash-expression. Am I
reading this correct? /aside

I don't think RFC6920 needs to be off the table, however; merely define an
ABNF subset compatible with RFC6920 `NI-URI` production. For all intents
and purposes, it would be just a coincidence that it happens to be a valid
NI-URI production (if funny-looking). The only significant difference (not
formatting/parsing related) would seem to be handling of ct= (appearing in
every hash production, instead of just once).

I don't see anything wrong with mixing option-expression and NI-URI
productions, that might be an ideal solution to explore. I'm thinking all
option-expressions must be matched, if any, and at least one
hash-expression must be matched.

Finally, the current option-value production prohibits the expression of
arbitrary media types. I regularly use quoted-strings (containing spaces
and special characters) in media types, including in Accept headers, for
instance with JSON Schema, e.g.:

application/json; charset=utf-8; profile="http://example.com/book"

(Note charset isn't a registered parameter for application/json, utf-8 is
by definition, but it's not prohibited, and I've had difficulty getting
some Web browsers to play nicely without it.)

The current option-value production appears to prohibit a media type like
this, whereas the urlencoded ct= named information parameter utilizes
urlencoding.

Thanks,

Austin.

On Wed, Mar 11, 2015 at 9:45 AM, Devdatta Akhawe <dev.akhawe@gmail.com>
wrote:

> Hi Austin
>
> These are all great points, but we are not really trying to address these
> in the first version of SRI. The goal in version 1 is only to be able to
> check the hashes of scripts and links. That said, I tend to agree with you
> that this should be in our radar; do you think the spec, as currently
> defined, makes it impossible to address these concerns in future
> iterations? For example, the parser is intentionally forgiving of formats
> it is not aware of, so as to allow such changes in the future.
>
> cheers
> Dev
>
>
> On Mar 9, 2015 11:26 PM, "Austin William Wright" <aaa@bzfx.net> wrote:
>
>> [CCing swig because I know many of you are also using ni: URIs. Let me
>> know if there's anything to add!]
>>
>> I understand that the ni: URI was removed in a recent update to the SRI
>> draft. I would like to ask this be reconsidered.
>>
>> Using the ni: URI for SRI is important to Semantic Web applications as it
>> allows us to treat the assertion as a link relation. This distinction might
>> not be significant to many user-agents (and thus many on this list), but in
>> Semantic Web applications, especially Web servers, this is of great
>> significance, allows us to make useful relationships between resources.
>>
>> It also confers benefits to servers and application designers, as RFC6920
>> defines a mapping between ni: URIs and a </.well-known/> URL. In a
>> corporate project under development, we're already using ni: URIs to keep a
>> content-addressable database of files, making them accessible through this
>> mapping. I intend to use Subresource Integrity when serving assets from
>> this store.
>>
>> It also provides an intuitive abstraction: If we think of the ni: URI as
>> identifying a resource (the definition of the URI), the integrity=
>> attribute is performing an assertion: "These two URIs must identify the
>> same information resource, otherwise abort!"
>>
>> For additional support for this use case, I'd like to propose making the
>> "integrity" attribute a a link-extension for RFC5988 Web Linking, suitable
>> for use on any declaration of a link.
>>
>> User agents do not need to think of the ni: URI as a URL if they do not
>> need to, they just follow the ABNF defined in the RFC. There's many cases
>> where URIs are used as identifiers in Web applications; in namespaces [1],
>> schemas (e.g. JSON Schema), DTDs, RDFa [2], and in rel= attributes in all
>> sorts of tags and HTTP headers.
>>
>> Additionally, RFC6920 defines a registry of hashes [3], to ensure forward
>> compatibility (e.g. SHA-3, when standardized later this year). I would like
>> to avoid duplication of effort defining a database of hashes.
>>
>> In short, (1) signing was an explicit goal of the ni: URI, along with
>> other features. Not using ni: would mean servers being unable to take
>> advantage of these other features; and, (2) don't forget about the HTTP
>> Link: header.
>>
>> Thanks for your consideration,
>>
>> Austin Wright.
>>
>> [1] http://www.w3.org/TR/html5/infrastructure.html#xml
>> [2] http://www.w3.org/TR/html-rdfa/
>> [3]
>> http://www.iana.org/assignments/named-information/named-information.xhtml
>>
>>
Received on Wednesday, 11 March 2015 18:38:35 UTC