Re: [Integrity] Revisit RFC6920 URIs for benefit of servers and Semantic Web applications from Devdatta Akhawe on 2015-03-12 (public-webappsec@w3.org from March 2015)

From: Devdatta Akhawe <dev.akhawe@gmail.com>
Date: Wed, 11 Mar 2015 20:35:14 -0700
To: Austin William Wright <aaa@bzfx.net>
Cc: Semantic Web <semantic-web@w3.org>, "public-webappsec@w3.org" <public-webappsec@w3.org>
Message-ID: <CAPfop_0N7E4T=keK=LwpEehHJ6hrjYSPN3QTc4sRE0FHZb03yQ@mail.gmail.com>
Hi Austin

We should definitely fix the whitespace issue. Re the proposed changes: we
want to reuse the deployed CSP hash-src format. So, it is unlikely that we
can change that format. I don't see that it explicitly makes it impossible
to use RFC6920 URIs in the future, if we so deemed necessary? a ni:// URI
can just be one without space and we can define its semantics as ignore
content type specified earlier if it exists in ni:// URI. While not ideal,
there is value to reusing the CSP hash-src format.

cheers
Dev

On 11 March 2015 at 11:38, Austin William Wright <aaa@bzfx.net> wrote:

> Understood that it doesn't need to support everything right from the first
> version.
>
> The spec, as currently defined, could make minor adjustments to enhance
> forward compatibility, that I do not believe would add additional cost to
> user agent implementors.
>
> If RFC6920, for whatever reason, is completely off the table, then I would
> suggest changing the format of the token so it doesn't look like a valid
> RFC3986 URI, by replacing use of the colon and dash with equals and comma,
> respectively:
>
> ; proposed changes
> option-expression  =  option-name "=" [ option-value ]
> hash-expression    = hash-algo "," base64-value
>
> This should be unambiguous, because neither comma nor equal sign will
> appear in option-name nor hash-algo. And the lack of a colon prevents
> RFC3986 parsing.
>
> This enhancement would also allow the TR to give tokens to hashes matching
> those in the IANA registry ("SHA-256" instead of "sha256"), further
> supporting forward compatibility.
>
> Aside: The ABNF in the current Editor's Draft seems to allow the following
> string:
>
> option-name:option-valuesha256-Cg==
>
> i.e. without a WS between option-expression and hash-expression. Am I
> reading this correct? /aside
>
> I don't think RFC6920 needs to be off the table, however; merely define an
> ABNF subset compatible with RFC6920 `NI-URI` production. For all intents
> and purposes, it would be just a coincidence that it happens to be a valid
> NI-URI production (if funny-looking). The only significant difference (not
> formatting/parsing related) would seem to be handling of ct= (appearing in
> every hash production, instead of just once).
>
> I don't see anything wrong with mixing option-expression and NI-URI
> productions, that might be an ideal solution to explore. I'm thinking all
> option-expressions must be matched, if any, and at least one
> hash-expression must be matched.
>
> Finally, the current option-value production prohibits the expression of
> arbitrary media types. I regularly use quoted-strings (containing spaces
> and special characters) in media types, including in Accept headers, for
> instance with JSON Schema, e.g.:
>
> application/json; charset=utf-8; profile="http://example.com/book"
>
> (Note charset isn't a registered parameter for application/json, utf-8 is
> by definition, but it's not prohibited, and I've had difficulty getting
> some Web browsers to play nicely without it.)
>
> The current option-value production appears to prohibit a media type like
> this, whereas the urlencoded ct= named information parameter utilizes
> urlencoding.
>
> Thanks,
>
> Austin.
>
> On Wed, Mar 11, 2015 at 9:45 AM, Devdatta Akhawe <dev.akhawe@gmail.com>
> wrote:
>
>> Hi Austin
>>
>> These are all great points, but we are not really trying to address these
>> in the first version of SRI. The goal in version 1 is only to be able to
>> check the hashes of scripts and links. That said, I tend to agree with you
>> that this should be in our radar; do you think the spec, as currently
>> defined, makes it impossible to address these concerns in future
>> iterations? For example, the parser is intentionally forgiving of formats
>> it is not aware of, so as to allow such changes in the future.
>>
>> cheers
>> Dev
>>
>>
>> On Mar 9, 2015 11:26 PM, "Austin William Wright" <aaa@bzfx.net> wrote:
>>
>>> [CCing swig because I know many of you are also using ni: URIs. Let me
>>> know if there's anything to add!]
>>>
>>> I understand that the ni: URI was removed in a recent update to the SRI
>>> draft. I would like to ask this be reconsidered.
>>>
>>> Using the ni: URI for SRI is important to Semantic Web applications as
>>> it allows us to treat the assertion as a link relation. This distinction
>>> might not be significant to many user-agents (and thus many on this list),
>>> but in Semantic Web applications, especially Web servers, this is of great
>>> significance, allows us to make useful relationships between resources.
>>>
>>> It also confers benefits to servers and application designers, as
>>> RFC6920 defines a mapping between ni: URIs and a </.well-known/> URL. In a
>>> corporate project under development, we're already using ni: URIs to keep a
>>> content-addressable database of files, making them accessible through this
>>> mapping. I intend to use Subresource Integrity when serving assets from
>>> this store.
>>>
>>> It also provides an intuitive abstraction: If we think of the ni: URI as
>>> identifying a resource (the definition of the URI), the integrity=
>>> attribute is performing an assertion: "These two URIs must identify the
>>> same information resource, otherwise abort!"
>>>
>>> For additional support for this use case, I'd like to propose making the
>>> "integrity" attribute a a link-extension for RFC5988 Web Linking, suitable
>>> for use on any declaration of a link.
>>>
>>> User agents do not need to think of the ni: URI as a URL if they do not
>>> need to, they just follow the ABNF defined in the RFC. There's many cases
>>> where URIs are used as identifiers in Web applications; in namespaces [1],
>>> schemas (e.g. JSON Schema), DTDs, RDFa [2], and in rel= attributes in all
>>> sorts of tags and HTTP headers.
>>>
>>> Additionally, RFC6920 defines a registry of hashes [3], to ensure
>>> forward compatibility (e.g. SHA-3, when standardized later this year). I
>>> would like to avoid duplication of effort defining a database of hashes.
>>>
>>> In short, (1) signing was an explicit goal of the ni: URI, along with
>>> other features. Not using ni: would mean servers being unable to take
>>> advantage of these other features; and, (2) don't forget about the HTTP
>>> Link: header.
>>>
>>> Thanks for your consideration,
>>>
>>> Austin Wright.
>>>
>>> [1] http://www.w3.org/TR/html5/infrastructure.html#xml
>>> [2] http://www.w3.org/TR/html-rdfa/
>>> [3]
>>> http://www.iana.org/assignments/named-information/named-information.xhtml
>>>
>>>
>
Received on Thursday, 12 March 2015 03:36:09 UTC