Re: [Integrity] Revisit RFC6920 URIs for benefit of servers and Semantic Web applications

Pull request #220 issued:

<https://github.com/w3c/webappsec/pull/220>

Cheers,

Austin.

On Thu, Mar 12, 2015 at 5:48 PM, Devdatta Akhawe <dev.akhawe@gmail.com>
wrote:

> I don't have a problem with that; I actually like the equals and I think
> it makes more sense if we want to support more options.
>
> That said, I am not the one who implemented SRI in the browser. I don't
> know what everyone else thinks. Can you submit a pull request to
> https://github.com/w3c/webappsec ?
>
> cheers
> Dev
>
> (and yes .. I meant hash-source)
>
> On 12 March 2015 at 15:01, Austin William Wright <aaa@bzfx.net> wrote:
>
>> Alright, thanks.
>>
>> I understand the CSP usage is already in the wild, otherwise I would make
>> the same submission to CSP. Perhaps future editions can add an option to
>> refer to the IANA registry (or even use ni: outright)?
>>
>> It's difficult to prove that forward compatibility would be "impossible".
>> Looking forward I imagine older user-agents might try to parse a URI as
>> an `option-expression`, and so I'd prefer the alternative that looks less
>> like a URI, other things being equal. Additionally, use of the equals sign
>> has precedent in other contexts, such as the `parameter` production for
>> media types (RFC2045), the HSTS `directive` production (RFC6797), the
>> HTTP Link header (RFC5988), and more.
>>
>> As a maintainer of libraries that parse many of these productions, and
>> author of Web robots (that will likely, in the distant future, have to grok
>> CSP and SRI), I would find re-using an existing production less complex
>> than implementing a new one.
>>
>> Since I don't find any option-expression production in CSP, would the WG
>> be amenable to using the equals sign character in place of colon, as
>> follows, or an existing similar production?
>>
>> ; proposed changes
>> option-expression  = option-name "=" option-value
>> option-name        = 1*option-name-char
>> option-value       = *option-value-char
>> option-name-char   = ALPHA / DIGIT / "-"
>> option-value-char  = ALPHA / DIGIT / "-" / "+" / "." / "/"
>>
>> (I noticed the current ABNF only allows a single character, here I've
>> allowed multiple characters. As was probably intended before, this
>> modification leaves open the possibility of unambigiously expanding the
>> range of option-value later to allow a quoted-string or urlencoded value,
>> or both.)
>>
>> With regards to re-using productions from CSP, I still see a significant
>> distinction between CSP and SRI: The former is making a series of
>> assertions about an HTTP request, the latter is making an assertion about a
>> link relation, and has implications beyond HTTP. If the WG wants to define
>> a subset of behavior as it applies to just HTTP, that's good, but I still
>> would like to be wary of future re-use and generalization. I don't believe
>> importing the CSP syntax satisfies this. Compared to the benefit of being
>> able to re-use quoted-string/urlencoded/etc, over defining a new one (as
>> the Editor's Draft currently does), I believe the introduced complexity of
>> having a different style of token, or even using ni: outright, is minimal.
>>
>> Finally, do you refer to the `hash-source` production [1], but without
>> the single quotes? I can't find a literal `hash-src` production like you
>> refer to, idk if it might be unreleased somewhere, or something like that.
>>
>> Cheers,
>>
>> Austin.
>>
>> [1] <https://w3c.github.io/webappsec/specs/CSP2/#hash_source>
>> (Unfortunately, I can't find a canonical URL for this version of the
>> editor's draft, but the most recent change appears to be
>> commit 7fe5ce1e2e54130702b0a678a40966f39fab1bab)
>>
>> On Wed, Mar 11, 2015 at 8:35 PM, Devdatta Akhawe <dev.akhawe@gmail.com>
>> wrote:
>>
>>> Hi Austin
>>>
>>> We should definitely fix the whitespace issue. Re the proposed changes:
>>> we want to reuse the deployed CSP hash-src format. So, it is unlikely that
>>> we can change that format. I don't see that it explicitly makes it
>>> impossible to use RFC6920 URIs in the future, if we so deemed necessary? a
>>> ni:// URI can just be one without space and we can define its semantics as
>>> ignore content type specified earlier if it exists in ni:// URI. While not
>>> ideal, there is value to reusing the CSP hash-src format.
>>>
>>> cheers
>>> Dev
>>>
>>> On 11 March 2015 at 11:38, Austin William Wright <aaa@bzfx.net> wrote:
>>>
>>>> Understood that it doesn't need to support everything right from the
>>>> first version.
>>>>
>>>> The spec, as currently defined, could make minor adjustments to enhance
>>>> forward compatibility, that I do not believe would add additional cost to
>>>> user agent implementors.
>>>>
>>>> If RFC6920, for whatever reason, is completely off the table, then I
>>>> would suggest changing the format of the token so it doesn't look like a
>>>> valid RFC3986 URI, by replacing use of the colon and dash with equals and
>>>> comma, respectively:
>>>>
>>>> ; proposed changes
>>>> option-expression  =  option-name "=" [ option-value ]
>>>> hash-expression    = hash-algo "," base64-value
>>>>
>>>> This should be unambiguous, because neither comma nor equal sign will
>>>> appear in option-name nor hash-algo. And the lack of a colon prevents
>>>> RFC3986 parsing.
>>>>
>>>> This enhancement would also allow the TR to give tokens to hashes
>>>> matching those in the IANA registry ("SHA-256" instead of "sha256"),
>>>> further supporting forward compatibility.
>>>>
>>>> Aside: The ABNF in the current Editor's Draft seems to allow the
>>>> following string:
>>>>
>>>> option-name:option-valuesha256-Cg==
>>>>
>>>> i.e. without a WS between option-expression and hash-expression. Am I
>>>> reading this correct? /aside
>>>>
>>>> I don't think RFC6920 needs to be off the table, however; merely define
>>>> an ABNF subset compatible with RFC6920 `NI-URI` production. For all intents
>>>> and purposes, it would be just a coincidence that it happens to be a valid
>>>> NI-URI production (if funny-looking). The only significant difference (not
>>>> formatting/parsing related) would seem to be handling of ct= (appearing in
>>>> every hash production, instead of just once).
>>>>
>>>> I don't see anything wrong with mixing option-expression and NI-URI
>>>> productions, that might be an ideal solution to explore. I'm thinking all
>>>> option-expressions must be matched, if any, and at least one
>>>> hash-expression must be matched.
>>>>
>>>> Finally, the current option-value production prohibits the expression
>>>> of arbitrary media types. I regularly use quoted-strings (containing spaces
>>>> and special characters) in media types, including in Accept headers, for
>>>> instance with JSON Schema, e.g.:
>>>>
>>>> application/json; charset=utf-8; profile="http://example.com/book"
>>>>
>>>> (Note charset isn't a registered parameter for application/json, utf-8
>>>> is by definition, but it's not prohibited, and I've had difficulty getting
>>>> some Web browsers to play nicely without it.)
>>>>
>>>> The current option-value production appears to prohibit a media type
>>>> like this, whereas the urlencoded ct= named information parameter utilizes
>>>> urlencoding.
>>>>
>>>> Thanks,
>>>>
>>>> Austin.
>>>>
>>>> On Wed, Mar 11, 2015 at 9:45 AM, Devdatta Akhawe <dev.akhawe@gmail.com>
>>>> wrote:
>>>>
>>>>> Hi Austin
>>>>>
>>>>> These are all great points, but we are not really trying to address
>>>>> these in the first version of SRI. The goal in version 1 is only to be able
>>>>> to check the hashes of scripts and links. That said, I tend to agree with
>>>>> you that this should be in our radar; do you think the spec, as currently
>>>>> defined, makes it impossible to address these concerns in future
>>>>> iterations? For example, the parser is intentionally forgiving of formats
>>>>> it is not aware of, so as to allow such changes in the future.
>>>>>
>>>>> cheers
>>>>> Dev
>>>>>
>>>>>
>>>>> On Mar 9, 2015 11:26 PM, "Austin William Wright" <aaa@bzfx.net> wrote:
>>>>>
>>>>>> [CCing swig because I know many of you are also using ni: URIs. Let
>>>>>> me know if there's anything to add!]
>>>>>>
>>>>>> I understand that the ni: URI was removed in a recent update to the
>>>>>> SRI draft. I would like to ask this be reconsidered.
>>>>>>
>>>>>> Using the ni: URI for SRI is important to Semantic Web applications
>>>>>> as it allows us to treat the assertion as a link relation. This distinction
>>>>>> might not be significant to many user-agents (and thus many on this list),
>>>>>> but in Semantic Web applications, especially Web servers, this is of great
>>>>>> significance, allows us to make useful relationships between resources.
>>>>>>
>>>>>> It also confers benefits to servers and application designers, as
>>>>>> RFC6920 defines a mapping between ni: URIs and a </.well-known/> URL. In a
>>>>>> corporate project under development, we're already using ni: URIs to keep a
>>>>>> content-addressable database of files, making them accessible through this
>>>>>> mapping. I intend to use Subresource Integrity when serving assets from
>>>>>> this store.
>>>>>>
>>>>>> It also provides an intuitive abstraction: If we think of the ni: URI
>>>>>> as identifying a resource (the definition of the URI), the integrity=
>>>>>> attribute is performing an assertion: "These two URIs must identify the
>>>>>> same information resource, otherwise abort!"
>>>>>>
>>>>>> For additional support for this use case, I'd like to propose making
>>>>>> the "integrity" attribute a a link-extension for RFC5988 Web Linking,
>>>>>> suitable for use on any declaration of a link.
>>>>>>
>>>>>> User agents do not need to think of the ni: URI as a URL if they do
>>>>>> not need to, they just follow the ABNF defined in the RFC. There's many
>>>>>> cases where URIs are used as identifiers in Web applications; in namespaces
>>>>>> [1], schemas (e.g. JSON Schema), DTDs, RDFa [2], and in rel= attributes in
>>>>>> all sorts of tags and HTTP headers.
>>>>>>
>>>>>> Additionally, RFC6920 defines a registry of hashes [3], to ensure
>>>>>> forward compatibility (e.g. SHA-3, when standardized later this year). I
>>>>>> would like to avoid duplication of effort defining a database of hashes.
>>>>>>
>>>>>> In short, (1) signing was an explicit goal of the ni: URI, along with
>>>>>> other features. Not using ni: would mean servers being unable to take
>>>>>> advantage of these other features; and, (2) don't forget about the HTTP
>>>>>> Link: header.
>>>>>>
>>>>>> Thanks for your consideration,
>>>>>>
>>>>>> Austin Wright.
>>>>>>
>>>>>> [1] http://www.w3.org/TR/html5/infrastructure.html#xml
>>>>>> [2] http://www.w3.org/TR/html-rdfa/
>>>>>> [3]
>>>>>> http://www.iana.org/assignments/named-information/named-information.xhtml
>>>>>>
>>>>>>
>>>>
>>>
>>
>

Received on Thursday, 19 March 2015 15:57:03 UTC