Re: [Integrity] Revisit RFC6920 URIs for benefit of servers and Semantic Web applications

Understood that it doesn't need to support everything right from the first
version.

The spec, as currently defined, could make minor adjustments to enhance
forward compatibility, that I do not believe would add additional cost to
user agent implementors.

If RFC6920, for whatever reason, is completely off the table, then I would
suggest changing the format of the token so it doesn't look like a valid
RFC3986 URI, by replacing use of the colon and dash with equals and comma,
respectively:

; proposed changes
option-expression  =  option-name "=" [ option-value ]
hash-expression    = hash-algo "," base64-value

This should be unambiguous, because neither comma nor equal sign will
appear in option-name nor hash-algo. And the lack of a colon prevents
RFC3986 parsing.

This enhancement would also allow the TR to give tokens to hashes matching
those in the IANA registry ("SHA-256" instead of "sha256"), further
supporting forward compatibility.

Aside: The ABNF in the current Editor's Draft seems to allow the following
string:

option-name:option-valuesha256-Cg==

i.e. without a WS between option-expression and hash-expression. Am I
reading this correct? /aside

I don't think RFC6920 needs to be off the table, however; merely define an
ABNF subset compatible with RFC6920 `NI-URI` production. For all intents
and purposes, it would be just a coincidence that it happens to be a valid
NI-URI production (if funny-looking). The only significant difference (not
formatting/parsing related) would seem to be handling of ct= (appearing in
every hash production, instead of just once).

I don't see anything wrong with mixing option-expression and NI-URI
productions, that might be an ideal solution to explore. I'm thinking all
option-expressions must be matched, if any, and at least one
hash-expression must be matched.

Finally, the current option-value production prohibits the expression of
arbitrary media types. I regularly use quoted-strings (containing spaces
and special characters) in media types, including in Accept headers, for
instance with JSON Schema, e.g.:

application/json; charset=utf-8; profile="http://example.com/book"

(Note charset isn't a registered parameter for application/json, utf-8 is
by definition, but it's not prohibited, and I've had difficulty getting
some Web browsers to play nicely without it.)

The current option-value production appears to prohibit a media type like
this, whereas the urlencoded ct= named information parameter utilizes
urlencoding.

Thanks,

Austin.

On Wed, Mar 11, 2015 at 9:45 AM, Devdatta Akhawe <dev.akhawe@gmail.com>
wrote:

> Hi Austin
>
> These are all great points, but we are not really trying to address these
> in the first version of SRI. The goal in version 1 is only to be able to
> check the hashes of scripts and links. That said, I tend to agree with you
> that this should be in our radar; do you think the spec, as currently
> defined, makes it impossible to address these concerns in future
> iterations? For example, the parser is intentionally forgiving of formats
> it is not aware of, so as to allow such changes in the future.
>
> cheers
> Dev
>
>
> On Mar 9, 2015 11:26 PM, "Austin William Wright" <aaa@bzfx.net> wrote:
>
>> [CCing swig because I know many of you are also using ni: URIs. Let me
>> know if there's anything to add!]
>>
>> I understand that the ni: URI was removed in a recent update to the SRI
>> draft. I would like to ask this be reconsidered.
>>
>> Using the ni: URI for SRI is important to Semantic Web applications as it
>> allows us to treat the assertion as a link relation. This distinction might
>> not be significant to many user-agents (and thus many on this list), but in
>> Semantic Web applications, especially Web servers, this is of great
>> significance, allows us to make useful relationships between resources.
>>
>> It also confers benefits to servers and application designers, as RFC6920
>> defines a mapping between ni: URIs and a </.well-known/> URL. In a
>> corporate project under development, we're already using ni: URIs to keep a
>> content-addressable database of files, making them accessible through this
>> mapping. I intend to use Subresource Integrity when serving assets from
>> this store.
>>
>> It also provides an intuitive abstraction: If we think of the ni: URI as
>> identifying a resource (the definition of the URI), the integrity=
>> attribute is performing an assertion: "These two URIs must identify the
>> same information resource, otherwise abort!"
>>
>> For additional support for this use case, I'd like to propose making the
>> "integrity" attribute a a link-extension for RFC5988 Web Linking, suitable
>> for use on any declaration of a link.
>>
>> User agents do not need to think of the ni: URI as a URL if they do not
>> need to, they just follow the ABNF defined in the RFC. There's many cases
>> where URIs are used as identifiers in Web applications; in namespaces [1],
>> schemas (e.g. JSON Schema), DTDs, RDFa [2], and in rel= attributes in all
>> sorts of tags and HTTP headers.
>>
>> Additionally, RFC6920 defines a registry of hashes [3], to ensure forward
>> compatibility (e.g. SHA-3, when standardized later this year). I would like
>> to avoid duplication of effort defining a database of hashes.
>>
>> In short, (1) signing was an explicit goal of the ni: URI, along with
>> other features. Not using ni: would mean servers being unable to take
>> advantage of these other features; and, (2) don't forget about the HTTP
>> Link: header.
>>
>> Thanks for your consideration,
>>
>> Austin Wright.
>>
>> [1] http://www.w3.org/TR/html5/infrastructure.html#xml
>> [2] http://www.w3.org/TR/html-rdfa/
>> [3]
>> http://www.iana.org/assignments/named-information/named-information.xhtml
>>
>>

Received on Wednesday, 11 March 2015 18:38:36 UTC