Re: Submitted new I-D: Cache Digests for HTTP/2 from Kazuho Oku on 2016-01-27 (ietf-http-wg@w3.org from January to March 2016)

From: Kazuho Oku <kazuhooku@gmail.com>
Date: Wed, 27 Jan 2016 10:11:47 +0900
To: Stefan Eissing <stefan.eissing@greenbytes.de>
Cc: Julian Reschke <julian.reschke@gmx.de>, Martin Thomson <martin.thomson@gmail.com>, Ilya Grigorik <ilya@igvita.com>, Amos Jeffries <squid3@treenet.co.nz>, HTTP Working Group <ietf-http-wg@w3.org>
Message-ID: <CANatvzwbZAwWNYK-D3bvvcUY9fam7ML4ehZuQzU7GsSgs9BT7w@mail.gmail.com>
Stefan,

Thank you for your response.

2016-01-21 20:08 GMT+09:00 Stefan Eissing <stefan.eissing@greenbytes.de>:
> Kazuho,
>
> thanks for your detailed feedback. I see that you have things in mind for cache digest that I had not understood.
>
> I would still argue for the standalone digest form and define a new parameter that specifies the digest type. For example:
>    Cache-Digest: base64urldigest;type=fresh, base64digest;type=if-modified-since
> with a default of "fresh".

To me it seems that we have a trade-off between simplicity and
possibly better efficiency in future extensions.  Considering the fact
that we cannot tell for certain what would happen in the future, I
will not argue for making the digest value as a parameter.

Reflecting your ideas, an early draft that reflects my understanding
can be found at https://gist.github.com/kazuho/238de4ab8a964c9c927c
(FYI the source code is https://github.com/mnot/I-D/pull/167).  Please
note the version on the Gist reflect my ideas only; Mark might want to
make modifications before updating the I-D.

That said, it would be great if you could see it looks fine.

FWIW I see two potential pitfalls when _upgrading_ to this draft:

* size of `N` and `P` has been changed from 8 bits to 5 bits

* scope is expressed via parameters named `host` and `path`

Note that the former is not named `domain`.  Please refer to
https://lists.w3.org/Archives/Public/ietf-http-wg/2016JanMar/0132.html
for the reason behind.

> To your other points, I agree.
>
>> Am 21.01.2016 um 02:26 schrieb Kazuho Oku <kazuhooku@gmail.com>:
>>
>> 2016-01-20 21:57 GMT+09:00 Stefan Eissing <stefan.eissing@greenbytes.de>:
>>> I make another attempt, based on the feedback (Kazuho, please stop me if you prefer to work on this):
>>
>> Sorry.  I thought I had responded to your email.  It has always been
>> my pleasure (and fun) to discuss how we should design and implement
>> cache digests.
>>
>> I think we mostly agree on how the header should be designed.
>>
>> Below, I will cover the semantic differences between yours and mine
>> (laid out in https://lists.w3.org/Archives/Public/ietf-http-wg/2016JanMar/0120.html),
>> and my thoughts.
>>
>>> Cache-Digest      = "Cache-Digest" ":" #digest-value
>>>
>>> digest-value    = base64url-encoded-digest *( ";" digest-param )
>>
>> I would like to have the encoded digest stored as a name-value pair,
>> thereby allowing the client to store several types of digests within a
>> single digest-value.
>>
>> At the moment, the draft only covers how the digests of fresh
>> resources in cache should be encoded.  However if all goes well, I
>> anticipate that we would extend the spec., to support stale resources
>> as well.  For such purpose, being able to store more than one type of
>> digests in a single digest-value would make the header less redundant.
>>
>> As an example, the header conveying a digest for fresh resources and a
>> digest for stale resources with if-modified-since headers can look
>> like:
>>
>>  cache-digest: fresh=xxxxx; if-modified-since=yyyyy; authority=*.example.com
>>
>> only in case if we use name-value pairs to encode the digest.
>>
>>> A digest replaces any existing digest if its domain parameter
>>> matches the domain of an existing digest (with '*' matching any
>>> domain) and the path parameter is a prefix of or equal to the
>>> existing path.
>>
>> I believe that a server should be able to determine the domain(s) the
>> client considers the server to be authoritative for.
>>
>> Therefore, I think that a) the default scope should be the value
>> passed in by the `:authority` pseudo header, and b) otherwise the
>> client should explicitly specify the scope using the language defined
>> in RFC 6125 6.4.3.
>>
>> Unless the scope is transferred over HTTP, HTTP servers need to have
>> the knowledge of the common name in the TLS certificate being used.
>> However that might not always be easy considering the fact that in
>> some cases TLS terminators (run by a different entity) might be placed
>> in front of the HTTP server, or in case we want to rolling-update
>> certificates of a TLS terminator cluster from a host-level certificate
>> to a wildcard certificate.
>>
>> Requiring the client to always specify the scope of the digest aligns
>> with the fact that in HTTP/1.1 and HTTP/2 specifications the Host
>> header (or the :authority header) is transferred even if TLS
>> certificate is used (and the fact that not the common name stored in
>> the certificate but the value of the header IS used as the name of the
>> authority).
>>
>>> If the codec of a digest is unknown, the digest MUST be assumed
>>> to be empty
>>
>> If the codec of a digest is unknown, a server must behave as if it did
>> not receive that digest.
>>
>> Assuming it to be empty would mean that a server will consider the
>> cache of the client to be empty.  It should mean to the server that
>> the client's cache state is unknown (for the category of the resources
>> (e.g. fresh)).
>>
>> And this definition is important for extending the spec., to cover
>> stale resources.  For example, if a client sends a digest of fresh
>> resources and a digest of stale resources, a server knowing only how
>> to handle the former should respect the former and ignore the latter.
>>
>>> A candidate of a server push is matched against a cache digest
>>> by comparing its :authority against the digest domain (port number?)
>>> and its :path against the digest path (equal or prefix). Only
>>> then is the candidate matched against the digest content.
>>
>> Agreed.
>>
>>>
>>>> Am 18.01.2016 um 15:46 schrieb Kazuho Oku <kazuhooku@gmail.com>:
>>>>
>>>> 2016-01-18 19:32 GMT+09:00 Julian Reschke <julian.reschke@gmx.de>:
>>>>> On 2016-01-18 11:19, Stefan Eissing wrote:
>>>>>>
>>>>>> +1 generic header parameter form.
>>>>>>
>>>>>> If I may make a proposal:
>>>>>>
>>>>>> Cache-Digest      = "Cache-Digest" ":" #digest-value
>>>>>>  digest-value    = "<" base64url encoded digest ">" *( ";" digest-param
>>>>>> )
>>>>>>  digest-param    = ( ( "domain" "=" domain-value )
>>>>>>                  | ( "path" "=" path-value )
>>>>>>                  | ( "codec" "=" codec-value )
>>>>>>                  | ( "update" ( "=" update-param ) )
>>>>>>                  | ( digest-extension ) )
>>>>>>
>>>>>> digest-extension = ( parmname [ "=" ( ptoken | quoted-string ) ] )
>>>>>>
>>>>>> domain-value     =
>>>>>>       authority              # defined in RFC3986
>>>>>>       | wildcard-identifier  # defined in RFC 6125 6.4.3
>>>>>>
>>>>>> path-value       = path-absolute    # defined in RFC3986 3.3
>>>>>>
>>>>>> codec            = ( "GCS-SHA256" | ( ptoken | quoted-string ) )
>>>>>> ...
>>>>>
>>>>>
>>>>> I wouldn't use angle brackets here; let's leave them for use in places where
>>>>> the value is a URI (Link header field and several WebDAV header fields).
>>>>>
>>>>> Just us token/quoted-sting here.
>>>>>
>>>>> Also, do not wire parameter names into the ABNF; this mixes syntax with
>>>>> semantics.
>>>>
>>>> Thank you for the advice.
>>>>
>>>> Obviously I have used an old-fashioned RFC as a reference when writing
>>>> the ABNF (that Stefan probably modified).  I should have looked into a
>>>> newer one, such as the Alt-Svc draft.
>>>>
>>>>> Best regards, Julian
>>>>
>>>>
>>>>
>>>> --
>>>> Kazuho Oku
>>>>
>>>
>>
>>
>>
>> --
>> Kazuho Oku
>



-- 
Kazuho Oku
Received on Wednesday, 27 January 2016 01:12:17 UTC