W3C home > Mailing lists > Public > ietf-http-wg@w3.org > January to March 2011

Re: draft-bryan-metalinkhttp-18.txt

From: Tatsuhiro Tsujikawa <tatsuhiro.t@gmail.com>
Date: Wed, 19 Jan 2011 23:49:12 +0900
Message-ID: <AANLkTikNB2o4YJHZkWJ375ai=b42uPPxiK5pRt=F3hb=@mail.gmail.com>
To: Julian Reschke <julian.reschke@gmx.de>
Cc: Anthony Bryan <anthonybryan@gmail.com>, ietf-http-wg@w3.org, draft-bryan-metalinkhttp@tools.ietf.org
2011/1/19 Julian Reschke <julian.reschke@gmx.de>:
> On 19.01.2011 10:33, Anthony Bryan wrote:
>>
>> ...
>>>
>>>   ...
>>>
>>>   [[ Discussion of this draft should take place on IETF HTTP WG mailing
>>>   list at ietf-http-wg@w3.org or the Metalink discussion mailing list
>>>   located at metalink-discussion@googlegroups.com.  To join the list,
>>>   visit http://groups.google.com/group/metalink-discussion . ]]
>>>
>>> This should go on the front page as "Editorial Note".
>>
>> in<abstract>?
>>
>> I've moved it to there where it says:
>>
>> [[ Discussion of this draft should take place on IETF HTTP WG mailing
>> list at ietf-http-wg@w3.org although this draft is not a WG item. ]]
>> ...
>
> No, add a <note> element after abstract. (Example:
> <http://tools.ietf.org/id/draft-ietf-httpbis-p2-semantics-12.xml>)
>
>>>   ...
>>>
>>> 1.2.  Examples
>>>
>>>   A brief Metalink server response with ETag, mirrors, .metalink,
>>>   OpenPGP signature, and a cryptographic hash of the whole file:
>>>
>>>   Etag: "thvDyvhfIqlvFe+A9MYgxAfm1q5="
>>>   Link:<http://www2.example.com/example.ext>; rel="duplicate"
>>>   Link:<ftp://ftp.example.com/example.ext>; rel="duplicate"
>>>   Link:<http://example.com/example.ext.torrent>; rel="describedby";
>>>   type="application/x-bittorrent"
>>>   Link:<http://example.com/example.ext.metalink>; rel="describedby";
>>>   type="application/metalink4+xml"
>>>   Link:<http://example.com/example.ext.asc>; rel="describedby";
>>>   type="application/pgp-signature"
>>>   Digest: SHA-256=MWVkMWQxYTRiMzk5MDQ0MzI3NGU5NDEyZTk5OWY1ZGFmNzgyZTJlO
>>>   DYzYjRjYzFhOTlmNTQwYzI2M2QwM2U2MQ==
>>>
>>> Note: there's no need quote the relation type here.
>>
>> here or throughout? they are quoted in the RFC 5988 examples.
>
> Not for single relation names that just use token characters.
>
>>    Link:<http://example.com/TheBook/chapter2>; rel="previous";
>>          title="previous chapter"
>>
>> or the MIME type in the type parameter doesn't need to be quoted?
>
> RFC 5988 allows it to be non-quoted, but I would advise against it as it
> violates that 2616 grammar for tokens...
>
>>> Note: it's unfortunate that there doesn't seem to be a registered media
>>> type
>>> for torrent files.
>>
>> true. I'd fix that if I could. :)
>
> Go ahead. You can :-)
>
>>>   ...
>>>
>>>   Metalink resources include a Link header [RFC5988] to present a list
>>>   of mirrors in the response to a client request for the resource.
>>>   Metalink servers MUST include the cryptographic hash of a resource
>>>   via Instance Digests in HTTP [RFC3230].  Valid algorithms are found
>>>   in the IANA registry named "Hypertext Transfer Protocol (HTTP) Digest
>>>   Algorithm Values" at
>>>   http://www.iana.org/assignments/http-dig-alg/http-dig-alg.xhtml .
>>>
>>> Surplus whitespace. Maybe put the URI into angle brackets.
>>
>> I've also seen registries cited in the references. is that better?
>
> I think in-lining has a better chance to survive the RFC Editor changes (I
> believe they don't want to have the URIs in the document at all).
>
>> ...
>>>
>>> As the term "Etag policy" is important, it might make sense to introduce
>>> it
>>> more formally.
>>
>> "Metalink servers and their associated mirror servers SHOULD all share
>> the same ETag policy, meaning ETags are synchronized across servers,
>> i.e. byte-for-byte identical files will have the same ETag on all
>> servers. ETags could be based on the file contents (cryptographic
>> hash) and not server-unique filesystem metadata."
>> ...
>
> Maybe "To have the same ETag policy means..."...?
>
> Also, it appears that you require more than what's needed.
>
> For instance, why would it be a problem when byte-for-byte identical files
> have different ETags on the same server? I think what's relevant is the
> requirement for the mirror files, not any additional requirements for other
> files^h^h^h^h^hresources on the server.
>
>>>   ...
>>>
>>>   There are two types of mirror servers: preferred and normal.
>>>   Preferred mirror servers are HTTP mirror servers that MUST share the
>>>   same ETag policy as the originating Metalink server.  Preferred
>>>   mirrors make it possible to detect early on, before data is
>>>   transferred, if the file requested matches the desired file.
>>>
>>> Note: that also could be achieved by introducing a new conditional header
>>> for the digest, or by using the extension points in the WebDAV "If"
>>> header.
>>
>> I don't know much about WebDAV but I read section 10.4 of RFC 4918. I
>> still don't understand.
>> ...
>
> The "If" header can use arbitrary state tokens, and you could make the
> instance digests one of those. Theoretically.
>
>> we use If-Match in the examples. is that wrong?
>
> No, this was just a suggestion that you *could* define to make the request
> conditional based on the digest, in which case the dance about ETag policies
> wouldn't be needed.
>

So we can define state token for instance digest and use "If" header with it.
Do you think it is better to use only instance digest and not use ETag?
In my opinion, in the current ID, the advantage of using ETag is it
can be used with "If-Match"
header to detect mismatch early. With "If" header with instance
digest, we can use instance
digest for this purpose too.

>>>   ...
>>> Wow. This appears to ignore the overhead of Range requests on the
>>> *server*.
>>> Note that sometimes, content is not served directly from the filesystem,
>>> and
>>> implementing Range may not be possible using seeks. Now one could argue
>>> that
>>> servers suffering from the problem should not support this in the first
>>> place, but still...
>>>
>>> Given the file sizes for which parallel downloads make any sense today,
>>> is
>>> it *really* a good idea to recommend 10K segments?
>>
>> nope. how about removing the last 3 sentences&  adding
>> "Note that Range requests impose an overhead on servers and clients
>> need to be aware of that and not abuse them. "
>
> +1
>

+1

Our Metalink client aria2(http://aria2.sourceforge.net) uses the
minimum length of
segment is 1MB. But usually, it issues request much larger than that to avoid
reconnection or overhead of HTTP headers.
Actually, even 1MB is too small for fast network connection nowadays.
Too small segment size just slows down download.

>> ...
>> yes :)
>>
>> thanks for the thorough review!
>> ...
>
> You're welcome.
>
> Best regards, Julian
>
>
Received on Wednesday, 19 January 2011 14:56:51 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Friday, 27 April 2012 06:51:36 GMT