Re: The getcontentlength property from Werner Donné on 2007-06-23 (w3c-dist-auth@w3.org from April to June 2007)

From: Werner Donné <werner.donne@re.be>
Date: Sat, 23 Jun 2007 11:10:48 +0200
To: "Mr. Demeanour" <mrdemeanour@jackpot.uk.net>
Cc: w3c-dist-auth@w3.org
Message-ID: <467CE398.8050505@re.be>
Mr. Demeanour wrote:
> 
> Werner Donné wrote:
>>
>> Julian Reschke wrote:
>>> Werner Donné wrote:
>>>>
>>>> Hi,
>>>>
>>>> RFC 2518 says that this property must be returned if the 
>>>> Content-Length header is returned in a GET response for the
>>>> resource at hand. I assume that it is allowed to return this
>>>> property if the Content-Length header is never returned, for
>>>> example, because the chunked transfer encoding is used.
>>>
>>> Yes.
>>>
>>>> Can a client rely on a previously retrieved getcontentlength to
>>>> read only part of the chunked response stream of the following
>>>> GET response?
>>>
>>> If it has reason to believe that the entity didn't change
>>> (ETag...), then yes.
>>>
>>>> The particular case I have is where I detect the client is on a
>>>> windows machine and is retrieving a text/plain resource. In that
>>>> case carriage returns may be added on the fly. This increases the
>>>> physical length of the resource.
>>>
>>> By doing this you're serving two different variants, depending on
>>> some part of the request ("User-Agent" header?)? So you'll have to
>>> specify "Vary" header in the response, and assign different entity
>>> tags. In that case a properly written client shouldn't have any
>>> problems because it can detect that the entity served by the second
>>> request is really different.
>>
>> Yes, the "User-Agent" header was used.
>>
>> The problem is when the resource is going to be fetched for the first
>>  time by the client. It does a PROPFIND for retrieving the contents
>> of the parent collection and then a GET. The client I'm trying it
>> with doesn't ask for the "getetag" property in the PROPFIND, so
>> whatever ETag returns, it can't draw any conclusions from it. It just
>> uses the "getcontenttype" property to decide how many bytes to read,
>> even if the response stream is chunked.
> 
> I take it you meant "uses the getcontentlength property to decide".

Indeed. Sorry for the confusion.

> 
> If this client is going to ignore the HTTP 1.1 protocol, and instead
> decides to read from the wire the number of bytes given in the response
> to an earlier request, then I wouldn't expect it to work at all with a
> chunked stream; because the length of the stream depends on the
> transfer-encoding; and in particular, chunked encoding changes to the
> number of bytes that comprise the stream.
> 
> I must say, I'm rather puzzled by this question; the whole point of the
> chunked encoding is that it is suited to transferring content for which
> it is inconvenient to compute a length in advance. If you can know the
> length in advance, can't you just get your server to transfer unencoded
> data, and use a Content-Length: header?
> 
> I presume I've missed the point.
> 

This client does seem to read the chunked stream correctly, but after
reading it, it only retains as much bytes as indicated by the
"getcontentlength" property, which was fetched earlier. So if I have a
text/plain resource, for example, with ten lines of five characters each
and if I add a CR to each of the lines, the last ten characters will not
be returned. The "getcontentlength" returned 50, but because of the CRs
the response is 60 long, after decoding the chunked transfer encoding
that is. The result I see is that the returned file is cut off.

Werner.
-- 
Werner Donné  --  Re
Engelbeekstraat 8
B-3300 Tienen
tel: (+32) 486 425803	e-mail: werner.donne@re.be
Received on Saturday, 23 June 2007 09:07:20 UTC