- From: Silvia Pfeiffer <silviapfeiffer1@gmail.com>
- Date: Wed, 15 Oct 2008 22:36:53 +1100
- To: "Media Fragment" <public-media-fragment@w3.org>
Hi all, This is to address my ACTION item on how far we can go with URI fragment specifications and the communication between uers, user agent, proxies, and origin server. I've had to do some reading up on the URI standard, on how proxies work and http headers, so I hope we can have an interesting discussion about this next week in Cannes. Let me start by listing the side conditions under which I believe we are working: 1. the URI specification is fixed and we should work within its boundaries 2. if possible, we should avoid requiring any changes to the software that runs on Web proxies 3. if possible, we should not require more than one set of changes to user agents and these changes should be independent of the media type 4. since the origin server needs to implement heavy handling of media files to deliver media fragments and these changes are already dependent on the media type, we should try to focus all necessary changes to the resource delivery chain over http at the origin server end 5. since we are trying to accommodate all types of media resources, our model of a media resource needs to be as generic as possible and we need to assume there is a unique map of time ranges to byte ranges (the map may however be surjective in the mathematical sense http://mathworld.wolfram.com/Surjection.html). I have tried to address 5. through the media resource description given at http://www.w3.org/2008/WebVideo/Fragments/wiki/Glossary#Video_Resource . As for the rest, I am working with the following model of a user/UA - proxy - origin server communication: http://www.w3.org/2008/WebVideo/Fragments/wiki/Image:Http_sequence.jpg . Can you check if you agree with that model? Now to the actual communication. When I originally said: we cannot use URI fragments, I was referring to the fact that they will not be guaranteed to go beyond the User Agent. This means: we can use them on link 1 and 9 (between user and UA) e.g. for browser history purposes, but we cannot rely on them to exist anywhere else on the communication. Incidentally, I have had a long discussion with my colleague John Ferlito, who is a network guru, and he reckons we should avoid using both, query ("?") and fragment ("#"). The reason is that both are already being used massively around the Web and we may break some existing Web resources in this way, in particular with query ("?"). Even if we are trying to use them only for specific media types, the problem is that it is impossible to tell from a URI what the media type is (e.g. http://example.com/resource#t=50-70 - how do you tell this is a video?) - only the server knows it and can communicate it. Therefore, the UA will always have to apply the fragment ("#") to the resource only after it has received the resource - unless it generically puts the media fragment request into another place in the URI request, namely into HTTP headers. Thinking here goes along similar lines as what we discussed for temporal URIs at http://annodex.net/TR/draft-pfeiffer-temporal-fragments-03.txt (search for byte range). The idea is that if we don't want to change the way in which Web proxies work, we have to work within their given resource fragment caching functionality. The simple fact is that byte ranges are the only way for Web proxies to deal with subparts of a Web resource. And since the Web server is the only one that can determine the byte range, we need a four-way handshake protocol to make media fragment URI requests cachable over HTTP. In the first path, the UA is being told which byte range to request, and in the second path the UA can request the resource with the correct time ranges. Now, we can choose URI fragments ("#") to specify the time segment(s) to the UA, or we can use anyone of the reserved delimiting characters from the URI spec (i.e. sub-delims = "!" / "$" / "&" / "'" / "(" / ")" / "*" / "+" / "," / ";" / "="). In essence the choice boils down to the probability of clashing with somebody else's already defined URI scheme. It may be an idea to ask Google to do us a special search on all the URIs they have stored and check if whatever scheme we come up with clashes with an existing URI scheme. In general I agree with others: "#" and ";" probably make the most sense. So, should we decide on something like http://example.com/resource#t=12-30,50-90 then we might want to ask google for all the URIs it can find that have "#t=" in them. Anyway .. moving on to the four-way handshake. The way it is specified in the temporal URI spec is one way to do it - it requires a resource redirection by the origin server such that the second path accesses the correct resource. After having thought about time range requests, John and I came up with the following alternative (explained on the example of a Ogg video resource): Initial request from a user in a Web browser: User -> UA (1): http://example.com/resource.ogv;t=20-30 UA chops off fragment and turns it into a HTTP GET request with a time range header (which can incidentally also be cached by a proxy): UA -> Proxy (2) -> Origin Server (3): GET http://example.com/resource.ogv Range: time 20-30 Origin Server converts time range to byte range and put all additional data that cannot be cached but is required by the UA to receive a fully functional media resource into the HTTP response. Origin Server -> Proxy (7) -> UA (8): RESPONSE 200 <...ogg header + skeleton...> Content-Range: time 20-30 Content-Type: video/ogg; codecs=theora,vorbis Time-Range: bytes 50000-200000/FILESIZE (this is a new HTTP header) The UA buffers the data it receives for hand-over to the media subsystem. It then proceeds to put the actual fragment request through: UA -> Proxy (2) -> Origin Server (3): GET http://example.com/resource.ogv Range: bytes 50000-200000 The Origin Server puts the data together and sends it to the UA: Origin Server -> Proxy (7) -> UA (8): RESPONSE 200 <... bytes of video data ...> Content-Range: bytes 50000-200000/FILESIZE The UA hands over the header and video data to the medai subsystem and therefore display it to the user (9). If we want to make media fragment resources cachable on the Web, we don't have many choices. We can however optimise the process for specific media types, e.g. for the quicktime streams that Dave Singer talked about. I can't however see a way to avoid a four-way handshake at least one per resource. I think we will have a nice discussion next week. :-) Cheers, Silvia.
Received on Wednesday, 15 October 2008 11:37:30 UTC