URL syntax to return byte ranges from files

I'm working on a (commercial) project to allow random access into files
stored on the Web.  We have it all working, using a simple CGI program that
takes a filename and byte range as arguments and returns the correct block
of data from the file.  Before shipping, however, I'd like to get group
input into the syntax we choose, with an eye towards being cleanly
extensible and inter-operable.

Executive Summary:
  Generalize support for WN's byte range syntax, ";bytes=<start>-<end>".
  See http://hopf.math.nwu.edu/docs/range.html for more info.

All comments and feedback are welcome.  In particular, if anyone knows of
any other already-deployed 'standards', or a more appropriate forum for
discussion, please say so (and/or forward this message).

Thanks for your time,
  dG

David Glazer
dglazer@best.com
  
===========================================================================
Problem statement:
  Define an URL syntax to return a specified byte range from a specified
  file living on the server.  Servers should be able to support the syntax
  via either a built-in extension or a CGI script.

CGI tradeoffs:

 Option 1: Use a CGI script
     Pros: simple to implement - doesn't require any server code changes
           provides the same syntax on all servers that support CGI
     Cons: requires an extra system call for each block requested
           requires the server to have CGI execution turned on
           requires a copy of the CGI script in each served directory
             (to avoid some scary security holes)
           only works with physical files located on the server's filesystem

 Option 2: Build support into the server
     Pros: more efficient
           easier to administer
           leverages all server name-mapping and security features
     Cons: requires agreement on syntax to be portable across servers
           even after agreement, newly modified servers need to be deployed

 Conclusion:
    Both options make sense and should be supported, ideally with a single
    URL syntax.  Hopefully, the CGI version will become less and less
    necessary as updated servers are rolled out.

Proposed Syntax:
  Given a base URL that identifies a document, append a modifier string to
  select a range.  The syntax of that string is ";bytes=<start>-<end>",
  where <start> and <end> are inclusive byte offsets.  The base URL can
  either be the normal document URL (for servers with built-in support) or
  a CGI URL.

Example:
  The file foo.doc is available via URL http://www.a.com/docs/foo.doc
  We want to get 512 bytes from foo.doc, starting at offset 1024.
  If the server has built-in support, the URL would be
    http://www.a.com/docs/foo.doc;bytes=1024-1535
  If using a CGI script (installed in foo.doc's directory), the URL would be
    http://www.a.com/docs/my.cgi?foo.doc;bytes=1024-1535

Note on Syntax:
  We considered several alternative syntaxes, such as:
     http://www.a.com/docs/my.cgi?foo.doc;bytes=1024+512
     http://www.a.com/docs/my.cgi?foo.doc+1024+1535
     http://www.a.com/docs/foo.doc?1024+512

  They differ mainly in punctuation and supplying a length instead of an
  ending offset.  I don't see any intrinsic benefit to most of the choices,
  so the facts that WN is shipping with a syntax, and that that syntax can
  also be cleanly used in CGI scripts, seem to carry the day.
===========================================================================

Received on Tuesday, 14 March 1995 15:45:30 UTC