W3C home > Mailing lists > Public > uri@w3.org > September 2004

Re: more 'file' suggestions for draft-hoffman-file-uri

From: Mike Brown <mike@skew.org>
Date: Tue, 21 Sep 2004 20:06:00 -0600 (MDT)
Message-Id: <200409220206.i8M260ZX071767@chilled.skew.org>
To: uri@w3.org

Paul Hoffman / VPNC wrote:
> ...I believe that most people would say that the morass of 
> conflicting ad-hoc interpretations had more to with the 
> understatement of the syntax, not the semantics of "what is a file".

OK, I agree. If trying to establish a standard interpretation is a goal, then 
of course the syntax should be better stated. I guess I am inclined to plug 
holes that aren't leaking.

I think a better statement of syntax for purposes of establishing a standard 
interpretation involves both

 1. a clear expression of the lexical aspects of a file URI, and

 2. a clear expression of the semantic aspects of the
    URI components: what do they represent / "mean"?

The lexical syntax shouldn't even be in question. It will be inherited from 
rfc2396bis. We don't need to know what a 'file' is in order to define this 
aspect of the syntax, and in fact, having an example of what is "usually" in a 
file URI just confuses things and IMHO can be excised from the spec entirely,
saving us all a lot of headache.

As for the semantic aspects of the syntax, maybe you're right, maybe it 
doesn't matter what a 'file' is there, either, because at that level we only 
care about the file identification conventions on the various file systems
that are out there.


So, how does this sound?

- A file URI represents a file that is associated with a host.

- The syntax of a file URI is that of absolute-URI, except that
  its scheme component must be 'file', case-insensitively.

- The *typical* syntax of a file URI is more restrictive
  (no query component, authority is usually empty,
   path usually starts with "/")

- The authority component of a file URI is considered by this
  specification to contain a host component exactly as defined
  by the rfc2396bis grammar. (I don't want there to be any
  ambiguity about what the "host component" is).

- The host component of a file URI represents the host
  associated with the identified file.

- The path component of a file URI represents an identifier for the file
  as would be used in the host's principal file system interface
  (i.e., the path component of a file URI usually represents a file's
  "local path" on the host's file system). "File system interface" is
  assumed to be a well-understood concept.

- Other components of a file URI, if defined, are not defined
  by this standard as necessarily representing anything in particular,
  but they do contribute to the identification of the file represented
  by the URI. Thus, a query component present in a file URI may or may
  not affect how the URI is dereferenced on a particular platform,
  but even when it does not affect anything, it cannot be assumed,
  in the absence of a standard stating otherwise, that a file URI with
  a query component is equivalent to a file URI without one.

- The manner in which a host component represents a host is
  this: If the component is empty or is "localhost" (what if it is
  the percent-encoded equivalent of "localhost"?), the component
  represents the host on which the URI is being interpreted. No
  guidelines are given for the interpretation of any other values;
  they may take the form of IP addresses, DNS names, or any other
  identifier. No guidelines are given for how to dereference such
  identifiers (hey, I'm just describing current practice).

 - The manner in which a path component represents a file
   identifier in a file system interface is this:  If the file
   system interface implies hierarchical containment, then...
   (and you can go on to whatever level of detail you want)

Thoughts?
Received on Wednesday, 22 September 2004 02:05:58 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Thursday, 13 January 2011 12:15:34 GMT