- From: Mike Brown <mike@skew.org>
- Date: Mon, 20 Sep 2004 22:28:20 -0600
- To: Paul Hoffman / VPNC <paul.hoffman@vpnc.org>
- Cc: uri@w3.org
Paul Hoffman / VPNC wrote: > Please review the file URI draft and let me know if this is sufficient. I have a few suggestions that I'm sure I'll be sorry I posted. First, an easy one. In section 2, change all "URL scheme" to "URI scheme". Then, please give this extremely vague statement further consideration: "The file URL scheme is used to designate files accessible on a particular host computer." Honestly, in this age of distributed filesystems, OS UIs that don't distinguish between local and remote resources, and the like, I don't even know for sure what a "file" is anymore, or exactly what kinds of access you have in mind when you say "accessible". Lots of finite bit sequences are accessible on a computer by a variety of means. I think at best we can only say the file is "associated" with the host. As for what makes a particular entity a "file", I don't know exactly. A file may or may not exist on a particular physical storage medium such as a disk or tape, and it may or may not be accessed via a network, which makes even the "on a particular host computer" open to boundless interpretation. So how about we acknowledge the ambiguities with something like this... The file URI scheme is used to identify a "file" resource associated with a particular host computer. The scheme emerged when the term file was relatively well-understood as implying certain typical characteristics of a resource, such as being a finite bit sequence manipulated as a unit, stored on a relatively non-volatile storage device, organized with other files in a hierarchical or record-based "file system", and being "local" to a single physical computer by virtue of being stored on the computer's closely-attached physical storage devices and accessible primarily via the ordinary means native to file management on that host. The infusion of networking technologies into nearly every aspect of computing has since rendered such distinctions less and less relevant, so the file URI scheme likewise makes no attempt to imply a particular access mechanism or any other characteristics of the identified resource, aside from the fact that the resource is associated with a particular host; implementations of this scheme typically define their own concept of "file" in a manner that is appropriate for their platform. And then continue on with the paragraph about poor interoperability. Typos to fix in the 2nd paragraph: syntaxt, docoument In the first paragraph, I think this can be safely deleted: This scheme, unlike most other URL schemes, does not designate a resource that is universally accessible over the Internet. ...the reason being, once you swap URL with URI, one must ask if "universal accessibility over the Internet" is really implied by most other resource identification schemes. Any references to "access" should be scrutinized, now that we distinguish between resource identification and resource representation retrieval. The thought that went into rfc2396bis's careful avoidance of requiring specific semantics in the authority component should be applied here (e.g. from the point of view of the syntax, a hostname doesn't *have* to be DNS based nor does it even necessarily rely on the idea that there's a network involved). And maybe change this... A file URL takes the form: file://<host>/<path> where <host> is the fully qualified domain name of the system on which the <path> is accessible, and <path> is a hierarchical directory path of the form <directory>/<directory>/.../<name>. ...to something like this? Any URI having a scheme component consisting of "file", case- insensitively, is a file URI. A file URI usually takes the form: file://<host><abs-path> where <host> matches the host syntax rule from the rfc2396bis grammar and is usually either empty, "localhost", or a fully qualified domain name for the host to which <path> applies; and where <abs-path> matches the either the path-abempty or path-absolute syntax rule from the rfc2396bis grammar and whose nonterminal segments represent "directories" or "folders" in a hierarchical file system. A file URI is not restricted to this syntax or interpretation, however. I'd also want to go one further and make this important statement, even if it is redundant by virtue of the fact that what's not in the spec doesn't need to be pointed out as not being in the spec: This standard does not mandate any particular mapping between the components of a file URI and the file itself, nor any means of accessing the file. The consequences of this could then be discussed: Consequently, no single component of a URI alone is necessarily an unambiguous identifier of anything; the host in a file URI may or may not directly correlate to the actual host associated with the file, and the path in a file URI may or may not directly correlate to a file system's mechanisms for file identification, be it a file system path, inode number, or other. Of course, it is customary for there to be no surprises; the host component usually identifies the actual host, and the path component usually bears some resemblance to the file's path on that host's hierarchical file system. Accordingly, producers and consumers of file URIs should document their expectations and what explicit mappings they assume between file URI components and file system-specific identifiers. Implementations that use file URIs for resource representation retrieval should document what access mechanisms are supported. I think I'll stop there. :) -Mike
Received on Tuesday, 21 September 2004 04:28:20 UTC