Re: file: URI scheme from Mike Brown on 2004-10-11 (uri@w3.org from October 2004)

From: Mike Brown <mike@skew.org>
Date: Sun, 10 Oct 2004 21:52:06 -0600 (MDT)
To: Kitchen Pages <jrobinson@kitchenpages.com>
CC: uri@w3.org
Message-Id: <200410110352.i9B3q6hm006303@chilled.skew.org>
Jason-

A number of your points have been topics of discussion already.
I'll address and expand on those here.

>    This scheme, unlike most other URL schemes, does not designate
>    a resource that is universally accessible over the Internet.
> 
> *I am just a little un-easy about the 'universally' bit

Everyone seems to be in agreement that the phrase needs work. Roy came up with 
a more accurate idea of how to get the point across that there is a certain 
conceptual incongruity between 'file' URIs and URIs of other schemes. See my 
summary of the discussion in the first part of 
http://lists.w3.org/Archives/Public/uri/2004Sep/0075.html
(up through the quoted text from Roy).

> *URL? vs URI?

I assume "URL" was just left over from the previous draft; I'm sure Paul will 
fix it in the next version.

>   Further, implementers on a single platform have often disagreed
>   on the syntax to use for a particular filesystem.
>
> *I kind of dislike the use of syntax because it relates to part of grammar
> in treating of arrangement of words in a sentence (meaning of n)

I don't feel strongly about it, but would you feel better if it were rephrased 
to something like "Further, implementations on a single platform often differ 
in how they relate file system paths to file URI components."?

>    A file URL takes the form:
> 
>    file://<host>/<path>
> 
>    where <host> is the fully qualified domain name of the system on
>    which the <path> is accessible, and <path> is a hierarchical
>    directory path of the form <directory>/<directory>/.../<name>.
> 
> *I think the <path> should be more over related to 'local service' 
> as in if DNS is accessible and the 'local' system is then resolved 
> to a FQDN; only then is the <path> is accessible as a FQDN URI with
> permissions (share/user/both) or other security settings and even 
> symbolic links on linux/unix**

I don't fully grasp what you're getting at, but I think it's safe to leave 
those qualifications in the realm of implementation-defined behavior.

As for how to describe <path>, as I mentioned before, I feel strongly that we 
need to clearly separate the statements we make about what is being 
represented by a file URI versus what lexical syntax rules apply in the file 
URI itself. Currently it's all mixed up and I think this is part of the
problem.

>    Some systems allow URLs to point to directories.  In this case, there
>    is usually (but not always) a terminating "/" character, such as in:
> 
>    file://usr/local/bin/
> 
> 
> *file is for file as defined in this document.  I do not see how this can
> make a jump to explain how a linux path is known as a file as any such
> uri like the above in my view should be treated as a directory booting
> up another interface to handle the requests (IE swaps to explorer mode).

Hmm. I was about to argue that there is a well-known concept of what a 
hierarchical 'file system' embodies, and part of that is this notion of a 
containment structure where certain files are actually 'directories' which 
have a parent-child relationship with other directories and files, and that 
unlike regular files, happen to NOT have a well-defined byte stream associated 
with them that can be used as the retrievable representation for themselves.
Neither of those properties of directories preclude the use of calling the
directories Resources that can be Uniformly Identified by a 'file' URI.
However, perhaps it is worth explicitly mentioning something to this effect. 

Your other concern, about terminating slashes, is best explained as something 
that may or may not be necessary for purposes of identification -- it is 
possible for some file systems to have both a directory and a file with the 
same name, leading to the situation where the only way to distinguish between 
them with a straightforwardly mapped URI is via a trailing slash -- but that 
is necessary for resolution of relative references occuring in the 
representation of the directory resource -- hence the common practice of HTTP 
servers issuing a redirect to force the browser to append a trailing slash.

How about this:

"A file URI may identify a special kind of file that is actually a directory 
or other containment resource (e.g. a 'folder') in a file system. A file URI
that represents a directory usually, but not always, has a terminating '/'
character, such as in

  file:///usr/local/bin/  [note I added a slash so 'usr' is not a host]

The use of a terminating '/' is recommended in the canonical form of a URI 
representing a directory, in order to disambiguate a directory from a file of 
the same name, and for the proper resolution of relative references that might 
appear in a representation as obtained by dereferencing the URI. If a file URI 
without a terminating '/' is dereferenced and the resource is found to be a 
directory, implementations should append a '/', if possible."

I'm not especially thrilled about the ending there, so if anyone has any other
suggestions, please speak up.

> Perhaps adding more URI/URL examples or a table to the
> draft could solve this issue without it becoming one; even
> if its a history list - its just a something to work from**

I think the more documentation, the better, but I also feel that How To 
Interpret A File URI involves just

1. expressing that a file URI represents a resource (file, directory, pipe, 
symlink, device node...) in a file system, by representing (a) the host 
associated with that resource (by a name, '', or 'localhost'), and (b) the 
resource's name (path) in the file system;

and

2. expressing the relationship between the components of a file URI and the 
components of a file system path.

Rules of equivalence, How To Construct A File URI, and How To Dereference A 
File URI are related topics that can/should be distinguished from the 
statements related to interpretation. IMHO.

-Mike
Received on Monday, 11 October 2004 03:52:13 UTC