- From: Larry Masinter <LMM@acm.org>
- Date: Thu, 19 Aug 2004 23:13:12 -0700
- To: "'Paul Hoffman / IMC'" <phoffman@imc.org>
- Cc: uri@w3.org
Here's some text which might appear in a description of the 'file:' URI scheme. It's missing most of the important and interesting details; think of it as a proposal for organizing the material. References to particular implementations are given by citing the implementation (or its documentation, if available), but the RFC explains the range of behavior for the areas where there are differences. What do you think of this approach? ===================================================== - Hierarchical structure Most implementations of the 'file:' URI scheme do a reasonable job of mapping the hierarchical part of a directory structure into the '/' delimited hierarchy of the URI syntax, independent of what the 'native' platform delimiter is. For example, on Windows platforms, it is typical that the file system presents backslash '\' as the file delimeter for file names, yet the URI's forward slash '/' can be used in file: URIs. Similarly, on (some) Macintosh OS versions, at least in some contexts, the colon (':') is used as the delimiter in the native presentation of file path names. Unix systems natively use the same forward slash '/' delimiter for hierarchy, so there is a closer mapping between file paths and native path names. - 'Drives, drive letters, mount points, file system root' There is considerable difference, in practice, for handling of the syntax for the 'top' of the hierarchy. The 'file:' URI syntax provides on simple place for designating the root of the file hierachy, and implementations have diverged, even on the same platform, sometimes even within a single application. For example, DOS and Windows based systems support the notion of a "drive letter", a single character which represents a (virtual) drive, mount point, or device. Native representations of file paths start with the drive letter, a colon, and then the path; e.g., "c:\tmp\test.txt". Drive letters can be mapped into the top of a 'file:' URI in various ways; some applications substitute horizontal bar "|" for the ":" after the drive letter, yielding file:///c|/tmp/test.txt. In some cases, the : is left unchanged [a][b][d]; some applications omit it [f]. UNC path names.... Use of hostname, host name checking The 'file:' URI specification calls for using the actual host name as the name authority file://myhostname/path and allowing it to be ommitted. This practice is rarely followed, and frequently is not checked. Omitting authority Some applications generate URIs with no authority component at all, e.g., file:/this/is/the/path [x][y][z] Using native paths Some applications accept (and even generate) File URIs which use the native syntax instead of the canonical /-delimited one. [p][d][q]. Character sets and encodings Local file systems of course may use many different encodings for representing file names. For interoperability sake, it would be preferable for file: URI libraries to translate the native character encoding for file names to and from Unicode, using URI / References [lwp-perl] LWP perl library [java-net] Java.net.URI [ms-net-lib] Microsoft .NET library
Received on Friday, 20 August 2004 06:13:35 UTC