Re: file:yyy and file:ddd/yyy

Jeremy,

RFC3986 regards this as an abnormal case:  see the example at the end of page 37
 (section 5.4.2):
[[
   Some parsers allow the scheme name to be present in a relative
   reference if it is the same as the base URI scheme.  This is
   considered to be a loophole in prior specifications of partial URI
   [RFC1630].  Its use should be avoided but is allowed for backward
   compatibility.

      "http:g"        =  "http:g"         ; for strict parsers
                      /  "http://a/b/c/g" ; for backward compatibility
]]

(My own code for this allows either, but defaults to the first case, IIRC.)

BTW, as part of the URI spec discussions, I collected several test cases from
various sources and encoded them as RDF, here:
  http://www.ninebynine.org/Software/HaskellUtils/Network/UriTest.n3 , .rdf

(These were programmatically generated from an Excel spreadsheet via a CSV file:
  http://www.ninebynine.org/Software/HaskellUtils/Network/UriTest.xls, .csv)

#g
--

Jeremy Carroll wrote:
> 
> 
> 
> Summary:
> ========
> 
> What is the correct reading of file:yyy and file:ddd/yyy?
> 
> Is it that these are relative URIs to be interpreted against the current
> working directory (as an absolute file: URI) of an application using
> them. Or is it better to treat them as absolute URIs?
> Treating them as errors would seem to go against too much deployed
> practice.
> 
> (note, the remainder of this message is merely background to this
> question, FYGI, and not required to engage with the above issue).
> 
> 
> Background
> ==========
> 
> In my team (Jena Semantic Web project), we are trying to improve our
> URI/IRI handling code. For years, we have used a variety of third party
> code that has always been problematic in the detail, and hard to
> support. We have also had a very tolerant contract, where we accept as a
> URI any string. This has been the cause of difficulty, when for
> instance, we accept a bad URI on input, but then can't output it again,
> because we can't tell how it interacts with say an xml:base declaration
> in a document, because it is too badly formed.
> 
> We are considering having a much stricter contract (optionally; default
> behaviour [strict/lax] to be decided). Particularly, since in Semantic
> Web, URIs are primarily treated as identifiers, rather than operational
> instructions, strictness seems more appropriate. (e.g. a mistake in a
> URI that is an identifier and an instruction is often detected when you
> do a GET; in a browsing context, these errors are detected very quickly,
> because GETs are done soon. In a SemWeb context, the first GET might not
> be applied for months, and the URI might have been through many systems.)
> 
> file: URIs are particularly problematic.
> Our command-line tools accept file: URIs as URIs (typically ones which
> locate documents to process). In particular, file:foo.rdf is used to
> locate a file in the current directory. We want to continue supporting
> this behaviour; but it seems hard to account for it with the RFCs
> defining URIs and the file: scheme (although it works with the Java URL
> class).
> 
> We have particular problem with file:ddd/yyy because applying the
> resolution algorithm from RFC 3986, with backward compatible behaviour
> enabled for file: scheme, we have
> 
> file:ddd/yyy resolves against file:ddd/yyy as file:ddd/ddd/yyy
> whereas
> file:xxx resolves against file:xxx as file:xxx
> 
> This is significant when reading a file in, when it includes its own (or
> a related) URI (in this form). Since we located it using a file: URI we
> use that as the base when reading it. Our current behaviour just treats
> this URIs as absolute and leaves them unchanged, and everything works,
> but .... our behaviour cannot be justified from the RFCs.
> 
> Jeremy
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 

-- 
Graham Klyne
For email:
http://www.ninebynine.org/#Contact

Received on Wednesday, 8 February 2006 15:36:56 UTC