- From: Graham Klyne <GK@ninebynine.org>
- Date: Wed, 08 Feb 2006 15:28:15 +0000
- To: Jeremy Carroll <jjc@hpl.hp.com>
- CC: uri@w3.org
Jeremy, RFC3986 regards this as an abnormal case: see the example at the end of page 37 (section 5.4.2): [[ Some parsers allow the scheme name to be present in a relative reference if it is the same as the base URI scheme. This is considered to be a loophole in prior specifications of partial URI [RFC1630]. Its use should be avoided but is allowed for backward compatibility. "http:g" = "http:g" ; for strict parsers / "http://a/b/c/g" ; for backward compatibility ]] (My own code for this allows either, but defaults to the first case, IIRC.) BTW, as part of the URI spec discussions, I collected several test cases from various sources and encoded them as RDF, here: http://www.ninebynine.org/Software/HaskellUtils/Network/UriTest.n3 , .rdf (These were programmatically generated from an Excel spreadsheet via a CSV file: http://www.ninebynine.org/Software/HaskellUtils/Network/UriTest.xls, .csv) #g -- Jeremy Carroll wrote: > > > > Summary: > ======== > > What is the correct reading of file:yyy and file:ddd/yyy? > > Is it that these are relative URIs to be interpreted against the current > working directory (as an absolute file: URI) of an application using > them. Or is it better to treat them as absolute URIs? > Treating them as errors would seem to go against too much deployed > practice. > > (note, the remainder of this message is merely background to this > question, FYGI, and not required to engage with the above issue). > > > Background > ========== > > In my team (Jena Semantic Web project), we are trying to improve our > URI/IRI handling code. For years, we have used a variety of third party > code that has always been problematic in the detail, and hard to > support. We have also had a very tolerant contract, where we accept as a > URI any string. This has been the cause of difficulty, when for > instance, we accept a bad URI on input, but then can't output it again, > because we can't tell how it interacts with say an xml:base declaration > in a document, because it is too badly formed. > > We are considering having a much stricter contract (optionally; default > behaviour [strict/lax] to be decided). Particularly, since in Semantic > Web, URIs are primarily treated as identifiers, rather than operational > instructions, strictness seems more appropriate. (e.g. a mistake in a > URI that is an identifier and an instruction is often detected when you > do a GET; in a browsing context, these errors are detected very quickly, > because GETs are done soon. In a SemWeb context, the first GET might not > be applied for months, and the URI might have been through many systems.) > > file: URIs are particularly problematic. > Our command-line tools accept file: URIs as URIs (typically ones which > locate documents to process). In particular, file:foo.rdf is used to > locate a file in the current directory. We want to continue supporting > this behaviour; but it seems hard to account for it with the RFCs > defining URIs and the file: scheme (although it works with the Java URL > class). > > We have particular problem with file:ddd/yyy because applying the > resolution algorithm from RFC 3986, with backward compatible behaviour > enabled for file: scheme, we have > > file:ddd/yyy resolves against file:ddd/yyy as file:ddd/ddd/yyy > whereas > file:xxx resolves against file:xxx as file:xxx > > This is significant when reading a file in, when it includes its own (or > a related) URI (in this form). Since we located it using a file: URI we > use that as the base when reading it. Our current behaviour just treats > this URIs as absolute and leaves them unchanged, and everything works, > but .... our behaviour cannot be justified from the RFCs. > > Jeremy > > > > > > > > > > > > -- Graham Klyne For email: http://www.ninebynine.org/#Contact
Received on Wednesday, 8 February 2006 15:36:56 UTC