- From: Andy Seaborne <andy.seaborne@epimorphics.com>
- Date: Mon, 27 May 2013 11:53:10 +0100
- To: public-rdf-comments@w3.org
David, You seem to have a different processing model to the one I think it is. You seem to believe the base is exactly the characters used for IRIREF; I understand it as URI resolution applies then the output is passed to whatever is doing base URI processing to be used. For context: XML http://www.w3.org/TR/xmlbase/#syntax and the example of a relative URI "/hotpicks/" for xml:base for a element. Turtle (and SPARQL) are just dong what everything else does here. What triples do you expect from, and what sequence of process steps would you expect a process to take, for these Turtle documents: in each case they are obtained by GET http://example/location/file.ttl Document1:: ---- <s> <p> <#o> . ---- Document2:: ---- @base <http://example/base2> . <s> <p> <#o> . ---- Document3:: ---- <s> <p> <#o> . @base <http://example/base2> . <s> <p> <#o> . ---- Document4:: ---- @base <base2/> . <s> <p> <#o> . ---- Document5:: corner case: ---- @base <base2/> . @prefix ns1: <ns#> . ns1:s <p> <#o> . ---- After resolution, before used as the base, it is absolute - all URIs in RDF are absolute. This absolute URI - possible with fragment, is then given to what ever machinery is doing to further URI resolution. That code is responsible for determining the right base URI given the inputs. Hence, I see that "If the base URI is obtained from a URI reference, ..." applies. Andy On 27/05/13 04:36, David Booth wrote:> Hi Markus, > > On 05/26/2013 06:37 PM, Markus Lanthaler wrote: >> On Sunday, May 26, 2013 7:17 PM, David Booth wrote: >>>> The syntax has >>>> >>>> @base IRIREF . >>>> >>>> and the @base is no different to other URIs - it is subject to URI >>>> resolution. >>> >>> But I don't see anything there that explicitly requires IRIREF to be an >>> absolute-IRI as defined in RFC3987. Other parts of the Turtle syntax >>> (such as the @prefix production) also use the IRIREF syntax production >>> without requiring it to be an absolute-IRI. That's why it isn't clear >>> that in the case of @base it must be an absolute-IRI. >> >> It can be a relative IRI as well. In that case it gets resolved >> against the >> currently active base IRI. >> >> >>>> @base <relURI> . >>>> >>>> is also legal as is >>>> >>>> @base <../sibling> . >>>> >>>> which might be occasionally useful. >>> >>> Huh? Are you saying that @base can recursively specify the base URI >>> using a *relative* URI? Then there would have to be a base URI of the >>> @base URI? >> >> Yes, not recursively though but sequentially. >> >> >>> I'm very surprised to hear you say that a relative @base URI would be >>> legal. I don't think that should be allowed. That seems too >>> mysterious and error prone to me. >> >> HTML allows that as well e.g. >> >> >>> That would require a relative URI specified in >>> @base to be resolved using "Reference Resolution", which is specified >>> in >>> section 5 of RFC 3986. But the result of "Reference Resolution" is "a >>> string matching the <URI> syntax rule of Section 3", and the <URI> >>> production *allows* a fragment identifier. >> >> And why should that be a problem? > > Because a base URI as defined in RFC 3986 does not permit a fragment > identifier. Therefore, if @base specified a relative URI which was > resolved using RFC3986 "Reference Resolution" then the result could > contain a fragment identifier. Thus, a Turtle "base URI" could contain > a fragment identifier, whereas an RFC 3986 "base URI" does not permit a > fragment identifier. > >> >> >>> I think it would be better to align directly with SPARQL and RFC 3986 >>> and RFC 3987 by explicitly requiring @base to specify an absolute-IRI. >> >> It is aligned with the two RFCs. There might be a case where you can't >> resolve a relative @base as the document itself has no IRI but that's the >> same problem as not being able to resolve relative IRIs anywhere else in >> such a document. > > If it is aligned with RFC 3986 and 3987 then the alignment certainly is > not very visible. I spent quite a lot of time trying to track it down, > and finally concluded that nothing in the Turtle spec requires Turtle's > notion of a base URI (which AFAICT is specified using @base) to be an > absolute-IRI as defined in those RFCs. Can you please point me to the > exact wording that requires a Turtle base URI to be an absolute-IRI? > > The Turtle EBNF certainly does not require it. > > Turtle section 6.3 has two paragraphs. The first says: > http://www.w3.org/TR/turtle/#sec-iri-references > [[ > Relative IRIs are resolved with base IRIs as per Uniform Resource > Identifier (URI): Generic Syntax [RFC3986] using only the basic > algorithm in section 5.2. Neither Syntax-Based Normalization nor > Scheme-Based Normalization (described in sections 6.2.2 and 6.2.3 of > RFC3986) are performed. Characters additionally allowed in IRI > references are treated in the same way that unreserved characters are > treated in URI references, per section 6.5 of Internationalized Resource > Identifiers (IRIs) [RFC3987]. > ]] > That paragraph only talks about resolving relative URIs. It does not > specify the base URI. > > The first sentence of the second paragraph says: > [[ > The @base directive defines the Base IRI used to resolve relative IRIs > per RFC3986 section 5.1.1, "Base URI Embedded in Content". > ]] > and RFC3986 section 5.1.1 says: "Within certain media types, a base URI > for relative references can be embedded within the content itself". > Since the Turtle directive is called "@base" (or "BASE") and the Turtle > spec often uses the term "base URI", this would strongly suggest that > the @base directive is used to specify a base URI that is "embedded > within the content itself". But if you and Andy are telling me that > @base may provide a relative URI, then the actual base URI is *not* > actually "embedded within the content itself". Rather, it is > (recursively) determined by resolving that relative URI against some > other base URI. > > The rest of the second paragraph in Turtle section 6.3 says: > [[ > Section 5.1.2, "Base URI from the Encapsulating Entity" defines how the > In-Scope Base IRI may come from an encapsulating document, such as a > SOAP envelope with an xml:base directive or a mime multipart document > with a Content-Location header. The "Retrieval URI" identified in 5.1.3, > Base "URI from the Retrieval URI", is the URL from which a particular > Turtle document was retrieved. If none of the above specifies the Base > URI, the default Base URI (section 5.1.4, "Default Base URI") is used. > Each @base directive sets a new In-Scope Base URI, relative to the > previous one. > ]] > Notice that it only references RFC3986 sections 5.1.2 and 5.1.3, which > only talk (vaguely) about where the base URI might come from. Those > sections do not constrain the base URI to be an absolute-URI. It is the > beginning of RFC3986 section 5.1 that constrains a base URI to be an > absolute-URI, and that portion is *not* referenced by the Turtle spec. > > The last sentence of that second paragraph in Turtle section 6.3 does > say "Each @base directive sets a new In-Scope Base URI, relative to the > previous one", and I guess that sentence is the justification for why > you and Andy are saying that @base can specify a relative URI. But > knowing that RFC3986 requires a base URI to be an absolute-URI, I had > understood that sentence to mean "Each @base directive sets a new > In-Scope Base URI, [in relation to] to the previous one", i.e., it is > new in relation to the previous one. I had no idea it was suggesting > that @base could specify a relative URI. > > Bottom line: > > - This stuff is not at all clear in the current wording. > > - If @base is permitted to specify a relative IRI then: (a) an > explanation should be added to explain how that relative IRI is > converted into an absolute-IRI (including what happens to any fragment > identifier that the relative IRI contains); and (b) Turtle will not be > aligned with SPARQL in this regard. > > - If @base is NOT permitted to specify a relative IRI then the Turtle > spec should make clear that @base must specify an absolute-IRI, in > alignment with SPARQL. > > I was not aware that HTML allowed base URIs to be relative, but, it > seems more important to align Turtle with SPARQL than with HTML. Plus > it would also be simpler. > > David >
Received on Monday, 27 May 2013 10:54:04 UTC