W3C home > Mailing lists > Public > public-rdf-comments@w3.org > May 2013

RE: Turtle syntax: Please align base URI with RFC 3986 & 3987

From: Markus Lanthaler <markus.lanthaler@gmx.net>
Date: Wed, 29 May 2013 00:57:08 +0200
To: "'David Booth'" <david@dbooth.org>
Cc: <public-rdf-comments@w3.org>
Message-ID: <000101ce5bf6$c0c6a790$4253f6b0$@lanthaler@gmx.net>
On Monday, May 27, 2013 5:36 AM, David Booth wrote:
> >> That would require a relative URI specified in
> >> @base to be resolved using "Reference Resolution", which is
> >> specified in section 5 of RFC 3986.  But the result of
> >> "Reference Resolution" is "a
> >> string matching the <URI> syntax rule of Section 3", and the <URI>
> >> production *allows* a fragment identifier.
> >
> > And why should that be a problem?
> Because a base URI as defined in RFC 3986 does not permit a fragment
> identifier.  Therefore, if @base specified a relative URI which was
> resolved using RFC3986 "Reference Resolution" then the result could
> contain a fragment identifier.  Thus, a Turtle "base URI" could contain
> a fragment identifier, whereas an RFC 3986 "base URI" does not permit a
> fragment identifier.

No, that's not correct. Even if the base contains a fragment identifier the
result of resolving any relative IRI (even the empty string "") will result
in a URI which does not contain the fragment identifier. Thus it really
doesn't matter. The fragment identifier will be ignored in any case.

> > It is aligned with the two RFCs. There might be a case where you
> > can't resolve a relative @base as the document itself has no IRI
> > but that's the
> > same problem as not being able to resolve relative IRIs anywhere else
> > in such a document.
> If it is aligned with RFC 3986 and 3987 then the alignment certainly is
> not very visible.  I spent quite a lot of time trying to track it down,
> and finally concluded that nothing in the Turtle spec requires Turtle's
> notion of a base URI (which AFAICT is specified using @base) to be an
> absolute-IRI as defined in those RFCs.  Can you please point me to the
> exact wording that requires a Turtle base URI to be an absolute-IRI?

@base enables to establishment of the base URI, it is not the final URI. If
base contains a relative IRI it is resolved against the document's URI or
the application supplied base to obtain the final base URI.

> [...]
> Notice that it only references RFC3986 sections 5.1.2 and 5.1.3, which
> only talk (vaguely) about where the base URI might come from.  Those
> sections do not constrain the base URI to be an absolute-URI.  It is
> the
> beginning of RFC3986 section 5.1 that constrains a base URI to be an
> absolute-URI, and that portion is *not* referenced by the Turtle spec.

Yes, in the end you need an absolute URI otherwise you can't resolve
relative ones. There a number of "layers" where the base might come from.
@base -> document URI -> application supplied. I'm writing this mail offline
so I can't give you the exact section in the RFC, but that's explained there
as well.

> The last sentence of that second paragraph in Turtle section 6.3 does
> say "Each @base directive sets a new In-Scope Base URI, relative to the
> previous one", and I guess that sentence is the justification for why
> you and Andy are saying that @base can specify a relative URI.  But


> knowing that RFC3986 requires a base URI to be an absolute-URI, I had
> understood that sentence to mean "Each @base directive sets a new
> In-Scope Base URI, [in relation to] to the previous one", i.e., it is
> new in relation to the previous one.  I had no idea it was suggesting
> that @base could specify a relative URI.
> Bottom line:
>   - This stuff is not at all clear in the current wording.

I find that quite clear and in-line with what, e.g., HTML does. Can you
suggest some concrete wording which would make it clearer?

>   - If @base is permitted to specify a relative IRI then: (a) an
> explanation should be added to explain how that relative IRI is
> converted into an absolute-IRI (including what happens to any fragment
> identifier that the relative IRI contains); and (b) Turtle will not be
> aligned with SPARQL in this regard.

The RFC's explain how a relative IRI can be resolved against a base to an
absolute IRI. @base does nothing special here. Isn't referencing the RFC

>   - If @base is NOT permitted to specify a relative IRI then the Turtle
> spec should make clear that @base must specify an absolute-IRI, in
> alignment with SPARQL.

That's not the case.

> I was not aware that HTML allowed base URIs to be relative, but, it
> seems more important to align Turtle with SPARQL than with HTML.  Plus
> it would also be simpler.

What's the advantage of such a restriction? If someone wants to use absolute
URIs that's fine. It doesn't add any complexity because the URI resolution
algorithm have to be implemented in any case.

Markus Lanthaler
Received on Tuesday, 28 May 2013 22:58:01 UTC

This archive was generated by hypermail 2.4.0 : Friday, 17 January 2020 16:59:34 UTC