W3C home > Mailing lists > Public > www-html@w3.org > January 2019

Re: Does expanding a CURIE into an IRI always succeed

From: <akrasner@riseup.net>
Date: Thu, 03 Jan 2019 04:08:16 -0800
To: www-html@w3.org
Message-ID: <fbeb2d78f56fe633aa476d8688e24fef@riseup.net>
On 2019-01-03 12:17, akrasner@riseup.net wrote:
> Hello!
> 
> tl;dr if a base URI and a CURIE suffix both have a fragment, expanding
> the CURIE results with an invalid IRI, am I correct?
> 
> 
> I'm implementing CURIE[1] expansion, i.e. turning a CURIE into an IRI by
> concatenating its suffix with some given base URI. That working group
> note linked in [1] suggests that "In all cases a parsed CURIE will
> produce an IRI". I'm not sure what it means, but I started implementing
> expansion, and I was wondering, when you concatenate a base URI and a
> CURIE's suffix, is the result *always* a valid IRI?
> 
> And I noticed one case (I didn't find more such cases, but, possibly
> missed them, idk) in which it isn't. If your base IRI and CURIE suffix
> both have a fragment, then the result is an invalid IRI, because a
> literal '#' character isn't allowed to be present anywhere in an IRI
> except to start its fragment. For example:
> 
> [...]

Looks like there are more cases like that, clearly CURIE expansion isn't
guaranteed
at all to produce a valid IRI. I suppose that sentence in the document
is confusing.
I mean, the only other meaning I can think of is that the CURIE itself,
as-is, can
be parsed a valid IRI, but that isn't true either (for example, a CURIE
prefix may
contain characters that are invalid in an IRI's scheme part).

Just to give an example, one of the weird cases is when the base IRI has
an authority but
empty path, and the CURIE suffix is a relative path:

Base: https://riseup.net
CURIE: ru:x.y.z
Result: https://riseup.netx.y.z

Base: https://
CURIE: ru:x.y.z.org
Result: https://x.y.z.org

(These happen to be valid IRIs, just really weird, and similarly one can
produce invalid ones
in the same way)
Received on Thursday, 3 January 2019 12:08:39 UTC

This archive was generated by hypermail 2.3.1 : Thursday, 3 January 2019 12:08:40 UTC