W3C home > Mailing lists > Public > www-html@w3.org > January 2019

Re: Does expanding a CURIE into an IRI always succeed

From: Shane McCarron <shane@aptest.com>
Date: Thu, 3 Jan 2019 18:19:22 -0500
Message-ID: <CAOk_reFKF6OpjX_G=1xe7oj1hz9j-EPJd4NCntdk_PRQ0YSYSQ@mail.gmail.com>
To: akrasner@riseup.net, Ivan Herman <ivan@w3.org>
Cc: www-html@w3.org
I am not certain this will help you, but the note you are referencing is
somewhat out of date... The real published definition of CURIE is in the
RDFa Standard.   The latest version of that is at
https://www.w3.org/TR/rdfa-core/ - see the section on CURIE Syntax
Definition and the section on CURIE and URI Processing.

Perhaps Ivan has other guidance?

On Thu, Jan 3, 2019 at 7:12 AM <akrasner@riseup.net> wrote:

> On 2019-01-03 12:17, akrasner@riseup.net wrote:
> > Hello!
> >
> > tl;dr if a base URI and a CURIE suffix both have a fragment, expanding
> > the CURIE results with an invalid IRI, am I correct?
> >
> >
> > I'm implementing CURIE[1] expansion, i.e. turning a CURIE into an IRI by
> > concatenating its suffix with some given base URI. That working group
> > note linked in [1] suggests that "In all cases a parsed CURIE will
> > produce an IRI". I'm not sure what it means, but I started implementing
> > expansion, and I was wondering, when you concatenate a base URI and a
> > CURIE's suffix, is the result *always* a valid IRI?
> >
> > And I noticed one case (I didn't find more such cases, but, possibly
> > missed them, idk) in which it isn't. If your base IRI and CURIE suffix
> > both have a fragment, then the result is an invalid IRI, because a
> > literal '#' character isn't allowed to be present anywhere in an IRI
> > except to start its fragment. For example:
> >
> > [...]
>
> Looks like there are more cases like that, clearly CURIE expansion isn't
> guaranteed
> at all to produce a valid IRI. I suppose that sentence in the document
> is confusing.
> I mean, the only other meaning I can think of is that the CURIE itself,
> as-is, can
> be parsed a valid IRI, but that isn't true either (for example, a CURIE
> prefix may
> contain characters that are invalid in an IRI's scheme part).
>
> Just to give an example, one of the weird cases is when the base IRI has
> an authority but
> empty path, and the CURIE suffix is a relative path:
>
> Base: https://riseup.net
> CURIE: ru:x.y.z
> Result: https://riseup.netx.y.z
>
> Base: https://
> CURIE: ru:x.y.z.org
> Result: https://x.y.z.org
>
> (These happen to be valid IRIs, just really weird, and similarly one can
> produce invalid ones
> in the same way)
>
>

-- 
Shane McCarron
Managing Director, Applied Testing and Technology, Inc.
Received on Thursday, 3 January 2019 23:19:56 UTC

This archive was generated by hypermail 2.3.1 : Thursday, 3 January 2019 23:19:57 UTC