Re: Does expanding a CURIE into an IRI always succeed from Ivan Herman on 2019-01-04 (www-html@w3.org from January 2019)

From: Ivan Herman <ivan@w3.org>
Date: Fri, 4 Jan 2019 09:20:23 +0100
To: Shane McCarron <shane@aptest.com>
Cc: akrasner@riseup.net, www-html@w3.org
Message-Id: <85009A40-8CD4-4F7C-8FA3-42E2C70D1330@w3.org>
> On 4 Jan 2019, at 00:19, Shane McCarron <shane@aptest.com <mailto:shane@aptest.com>> wrote:
> 
> I am not certain this will help you, but the note you are referencing is somewhat out of date... The real published definition of CURIE is in the RDFa Standard.   The latest version of that is at https://www.w3.org/TR/rdfa-core/ <https://www.w3.org/TR/rdfa-core/> - see the section on CURIE Syntax Definition and the section on CURIE and URI Processing.
> 
> Perhaps Ivan has other guidance? 

Not really. The RDFa version is the only CURIE that has the recommendation level. To be precise, we are talking about this:

https://www.w3.org/TR/rdfa-core/#s_curies <https://www.w3.org/TR/rdfa-core/#s_curies>
https://www.w3.org/TR/rdfa-core/#s_curieprocessing <https://www.w3.org/TR/rdfa-core/#s_curieprocessing>

(This is not necessarily good, and it would be worthwhile separating CURIE-s into a separate Rec, but that is another question.)

Ivan

> 
> On Thu, Jan 3, 2019 at 7:12 AM <akrasner@riseup.net <mailto:akrasner@riseup.net>> wrote:
> On 2019-01-03 12:17, akrasner@riseup.net <mailto:akrasner@riseup.net> wrote:
> > Hello!
> > 
> > tl;dr if a base URI and a CURIE suffix both have a fragment, expanding
> > the CURIE results with an invalid IRI, am I correct?
> > 
> > 
> > I'm implementing CURIE[1] expansion, i.e. turning a CURIE into an IRI by
> > concatenating its suffix with some given base URI. That working group
> > note linked in [1] suggests that "In all cases a parsed CURIE will
> > produce an IRI". I'm not sure what it means, but I started implementing
> > expansion, and I was wondering, when you concatenate a base URI and a
> > CURIE's suffix, is the result *always* a valid IRI?
> > 
> > And I noticed one case (I didn't find more such cases, but, possibly
> > missed them, idk) in which it isn't. If your base IRI and CURIE suffix
> > both have a fragment, then the result is an invalid IRI, because a
> > literal '#' character isn't allowed to be present anywhere in an IRI
> > except to start its fragment. For example:
> > 
> > [...]
> 
> Looks like there are more cases like that, clearly CURIE expansion isn't
> guaranteed
> at all to produce a valid IRI. I suppose that sentence in the document
> is confusing.
> I mean, the only other meaning I can think of is that the CURIE itself,
> as-is, can
> be parsed a valid IRI, but that isn't true either (for example, a CURIE
> prefix may
> contain characters that are invalid in an IRI's scheme part).
> 
> Just to give an example, one of the weird cases is when the base IRI has
> an authority but
> empty path, and the CURIE suffix is a relative path:
> 
> Base: https://riseup.net <https://riseup.net/>
> CURIE: ru:x.y.z
> Result: https://riseup.netx.y.z <https://riseup.netx.y.z/>
> 
> Base: https://
> CURIE: ru:x.y.z.org <http://x.y.z.org/>
> Result: https://x.y.z.org <https://x.y.z.org/>
> 
> (These happen to be valid IRIs, just really weird, and similarly one can
> produce invalid ones
> in the same way)
> 
> 
> 
> -- 
> Shane McCarron
> Managing Director, Applied Testing and Technology, Inc.


----
Ivan Herman, W3C 
Publishing@W3C Technical Lead
Home: http://www.w3.org/People/Ivan/ <http://www.w3.org/People/Ivan/>
mobile: +31-641044153
ORCID ID: https://orcid.org/0000-0003-0782-2704 <https://orcid.org/0000-0003-0782-2704>
Received on Friday, 4 January 2019 08:20:29 UTC