- From: <akrasner@riseup.net>
 - Date: Thu, 03 Jan 2019 02:17:48 -0800
 - To: www-html@w3.org
 
Hello!
tl;dr if a base URI and a CURIE suffix both have a fragment, expanding
the CURIE results with an invalid IRI, am I correct?
I'm implementing CURIE[1] expansion, i.e. turning a CURIE into an IRI by
concatenating its suffix with some given base URI. That working group
note linked in [1] suggests that "In all cases a parsed CURIE will
produce an IRI". I'm not sure what it means, but I started implementing
expansion, and I was wondering, when you concatenate a base URI and a
CURIE's suffix, is the result *always* a valid IRI?
And I noticed one case (I didn't find more such cases, but, possibly
missed them, idk) in which it isn't. If your base IRI and CURIE suffix
both have a fragment, then the result is an invalid IRI, because a
literal '#' character isn't allowed to be present anywhere in an IRI
except to start its fragment. For example:
Base IRI: https://riseup.net/some/path#
CURIE: ru:something#xyz
Result: https://riseup.net/some/path#something#xyz
The result's fragment part is "something#xyz", which contains a '#',
that makes it an invalid IRI.
What I'm wondering is:
(1) Am I observing correctly, and indeed CURIE expansion can produce
invalid IRIs, so I should be prepared to return an error (or an invalid
IRI) when implementing this expansion?
(2) When expanding, should I / may I percent-encode that '#' character
so that I do always get a valid IRI? My use case is JSON-LD and RDF, in
which the exact IRI string is the ID of something, so it has to be
precise
(3) I was thinking, just sharing the thought, to have 3 variants of
expansion:
    (a) Regular CURIE, expansion may produce invalid IRI
    (b) CURIE can't have a fragment, always produces valid IRI
    (c) Base IRI can't have a fragment, always produces valid IRI
For a given base IRI, it seems to me that in practice, 99% of the time
(if not 100%) it's one of the latter cases, i.e. either your
XML/RDF/whatever base IRI contains a fragment and your CURIEs get
appended to it (so they probably don't contain a fragment), or your base
IRI has no fragment, and possibly your CURIEs have fragments.
Example for (b):
Base IRI: https://riseup.net/some/path#
CURIE: ru:abc
Example for (c):
Base IRI: https://riseup.net/some/path
CURIE: ru:#abc
I'd love to hear thoughts, especially about whether I'm observing
correctly that CURIEs may produce invalid IRIs if both CURIE and base
have fragments :)
-- a.k.
[1]: https://www.w3.org/TR/curie/
Received on Thursday, 3 January 2019 10:49:20 UTC