W3C home > Mailing lists > Public > public-rdfa-wg@w3.org > January 2012

Re: A real problem with CURIEs and a proposal

From: Ivan Herman <ivan@w3.org>
Date: Thu, 26 Jan 2012 13:03:46 +0100
Cc: public-rdfa-wg <public-rdfa-wg@w3.org>, Gavin Carothers <gavin@carothers.name>
Message-Id: <3B6056B2-FA3E-4D46-9DC6-44D80DBDD828@w3.org>
To: Niklas Lindström <lindstream@gmail.com>
Niklas,

If I see

@property="http://a.b.c/"

in RDFa 1.1, what difference does it make whether 

- there is a @prefix="http: http:" definition somewhere, in which case this is treated as a URI
- there is no @prefix definition, in which case this is treated as a CURIE, resulting in... the same URI

The generated triples are identical.

Yes, of course, the user can do a 

@prefix="http: someotherschema:"

in which case this will become a strange and probably useless URI. So yes, we give a rope to the user for hanging himself/herself, but that is true for any kind of other prefixes as well.

I really fail to see what the problem is, I must admit, apart from a perception issue.

Let us take that this evening and make a final decision. This has been dragging on for too long...

Ivan


On Jan 26, 2012, at 12:53 , Niklas Lindström wrote:

> Ivan,
> 
> I am fully aware that preventing "//" doesn't disambiguate CURIEs from
> IRIs with schemes not using the authority construct (i.e. "opaque path
> IRI schemes", such as urn, tag, mailto, etc.). I elaborated on that in
> depth in the start of this thread [1].
> 
> But, as I have also explicitly pointed out, ISSUE-125 [2] raised by
> the RDF WG speaks of "normal IRIs", and the example plus reasoning
> used by Gavin indicates that it is these, IRIs using the
> "scheme://authority/path?query#fragment" form, that are of primary
> concern (Again I quote: "These are very easy to confuse with normal
> IRIs. In general it seems that the intent of CURIEs was to limit the
> right hand side to relative references"). This is explicitly what my
> suggestion addresses.
> 
> It is obvious that due to the requirements for CURIEs like:
> 
>   schema:Person/Doctor
>   og:video:height
>   db:resource/Albert_Einstein
> 
> we cannot prevent CURIEs from looking like IRIs using schemes without
> an authority part. We discussed this in ISSUE-90 and came to the
> conclusion that preventing the forms above are not viable, nor is it
> viable to require only SafeCURIEs where we now allow the CURIEorIRI
> construct. We accepted this and I am not revisiting it. (And while an
> alternate reality where "|" had been used at the inception of prefixes
> would solve everything, this is unfortunately not an option now.)
> 
> But if you read my original post [1] you see my arguments for why
> preventing "//" makes for a better situation in practice. We can
> prevent those forms, which is very good since they are the dominant
> forms of IRIs which people have expectations on. Af for the other
> forms we can easily explain how to recognize them and which measures
> are needed for managing that remaining collision risk.
> 
> Consider the CURIEorIRI expressions below. Not knowing which prefixes
> are in effect, I argue that there are four here which by a quick scan
> should stand out as unambiguously being IRIs (and given my
> redefinition of CURIEs will be). The rest cannot be categorized a
> priori without knowing the effective prefix mappings.
> 
>    schema:Person/Engineer
>    og:video:height
>    db:resource/Albert_Einstein
>    /Engineer
>    http://dbpedia.org/resource/Albert_Einstein
>    spotify:track:21Phj46KeUHOWyZW9A9b7P
>    ./og:video:height
>    tag:dbpedia.org,2012:resource/Albert_Einstein
>    widget://c13c6f30-ce25-11e0-9572-0800200c9a66/index.html
> 
> (Reading Shane's comment on my issues with the new note in the Core
> spec on how CURIE prefixes overshadow schemes, it appears that the
> assumptions about what people, authors vs. consumers, will believe are
> CURIEs or IRIs are different. Alas, I have a huge backlog of work to
> do today and have no time prior to our telecon to address this
> further. I hope that I've explained my case better now, and if not,
> that you have time to revisit the argumentation I presented at the
> beginning of this thread.)
> 
> Best regards,
> Niklas
> 
> [1]: http://lists.w3.org/Archives/Public/public-rdfa-wg/2012Jan/0045.html
> [2]: http://www.w3.org/2010/02/rdfa/track/issues/125
> 
> 
> 
> 2012/1/26 Ivan Herman <ivan@w3.org>:
>> Niklas,
>> 
>> I think the only disagreement between the two of us is the starting '//' issue, ie, whether we want to allow it or not. And yes, your modification of the CURIE definition makes it relatively easy to define it without the '//'. Comments on that:
>> 
>> - I cannot tell you how widespread its usage is out there. Clearly, the only reasonable(?) usage in RDFa 1.0 was to mimic, say, http and it is correct that this became moot in RDFa 1.1. Which also means that any RDFa 1.0 _content_ that used it would be interpreted in 'your version' of RDFa 1.1, through different parsing route, yielding the same triples. This is certainly in favour of your proposal. As you say, due to the kindness of Facebook, we probably MUST change the CURIE definition anyway, so doing this may be done.
>> 
>> - However, this does _not_ address the issue of a possible clash of a registered scheme and prefix. What it achieves is... well, I am not even sure in practice. If I see a
>> 
>> http://www.example.org
>> 
>> in a @property, and the user can define a @prefix="http: http:" then, well, this is mapped on a HTTP URI. If it is not allowed or that prefix is not used then it will be mapped on... the same HTTP URI. No difference in practice.
>> 
>> On the other hand, the issue of leading '//' is completely irrelevant for a number of URI schemes. For example, consider:
>> 
>> prefix="tag: http://www.w3.org/"
>> @property="tag:ivan@ivan-herman.net,2012-01-26:abcd"
>> 
>> This would be expanded, through the CURIE mechanism, to http://www.w3.org/ivan@ivan-herman.net,2012-01-26:abcd, but...
>> 
>> tag:ivan@ivan-herman.net,2012-01-26:abcd
>> 
>> is a perfectly valid URI, using the (little used) 'tag' scheme RFC4151! We have not avoided this clash.
>> 
>> Ie, the restriction on '//' does _not_ handle the scheme/prefix clash issue in general. That problem is simply the result of the fact that the XML WG, in the distant past, and in their infinite wisdom, chose the ':' character as a namespace separator characters, and this choice has rippled around all the other documents that use a similar mechanism. The ONLY way of properly solve this issue is to use a different character at that place, say '|' as the CSS guys do, but that would create an unacceptable discrepancy between RDFa and the other, established and widely used serializations.
>> 
>> If this is so, I am in favour of the smallest possible change which does not require the RDFa processors to do one more (albeit trivial) check while parsing.
>> 
>> Cheers
>> 
>> Ivan
>> 
>> 
>> 
>> [1] http://www.ietf.org/rfc/rfc4151.txt
>> 
>> 
>> 
>> 
>> 
>> On Jan 26, 2012, at 03:47 , Niklas Lindström wrote:
>> 
>>> Hi Ivan,
>>> 
>>> Yes, I believe your proposal takes care of the new OpenGraph issue
>>> without touching anytyhing else. But it doesn't seem to address any of
>>> the RDF WG concerns.
>>> 
>>> Given that we now need to change the CURIE definition anyway, I think
>>> it is prudent to clarify what our situation is.
>>> 
>>> With your proposal, the lexical space of CURIEs is now a perfect
>>> superset of the IRI lexical space. I.e. every full IRI can be a CURIE.
>>> It was nearly that before of course, but it should be noted.
>>> 
>>> (Interestingly, the use of CURIEs is then entirely equated with just
>>> using IRIs plus the prefix expansion mechanism, apart from the more
>>> permissive syntax of prefixes compared to IRI schemes.)
>>> 
>>> As for backwards incompatibility, there may be two sides to that coin:
>>> 
>>> 1. RDFa 1.1 allow IRIs and CURIEs to mix. For @about and @resource
>>> this means that in RDFa 1.1, as opposed to 1.0 (and as opposed to
>>> RDF/XML, Turtle and SPARQL), prefixes in scope (defined in parent
>>> elements or in an out-of-band host language based initial context)
>>> override anything that looks like a prefix. This expands, without
>>> distinction, the prefix of any CURIE -- including schemes on those
>>> lexically identical to IRIs -- to create longer IRIs. This we are all
>>> aware of. I wonder though, isn't this an incompatibility, since there
>>> are forms of RDFa 1.0 where IRIs in @about and @resource would be
>>> changed into other IRIs in RDFa 1.1 (e.g. if there is an
>>> @xmlns:http="..." declared)? Just to ensure we're on stable ground.
>>> 
>>> 2. Currently the local part of CURIEs are allowed to start with "//".
>>> If this was not allowed in RDFa 1.1, it would be formally
>>> backwards-incompatible. But are there *any* existing or desirable uses
>>> for this that would be blocked? The only example I've ever seen is in
>>> RDFa 1.0, where if one binds a prefix to itself (@prefix="http:
>>> http:"), one can use CURIEs in @typeof, @property, @rel and @rev.
>>> Which is moot in RDFa 1.1 since full IRIs are allowed there anyway.
>>> 
>>> Whether or not to now take the opportunity disallow CURIEs from
>>> starting with "//" should be informed by answers to these questions,
>>> directed to everybody:
>>> 
>>> * Would that change address the concerns of the RDF WG regarding
>>> CURIEs being very easy to confuse with normal IRIs? I'd say yes in so
>>> far as it disambiguates any normal, authority-based IRI from being a
>>> potential CURIE.
>>> 
>>> * Does forbidding "//" from following the first ":" in CURIEs block
>>> any actual or desirable CURIE usage? Gavin also said: "In general it
>>> seems that the intent of CURIEs was to limit the right hand side to
>>> relative references". I cannot find any use case. What is the general
>>> opinion here?
>>> 
>>> * Everyone agrees on the existing, albeit arguably improbable, danger
>>> of confusion and undesirable expansion of schemes. Is this problem so
>>> minute in all conceivable scenarios that some prevention of this is
>>> not reasonable, even if it doesn't prevent any current CURIE usage?
>>> 
>>> Now, I understand your feeling about addressing this at this point of
>>> the process. My feeling is that this is the last chance we have to
>>> reduce the danger of conflation (eliminating it in the case of IRIs of
>>> the "scheme://" form). Of course, we should not let feelings dictate
>>> what we do.
>>> 
>>> I've revised my proposal (C) below to mirror yours (B) in syntax, and
>>> I altered it to only prevent the "//", nothing more (by adding
>>> "ipath-absolute" to the choices). Thus it is equivalent to yours with
>>> the exception of the construct choice: "//" iauthority ipath-abempty.
>>> 
>>> This makes our main options:
>>> 
>>> A. CURIEs today:
>>> 
>>>    curie       ::= [ [ prefix ] ':' ] reference
>>>    reference   ::= irelative-ref
>>> 
>>> B. CURIEs supporting OpenGraph:
>>> 
>>>    curie       ::= [ [ prefix ] ':' ] reference
>>>    reference   ::= ihier-part [ "?" iquery ] [ "#" ifragment ]
>>> 
>>> C. CURIEs supporting OpenGraph but not "prefix://":
>>> 
>>>    curie       ::= [ [ prefix ] ':' ] reference
>>>    reference   ::= ( ipath-absolute / ipath-rootless / ipath-empty )
>>>                        [ "?" iquery ] [ "#" ifragment ]
>>> 
>>> For comparison, this is the definition of IRI:
>>> 
>>>   IRI         = scheme ":" ihier-part [ "?" iquery ]
>>>                        [ "#" ifragment ]
>>> 
>>>   ihier-part  = "//" iauthority ipath-abempty
>>>               / ipath-absolute
>>>               / ipath-rootless
>>>               / ipath-empty
>>> 
>>> Best regards,
>>> Niklas
>>> 
>>> 
>>> 2012/1/25 Ivan Herman <ivan@w3.org>:
>>>> (Gavin, welcome in our midst:-)
>>>> 
>>>> Niklas,
>>>> 
>>>> fist of all, your issue with OpenGraph is indeed compelling. Whether we like it or not, it so happens that we have a major customer out there that uses an illegal CURIE (illegal in RDFa 1.0, that is) and we cannot ignore that nor can we force them to change that. This train is gone, so to say. Ie, I agree that RDFa 1.1 should try to accommodate for this.
>>>> 
>>>> However. As you yourself say below, your proposed changes conflate various different issues that are unrelated to the Facebook issue, namely the now notorious '//' starting character issue. As I said many times, at this point in the process I am not really happy touching that (though I can see the, albeit improbable, danger of confusion there)' let alone the issue of incompatibilities with RDFa 1.0 (our charter obligation is to create incompatibilities when there are really pressing issues or major market/user push to do so).
>>>> 
>>>> ooking at the RFC[1], there is a simpler way to amend the definition of CURIE-s in our document, where the _only_ change is the fact that the ':' character would be allowed in the reference part. Indeed, here is a possible alternative for the CURIE definition:
>>>> 
>>>> curie       ::=   [ [ prefix ] ':' ] reference
>>>> reference   ::=   ihier-part [ "?" iquery ] [ "#" ifragment ] ; ('ihier-part', 'iquery' and 'ifragment' as defined in [RFC3987])
>>>> 
>>>> By doing that change, the _only_ difference between the old definition of CURIE-s and the new one is the fact that ':' characters are allowed in the reference, ie, the OpenGraph CURIE-s become valid. I have put more details on this derivation in the Post Scriptum, if you want to check this (and somebody should, to be sure about it!).
>>>> 
>>>> My conclusion is therefore that (a) yes, we have a problem, Niklas is right; (b) my proposed change addresses this and only this issue and does not create backward compatibility issues (as also simpler in the spec:-). I would therefore prefer to go along this alternative.
>>>> 
>>>> As for the discrepancy with Turtle: yes, I believe that this will be yet another difference between Turtle and RDFa, but Gavin should tell me whether I am wrong. However, the Facebook RDFa usage is, I am afraid, compelling enough that we have to do that and assume the differences
>>>> 
>>>> (Gavin, the other example that did come up on our call, is the fact that, for example, Schema.org has type URI-s of the form A/B/C, and it is imperative that RDFa can do something like schema:A/B/C...)
>>>> 
>>>> Cheers
>>>> 
>>>> Ivan
>>>> 
>>>> P.S. Just to avoid you guys to go to the RFC document, here are the major points.
>>>> 
>>>> The previous CURIE definition was
>>>> 
>>>> curie       ::=   [ [ prefix ] ':' ] reference
>>>> reference   ::=   irelative-ref ; (as defined in [RFC3987])
>>>> 
>>>> and the RFC says:
>>>> 
>>>> irelative-ref  = irelative-part [ "?" iquery ] [ "#" ifragment ]
>>>> irelative-part = "//" iauthority ipath-abempty
>>>>                 / ipath-absolute
>>>>                 / ipath-noscheme  <-- !
>>>>                 / ipath-empty
>>>> 
>>>> Remember that arrow that I have put there (this is not in the original RFC)
>>>> 
>>>> The new alternative refers to ihier-part, which is defined as:
>>>> 
>>>> ihier-part     = "//" iauthority ipath-abempty
>>>>                 / ipath-absolute
>>>>                 / ipath-rootless   <-- !
>>>>                 / ipath-empty
>>>> 
>>>> Again the arrow is mine; this is indeed the only difference between ihier-part (used in the proposed new version of CURIE) and irelative-ref (used in the current version).
>>>> 
>>>> Going down in the RFC, one find:
>>>> 
>>>> ipath-noscheme = isegment-nz-nc *( "/" isegment )
>>>> ipath-rootless = isegment-nz *( "/" isegment )
>>>> 
>>>> ie, the only difference is between that '-nc' stuff.
>>>> 
>>>> The definition of these two are not completely symmetricm because isegment-nz uses yet another indirection:
>>>> 
>>>> isegment-nz    = 1*ipchar
>>>> ipchar         = iunreserved / pct-encoded / sub-delims / ":" / "@"
>>>> 
>>>> whereas isegment-nz-nc id defined diretly:
>>>> 
>>>> isegment-nz-nc = 1*( iunreserved / pct-encoded / sub-delims / "@" )
>>>> 
>>>> But, as you can see, the _only_ difference between the two, at the end of the day, is that isegment-nz-nc disallows the ':' character from isegment-nz. QED, as they put it mathematical proofs:-)
>>>> 
>>>> 
>>>> [1] http://tools.ietf.org/html/rfc3987
>>>> 
>>>> 
>>>> On Jan 25, 2012, at 01:50 , Niklas Lindström wrote:
>>>> 
>>>>> Hi Ivan!
>>>>> 
>>>>> (I'm CC:ing Gavin Carothers who raised ISSUE-125, since we're now
>>>>> discussing whether this addresses those concerns at all.)
>>>>> 
>>>>> 2012/1/24 Ivan Herman <ivan@w3.org>:
>>>>>> Niklas,
>>>>>> 
>>>>>> I think your analysis on the Open Graph protocol issue is correct.
>>>>> 
>>>>> Good. Do you think that we should fix this? (I've been believing that
>>>>> we do want that, even that most(?) of us thought that it was already
>>>>> supported.) Of course, it is already the case today that since RDFa
>>>>> 1.0 defines CURIEs like this as well, the OG usage is in fact invalid.
>>>>> I don't know how many RDFa processors actually break on that though.
>>>>> Since one could not mix (unsafe) CURIEs and IRIs in RDFa 1.0 I'd
>>>>> expect most of them to just split on ":" and expand the prefix part.
>>>>> 
>>>>>> My issue, however, is: if we go along the lines you propose, we are getting even further away from a compatibility with Turtle/SPARQL, an issue that has already been raised by the RDF WG. I am not sure what the best forum is for that.
>>>>> 
>>>>> That was not my intention. :( The change *does* allow colon ":" in the
>>>>> first segment of the local part now of course; explicitly in order to
>>>>> support the OG form of CURIEs. This admittedly gets us further away.
>>>>> 
>>>>> But by disallowing CURIEs to start with "prefix://", I hoped that it
>>>>> would mitigate (if not fully address) one of the concerns that the
>>>>> RDF-WG expressed, of confusing them with normal IRIs. As Gavin said:
>>>>> "These are very easy to confuse with normal IRIs. In general it seems
>>>>> that the intent of CURIEs was to limit the right hand side to relative
>>>>> references but that is not accomplished by using the "irelative-ref"
>>>>> production from the IRI RFC."
>>>>> 
>>>>> So I set out to fulfill the goals of supporting CURIEs like:
>>>>> 
>>>>>    og:video:width
>>>>>    schema:Person/Engineer
>>>>>    ex:some?very=special#thing
>>>>> 
>>>>> while not allowing CURIEs of the forms like Gavin's example:
>>>>> 
>>>>>    prefix://user:password[2001:0db8:85a3:0000:0000:8a2e:0370:7334]:8080/
>>>>> 
>>>>> Nor any other IRI using the "//" authority path form (like http and https IRIs).
>>>>> 
>>>>> I find five things to consider:
>>>>> 
>>>>> 1. CURIEs do not currently allow e.g. "og:video:height". PNames don't
>>>>> either. We however have RDFa in the wild using that form (both with
>>>>> the original Open Graph Protocol using RDFa 1.0 and the new Open Graph
>>>>> using RDFa 1.1 with @prefix).
>>>>> 
>>>>> 2. CURIEs support lots of special characters in the local part; PNames
>>>>> don't. The same reasoning as in 1 seems to apply, with our explicit
>>>>> requirements being to support e.g. "schema:Person/Engineer" and
>>>>> "db:resource/Albert_Einstein" (and "ex:some?very=special#thing", I
>>>>> suppose).
>>>>> 
>>>>> 3. CURIEs are allowed to be identical to full IRIs today (PNames most
>>>>> definitely aren't). Gavin expressed concerns about this ("Host parts,
>>>>> IPv4 and IPv6 segments") because they can be confused with normal
>>>>> IRIs. I interpret that as meaning those with "//" and authority after
>>>>> the scheme. I propose to not allow the CURIE local part to start with
>>>>> "/" (and thus not "//").
>>>>> 
>>>>> 4. CURIE prefixes, being defined as NCNames, allow some forms which
>>>>> are not allowed in PName prefixes (e.g. prefixes starting with "_").
>>>>> We may be able to use PN_PREFIX instead without breaking any real use
>>>>> case.
>>>>> 
>>>>> 5. I kept using the ABNF from the IRI RFC because CURIEs are based on
>>>>> IRIs. The RDF WG asked us to use W3C EBNF. Provided that we should
>>>>> address any of the above I'd gather that it is a sound request to do
>>>>> so using EBNF.
>>>>> 
>>>>> My hope is that if we were to address these, the RDF WG would find the
>>>>> results satisfactory, even if the CURIE definition end up as a
>>>>> superset of PName.
>>>>> 
>>>>> (Note that point 1 and 2 may also be of interest for the RDF WG
>>>>> regarding PNames.)
>>>>> 
>>>>> Best regards,
>>>>> Niklas
>>>>> 
>>>>> PS. You know that point 3 has vexed me, but please believe that I
>>>>> don't want to reopen ISSUE-90. That suggested more invasive changes
>>>>> which don't work with use cases as per above. I've absolutely accepted
>>>>> that. I approached this based on ISSUE-125 along with the observation
>>>>> of the Open Graph issue. Part of that suggested that point 3 is of
>>>>> concern, and that it may be addressed without affecting our needs. I
>>>>> want to keep the changes to a minimum while supporting as many
>>>>> concerns as possible (usability and safety being the primary
>>>>> objectives).
>>>>> 
>>>>> 
>>>>>> Manu: will you be at the Coordination Group tomorrow? Maybe worth raising the issue there?
>>>>>> 
>>>>>> ivan
>>>>>> 
>>>>>> On Jan 24, 2012, at 04:01 , Niklas Lindström wrote:
>>>>>> 
>>>>>>> Hello,
>>>>>>> 
>>>>>>> I've been investigating some of the minute details and issues
>>>>>>> surrounding CURIEs, based on the discussion that recently cropped up
>>>>>>> with ISSUE-125 [1].
>>>>>>> 
>>>>>>> It seems to me that the definition we currently have is flawed in one
>>>>>>> more way, and quite crucially so.
>>>>>>> 
>>>>>>> 
>>>>>>> ## The Problem ##
>>>>>>> 
>>>>>>> As we already know, a bunch of Facebook OpenGraph properties are
>>>>>>> expressed with CURIEs where the parts after the prefix themselves
>>>>>>> contain colons. For instance, "video:actor:role", and
>>>>>>> "my-og-app:podcast:url" as seen in the examples at [2]. (There are
>>>>>>> also 13 such properties defined in <http://ogp.me/ns#>, e.g.
>>>>>>> "og:image:width" and "og:video:height".)
>>>>>>> 
>>>>>>> We currently define CURIEs as:
>>>>>>> 
>>>>>>>    curie       ::=   [ [ prefix ] ':' ] reference
>>>>>>>    reference   ::=   irelative-ref ; (as defined in [RFC3987])
>>>>>>> 
>>>>>>> Now, I may be too tired to see clearly, but if I read the definition
>>>>>>> of irelative-ref in section 2.2 of RFC 3987 [3] correctly, it actually
>>>>>>> prohibits such CURIEs!
>>>>>>> 
>>>>>>> Let me explain. I find these to be the relevant definitions in RFC 3987:
>>>>>>> 
>>>>>>>    irelative-ref  = irelative-part [ "?" iquery ] [ "#" ifragment ]
>>>>>>> 
>>>>>>>    irelative-part = "//" iauthority ipath-abempty
>>>>>>>                   / ipath-absolute
>>>>>>>                   / ipath-noscheme
>>>>>>>                   / ipath-empty
>>>>>>> 
>>>>>>>    ipath-absolute = "/" [ isegment-nz *( "/" isegment ) ]
>>>>>>>    ipath-noscheme = isegment-nz-nc *( "/" isegment )
>>>>>>>    ipath-empty    = 0<ipchar>
>>>>>>> 
>>>>>>>    isegment-nz-nc = 1*( iunreserved / pct-encoded / sub-delims
>>>>>>>                        / "@" )
>>>>>>>                  ; non-zero-length segment without any colon ":"
>>>>>>> 
>>>>>>> If I interpret the ABNF [4] properly, given "og:image:width", I get
>>>>>>> the following:
>>>>>>> 
>>>>>>> * "og:" matches the prefix and ":", so we match "image:width" against
>>>>>>> irelative-ref;
>>>>>>> * there is no "?" or "#" in that, so only irelative-part is considered;
>>>>>>> * it does not start with "//", so we skip the following (iauthority
>>>>>>> ipath-abempty) of the first alternative;
>>>>>>> * it does not start with "/", so it is not an ipath-absolute;
>>>>>>> * it contains a colon ":", so it is not an ipath-noscheme (does not
>>>>>>> match isegment-nz-nc *( "/" isegment ));
>>>>>>> * it is not empty, so it is not an ipath-empty.
>>>>>>> 
>>>>>>> With no more alternatives in irelative-part, I conclude that
>>>>>>> "og:image:width" is not a valid CURIE!
>>>>>>> 
>>>>>>> Please correct me if I'm wrong here! If not, it is quite evident that
>>>>>>> we have to fix this (lest we accept to break a widely deployed
>>>>>>> de-facto usage).
>>>>>>> 
>>>>>>> Ironically, we *do* allow for CURIEs to begin with "//". This makes it
>>>>>>> possible to use CURIEs *indistinguishable* from "normal" IRIs (using
>>>>>>> authority and paths), as explained in ISSUE-125 (and in my old (dead
>>>>>>> horse) ISSUE-90 [5]).
>>>>>>> 
>>>>>>> 
>>>>>>> ## The Proposal ##
>>>>>>> 
>>>>>>> We have the opportunity here to fix a lot of things. I propose to
>>>>>>> define CURIEs along the lines of:
>>>>>>> 
>>>>>>>    curie           =   [ prefix ] ':' local
>>>>>>>    prefix          =   PN_PREFIX; as defined in SPARQL 1.1 [6]
>>>>>>>    local           =   (ipath-rootless / ipath-empty)
>>>>>>>                            [ "?" iquery ] [ "#" ifragment ]
>>>>>>> 
>>>>>>>    ipath-rootless  = isegment-nz *( "/" isegment )
>>>>>>>    isegment        = *ipchar
>>>>>>>    isegment-nz     = 1*ipchar
>>>>>>>    ipchar          = iunreserved / pct-encoded / sub-delims / ":"
>>>>>>>                        / "@
>>>>>>> 
>>>>>>> .. For comparison, this is the definition of the full IRI:
>>>>>>> 
>>>>>>>    IRI         = scheme ":" ihier-part [ "?" iquery ]
>>>>>>>                         [ "#" ifragment ]
>>>>>>> 
>>>>>>>    ihier-part  = "//" iauthority ipath-abempty
>>>>>>>                / ipath-absolute
>>>>>>>                / ipath-rootless
>>>>>>>                / ipath-empty
>>>>>>> 
>>>>>>> 
>>>>>>> ## The Consequences ##
>>>>>>> 
>>>>>>> This (if I'm awake enough) stills allow for *all* the use cases that
>>>>>>> have hitherto been put forward as needed. E.g.:
>>>>>>> 
>>>>>>>    schema:Person/Doctor
>>>>>>>    og:video:height
>>>>>>>    db:resource/Albert_Einstein
>>>>>>>    ex:some?very=special#thing
>>>>>>> 
>>>>>>> (While it is true that it would prevent the "hack" once presented as a
>>>>>>> means of using full IRIs where RDFa 1.0 only allows CURIEs (by using
>>>>>>> @xmlns:http="http:"), isn't that moot? Any processor affected by this
>>>>>>> change in RDFa 1.1 should reasonably use RDFa 1.1 rules, where we now
>>>>>>> allow such IRIs anywhere CURIEs are allowed. (And for that matter, I
>>>>>>> don't recall any reports of actual usage of that.))
>>>>>>> 
>>>>>>> Most importantly, this completely eliminates the risk of confusing
>>>>>>> CURIEs with normal IRIs. That is, IRIs with a scheme followed by "//",
>>>>>>> an authority, and a path of segments (separated with "/"), followed by
>>>>>>> optional "?" query and "#" fragment parts. These are the kinds of IRIs
>>>>>>> that can be expressed in various relative forms and resolved against a
>>>>>>> base IRI.
>>>>>>> 
>>>>>>> Looking at the list of official and common URI schemes at [7], I find
>>>>>>> that of the 137 schemes, 71 (52%) are in the authority+path form. As
>>>>>>> we know, the prevalent two on the web, http and https, are of this
>>>>>>> kind (arguably the only relevant ones). I'd wager that we can expect
>>>>>>> this form to stay prevalent on the web *even* if "http" we're to be
>>>>>>> eventually superseded. (I say so because relative paths are immensely
>>>>>>> usable, and there is an abundance of code dealing with hierarchical
>>>>>>> URL/URI resolution. Combined with the DNS-based authority model it's
>>>>>>> reasonably here to stay.)
>>>>>>> 
>>>>>>> Note also the fact that "http" used as prefix has already turned up in
>>>>>>> the wild, due to the HTTP Vocabulary Working Draft [8]. This has even
>>>>>>> been used in the RDFa 1.1 Core spec itself (as I recently reported in
>>>>>>> my review). To my knowledge, we have asked the ERT WG to change this,
>>>>>>> but this has not yet happened. With this change, such as prefix would
>>>>>>> no longer be a (technical) problem.
>>>>>>> 
>>>>>>> The other form is of the "opaque" IRIs (without an authority part and
>>>>>>> possibly no "/" separated segments (i.e. "non-relativizable")).
>>>>>>> Seemingly we've hitherto *unintentionally* prevented some of them
>>>>>>> (e.g. urn: and tag: URIs); but at the price of the OpenGraph CURIEs.
>>>>>>> There are some fairly well-known schemes in this group (official or
>>>>>>> not), e.g.: mailto, tag, urn, doi, geo, tel, callto, news, xmpp, sip,
>>>>>>> sms, bitcoin, gtalk, skype, spotify. Of these, "tag" and "geo" can be
>>>>>>> found in prefix.cc. (I've previously mentioned that "geo" may be of
>>>>>>> some concern for certain RDFa users [9].) But as we've already
>>>>>>> concluded when resolving ISSUE-90, we argue that these will probably
>>>>>>> not be used as prefixes, and will be quite uncommon as schemes of
>>>>>>> subject or object IRIs in RDFa. Also, given that many IRIs using these
>>>>>>> schemes already are reminiscent of CURIEs, and are of a rather
>>>>>>> specialized nature, I'd imagine that it's easier for anyone coming
>>>>>>> across such oddities to recognize the collision risk, should it ever
>>>>>>> happen. We should still be very clear in the section about CURIEs
>>>>>>> though, that prefixes overshadow schemes in IRIs of these forms, and
>>>>>>> that we advice users to monitor the in-scope prefixes for any such
>>>>>>> collision (along with the workaround accomplishable by using e.g.
>>>>>>> @prefix="geo: geo:").
>>>>>>> 
>>>>>>> 
>>>>>>> ## Summary ##
>>>>>>> 
>>>>>>> I sincerely hope that I have interpreted the ABNF correctly and
>>>>>>> haven't raised the issue of OpenGraph CURIEs in error. And that I have
>>>>>>> made a clear and satisfactory draft proposal for fixing both this and
>>>>>>> the problems raised in ISSUE-125 (primarily the risk of confusing
>>>>>>> CURIEs with normal IRIs).
>>>>>>> 
>>>>>>> Best regards,
>>>>>>> Niklas
>>>>>>> 
>>>>>>> [1]: http://www.w3.org/2010/02/rdfa/track/issues/125
>>>>>>> [2]: http://developers.facebook.com/docs/opengraph/objects/builtin/
>>>>>>> [3]: http://tools.ietf.org/html/rfc3987#section-2.2
>>>>>>> [4]: http://en.wikipedia.org/wiki/Augmented_Backus%E2%80%93Naur_Form
>>>>>>> [5]: http://www.w3.org/2010/02/rdfa/track/issues/90
>>>>>>> [6]: http://www.w3.org/TR/2012/WD-sparql11-query-20120105/#rPNAME_LN
>>>>>>> [7]: http://en.wikipedia.org/wiki/URI_scheme
>>>>>>> [8]: http://www.w3.org/TR/HTTP-in-RDF10/
>>>>>>> [9]: http://lists.w3.org/Archives/Public/public-rdfa-wg/2011Aug/0039.html
>>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> ----
>>>>>> Ivan Herman, W3C Semantic Web Activity Lead
>>>>>> Home: http://www.w3.org/People/Ivan/
>>>>>> mobile: +31-641044153
>>>>>> FOAF: http://www.ivan-herman.net/foaf.rdf
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> 
>>>>> 
>>>> 
>>>> 
>>>> ----
>>>> Ivan Herman, W3C Semantic Web Activity Lead
>>>> Home: http://www.w3.org/People/Ivan/
>>>> mobile: +31-641044153
>>>> FOAF: http://www.ivan-herman.net/foaf.rdf
>>>> 
>>>> 
>>>> 
>>>> 
>>>> 
>> 
>> 
>> ----
>> Ivan Herman, W3C Semantic Web Activity Lead
>> Home: http://www.w3.org/People/Ivan/
>> mobile: +31-641044153
>> FOAF: http://www.ivan-herman.net/foaf.rdf
>> 
>> 
>> 
>> 
>> 
> 


----
Ivan Herman, W3C Semantic Web Activity Lead
Home: http://www.w3.org/People/Ivan/
mobile: +31-641044153
FOAF: http://www.ivan-herman.net/foaf.rdf







Received on Thursday, 26 January 2012 12:02:20 UTC

This archive was generated by hypermail 2.3.1 : Tuesday, 6 January 2015 21:19:55 UTC