Re: Several minor problems in the grammar for the functional-style syntax from Peter F. Patel-Schneider on 2009-03-22 (public-owl-wg@w3.org from March 2009)

From: Peter F. Patel-Schneider <pfps@research.bell-labs.com>
Date: Sun, 22 Mar 2009 09:23:56 -0400 (EDT)
To: bparsia@cs.manchester.ac.uk
Cc: boris.motik@comlab.ox.ac.uk, ivan@w3.org, public-owl-wg@w3.org
Message-Id: <20090322.092356.194012045.pfps@research.bell-labs.com>

From: Bijan Parsia <bparsia@cs.manchester.ac.uk>
Subject: Re: Several minor problems in the grammar for the functional-style syntax
Date: Sun, 22 Mar 2009 12:42:14 +0000

[...]

> So, basically, the options are:
> 	1) Forbid [problem] characters in CURIEs, [...]
> 	2) Require percent encoding of problem characters
> 	3) Require safe CURIEs (per the spec)
> 
> The problem with 2 is that URI comparison become a bit trickier. What do
> we say now? We'd have to make sure that there was a URI normalization
> phase (or only a CURIE normalization?)
> 
> The problem with 3 is that it adds a bit of logic to the CURIE parse
> phase (i.e., check for leading [, make sure theirs a trailing ])
> 
> The problem with one is a burden on serializers.
> 
> I prefer 1. I think it's the smallest change from the status quo.
> 
> Cheers,
> Bijan.

The third alternative (#3) means that in the FS one has to put [] around
the short (and supposedly nicer-to-read) form of identifiers.  Ugh.  I
think that this is the worst alternative, particularly as other SW
serializations don't go this way.

The first alternative (#1) is somewhat better.  However, if we go this
way I would prefer going all the way back to QNames, even though QNames
are unnecessarily restrictive.  Going half-way means that our "CURIES"
are some entirely new thing.  An advantage of this way is that other SW
serializations do this or close to this.  (Perhaps we could slightly
generalize QNAMES as does SPARQL, specifically to allow leading digits.)

I would have preferred the second alternative (#2), *except* that it has
a big problem.  The problem is how to figure out which percent encodings
are done for our purposes, and which are done for purposes of encoding a
character in an IRI that clashes with an IRI delimiter.  I think that
this means that we can't recover the "true IRI" from our encoding.  To
fix this, we *could* have a different method for encoding our problem
characters, but I don't suggest going there.

peter

PS:  Note that this only affects the functional syntax and the
     Manchester syntax.

Received on Sunday, 22 March 2009 13:22:53 UTC