- From: Andy Seaborne <andy.seaborne@epimorphics.com>
- Date: Tue, 21 Feb 2012 19:02:45 +0000
- To: RDF-WG <public-rdf-wg@w3.org>
This comment to the SPARQL-WG is also about wanting unescaped / in Turtle prefix names. Andy On 18/02/12 02:43, David Robillard wrote: > Hello, > > Apologies for sending this past the Last Call, but I have a comment > about the decision to combine PNames and Property Paths in SPARQL and > escaping PNames to resolve the problems this causes. > > My perspective is mainly that of a Turtle user/implementer. I > discovered this issue updating my Turtle implementation[1] for the > latest spec. I discovered that an odd new rule has been added to the > grammar: > > [163s] PN_LOCAL_ESC ::= '\\' ( '_' | '~' | '.' | '-' | '!' | '$' | '&' | > "'" | '(' | ')' | '*' | '+' | ',' | ';' | '=' | ':' | '/' | '?' | '#' | > '@' | '%' ) > > Unhappy with how ugly this is, and puzzled why such a specific seemingly > arbitrary set of characters has been introduced as escapes in PNames, I > investigated. It turns out this is from SPARQL, and the escapes are to > avoid clashing with Property Paths (hereafter just "paths"). > > This seems like a problem to me: the Turtle specification now has a > strange and unpleasant grammar rule from a different specification, to > mesh with a concept that is meaningless in the context of a Turtle > document. I do agree, though, that copy/paste compatibility between > statements in both languages is highly desirable. > > My main point is about the method: I think escaping is a very poor way > of achieving this, and quotation is more appropriate. Either Paths, or > PNames, should be quoted, or have a special leading character, to remove > this ambiguity. > > Some cons of the current escaping scheme: > > * Escaping is ugly, and difficult to work with. Paths that include > pnames with special characters are difficult to read. > > * Copying from other data sources that use these characters is > difficult, so much so that expecting a user to manually do this (i.e. > escape every character in the above list) is not realistic, and > error-prone. > > * This effectively prevents future revisions of SPARQL from adding > anything to the path syntax. If both of these specs become > recommendations, then Turtle (and the corresponding rules in SPARQL > itself) will have baked-in escapes specifically to work around path > syntax. None can be added, because this will break the rules for > PNames, in both SPARQL and Turtle. > > * The very existence of escaping implies there is a need to express > these characters in PNames. However, this has been made tedious and > ugly to accomodate paths. In my opinion, this is somewhat backwards. > Both languages should have a clean PName syntax. Paths are a different > thing, and should be clearly designated as such. Put another way, > property paths are not pnames, and crippling the pname syntax for paths > is a poor design when there are very simple alternative ways of > differentiating the two. > > Some pros of quoting, rather than escaping: > > * Much easier to read. Even in a purely SPARQL context, ignoring > Turtle, having a path be very clearly delineated is much simpler to read > than navigating a mess of escapes and trying to mentally parse what is > going on. > > * Turtle is not 'infected' by this SPARQL specific grammar > consideration, and both can use a simpler, more expressive, and more > friendly PName grammar. SPARQL is not 'locked in' forevermore and is > free to update the path syntax in the future. > > * Copy/paste compatibility with other data sources is much simpler, > since quoting is easy, unlike escaping. It is also less error prone, > since only the quote character needs special consideration. > > * The grammars become cleaner, since Path rules and PName rules are > clearly distinct (though the former would refer to the latter). The > PName rules do not need to take into consideration every character used > in the Path syntax, which is crucial since the PName rules must be in > Turtle as well. The current PName rule is a symptom that different > types of tokens have not been properly distinguished. > > * The PName rules would be far more (possibly entirely) compatible with > CURIES, rather than extremely SPARQL specific. > > I am not sure exactly what to suggest in terms of syntax. It seems most > in-line with existing practice to not quote 'top-level' PNames, but > rather quote paths somehow. This resolves the Turtle problems, but does > not resolve issues with PNames inside paths. Here, it seems quoting is > best. One proposal: paths always have a leading '/', and PNames within > paths are quoted with '[' and ']' (as in the CURIE spec). Thus, the > example: > > ?x foaf:knows/foaf:name ?name . > > Would become: > > ?x /[foaf:knows]/[foaf:name] ?name . > > The quoting means the PNames are free to contain extended characters, > e.g. rather than the unwieldy: > > ?x eg:foo\/bar\/baz/eg:terms\/a\+b ?b . > > You would have: > > ?x /[eg:foo/bar/baz]/[eg:terms/a+b] ?b . > > Importantly, no quoting of PNames in any other context is necessary, and > no escaping of PNames is necessary at all, which is a significant win > for "copy-paste compatibility" (quoting could also be optional in > paths). > > The prefix character is analogous to the '?' used for variables. This > works well, and is very simple, since a token that starts with a '?' is > clearly a variable, and there is no clashing. Paths (indeed, any new > kind of token) should be similarly simple to distinguish. A token that > starts with a '?' is a variable. A token that starts with a '/' is a > property path. Simple, consistent, extensible. > > Note these are just off-the-cuff examples, I have not thought much about > the best syntax. Leading slash for paths and [] quoting as above may > not be the best choices for whatever reason; I am more interested in > highlighting the problem first. If quoting in paths is not popular, I > wouldn't mind escaping *only in paths* - at least that doesn't wreck > Turtle. > > In my opinion, this is a very serious issue. I have a strong aversion > to implementing these PName escapes in Turtle, and consider it an > outright error. Again, apologies for being late, but a more palatable > resolution to this problem would be a significant improvement, and > prevent future problems. > > Thanks, > > -dr > > [1] http://drobilla.net/software/serd/ > >
Received on Tuesday, 21 February 2012 19:03:17 UTC