Re: PrEfIx and BaSe should be removed from Turtle from Sandro Hawke on 2013-05-22 (public-rdf-comments@w3.org from May 2013)

From: Sandro Hawke <sandro@w3.org>
Date: Wed, 22 May 2013 18:31:20 -0400
To: Peter Ansell <ansell.peter@gmail.com>
CC: Gavin Carothers <gavin@carothers.name>, RDF-WG WG <public-rdf-wg@w3.org>, "public-rdf-comments@w3.org" <public-rdf-comments@w3.org>, David Booth <david@dbooth.org>, David Robillard <d@drobilla.net>, Gregory Williams <greg@evilfunhouse.com>
Message-ID: <519D4738.6020807@w3.org>
On 05/22/2013 05:03 PM, Peter Ansell wrote:
>
>
> On 23 May 2013 02:43, Sandro Hawke <sandro@w3.org 
> <mailto:sandro@w3.org>> wrote:
>
>     On 05/15/2013 10:48 PM, Gavin Carothers wrote:
>>
>>
>>       <https://gist.github.com/gcarothers/5589022#prefix-and-base-should-be-removed-from-turtle>PrEfIx
>>       and BaSe should be removed from Turtle
>>
>>     PREFIX and BASE were added as features at risk to Turtle before
>>     CR. Designed to make it easier to copy and paste between Turtle
>>     and SPARQL they have proven controversial. As they are a
>>     controversial addition to a settled language they should be removed.
>>
>
>     I don't agree.     Removing them would do a disservice to the
>     future users of Turtle.
>
>     To be clear: it's a "coin-toss" non-technical decision whether to
>     require the @-sign.   It doesn't have a significant impact on
>     implementation difficulty; has no impact on expressive power,
>     etc.   It's just a question of style and tradition.
>
>     As a question of style, I think the @ is ugly and annoying.    "@"
>     means a lot of different things, usually somewhat related to the
>     notion of "at".    sandro "at" w3.org <http://w3.org>.    comment
>     reply directed "at" sandro.    In Turtle it already has another
>     meaning, for language tags: "chat" "at" fr.  Okay, that's not a
>     good fit for "at" -- but it's still another meaning for "@".   
>     How are people supposed to read "@prefix"?   Like it's a  chat
>     message directed at the prefix processor?
>
>     Historically, the @ was there because Tim didn't want to claim the
>     barewords in the language for keywords.   He wanted to allow for
>     prefix-less terms, I believe.    But the "a" keyword already
>     violates that, and Turtle isn't going to allow prefix-less terms.
>       (I pushed for that at one point and the WG had zero interest.)  
>     So that reason has no weight -- there's no longer a technical
>     reason to have @.
>
>     Now we come to tradition.  Here we have dueling traditions of
>     N3/Turtle with the @-sign and SPARQL (and kind of every other
>     language) without the @-sign.   On the one hand, we have lots of
>     people using N3 and Turtle.  On the other hand, we have lots of
>     people using SPARQL.   SPARQL is a W3C Recommendation while N3 and
>     Turtle are not.   SPARQL is perhaps used more internally, while N3
>     and Turtle are published in the open.
>
>     Some people say, "but they are different" -- SPARQL is a database
>     interface language, Turtle is a data format, so they can and
>     should be arbitrarily different -- but I don't find this
>     compelling at all.  I think it makes perfect sense to say Turtle
>     and SPARQL (and TriG) are composed of the same pieces, put
>     together in the obvious ways for the obvious purposes.  To say
>     that two of the pieces (the directives with and without @-sign)
>     serve the exact same purpose but are not interchangeable seems to
>     me to be an obvious design flaw.
>
>     We can't just remove the @-sign, because there's lots of Turtle
>     using it already, but I see no reason not to allow a slow
>     migration to begin.
>
>     Bottom line:*I think it's burden on users to have to keep track of
>     arbitrary and meaningless syntactic differences between closely
>     related languages.*   There's a burden to changing, but I believe
>     Turtle will have far more users in the future than it has today.
>
>     As I said on the call, this is not a do-or-die thing, of course.  
>     I don't think we need to spend a lot of time arguing about it.  
>     Folks did ask me to state my reasons, though, so I've done that.  
>     If we require @-signs in Turtle, I believe it will be a
>     mistake.    (Not the first, not the last, ...  :-)
>
>
> In this case the delayed standardisation process doesn't really allow 
> for you to say "If we require", as practically the W3C would be 
> expected to standardise something that fits with existing deployments 
> unless there is a technical hurdle within the format. If the W3C were 
> designing the syntax from scratch within a standardisation track then 
> you would be perfectly able to say that existing deployments were just 
> experimentally using an early draft of the format and they could 
> change their limited, experimental, deployments over to the new 
> syntax. That clearly isn't the case here where there are a large 
> number of static Turtle files and RDF-to-Turtle serialisers that have 
> always used it. As I said to Eric, and as Gavin quoted below, you 
> can't use the "future users are more significant then current users" 
> to remove support for a feature that isn't actually hindering anything 
> technically.
>
> If Turtle is not going to remove support for the syntax, then it would 
> be much more confusing for users to have two fairly different syntaxes 
> (given the difference with the dot at the end) than to have one, as 
> machine generated Turtle may still use @prefix and @base for the 
> foreseeable future to maintain backwards compatibility.

We have two possible stories:

1.  In SPARQL you use prefix-no-period and in Turtle you use 
@prefix-with-period.  And that's how it will always be.

-or-

2.  Turtle is transitioning to match SPARQL with prefix-no-period. 
During the transition, you may still want to generate 
@prefix-with-period to support older systems.   In the future, you may 
still encounter old documents with @prefix-with period; systems MUST 
still accept them.

I, personally, am much more comfortable with the second story.  It 
acknowledges a sort-of mistake, does what we can to repair the 
situation.   The first story seems to me to be denying the mistake, to 
the detriment of the vast majority of users, long term.

But this is more of a personal view than a technical one, so I don't 
expect to change a lot of minds.    The objective thing we could do is 
try to test users to see which is easier for them, but that's a lot of 
work, and it's hard to know which users are the right ones to test.

>     (Closely related, I think TriG (beyond the directives) needs to be
>     a subset of the SPARQL quad pattern syntax, allowing the GRAPH
>     keyword and not requiring braces around the default graph.  I
>     mention it here because the arguments are very similar.)
>
>
> Does TriG currently allow @base redefinitions or @prefix definitions 
> inside of Graphs? If so we will be in exactly the same position there 
> where there is no copy and paste support anyway, lest someone has 
> relied on the ability to redefine the base within their triples for 
> some valid purpose.
>
> In the same way people are not allowed to copy and paste Turtle 
> documents into SPARQL if they rely on that ability, or if they are not 
> completely sure that their copy and pasted Turtle file does not rely 
> on that ability or have any redefined prefixes internally. Hence, in 
> both cases, not switching to use the SPARQL style when it is not copy 
> and pastable in the general case doesn't have a technical downside, as 
> the ability was never really there.
>

Copy-paste isn't an argument I'm making.   To me, this is about people 
trying to remember the syntaxes of two languages (and their arbitrary, 
meaningless differences).

For copy paste to/from SPARQL one is probably going to have to paste the 
directives in separately, yes.

      -- Sandro



>          -- Sandro
>
>
>>         <https://gist.github.com/gcarothers/5589022#rationale>Rationale
>>
>>     The FPWD for the Turtle document published by this working group
>>     included the following:
>>
>>         Turtle is already a reasonably settled serialization of RDF.
>>         Many implementations of Turtle already exist, we are hoping
>>         for feedback from those existing implementers and other
>>         people deciding that now would be a good time to support
>>         Turtle. There are still a few rough edges that need
>>         polishing, and better alignment with the SPARQL triple
>>         patterns. The working group does not expect to make any large
>>         changes to the existing syntax.
>>
>>     The laudable goal of reducing copy and paste errors in both
>>     Turtle and SPARQL is not strong enough to change a settled
>>     serialization.
>>
>>     None of the arguments in support of the feature deal with what
>>     serialization should be preferred. Introducing the feature also
>>     creates issues with the existing @prefix and @base syntax and if
>>     they should or should not require a trailing period. Again, there
>>     is no clear guidance for serializers. This feature also
>>     introduces the first case insensitive keyword in Turtle all other
>>     Turtle keywords are case sensitive.
>>
>>     Lastly, the goal of full copy and paste between SPARQL and Turtle
>>     will not be achieved only by adding this feature. A range of
>>     other issues exist preventing this.
>>
>>
>>         <https://gist.github.com/gcarothers/5589022#actions>Actions
>>
>>       * Remove "Feature at Risk" box
>>       * Remove grammar rules 5s, 6s
>>       * Update grammar rule 3 to read |prefixID | base|
>>       * Change grammar note 1 to "All keywords are case sensitive"
>>
>>     The grammar at that point requires no magical understanding of
>>     case rules as all rules express the case of their keywords.
>>
>>
>>         <https://gist.github.com/gcarothers/5589022#ways-forward-for-those-that-like-copying-and-pasting-prefixbase>Ways
>>         forward for those that like copying and pasting PREFIX/BASE
>>
>>     I'd direct your attention to the conformance section of the
>>     Turtle document. http://www.w3.org/TR/turtle/#conformance "This
>>     specification does not define how Turtle parsers handle
>>     non-conforming input documents." That note is not there by mistake.
>>
>>
>>         <https://gist.github.com/gcarothers/5589022#selected-messages-and-threads-from-outside-of-wg>Selected
>>         Messages and Threads (From outside of WG)
>>
>>     Yes, I read all the threads again, just a few are included here.
>>
>>
>>           <https://gist.github.com/gcarothers/5589022#supporting>Supporting
>>
>>     David Booth david@dbooth.org <mailto:david@dbooth.org>
>>     http://www.w3.org/mid/516EAFDF.50708@dbooth.org
>>
>>     Copy and paste between Turtle and SPARQL is /very/ common,
>>     particularly in debugging. Having to change the prefix syntax
>>     back and forth is a significant and pointless waste of time.
>>     Please find a path to a single compatible syntax.
>>
>>
>>           <https://gist.github.com/gcarothers/5589022#in-opposition>In opposition
>>
>>     Gregory Williams greg@evilfunhouse.com
>>     <mailto:greg@evilfunhouse.com> 2013-03-01:
>>     http://www.w3.org/mid/C52BE515-076D-4D10-82D0-27FD757F2F48@EVILFUNHOUSE.COM
>>
>>     I'd like to take this opportunity to provide feedback on the
>>     inclusion of SPARQL BASE and PREFIX syntax in the new Turtle
>>     grammar. I think this is a mistake, adding complexity for both
>>     users and implementors. I'm sympathetic to the desire to align
>>     syntax for triples between Turtle and SPARQL, but don't believe
>>     the alignment is necessary or recommended for the top-level
>>     language syntax (as the need for backwards compatibility with
>>     pre-REC Turtle means that alignment requires two different
>>     syntaxes for the same declarations).
>>
>>     David Robillard d@drobilla.net <mailto:d@drobilla.net>
>>     2012-10-08:
>>     http://lists.w3.org/Archives/Public/public-rdf-comments/2012Aug/0028.html
>>
>>     For what it's worth, as a Turtle implementer I am opposed to this
>>     change. It is ugly and inconsistent with the language, clearly an
>>     import that does not belong.
>>
>>     While it is unfortunate that SPARQL did not use the @directive
>>     convention from N3, Turtle does not contain SELECT and such
>>     either. While triple copying and pasting between the languages is
>>     desirable (even if it has polluted Turtle with path grammar and
>>     horrific escaping rules), the directives between the two
>>     languages are not compatible.
>>
>>     I do not consider messing up the Turtle specification in this way
>>     appropriate. If implementations want to support this as an
>>     extension, they may. I won't.
>>
>>     Peter Ansell ansell.peter@gmail.com
>>     <mailto:ansell.peter@gmail.com> 2013-04-28
>>     http://www.w3.org/mid/CAGYFOCQKfygwogHQj_b7=nW1CrxM4aq5XUdgJg4nx4uaUSKZFw@mail.gmail.com
>>
>>     I am sorry, but I completely disagree that changing a fundamental
>>     part of an established syntax as part of its long-delayed
>>     "standardisation" process can be rationalised by saying that
>>     hypothetical future users will appreciate it and it doesn't
>>     matter what a community of current users think.
>>
>
>
Received on Wednesday, 22 May 2013 22:31:30 UTC