Re: Minor Syntax issues from Eric Prud'hommeaux on 2005-02-14 (public-rdf-dawg@w3.org from January to March 2005)

From: Eric Prud'hommeaux <eric@w3.org>
Date: Mon, 14 Feb 2005 03:58:33 -0500
To: 'RDF Data Access Working Group' <public-rdf-dawg@w3.org>
Message-ID: <20050214085831.GG22068@w3.org>
On Sun, Feb 13, 2005 at 10:56:25AM +0000, Steve Harris wrote:
> 
> On Fri, Feb 11, 2005 at 09:02:01 +0000, Andy Seaborne wrote:
> > 
> > This is about tuning the current syntax, post-WD2 publication, not 
> > redesigning the whole thing.
> > 
> > 1/ Bound
> > 
> > This is special because it tests the variable, not the value.  It's the only
> > case where this happens.
> > 
> > The suggestion (PatH) was to make this different.  In other programming 
> > languages, there is just a plain function like many other library 
> > functions. It returns a value (a boolean) like any other function.
> > 
> > Options:
> > 1a/ BOUND(?x)   -- as the current grammar
> > 1b/ BOUND[?x]   --  different grouping
> > 
> > Anything with a colon in it will look like a qname.
> > 
> > BOUND ?x is dangerous as it does not express the tight binding nature of
> > this operator: "BOUND ?x && ?y" is strange.
> 
> There is a precident, that is what perl does. Not that I'm holding up perl
> as a model of clarity :)
>  
> > I prefer "BOUND(?x)" -- leave as is.
> 
> I have a mild preference for BOUND ?x, but BOUND(?x) is fine too.
> 
> > BOUND[] as a one-off is over doing it.
> > 
> > 2/ AND
> > 
> > AND is a special keyword that starts constraints (SUCH THAT would be better
> > but its two words).

As English words go, WHERE is my personal favorite:
  { (?part g:id ?id)
    (?part g:od ?od) WHERE ?id < ?od
    (?part p:name "nut") }
It's nice for mathematicians.

Using the []s syntax is more pleasing to my eyes:
  { (?part g:id ?id)
    (?part g:od ?od) [?id < ?od]
    (?part p:name "nut") }


> >                      Currently in the grammar it is required because ?x-?y 
> > is unclear : can be "?x binary minus ?y" or two expressions "?x" then 
> > "unary minus ?y"
> > 
> > Proposal: use [] to mark constraints (see below).
> 
> I dont like this very much. [] has lots of uses in syntax and programming
> langages, but none of them relate to contraints in my experience.

I think the important one is XPath.
  <xsl:for-each select="/root/element[@foo='bar']"/>

> > 3/ OPTIONALS
> > 
> > There are two syntactic forms "OPTIONAL" and "[]"
> > 
> > Proposal: just the OPTIONAL form, freeing up [] for constraints.
> 
> I have no proboem with using the OPTIONAL keyword instead of [].

OPTIONAL++

> > 4/ Functions , casting and specials.
> >     &ex:foo() , xsd:byte(23) , isBlank(?x)
> > 
> > These have different aspects:
> > 
> > Functions act on values.  Currently, they are only filters (boolean valued).

This is essential (and needs some explanation to that effect) for
isBlank as every URI solution implies a bNode solution that isBlank
will match. This is probably not what you had in mind when you asked
the question.

> > The specials (isURI and friends) act on graph elements, not the individuals
> > represented by those elements.  The could be functions if we define their 
> > values - that would require a set of functions that all implementations had 
> > to have.  At the moment, the function mechanism is an extension point and 
> > an implementation can choose not to provide it at all.
> > 
> > Casting is like a function but it returns a value in a constrained way (no
> > assignment to variables, fixed set of casts) and the return is typed.
> 
> We dont have assignment, so the distinction is moot.

But we do have functions calling functions. This is essential for
operators that are not distinguished by the syntax. For instance:

    flt:UA9660 from "TYO")
    flt:UA9660 to "SFO"
    flt:UA9660 arrivalTime "20050215T18:30Z"
    flt:UA174  from "SFO"
    flt:UA174  departureTime "20050215T19:30Z"
    flt:UA174  to "BOS"

    (?flight1 from "TYO")
    (?flight1 to ?layover)
    (?flight1 arrivalTime ?arv)
    (?flight2 from ?layover)
    (?flight2 departureTime ?dpt) [ xs:dateTime(?dpt) > xs:dateTime(?arv) ]
    (?flight3 to "BOS")

In the above data, ?dpt and ?arv are untyped literals (very common in
RDF in the wild).  Without the casts, we have no way to invoke
op:dateTime-less-than . The best we could do is use the ">" to hint at
op:numeric-less-than, which will fail as those strings aren't numbers.

I admit this example is slightly contrived, but I can't see any way to
ever invoke date comparison unless the literals are typed.
    "20050215T18:30Z"^^xs:dateTime

> > A concern I have with general functions (ones that return any value,
> > especially if they can assign to variables)) is that we getting into a
> > second computational system that needs a lot of thinking about, not in 
> > technical terms but in scope and appropriateness terms.
> > 
> > The obvious simplification is to use the same syntax of functions and casts,
> > make functions value-returning then provide a standard set of functions for
> > the casts.

That's consistent with XSLT and XQuery.

> >             The specials like isURI can be functions.
> 
> I like that. Fewer syntatic forms is good.
>  
> > [Not sure about typing of function returns which might be lost in such a
> > scheme.  Does it hurt optimization?]
> 
> I dont think so. Maybe makes it a bit harder, but its allready seriously
> hard.

I guess every dynamic constraint will be expressed in SQL as a set of
value pattern constraints. Wow, I hadn't even gone there.

> > 5/ LOAD => WITH
> > 
> > The word "LOAD" suggests, to some people, a permanent change to the 
> > database which is a wrong implication.  DaveB suggested changing the word 
> > to "WITH". I have done this change (rq23 and the tests).
> 
> OK by me.
>  
> > 6/ Clause order
> > 
> > The current order is:
> > 
> > BASE
> > PREFIX
> > SELECT
> > WITH
> > FROM
> > WHERE
> > LIMIT
> > 
> > which is a mixed style.  It would make sense to have WITH and FROM before 
> > SELECT (declarations first) and have LIMIT before WHERE (modifier to 
> > SELECT).  It has confused some RDQL users that FROM comes after SELECT.
> 
> >From before SELECT seems fine, its the other way round in SQL, but the SQL
> FROM is very different. OTOH I prefer LIMIT at the end, as its parallel
> with SQL is direct.
> 
> Incidentally, I dont think of LIMIT as modifying SELECT, I think of it as
> modyfying the result set.

I think the same is true in SQL. LIMIT and GROUPing aren't in
relational calculus. I bet SQL defines a solution as the result of
LIMITing/GROUPing/COUNTing performed after the relational part is
done. Anyone know where I can get a copy of, say, the SQL 92 spec?
-- 
-eric

office: +81.466.49.1170 W3C, Keio Research Institute at SFC,
                        Shonan Fujisawa Campus, Keio University,
                        5322 Endo, Fujisawa, Kanagawa 252-8520
                        JAPAN
        +1.617.258.5741 NE43-344, MIT, Cambridge, MA 02144 USA
cell:   +81.90.6533.3882

(eric@w3.org)
Feel free to forward this message to any list for any purpose other than
email address distribution.
Received on Monday, 14 February 2005 09:03:22 UTC