- From: Jeremy Carroll <jeremy@topquadrant.com>
- Date: Mon, 02 Nov 2009 23:30:23 -0800
- To: public-rdf-dawg-comments@w3.org
- CC: Holger Knublauch <holger@topquadrant.com>, Jeremy Carroll <jcarroll@topquadrant.com>
Thank you for your time and attention at the WG meeting today.
TopQuadrant would like Holger's earlier comment [1] to be treated as a
formal comment. (i.e. with an official WG response on this mailing list).
My understanding from today's meeting is that that is likely to be that
the WG has already considered the LET design and believes the AS design
to be adequate.
(LET is merely an abbreviated form for certain AS constructs). I also do
not believe that TopQuadrant is bringing any new information that was
not considered at your f2f meeting [2].
We however feel strongly about this, and are likely to raise a formal
objection (in the sense that we believe it would be better for the WG to
take a few weeks longer over SPARQL 1.1 and get this right, than to
deliver SPARQL 1.1 on schedule without this feature).
Thinking through particularly Steve's comments, I tried to come up with
an example illustrating how the ordering of operations that is sometimes
required is better articulated with LET than with AS.
This example is not as polished as I would like, since I believe it is
more helpful to contribute during your F2F meeting.
First I wish to clarify that this is not about whether or not assignment
should be in SPARQL 1.1. Assignment is in already, with the AS construct
that was discussed under item 39. This issue is purely about the syntax
and scoping rules for the single assignment capability.
Many of the sort of processing tasks that we and are customers have
involve mapping several legacy sources together, merging them into one
RDF graph, and then doing some processing.
A frequent problem is that different legacy sources represent the same
data in different ways, e.g. with different case conventions, in
different units, or whatever. In these cases, data laundry of one sort
or another is necessary. One option for laundry is using functions and
assignment within SPARQL.
So for my example, I am taking information about alumni at a college and
trying to find the appropriate year photo for them.
I will simplify the name problem to a name consist of a first name and a
last name, (no middle initial), but people change their last name from
time to time.
The data sources that I have include:
- a current mailing database, with full-names, e-mail addresses, and
addresses
a:fullName a:email a:address
_:w a:fullName "John Smith" .
_:w a:email <mailto:john.smith@example.org>.
- a database with students first names and last names and former last names
to simplify processing I just use two properties
b:firstName
b:lastName
for example:
_:x b:firstName "John" .
_:x b:lastName "Doe" .
_:x b:lastName "Smith".
shows that the person known as John Doe and the person known as John
Smith are one and the same, without clarifying the chronology of the
name change.
- a database with date of matriculation, and years of study, by full
name at time of matriculation
c:matriculationDate c:studyYears c:fullName
_:y c:fullName "John Doe" .
_:y c:studyYears "P1Y"^^xs:yearMonthDuration .
_:y c:matriculationDate "1988-09-01"^^xsd:date.
- and a list of graduation photo names by year.
d:year d:fileName
_:z d:year "1988"^^xsd:date
_:z d:fileName "classOf88"
- I have arranged these photos as jpg files on the web at
http://www.example.org/photos
http://www.example.org/photos/classOf88.jpg
SELECT ?eMail ?image
WHERE
{ ?a a:email ?eMail .
?a e:fullName ?fullName
LET ( ?fullNameSpaceNormalized=normalize-space(?fullName) ) [A]
LET ( ?firstName=substring-before(?fullNameSpaceNormalized," ") [B]
?lastName=substring-after(?fullNameSpaceNormalized," ") )
?b b:firstName ?firstName .
?b b:lastName ?lastName .
?b b:lastName ?altLastName . [C]
LET ( ?altName=concat(?firstName, " ", ?altLastName ) )
?c c:fullName ?a;tName .
?c c:studyYears ?lengthOfCourse .
?c c:matriculationDate ?matriculate .
LET (?endDate=|year-from-date(add-yearMonthDuration-to-date(?matriculate,?lengthOfCourse)) )
?d d:year ?endDate .
?d d:fileName ?imageFile .
LET ( ?image = xs:anyURI(concat("http://www.example.org/photos", ?imageFile, ".jpg" ) ) )|
}
Notes:
[A] for robustness against leading/trailing space and/or double space in
the middle
[B] cannot be combined with [A] because of rules discussed under issue 39
[C] ?altLastName can be the same as ?lastName
I believe the WG is considering recommending that this query should be
written as follows.
SELECT ?eMail,
xs:anyURI(concat("http://www.example.org/photos", ?imageFile,
".jpg" ) ) as ?image
WHERE {
SELECT ( *
year-from-date(add-yearMonthDuration-to-date(?matriculate,?lengthOfCourse))
AS ?endDate )
WHERE {
SELECT ( * concat(?firstName, " ", ?altLastName ) AS ?altName )
WHERE {
SELECT (* substring-before(?fullNameSpaceNormalized," ")
AS ?firstName,
substring-after(?fullNameSpaceNormalized," ") AS ?lastName )
WHERE {
SELECT (* normalize-space(?fullName) as ?fullNameSpaceNormalized)
WHERE {
?a a:email ?eMail .
?a e:fullName ?fullName .
}
}
?b b:firstName ?firstName .
?b b:lastName ?lastName .
?b b:lastName ?altLastName .
}
?c c:fullName ?a;tName .
?c c:studyYears ?lengthOfCourse .
?c c:matriculationDate ?matriculate .
}
?d d:year ?endDate .
?d d:fileName ?imageFile .
}
(Using the equivalence from [3])
We believe that this is inferior.
Harder to write, harder to read, harder to understand, and that the cost
of complicating the language by having two ways to say the same thing is
well worth it.
Jeremy Carroll
AC Rep, TopQuadrant.
[1]
http://lists.w3.org/Archives/Public/public-rdf-dawg-comments/2009Oct/0003
[2]
http://www.w3.org/2009/sparql/meeting/2009-05-06#ProjectExpressions___26___20_Assignment
[3]
http://www.w3.org/2009/sparql/wiki/Feature:Assignment#Equivalence_with_SubSelects_and_ProjectExpressions
Received on Tuesday, 3 November 2009 07:30:55 UTC