W3C home > Mailing lists > Public > public-sparql-dev@w3.org > July to September 2008

Re: Mulgara and sameTerm

From: James Leigh <james-nospam@leighnet.ca>
Date: Tue, 29 Jul 2008 11:49:44 -0400
To: public-sparql-dev@w3.org
Cc: "Seaborne, Andy" <andy.seaborne@hp.com>, Arjohn Kampman <arjohn@aduna-software.com>, Andrae Muys <andrae@netymon.com>, Paul Gearon <gearon@ieee.org>
Message-Id: <1217346584.7256.90.camel@jacob>

On Tue, 2008-07-29 at 10:44 -0500, Paul Gearon wrote:
> Because I was being asked to make this "work" with the SPARQL test
> suite, I presumed that duplication was required. I also presumed that
> most applications inserting a non-canonical form of data would stick
> to the same lexical form each time, which would minimize the issue for
> that application.
> 
> Of course, it is always possible to take the easy road and rely on
> RDF-equals. So instead of using:
>   ns:foo ns:bar ?x . ?x ns:baz ns:boo
> 
> You'd instead use:
>   ns:foo ns:bar ?x . ?y ns:baz ns:boo FILTER (?x = ?y)
> 
> However, this is never going to perform as well, and can potentially
> take up significantly more storage, so I'm not for it at all.
> 
If this brakes SPARQL compatibility, would you be against full SPARQL
compatibility in Mulgara?

> I'm OK to move this thread onto the SPARQL list.
> 
> Paul
> 
> On Tue, Jul 29, 2008 at 10:28 AM, Seaborne, Andy <andy.seaborne@hp.com> wrote:
> > Does anyone mind if this discussion happens on public-sparql-dev@w3.org?
> >
> >        Andy
> >
> >> -----Original Message-----
> >> From: James Leigh [mailto:james@leighnet.ca]
> >> Sent: 29 July 2008 13:21
> >> To: Arjohn Kampman; Seaborne@domain.invalid; Seaborne, Andy
> >> Cc: Paul Gearon; Andrae Muys
> >> Subject: Re: Mulgara and sameTerm
> >>
> >> Hi all,
> >>
> >> Including Andy to get his interpretation (read on down the page for more
> >> information).
> >>
> >> I spoke with Andrae (he is having email troubles). He thought this was a
> >> very serious problem and wanted to take this up with Andy Seaborne.
> >>
> >> His concerns where:
> >> The problem is that this would prevent us from ever storing nodes
> >> inline; forcing a string-pool lookup on *every* resolution.
> >> What should be the result of joining "1"^^xsd:int and "+1"^^xsd:int ?
> >> Will this mean that they will have different localnodes?
> >>
> >> Paul what is your take on these concerns/questions?
> >>
> >> I think "1"^^xsd:int should be a different term then "+1"^^xsd:int and
> >> have different localnodes.
> >>
> >> Maybe we could introduce new internal types, instead of just integer, we
> >> could have integer and integer-with-plus-prefix and others to handle all
> >> possible numeric formats?
> >>
> >> James
> >>
> >> On Mon, 2008-07-28 at 13:09 -0400, James Leigh wrote:
> >> > Hi Arjohn, Paul and Andrae,
> >> >
> >> > Mulgara 2.0 was released last week. It includes some of the bugs that
> >> > were discovered through the Sesame SPARQL test-suite. However, there are
> >> > a few core issues that will prevent us from releasing a stable SPARQL
> >> > compliant RDF store using Mulgara.
> >> >
> >> > The biggest problem is that Mulgara stores only the literal _value_ for
> >> > known datatypes. That means that "+1"^^xsd:int is stored identical to
> >> > "1"^^xsd:int. This has significant consequences with how we implement
> >> > sameTerm as these literals originally have different labels, but are
> >> > collapsed into the same label.
> >> >
> >> > RDF Concepts states that for two literals to be the same "´╗┐The strings
> >> > of the two lexical forms compare equal, character by character." (see
> >> > below for more context). Mulgara will have to begin storing the original
> >> > label with all literals´╗┐ (at least for unreproducible labels) before we
> >> > can release a stable SPARQL compliant RDF store.
> >> >
> >> >  ** Paul/Andrae, can this change be put into the Mulgara road-map? **
> >> >
> >> > Thanks,
> >> > James
> >> >
> >> > ---%<---
> >> > The SPARQL sameTerm states that[1]:
> >> >         Returns TRUE if term1 and term2 are the same RDF term as defined
> >> >         in Resource Description Framework (RDF): Concepts and Abstract
> >> >         Syntax [CONCEPTS]; returns FALSE otherwise.
> >> >
> >> > Here is a excerpt from RDF Concepts[2]:
> >> >         6.5.1 Literal Equality
> >> >         Two literals are equal if and only if all of the following hold:
> >> >
> >> >               * The strings of the two lexical forms compare equal,
> >> >                 character by character.
> >> >               * Either both or neither have language tags.
> >> >               * The language tags, if any, compare equal.
> >> >               * Either both or neither have datatype URIs.
> >> >               * The two datatype URIs, if any, compare equal, character
> >> >                 by character.
> >> >
> >> > [1] http://www.w3.org/TR/rdf-sparql-query/#func-sameTerm´╗┐
> >> > [2] http://www.w3.org/TR/rdf-concepts/#section-Graph-Literal
> >> >
> >
> >
Received on Tuesday, 29 July 2008 16:01:35 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Tuesday, 29 July 2008 16:01:35 GMT