trade-offs for equivalence tests from Eric Prud'hommeaux on 2006-08-22 (public-rdf-dawg@w3.org from July to September 2006)

From: Eric Prud'hommeaux <eric@w3.org>
Date: Tue, 22 Aug 2006 12:06:30 +0200
To: public-rdf-dawg@w3.org
Message-ID: <20060822100630.GA1067@w3.org>

The concensus during the 8 Aug telecon was that we should have the =
operator serve for both value equivalence and node equivalence tests:

  value-eq:  "12.0"^^xsd:float = "12"^^xsd:integer
  node-eq:   <foo> = <foo>

The screw case is that one cannot use '=' to test to see if two
strings of unknown type are *different*:

  "asdf"^^foo:bar != "qwer"^^foo:bar  => type error

We need to either accept nonmonotoncity or add an operator to allow
simple node-eq checking on literals of unsupported types. The
former seems like a non-starter, so it seems we need an operator for
the latter. Here are some options and their costs:

 -- Conservative --
The syntax for the two operators are entirely distinct; you can't
write "is this IRI the same as that one?" the same way you write "is
this value numerically equivalent (value-eq) to that value?"

Advantages:
- consistent user experience: operators always do the same thing.
- more optimizable: node-eq treated just as pointer equivalence;
  more heavy value-eq only invoked when the user specifically
  demanded it.

Disadvantages:
- you need more operators in mind
- queries where you want to test *either* node-eq or value-eq need to
  be specially worded with an ||


 -- Liberal (outcome of the 8 Aug telecon) --
The syntax for value-eq does double duty for all the possible (as
limited by monotonicity) operands; you *can* test for IRI-eq, IRI-ne,
bNode-eq, bNode-ne and literal-eq the same way you test for numeric
equivalence. IRIs and bNodes equivalence can be tested with the
node-eq operator *or* the value-eq operator.

Advantages:
- intuitive: for most cases, the one operator does what you need.

Disadvantages:
- inconsistent: literal-ne different from the rest.
- less optimizable: every value-eq test meeds to do both a value test
  and a node equivalnce.

Finally, we should figure how to spell these to operators in the
query. '=' seems like a popular choice for at least one.

 -- = / sameNode --
  ?num1 = ?num2 && sameNode(?str1, ?str2)

 -- = / == --
  ?num1 = ?num2 && ?str1 == ?str2

 -- == / = --
  ?num1 == ?num2 && ?str1 = ?str2

Andy says that having both eq and = in RDQL lead to user confusion.
Given that we want monotonicity, I don't think we can avoid having two
operators. The question is infix/function, how to spell them, and how
liberal to make the value-eq operator.
-- 
-eric

home-office: +1.617.395.1213 (usually 900-2300 CET)
	    +33.1.45.35.62.14
cell:       +33.6.73.84.87.26

(eric@w3.org)
Feel free to forward this message to any list for any purpose other than
email address distribution.

Received on Tuesday, 22 August 2006 10:05:16 UTC