Re: ISSUE-29: Re: When is equal and when is it nonequal (eg, the IRI interface) from Ivan Herman on 2010-08-05 (public-rdfa-wg@w3.org from August 2010)

From: Ivan Herman <ivan@w3.org>
Date: Thu, 5 Aug 2010 06:30:08 +0200
To: nathan@webr3.org
Cc: Manu Sporny <msporny@digitalbazaar.com>, Toby Inkster <tai@g5n.co.uk>, W3C RDFa WG <public-rdfa-wg@w3.org>, Gregg Kellogg <gregg@kellogg-assoc.com>
Message-Id: <E5666618-3266-4D55-8741-B4BECA5A8156@w3.org>
Nathan,

(I also copy Gregg to this mail explicitly, because he referred to something similar)

I am still looking at all this through an implementor's perspective using and external, sort-of-standard RDF package. As I said, extending an IRI (ie, and URIRef in RDFLib) with an additional information through, say, subclassing is Python-wise possible and it is of course possible to redefine the __eq__ magic property, RDFLib goes wild if the equality also includes the equality of URI-s and the serialization goes wrong. If, on the other hand, equality does not touch the info value, then some of these class will be lost and the output will be incorrect. 

Though, conceptually, the issue of equality for triples raises similar issues, it does not in terms of RDFLib at least, and I suspect a number of other packages will handle it as well. Indeed, at least in RDFLib, there already _is_ a way to handle triples as quads, and use a triple store that is, in fact, a quadstore. It will do all the right things, ie, will serialize things properly while still keeping the quads as separate if needed. By using a structure containing the three origin node information, and using such structure instances as quad info, things can be done easily. (Yes, this is, sort of, a named-graph-like mechanism in RDFLib.)

Cheers

Ivan

On Aug 2, 2010, at 16:17 , Nathan wrote:

> 
> Again, with all due respect what's the use case for having this additional information at all, does anybody actually need the origin for anything? (even if moved to the triple as toby originally suggested)
> 
> If you simply move the extra properties on to the RDFTriple then surely you get the same problem when testing triples for equivalence. Would you then be saying that two triples are not equal if they come from different sources?
> 
> An RDFTriple does not have any of these extra properties as we all know, if you are saying that a triple in RDFa does, then to me that simply means:
>  RDFaTriple extends RDFTriple
> 
> On a more practical implementation level, by subclassing string you can get around the equivalence issues in some (but not all) languages. The other common and simple approach is to add an equivalence method o.equals(o) to interfaces which need custom equivalence testing.
> 
> All in though, I realise removing origin's all together would be a big change to the spec - but is there any reason not (or indeed to have them their in the first place, in whatever form).
> 
> Best,
> 
> Nathan
> 
> Ivan Herman wrote:
>> Manu,
>> I do not think that having 'info' helps in any way. And I am not sure it is a different issue in the sense that it is so intimately related to equivalent testing that I am not sure how to separate them.
>> My assumption is, as usual, that I try to implement the API on top of some standard RDF package. That is of course a restriction, because I am bound to what the package gives me, but I think it is reasonable to suggest that a number of people will want to do that. That is how RDFLib comes into the picture for me.
>> A URI is a fundamental feature of those packages, so is the way to compare them for equality. It may be considered as atomic (whereas triples are not); triple comparison, internal cuisine of the package, serialization, etc, are based on that equality. If we touch that atomic unit in any way, then this may have ripple effects all around. In the RDFa API the URI is such an atomic unit, too. If I can map that on an RDFLib URI (call it URIRef) by simple renaming, than I can rely on the package for everything else. If I cannot, than I have to, essentially, throw away RDFLib. I do not think this should be an option.
>> Because RDFLib is in Python it is of course thinkable to create a subclass of RDFLib, and re-define equality for that subclass as... hm. What? (Yes, this is the issue you raised.) Is it such that a == b if a.info == b.info and a.value == b.value? That is of course wrong. But if I define a == b <=> a.value == b.value, then the underlying system will ignore the info in a bunch of places (eg, when it creates dictionaries using those class instances as keys) and things go wrong. My first attempts implementing that in RDFLib failed, and I am not sure what I would have to do. And I am not sure implementing such a subclassing mechanism with equality redefinition is a viable option for all languages in the first place!
>> So yes, the essence of the matter is equality, but that comes from the issue of creating a new type of atomic unit, so to say, which stores more value than just the URI.
>> Nathan's question is absolutely correct here: what is the use case? What is the use case that cannot be solved by having (as Toby suggested) that addition information in a triple and not in the URI?
>> Ivan On Aug 2, 2010, at 03:28 , Manu Sporny wrote:
>>> On 06/09/2010 04:53 PM, Toby Inkster wrote:
>>>> On Wed, 9 Jun 2010 17:44:41 +0200
>>>> Ivan Herman <ivan@w3.org> wrote:
>>>> 
>>>>> The problem is as follows. In a package like RDFLib, the equality is
>>>>> fairly simple, it is based on the equality of the URIs (let us put,
>>>>> for a moment, the issue of IRI vs URI and its encoding aside).
>>>>> However, our version of the IRI has two attributes: the 'value' and
>>>>> the 'origin'. Ie, to IRI instances that have the same value but
>>>>> different origins are different.
>>>> 'origin' as specified needs fixing. Rather than:
>>>> 
>>>> 	triple.subject.origin
>>>> 
>>>> We should have:
>>>> 
>>>> 	triple.subjectOrigin
>>>> 
>>>> That way testing for equivalence between IRIs, Blank nodes and Literals
>>>> becomes obvious.
>>>> 
>>>> <> rdfs:seeAlso
>>>> <http://lists.w3.org/Archives/Public/public-rdfa-wg/2010Apr/0161.html>.
>>> Unfortunately, I don't think the answer to this one is as simple as
>>> that, Toby. It assumes that we're operating in a DOM environment, which
>>> we can't depend on for the RDFa API.
>>> 
>>> I've updated the RDFa API spec to move "origin" into a
>>> developer-modifiable attribute called info. We also had to rename
>>> "origin" to "source" due to Thomas' input on ISSUE-29. So, the way you
>>> access the property at the moment is:
>>> 
>>> triple.subject.info.source
>>> 
>>> That's not great... definitely would like to hear some suggestions on
>>> how we could make this easier on developers. The "info" mechanism is
>>> meant to be a free-form developer-modifiable mechanism for storing
>>> additional information along with subjects, predicates and objects. Mark
>>> had asked for something like this as had Nathan and a few others.
>>> 
>>> However, I think your and Ivan's issue has to do with equivalence
>>> testing, which may be a completely separate discussion (and may need an
>>> ISSUE of its own).
>>> 
>>> AFAIK, there are two types of equivalence testing that we could do with
>>> the RDFa API. The first performs equivalence testing for just the triple
>>> information, that is, purely on the subject/predicate/object data
>>> (including datatype and lang information).
>>> 
>>> The second type performs equivalence testing on the information above,
>>> in addition to the source Element information.
>>> 
>>> Perhaps we should specify a section in the RDFa API that clarifies the
>>> different types of equivalence testing one can do via the RDFa API. I'm
>>> a bit hesitant to specify the "correct" way to do equivalence testing in
>>> the RDFa API spec as there are many ways to do it.
>>> 
>>> -- manu
>>> 
>>> -- 
>>> Manu Sporny (skype: msporny, twitter: manusporny)
>>> President/CEO - Digital Bazaar, Inc.
>>> blog: WebApp Security - A jQuery Javascript-native SSL/TLS library
>>> http://blog.digitalbazaar.com/2010/07/20/javascript-tls-1/
>>> http://blog.digitalbazaar.com/2010/07/20/javascript-tls-2/
>> ----
>> Ivan Herman, W3C Semantic Web Activity Lead
>> Home: http://www.w3.org/People/Ivan/
>> mobile: +31-641044153
>> PGP Key: http://www.ivan-herman.net/pgpkey.html
>> FOAF: http://www.ivan-herman.net/foaf.rdf
> 
> 


----
Ivan Herman, W3C Semantic Web Activity Lead
Home: http://www.w3.org/People/Ivan/
mobile: +31-641044153
PGP Key: http://www.ivan-herman.net/pgpkey.html
FOAF: http://www.ivan-herman.net/foaf.rdf
Attachments

application/pkcs7-signature attachment: smime.p7s
Received on Thursday, 5 August 2010 04:29:22 UTC