Re: comparing XML and RDF data models

On Jul 2, 2008, at 11:02 PM, Peter Ansell wrote:

> 2008/7/2 Bijan Parsia <bparsia@cs.man.ac.uk>:
>>
>> On 2 Jul 2008, at 12:19, Mark Birbeck wrote:
>>
>>> Hi Tim,
>>>
>>> I'm not sure that this is where the differences lie.
>>>
>>> In my view the key point is that with RDF we have unique identifiers
>>> for concepts--whether that is the things we're talking about, or the
>>> vocabulary we're using to talk about them.
>>
>> [snip]
>>
>> I stop reading here.
>>
>> Here's why. At best we have unambiguous (not unique) identifiers  
>> and we
>> don't have those either.
>
> I don't think it is too hard in scientifically based/real world
> ontology instances to determine whether things are infact unique or
> common names for a universal thing.

I think you radically underestimate this. Talking with biologists and  
bioinformatics people reveals otherwise.

It's also basic logico-mathematical fact. We can't get unique and  
unambiguous names for *integers*.

So, I guess it is not too hard, for any name, you can say, "it's in  
fact not unique" :)

>> This isn't even a coordination issue. In a single ontology it's  
>> highly
>> nontrivial to establish formal uniqueness (i.e., two names aren't
>> equivalent/equal; requires lots of reasoning) and even harder to  
>> establish
>> intended uniqueness (I might coin a term twice because I didn't  
>> recognize
>> them to be the same).
>
> Are you talking about establishing conclusively that two names are
> definitely not referring to the same universal thing?

I don't know what a universal thing.

But yes, I don't think it's easy to establish either that two people  
intended their distinct names to refer to different things, and it's  
certainly difficult to establish that equivalence in a model (OWL  
reasoning is difficult!). And even if you declare two class to be  
disjoint...that might be *wrong* (and one might find an isomorphism  
between them).

> With the open
> world assumption you can never deny that they might refer to the same
> thing,

Well, you can say they are unequal or disjoint. (Or infer that.) But  
in an alignment situation, you might find that certain structurally  
identical but disjoint classes actually "ought" to denote the same  
thing (i.e., your model is wrong).

> possibly in an inconsistent way according to the ontology, but
> if you establish a consistent ontology and they do not match up given
> your chosen reasoning rules then you have to make the best assumption
> you can and say that they are not likely statistically to be the same
> and run with that. RDF doesn't provide a solution to dirty data,
> although it can be used to trace it better.

Please see:
	<http://www.w3.org/mid/44444506-BC65-4BB9- 
AF6B-01FE434C6C3A@cs.man.ac.uk>

> The point I think Mark may have been trying to make is that with
> certain property combinations, InverseFunctionalProperty's can be
> utilised in order to determine this uniqueness or non-uniqueness,

? IFPs aren't about identifiers per se. And they can't determine  
uniqueness!  Infact, generally you are trying to say that two names  
refer to the same thing (i.e., aren't unique).

> along with it being easier in RDF to distinguish between your
> acknowledged terms, and outside terms,

This is contentless for me. I literally have no idea what you mean.

> which in XML can result in you
> not acknolwedging that two infoset elements are unique because one
> contains an unknown namespaced property against the desires of one
> schema, and you need to validate XML before you can work with it at
> this level.

Again, see:
	<http://www.w3.org/mid/44444506-BC65-4BB9- 
AF6B-01FE434C6C3A@cs.man.ac.uk>

I hope you don't find this offensive, but this just seems like  
*babble* to me. Perhaps a concrete example will help.

Tim Glover really did a heroic job of trying to make a case. I  
commend his effort as a model. The problem with getting clear is that  
one is likely to be wrong :( That sucks, but it's better to get clear  
and find out what we *can* usefully claim.

Again, I urge semanticwebbians to be *ruthless* in our scrutiny of  
the sorts of claims we make.  We have a *bad* reputation for  
Koolaidoisity, and, I'm afraid, it's well deserved. We're not going  
to win over unconvinced people by grandiose and vague magical claims.

...or maybe we will. Plenty of stuff works that way. But I don't like  
it. I'd rather say true and extremely intelligible things.

Cheers,
Bijan.

Received on Wednesday, 2 July 2008 22:47:58 UTC