RE: Reification from Danny Ayers on 2001-04-10 (www-rdf-logic@w3.org from April 2001)

From: Danny Ayers <danny@panlanka.net>
Date: Tue, 10 Apr 2001 19:44:55 +0600
To: "Peter Crowther" <Peter.Crowther@melandra.com>
Cc: <www-rdf-logic@w3.org>
Message-ID: <EBEPLGMHCDOJJJPCFHEFMEFPDBAA.danny@panlanka.net>
<- > <- You cannot control what reads your RDF if you express something
<- > <- in RDF (just as with HTML or with XML).
<- >
<- > I believe you can.
<-
<- OK, I'll go one stage further in my argument: If you embed some RDF in a
<- resource [for example as some annotation on a Web page] and place that
<- resource in a location where it is accessible from the Web, then any user
<- agent accessing that resource can and should get the RDF as
<- well.  Yes, you
<- can play tricks with the server handing out different RDF
<- depending on the
<- user agent, but there are still the usual problems of unknown UAs,
<- pre-caching proxies, and crawlers that use fake UA identifiers.  So I
<- contend that it is not, in the general case, possible to limit what a
<- particular agent sees.
<-
<- I'd be interested to see you justify your belief in a similar
<- way, bearing
<- in mind that we're talking about the Web, not about EDI between two
<- mainframes with a dedicated link between them.

I have information at the URL in my sig that I happily expose to the web,
and to any agents that might be around. I have a credit card number that I
use to order books over the web, but I try not to make this available to
agents other than those on a very short list.

Let's say there's an intelligent web crawler that scans my site, and sees
'XML' all over the place, and decides this is equivalent to 'XXXX', and
lists my site in a catalogue as porrn.
Is this not like the scenario you have been talking about? Who's
responsibility is this? Am I in the wrong for not having 'not smut' in my
robot.txt, or is the owner of the crawler in the wrong for publishing
incorrectly interpreted data?
More to the point, how would I deal with this? Either I'd politely request
that the owners of the crawler remove the incorrect entry, ignore it or take
them to court. Only I wouldn't take them to court unless there was a lot at
stake, something business critical about the whole thing, and if it was
something like this that could potentially be misused then I wouldn't make
it so public in the first place - like my card number.

There is another point here, perhaps more fundamental - you're making the
assumption that the alternatives are things being completely publicly
accessible or EDI style restricted peer-to-peer. Surely it is a very basic
requirement/prospect that information can be published within constraints,
that channels can be restricted according to the preferences of the
individuals/entities involved? What's this 'web of trust' stuff about
anyway?

Ok, but assume my potentially sensitive data is broadcast. Then :

<- The problem here is when a system receives some RDF, tries to do
<- this, and
<- falls flat on its face because it's unaware of some nuance of the way a
<- property is being used within a particular piece of RDF.

(most of the quote snipped except) "agents processing metadata will be
able to trace the origins of schemata they are unfamiliar with back to known
schemata and perform meaningful actions on metadata they weren't originally
designed to process."

This strikes me as an overly optimistic, unrealistic line.  I think in
reality the agent would *attempt* to trace the schemata back to known
schemata, and if it recognised the elements involved then maybe it could
perform meaningful actions. To successfully trace the schemata back it needs
a matching mechanism to tell it when it has reached the end of the chain. If
it can't find an adequate match, then it wouldn't make sense trying to
'perform meaningful actions' because there would be unknowns remaining
(unless of course the mechanism is happy to arbitrarily equate e.g. <a> with
<b> when it only recognises <b>).


<- [...]
<- > Ignore the tag would be preferable fallback, I'd have thought.
<-
<- First find your 'tag'.  RDF properties are probably the closest
<- equivalent
<- to *ML tags.
<-
<- The problem is that you can't do this with RDF at all, because *all* the
<- properties carry data with uncertain interpretation.

The definitions in your namespace make them as certain as you like.

<- With RDFS, a limited number of properties have some informal
<- definition; for
<- example, it would be possible to use rdfs:subClassOf to extract
<- hierarchies
<- from any piece of RDFS without understanding the rest.  However,
<- if you wish
<- to produce any system that can answer queries about arbitrary RDF
<- structures, you're still screwed: without a standard definition of (for
<- example) negation, your system may give answers that are quite
<- different to
<- those produced by a system that knows about the particular way a
<- structure's
<- creator chose to express negation.

I don't see where negation comes into this example.

If I ask is A a descendent of B, the query engine looks for a match, or one
it can infer, e.g. from :
A is a child of C
C is a child of B

or a contradiction (B is a descendent of A)

Where no match or contradiction was found then you get the answer back
'unknown', what you do with that is your own concern.


<- For those of us who like to produce generic solutions that we
<- can re-sell,
<- this is a big problem :-).
<-
<- > I see your point anyway - personally I think it would be
<- > better in the case
<- > of uncertainty for the trumpets to sound, the drawbridge to
<- > be raised and
<- > the oil put on the stove (or at least the transaction to be
<- > rolled back).
<-
<- That's fine, but 'uncertainty' in this case would have to be the
<- discovery
<- of *any* piece of RDF where the system didn't know the meaning of a
<- property,

What's wrong with that?

Dealing with uncertainty has to be handled somehow, I think some approaches
will be considerably more useful than others.

What are you suggesting should be done when an unknown item is encountered?

I don't think it's important whether it's related to negation or whatever -
either you know what to do with something (through a chain of schemas) or
you don't.

I suppose it would be possible for an engine (A) to have a rule like 'if
secondary then author = {alternative}'?

given the data

<author secondary="true">Herman Melville</author>
<alternative>Hubert Melville</alternative>

A would read this as
author = Hubert Melville

Another engine (B), ignorant of 'secondary' decides
author = Herman Melville

Engine C, ignorant of 'secondary' decides
author = {undefined}

I need a cup of tea...
Received on Tuesday, 10 April 2001 09:48:23 UTC