RE: Concise Bounded Descriptions - updated, expanded, stand-alone definition from Patrick.Stickler@nokia.com on 2004-08-20 (www-rdf-interest@w3.org from August 2004)

From: <Patrick.Stickler@nokia.com>
Date: Fri, 20 Aug 2004 17:29:02 +0300
To: <otto@math.fu-berlin.de>
Cc: <www-rdf-interest@w3.org>
Message-ID: <A03E60B17132A84F9B4BB5EEDE57957B026304C8@trebe006.europe.nokia.com>
> -----Original Message-----
> From: ext Karsten Otto [mailto:otto@math.fu-berlin.de]
> Sent: 20 August, 2004 16:57
> To: Stickler Patrick (Nokia-TP-MSW/Tampere)
> Cc: www-rdf-interest@w3.org
> Subject: RE: Concise Bounded Descriptions - updated, expanded,
> stand-alone definition
> 
> 
> On Fri, 20 Aug 2004 Patrick.Stickler@nokia.com wrote:
> 
> > > > A draft of an updated, expanded, stand-alone definition 
> for Concise
> > > > Bounded Descriptions is now available
> > > >
> > > >   http://swdev.nokia.com/uriqa/CBD.html
> > > >
> > > [snip]
> > >
> > > Great to have this on its own page as a point of reference!
> > > However, I have a problem with the new concept of the inverse
> > > functional
> > > bounded description: It requires that both the sending 
> and receiving
> > > agents are schema/ontology-aware, and also that they 
> share the same
> > > schema/onology-knowledge, in order to correctly create 
> and interpret a
> > > CBD.
> > >
> > > For once, the sender needs to know that a given predicate is an
> > > owl:InverseFunctionalProperty, so it can pick the 
> "if"-branch of the
> > > IFBD definition for an anonymous resource. However, this knowledge
> > > may not always be available, e.g. in case of a simple semantic web
> > > crawler. AFAIK the issue of finding all schemata/ontologies
> > > for a given
> > > RDF graph is not solved in general yet - or is it?
> >
> > If the sending agent is not aware that a given property is
> > an owl:InverseFunctionalProperty, then it proceeds as if it
> > is not.
> >
> Ok, seems like I interpreted too much into the definition. I 
> took it as
> a MUST, and without getting to philosophical, an IFP *is* an 
> IFP wheteher
> I know it or not. But from your reply I gather it should read 
> something like
> "where the predicate *is known to be* an 
> owl:InverseFunctionalProperty".
> 
> > Then again, if the agent has no knowledge about the property,
> > it (ideally) would be able to submit a URIQA request to the
> > metadata authority and obtain the information it needs.
> >
> Yes, this makes sense for a more complex agent. But I was 
> thinking of a
> simpler case, such as a passive RDF database with a frontend 
> for answering
> queries with CBDs (e.g. via URIQA :-)
> 
> Also, what is the "metadata authority" you mention?

Whomever controls the response to a URIQA request (e.g. MGET)
(per the web authority of the URI). So, for e.g. the resource
denoted by http://www.example.com/blargh the web authority
is www.example.com and thus a description from www.example.com
is the authoritative description (as opposed to, e.g. a description
obtained from any other source, even if via the URIQA servlet
API, e.g. http://www.google.com/uriqa?uri=http://www.example.com/blargh.

> 
> > However, the definition/generation of CBDs still works just
> > fine if such information is not available -- in fact, it is
> > then the same as the original definition of CBDs.
> >
> > > Furthermore, the receiver also needs to know that a given 
> predicate
> > > is an IFP. This is a more serious issue, as it needs this 
> to determine
> > > whether the "if"- or the "else"-branch of the IFBD definition
> > > was picked
> > > by the sender. In the "else" case, it already has all known
> > > statements,
> > > but in the "if" case it might need to issue another query 
> (by IFP).
> > >
> > > Consequently, if the IFP is unknown to the receiver, it 
> might falsely
> > > conclude that it already got all information the sender had on the
> > > resource.
> >
> > Well, I think it is fair to presume that if the recieving agent
> > is going to do anything particularly useful with (i.e. make 
> decisions
> > based on) the recieved knowledge, that it will have to be aware
> > of the vocabularies/ontologies in which that knowledge is expressed.
> >
> Agreed. However, the open nature of RDF implies that an agent does not
> need to understand every statement in a graph. If a receiver does not
> understand the IFP the sender used to "prune" the CDB, it will mistake
> the pruned graph for the whole thing. IMHO there should be a way to
> distinguish the two cases.

I appreciate your point. I'm just not convinced that the subgraph
returned as a CBD needs to contain such process-specific knowledge.

E.g. one could include a statement

   ?x rdf:type cbd:InverseFunctionalPropertyDistinguished .

or some such triple to indicate which anonymous nodes are
uniquely distinguished by inverse functional properties and
which are not.

Or one could include, as you suggested, statements about the
properties themselves.

But I'd need to see a strongly motivating use case for doing
something like that. I.e. the goal is to keep CBDs as, er,
"concise" as possible, so any knowledge to be included needs
to fight hard to win a place in a CBD.

Since I envision an semantic web where agents can obtain
authoritative CBDs via dereferencable URIs, the inclusion of
information such as above is not IMO sufficiently justified
(in the long term, at least).

> 
> > Also, and again, I personally do not see CBDs as a complete
> > solution to knowledge interchange between semantic web agents.
> > Something equivalent or comparable to URIQA must also exist so
> > that agents can further obtain the knowledge they need.
> >
> Of course. But my point is that the receiver cannot know that it
> needs to send another query in the problematic case. In fact, as it
> does not know the relevant IFP, it does not even have the necessary
> parametes for the query.

It does not have to know the IFPs. It can use the triples it has
been given as the template. If there are IFPs, then all of those
triples will include IFPs. If there are no IFPs, then none will
include IFPs, and the query may identify more than one resource.

Still, it would make more sense to me for the agent to ask about
the properties it has never seen before, expanding its knowledge
accordingly, before trying to use a half-blind brute force 
series of queries to extract additional knowledge about
anonymous node denoted resources.

> 
> > > I see two possible solutions to this problem: The CBD 
> could contain
> > > the relevant "ppp rdf:type owl:InverseFunctionalProperty" 
> statements,
> > > or indicate all relevant ontologies by way of owl:includes.
> > > However, neither solution is viable for RDF-only cases, such as
> > > querying the aforementioned simple spider agent.
> >
> > Or, if the recieving agent has no knowledge about those properties,
> > it can either submit a URIQA MGET request, or ask the same source
> > of that knowledge for additional knowledge about those properties.
> > I.e., ask the sending agent what it knows about those properties
> > (by sending the CBD of each, etc.)
> >
> > I see the definition of CBDs as a componenent of a general
> > bootstrapping mechanism for the semantic web, not as an all
> > encompassing solution to knowledge interchange.
> >
> Yes, and I dont expect them to be anything else. But I was thinking of
> a scenario where the receiving agent has limited resources (memory,
> bandwidth, CPU power), for example because it resides on a 
> tiny embedded
> device. For that reason it cannot cache all ontologies it 
> might encounter,
> or ask for the precise definition of everything it finds. But it can
> answer simple queries on the triple level, like "find me things of
> rdf:type dev:Printer with foo:location bar:Room5". The lookup service
> has some _:p in its database that matches these criteria, but also has
> the IFP comp:uniqueDeviceNumber. The agent does not know the comp
> ontology, so it does not know the information about _:p is pruned, and
> would match if another query were formed. (Sorry for this 
> hasty example)

I certainly am sympathetic to agents running in limited environments
(after all, I'm a semantic web researcher working for Nokia ;-)
but again, my experience has been that applications that deal with
knowledge expressed with ontologies employing IFPs are aware of which
IFPs are important -- and even more so, are highly selective of 
knowledge which syncs with their own limited vocabularies and disregard
the rest (so if the embedded agent doesn't already know it's an IFP,
it doesn't care and won't bother about it anyway).

Again, I appreciate your point, but would like to see some hard
and real use cases and experience demonstrate the need, rather
than expanding the scope of CBDs merely "on a hunch" or as a matter
of esthetics or "just in case" arguments.

> 
> > > By the way this seems to be a more general case of the 
> "crossing layer
> > > boundaries"-problem previously discussed (but not solved) 
> in another
> > > mailing list thread [1].
> >
> > Well, given the examples you present in that referenced document,
> > I would say that those "missing triples" are provided for by the
> > closure rules defined in the RDF model theory. Triples that can be
> > inferred are not the same as triples which are simply not included
> > in a graph, but must be obtained separately.
> >
> Well, I think the cases are similar in that there the 
> receiver is supposed
> to understand rdfs:subClassOf etc, where here it is supposed 
> to understand
> owl:InverseFunctionalProperty. If the receiver does not 
> understand these,
> it will fail to process the information in the way the sender 
> intended.

That is a much broader issue and a general challenge for achieving
a critical mass of deployed semantic web solutions.

I don't see that the definition of CBDs directly helps or hinders
that problem (though I see that URIQA most certainly would help
substantially).

Thanks for the engaging questions. I hope you don't feel I'm in
any way blowing you off and not taking your points seriously. I'm
simply not convinced that there is a critical problem that would
be solved by changing the definition of CBDs as opposed to other
approaches/solutions.

Cheers,

Patrick
Received on Friday, 20 August 2004 14:29:17 UTC