Re: KR, meet WWW. was: Clarifying what a URL identifies (Four Uses of a URL) from Sandro Hawke on 2003-01-24 (www-tag@w3.org from January 2003)

From: Sandro Hawke <sandro@w3.org>
Date: Fri, 24 Jan 2003 12:33:30 -0500
To: Tim Berners-Lee <timbl@w3.org>
cc: "Jonathan Borden" <jonathan@openhealth.org>, "David Booth" <dbooth@w3.org>, www-tag@w3.org
Message-Id: <200301241733.h0OHXUL29052@wadimousa.hawke.org>
I agree with the main point here (that "libwww" and OWL each work fine
by themselves, but there's a danger of a collision), but I wanted to
pick a few nits (lest they fester) and point out the fundamental
disagreement we (TimBL and I) still have.  I'm hoping someone who
really understands the two options has ideas for a third option and/or
important use cases to help decide the matter.

> A log:uri B     , for example, means  the string b is the URI of 
> resource A.

[Nit] I'm pretty sure you mean A is _a_ URI of B.  Or is a different
resource really identified by "http://www.w3.org/" and
"http://www.w3.org" (no trailing slash) ?

> A log:semantics B  means that B is the RDF triple set  which can
>   be obtained by looking up the resource A on the web.  Its domain is
>   Conceptual Work  (doc:Work) and its range is N3 formula (log:Formula).

[Nit] I suggest that "Mediated Memory Location" is more accurate than
"Conceptual Work", which has strong connotations of immutability.  The
location holds some information (more formally I think the right term
is more like "a proposition" or "a belief"), much as does a conceptual
work.

[Nit] Also, I think you'd agree this is a many-to-many relation, so B
is _an_ RDF triple set (graph) which can be obtained....

> These functions are built-in. The  log:semantics is calculated 
> automatically
> for resources whose URI starts with "http:" and happen to be available 
> on
> the web at the time.

[Complete Aside] Your implementation of log:semantics reminds me of
prolog's marking parameters with "?", "+", or "-".  For instance, the
"is" predicate can be used to do math, like
   is(X, 1+4)        (reports X = 5)
   is(6, 1+4)        (reports "no")
   is(5, 1+4)        (reports "yes")
but you can't say 
   is(5, X)          (reports Error: Unbound variable)
which I think should report an infinite sequence:
                       X=5, X=5+0, X=5+0+0, ..., X=4+1, ...

In the documentation, it's written
   is(?Result,+Expression)

The "?" on the first parameter means it MAY be bound, while the "+" on
the second means it MUST be.  "-" would mean it MUST be an unbound
variable.

I find these distasteful; I wish everywhere were "?" parameters; but I
guess it's a lot harder to implement and not very useful.

I keep mulling over other approaches here, like how to do this
(log:semantics) in backward chaining, and how to graft it into a
resolution theorem prover.

> If you like, it is as though there is an axiom
> 
> { ?x log:uri ?u.  ?u  string:match "^http://[^#]$" } => { ?x rdf:type 
> doc:Work }.
> 
> (where string:match is a regexp matcher)
> This axioms comes from the URI spec and the specs it references.
> Any semantic web engine can conclude it.  It is not authorized
> by the OWL spec, it is authorized by the URI spec.

[Main Point] This is where we (respectfully) disagree.  I think fragments
should be fragments.  I think a fragment of an http resource (living
document, web page, shared memory location, communication end point,
etc) should be something very like an http resource, not something
suddenly in another domain of discourse.  

RDF should use a mixture of "identifies" and "indicates" as desired,
when it wants to talk about http resources AND other things like 
properties.  It could stick with just "identifies" and use arcs to do
different kinds of indictations, as you suggested to Larry, except
we've already put things like
http://www.w3.org/1999/02/22-rdf-syntax-ns#type in http space.  It
could also stick with just indicates(primarySubject), but that might
break too much existing data, including lots of tutorial examples
which talk about web pages.

Of course I pay for this "fragments are fragments" win without
breaking existing RDF documents or software with the complete hack of
saying RDF uses the presence of "#" in a URI as a flag in the rule for
determining when a URI is used as an identifier or an indicator.  My
doing this is only marginally better, architecturally, than saying the
presence of "@" somewhere in a URI should be the flag; it just so
happens that "#" does the right thing in a lot of existing data.

So I see us arguing which is the worse hack: using the "#" characters
as a flag, or (your approach) using the content-type semantic
dependency rule to say fragments of RDF documents are not fragments at
all.   For me the deciding use case is that I really want to do
content negotiation between RDF and HTML, which your approach does not
allow.

I think this is more a TAG issue than an RDF issue, which has helped
motivate my discussion here.

> This means that any use of such a URI to identify a car will
> indeed lead automatically to a contradiction the moment the
> ontologies are available and the power is turned on.

[Nit] Oh, if only automated reasoning were so fast and reliable.  :-)

> KR, this is WWW. WWW, this KR.
> 
> Or more strictly, OWL, this is URI. URI, this is OWL.
> Neither an owl reasoner nor libwww has the whole story.

Yeah.  I think I spelled out the areas of overlap (and also the
content-negotation argument) from a different angle last night [1].

    -- sandro

[1] http://lists.w3.org/Archives/Public/www-tag/2003Jan/0339
Received on Friday, 24 January 2003 12:36:15 UTC