Re: New issue - Meaning of URIs in RDF documents from Sandro Hawke on 2003-07-16 (www-tag@w3.org from July 2003)

From: Sandro Hawke <sandro@w3.org>
Date: Wed, 16 Jul 2003 17:09:30 -0400
To: pat hayes <phayes@ihmc.us>
cc: www-tag@w3.org, timbl@w3.org
Message-Id: <200307162109.h6GL9UtE011183@roke.hawke.org>
Pat, I appreciate your expertise here.  I think TimBL's message
succinctly expresses his intuition about how this can all work out, in
terms familiar to the TAG.  I know you and others see gaping flaws in
his sketch.  (Of course he knows you do, and knows you know he knows,
etc.  See [1] for history links.)  Some of these holes may be
terminological, others architectural.  Some may be repairable, others
fatal.   That's the state of the art, IMHO.

There has been a process question here, which we might express as a
continuum.  At one end, the W3C could convene a series of Semantic Web
Design Workshops to try to come up a viable design for the Semantic
Web.  At the other end, it could go with TimBL's intuition that the
Semantic Web is deeply coherent with the existing Web, and that URIs
have always been logical constant symbols.  What TimBL's asking the TAG
to do here, I think, is to use his sketch as the initial straw man.
Some of the motivation here is surely the fear that the other end of
the spectrum is a very, very slow road.  More a bog, really.

So if you're willing to help by poking holes in the strawman, offering
new limbs on occasion, or maybe even offering a new strawman, then
great.  Your message tempts one to think you were just trying to burn
the field and barn to the ground; fortunately, I know you better than
that.

I do think the process will work a lot better if the TAG creates a
Task Force of the interested parties here (like you, me, and everyone
else who volunteered in Cambridge and Budapest [1]); I'm pretty sure
they haven't even considered that idea yet, but they will.  Having
such a group be answerable to the TAG seems reasonable to me.
Alternatively, the TAG can keep us all on the outside, where we get
our say via the usual comment mechanisms.  And even more
alternatively, there's nothing to stop anyone else from organizing
workshops and writing papers proposing ways to establish the desired
shared meaning of URIs and/or RDF graphs.

> 1. " each URI
> identify one thing ("Resource": concept, etc)."
> 
> Exactly what is meant by "identify" here is not exactly clear, but if 
> this means something  close to what it usually means then it is 
> simply untenable to claim that all names identify one thing.

I think "identify" == "denote" here, and the "in one interpretation"
is implied.  The main point is that URI are NOT supposed to be
overloaded to denote more than one thing in any single interpretation.

> First, OWL is more than an RDF vocabulary: it is an RDF vocabulary 
> with a particular semantics applied to it. It is the semantics which 
> allows the document (strictly, the RDF graph) to make nontrivial 
> assertions, most of which cannot be made in RDF; so it is the OWL 
> document making the assertions, not the RDF document (true, an OWL 
> document can be described as an RDF document with an OWL semantics, 
> but it is misleading to use the syntactic criterion when we use 
> phrases like "say things about".)

I'm stuck on this problem right now.  I don't know how a user is
supposed to know which semantics they are supposed to use with a given
RDF/XML document.  It seems like the semantics need to be identified
out-of-band, which is real pain because ... where are you going to put
it when you store the document in a normal filesystem, etc?  This is
about where we are with XML in general these days, with every document
having only application-specific semantics, but I had hoped we could
do better with RDF/XML.  If we can't, we at very least need some
meta-data protocol for identifying the intended semantics during
transmission via HTTP.

I think there are lot of XML and RDF/XML users who think they have
cross-application semantics provided by XML with namespaces, but there
is conflicting evidence as to whether they actually do, which probably
means they don't.  I'm not really up on the TAG discussions about this
w.r.t. XML in general, though.

> 5. "This information, directly or indirectly acquired, may be
> human-readable and/or machine readable, the latter including for
> example ontological statements in OWL, or rules, or other logical
> expressions."
> 
> This is an extremely contentious and potentially confusing claim. It 
> is *impossible* for software agents to respond to or utilize 
> "information" which is only human-readable: it must be 
> machine-readable. So to lump these categories under the single 
> heading of 'information' is an architectural disaster, if 'recipient' 
> in the previous sentence is supposed to refer to an architectural 
> element (such as an agent of some kind). This point is not new, of 
> course: it has been made already in many intense discussions and 
> debates, many of them archived.

I'm not convinced here.  I think there's a lot of value to a system of
"agents" in which some of them understand a different set of languages
than others.  So some are human, some are machine; of course the
machine ones have different constraints, but some generalities apply
to both -- like the primary idea in TimBL's sketch, that dereferencing
a URI is a good way to get more information.  That may well apply to
both humans and machines, even though the bytes they obtain via
dereference may be different and what they "learn" from those bytes is
probably vastly different.

> 6. "-  the architecture is that a single meaning is given to each URI "
> and
> "- the architecture does not permit the meaning of a URI to be changes
> by consistent misuse by others"
> 
> These are IMPOSSIBLE architectural requirements. There are no precise 
> theories of meaning which make such statements other than fatuous

There may be some wiggle room here.  I've been exploring this as
"namespace distortion" [2], and my best guess right now is to
recommend user agents SHOULD make some (defined) effort to detect
logical inconsistencies between instance data and a (defined) partial
web closure [3] of the instance data, and report such inconsistencies
via warnings to the user.  This might just work, motivating people to
provide suitable retrievable definitional formulas (er, ontologies, in
the general sense), while scaring people away from using URIs
inconsistently.

> 7. "The community needs
> 1) A concise statement of the above architectural elements from
> different specs in one place, written in terms which the ontology
> community will understand, with pointers to the relevant specifications."
> 
> Maybe, if I could make the suggestion without seeming to commit 
> lese-majesty, it would be a good strategy for the W3C, rather than 
> trying to render nonsense "in terms that the ontology community will 
> understand", to ask if it might possibly learn something from 
> actually *listening* to the ontology community; or at any rate, to 
> anyone with a grasp of basic 20th-century results in linguistic 
> semantics.

It can be hard to know in advance which "basic results" from which
fields are truly applicable.  The IETF culture makes little reference
to philosophical, linguistic, or even mathematical results, beyond
basic engineering.  And there's the web's founding mythos about how
everyone said it could never work, etc, etc.  (This was nicely
rehashed by the non-TimBL keynote speakers at WWW2003 each saying what
they thought of the Web when it first appeared.)  It can be hard to
translate and summarize an entire field of study into a compelling few
test cases, but that might just be the best approach here.   I'll try
to help, though my grasp of some of the issues is still tenuous.

        -- sandro

[1] http://esw.w3.org/topic/SocialMeaningGroup
[2] http://esw.w3.org/topic/NamespaceDistortion
[3] http://esw.w3.org/topic/WebClosure
Received on Wednesday, 16 July 2003 17:09:35 UTC