RE: web proper names redux from Patrick.Stickler@nokia.com on 2004-09-27 (www-rdf-interest@w3.org from September 2004)

From: <Patrick.Stickler@nokia.com>
Date: Mon, 27 Sep 2004 09:41:20 +0300
To: <hhalpin@ibiblio.org>, <www-rdf-interest@w3.org>
Cc: <ht@inf.ed.ac.uk>
Message-ID: <1E4A0AC134884349A21955574A90A7A50ADCDE@trebe051.ntc.nokia.com>
> -----Original Message-----
> From: www-rdf-interest-request@w3.org
> [mailto:www-rdf-interest-request@w3.org]On Behalf Of ext Harry Halpin
> Sent: 26 September, 2004 02:04
> To: www-rdf-interest@w3.org
> Cc: ht@inf.ed.ac.uk
> Subject: web proper names redux
> 
> 
> 
> Thanks everyone on this list who discussed the Web Proper 
> Names proposal. 
> A number of questions have been proposed, and the discussion
> went off to discuss a number of related proposals such as 
> URIQA that have
> differeing methods for solving the problem of the potential ambiguity 
> between when a URI is used to represent a referent (such as me) or a 
> representation of a referent (such as a webpage). As put by David 
> Menendez, "As I see it, some URIs identify web pages and 
> other identify 
> abstract, non-web-page things".
> 
> For those interested, I'm going to clarify the options for 
> those wishing 
> to make the distinction (or similar ones, such as that 
> between a resource 
> itself and those representations which it may return). Then I 
> will take
> leave of what many may consider a philosophical rat-hole, and 
> any further
> communication with me on this subject should take place via 
> e-mail since
> I'm sure the www-rdf-interest has more things to discuss at 
> this point:

Not to see this turn into an endless, circular debate, but
I just wanted to also offer a few final comments/clarifications
to your final comments/clarifications....  ;-)

> 1) The problem could be solved by using a new URI Scheme, such as 
> the contextualized wpn:// or the less contextualized Larry Masinter
> tdb:// proposal (http://larry.masinter.net/duri.html). 

This isn't really a "solution" at the RDF level, since URIs are
fully opaque, and thus, one is not licensed to examine the URI
scheme to make decisions regarding the meaning of a given URI,
insofar as the RDF MT is concerned. True, some people do that, but
that is non-conformant and potentially dangerous behavior for
a SW client.

The explicit URI scheme can help humans, to some degree, but
IMO it is offers less utility to automated agents operating
in accordance with the RDF MT as it makes it harder to dereference
the opaque URI to (hopefully) obtain information regarding its
meaning.

So, from the perspective of an automated SW agent, new URI
schemes do not help, but only hinder on-the-fly discovery
of meaning.

> 2) The problem could also be solved by allowing a 
> representation itself
> to be used to denote a referent, and the Expanded Web Proper Bame 
> format does that with the goal of interoperability in mind. 
> http://www.cogsci.ed.ac.uk/~ht/webpropernames/
> 3) Jon Hanna had an RDF schema (rough draft) for 
> distinguishing between
> resources and representations:
> http://www.hackcraft.net/rep/rep.xml 
> 4) Thomas Passim had a few RDF predicates that could help:
> 	 subjectIsTheThingReturnedByThisURI,  
> 	theDocumentAtThisUriDescribesTheSubject, 
> 	TheDocumentAtThisUriIsAboutTheSubject - which were not discussed
> much further, although it's another route.

I see both options 3 and 4 above as compatible with the URIQA
approach, in that, one makes explicit RDF statements about the
resources denoted by various URIs, and the relationships between
resources, and that information is obtainable by dereferencing
those URIs -- ideally in a functionally explicit and robust manner,
such as via URIQA (but other approaches are also possible).

Nevertheless, the URIs remain opaque insofar as the SW machinery
is concerned, and the URI scheme should facilitate effective access
to information about the resource denoted by the URI.

Again, a non-http URI hinders, rather than helps, the solution.

> 5) The problem could also be solved by simply grounding a URI in "a 
> RDF graph where the terminal nodes are either URI references, 
> literals, or anonymous nodes not serving as the subject of any 
> statement", which might be even easier with a few new HTTP methods, 
> as suggested by URIQA and Patrick Stickler.
> ( http://swdev.nokia.com/uriqa/URIQA.html)
> 
> I tend towards the human-readable representation viewpoint to 
> solve the 
> "URI-grounding" problem, 

Firstly, I consider the URI-grounding problem to primarily be
one for machines, not humans -- in that, for the SW, if machines
cannot discover the meaning of newly encountered terms, it will
grind to a halt. Having to keep humans in the loop to continually
research new terms and modify the machine-readable knowledge of
such agents, rather than enabling them to expand that knowledge
independently will mean the SW will never scale and reach critical
mass or become globally ubiquitous.

Also, there is no reason whatsoever why there cannot be both a human
readable representation and a machine readable representation.

Consider the results of the following two resolution requests based
on the very same URI, the first of which provides a traditionally
GETable human-friendly representation, the latter providing a
URIQA MGETable machine-friendly description:

GET /FN-1/published HTTP/1.1
Host: sw.nokia.com

(e.g. curl -L "http://sw.nokia.com/FN-1/published")

versus

MGET /FN-1/published HTTP/1.1
Host: sw.nokia.com

(e.g. curl -X MGET "http://sw.nokia.com/FN-1/published")

Thus, the deployment of URIQA in no way impacts how representations 
of any kind are published on the web, and simply adds a means
for software agents to obtain machine-readable descriptions
irregardless of whatever human-readable (or other) representations
might be published alongside such descriptions.

> although I find the URIQA model 
> (minus the new 
> http)and the RDF predicate/schema models also interesting avenues. 
> 
> There are definitely some differing views on how URI works. 
> In the ideal
> semantic web world, at Patrick Stickler put it, "URIs should 
> not be used 
> to denote more than one thing. Period. 

That part, at least, is defined explicitly by the RDF MT, and
not simply my personal opinion.

> Only the creator of a 
> URI can say 
> what it denotes." Then, as pointed out, this excludes us from
> making statements about documents on the Web, especially when 
> the creator
> does not say what it denotes or I wish to use a  human-readable 
> representation to make clear what is being denoted. 

Not at all, and there were explicit examples of how one could
use RDF, with anonymous nodes and typed literals, to say whatever
one would like about the representations (entities) accessible
via a particular URI, irrespective of what that URI actually
denotes. E.g.

   _:x ex:accessedVia "http://example.com/someThing"^^xsd:anyURI ;
       ex:sizeInBytes "1234"^^xsd:integer ;
       ex:accessedAt "2004-09-26T11:36:09Z"^^xsd:dateTime .

And given

   _:x ex:accessedVia "http://example.com/someThing"^^xsd:anyURI .

we can, based on what we know about the foundational web architecture,
infer that

   _:x a ex:Representation ; ex:representationOF <http://example.com/someThing> ;

Thus, while we may have no clue what <http://example.com/someThing>
actually denotes (means), we can describe the representation (entity)
obtainable at a given point in time when dereferencing that URI. No
problem at all.

To what extent one might be able to guess about the nature of the
thing denoted by <http://example.com/someThing> based on any information
embodied in _:x is an entirely different issue.

> Phil 
> Dawes suggested
> "They are solved in the same way as they are solved in real 
> life - using
> context." WPNs formalize some context, and so do others, and 
> this problem 
> is notoriously hard. You could always,  as Graham Klyne 
> points out, use a 
> # at the end to make the differentiation. Jon Black notes 
> that "When a URI 
> denotes, it does so because everyone in a group knows what it 
> denotes", 
> which as he notes is difficult if not impossible in a completely open 
> system such as the SW. Hamish returns to the point of 
> representation, "To 
> humans, being able to dereference a URI and find some 
> explanation of what 
> that URI qua symbol is intended toindicate is very valuable 
> indeed, and I was starting from the assumption
> that http URIs would be used, as symbols, to indicate non-web 
> resources.", 
> and David Menendez follows "there's no reason that a software
> agent couldn't do the same." So clearly there's no consensus on this
> issue and exactly how URIs should be used in the SW 

It's probably fair to say that there is not yet a full and mature
concensus about these issues -- but you seem to suggest a far more
chaotic condition with little to no concensus at all.

I think that most folks agree about far more than they disagree.

And I also think that a very large majority of folks feel that
new URI schemes intended to capture semantics of usage do not
constitute the most optimal approach to these kinds of problems.

> - perhaps 
> we can only 
> hope for some Best Practice guidelines from above to clarify 
> the issue, or 
> see empirically how it works in the coming months and years. 

And we have a W3C working group to do precisely that! Woohoo ;-)  
             
(c.f. http://www.w3.org/2001/sw/BestPractices/)

> Again, I would say interoperability between any type of ontology or 
> metadata is going to be difficult, especially where the 
> semantics are unclear, humans may not be too careful about their 
> use of statements (or worse, machines making statements  
> automatically!), and so on. The URIQA proposal has an algorithm for
> grounding RDF in other RDF, while the WPN proposal could ground RDF
> statements in human-readable representations that are easy to build
> and compare. 

I consider the URIQA approach to provide access to machine readable
descriptions of resources *in parallel* with access to human readable
representations of those same resources -- both via the same URI,
and using the presently deployed and proven HTTP infrastructure
(it is not either/or, as you seem to suggest above).

> So one could clarify a statement given in FOAF such as:
> 
>  <foaf:image 
> rdf:about="http://www.ibiblio.org/hhalpin/homepage/images/harr
> ytrain.png">
>           <foaf:depicts 
> rdf:resource="http://www.ibiblio.org/hhalpin.wpn"/>
>  </foaf:image>
> 
> 	Where http://www.ibiblio.org/hhalpin.wpn is a google on myself
> where I've verified and collected useful web-pages with info 
> about me on 
> them, collating them into a EWPN.

And how/where would that statement above be obtained? If either
a human or SW agent encountered the URI 

   <http://www.ibiblio.org/hhalpin/homepage/images/harrytrain.png>

how would it obtain any reliable machine-readable information about
it? How would one find out that that resource foaf:depicts some
other resource? How does WPN offer any solution to that fundamental
bootstrapping problem? With URIQA, one could simply ask

MGET /hhalpin/homepage/images/harrytrain.png HTTP/1.1
Host: www.ibiblio.org

and (if the server is URIQA-enlightened and a description is published)
a formal description about that resource will be provided.

And if such descriptions are obtainable via URIQA requests irregardless
of the URI (i.e. whether it is a "web proper name") and one can use
RDF to say anything whatsoever about the resource in question, using
whatever vocabularies/ontologies one likes, where does WPN offer any
explicit utility?

Thus, from my viewpoint, *every* URI is a WPN, and folks simply have
to use the right URI when talking about resources and not be sloppy
and overload or carelessly misuse URIs when communicating.

> 	 We'll revise WPN a bit when we have some working code and have 
> fully digested all the comments we have received. And as 
> pointed out by 
> the "OWL and the Real World" discussion, this is difficult 
> going...but rewarding, since I think these types of proposals show how
> a SW might communicate and grow, and as we put it, to avoid 
> the "real risk 
> that the Semantic Web will consist of a vast number of 
> self-consistent but 
> mutually incommensurable collections of metadata."      

Well, I don't myself expect that to happen, so long as folks follow
the (clearly and strongly emerging) best practice that a URI denotes
one and only one resource and folks should be sure about what a URI
denotes before they use it to make statements.

Given a reasonable majority of statements following that best practice,
and a reasonble number of central ontologies to which more specific,
localized ontologies are related, and I think we'll achieve a pretty
fair amount of interoperability.

Cheers,

Patrick

      
>                                                                     
>   
> 			thanks everyone,
> 
> 				-harry
> 
> 
>
Received on Monday, 27 September 2004 06:42:10 UTC