- From: Patrick Stickler <patrick.stickler@nokia.com>
- Date: Fri, 25 Jan 2002 12:55:03 +0200
- To: ext Mark Baker <distobj@acm.org>, <www-archive@w3.org>
On 2002-01-24 6:13, "ext Mark Baker" <distobj@acm.org> wrote: >> Furthermore, I find that far more people that I encounter, >> particularly those working on semantic web applications, >> subscribe to the classical view, particularly software >> engineers building web applications that use URIs extensively, >> and who want/need a logical and formal taxonomy of URI classes. > > That's unfortunate, because now that the W3C and IETF have agreed to > use the contemporary view, the nomenclature, and the direction of > future work on URIs will follow it. Well, I think that there is sufficient contention about the recent "clarification" that both the W3C and IETF will be hard pressed to impose it upon the world at large. It may, unfortunately, flavor some impending work, but I thing that there's going to be alot of friction in moving in the classicial view's direction. >>> I see it this way (which is also the way that Tim Berners-Lee, >>> Dan Connolly, and Roy Fielding see it - the three people most >>> responsible for the Web as we know it); >> >> With all due respect to all three individuals, and many others, >> I think that they are missing something fundamentally important. >> (Even the gods can be wrong now and then) > > That's certainly possible. But when these three agree, I have not yet > *ever* found them to be wrong. There's always a first time ;-) And I find that the contemporary view (and classical view) disregards any notion of non-resolvable identifier (URP) as well as the need by software (not humans) to differentiate between true access problems and non-accessible resources. I.e., the contemporary view does not address the needs of semantic web agents -- which is amazing, since the SW is Tim BL's latest "thing". >> I see the "contemporary view" as a transient fad that >> will pass, leaving the classical view to continue on >> its merry way towards a global semantic web. > > It is doubtful, because all future IETF/W3C work will use it. Only if the "customers" of IETF and W3C agree, and there seems to be a growing population who, now that they are beginning to understand the debate, are not too happy with the fuzzy, ambiguous, and high-overhead approach outlined by the contemporary view. It seems that the only folks these days who are true proponents of the view are either "the three" or their devoted disciples. Hours of discussion at e.g. XML 2001 showed either total ignorance of the issue (just eat what the W3C/IETF feed you) or preferance of the classicial view -- and numerous times folks were "shocked" or "amazed" when I explained what the contemporary view really means for web applications. >> Just create your names using UUIDs, describe them as you like, >> and for those that denote web resources, use DDDS to map them to >> some address. Eh? >> >> After all, a name is just an opaque identifier that "is" what >> you say it is. > > That's right. The only issue is that http: as most of the same > properties as uuid:, plus it's so well deployed and understood. > I don't see a reason to go with an alternative that has no > advantages over the incumbent. Because 'http:' URLs are not temporarally unique. It's that simple. If you want non-fragile, persistent identifiers, you can't rely on 'http:' URLs. Also, it's an issue of whether the resoulution agency is explicit in the URI or whether it is left undefined, for determination at time of interpretation based on contextual information. So, those are two very valid and deciding factors for choosing something other than 'http:' URLs for many applications -- and there are other factors besides. >> Furthermore, are the proponents of the contemporary view completely >> *blind* to the confusion that exists in the larger masses of web >> users regarding 'http:' and other URLs that don't resolve because >> they don't denote digital resources (e.g. XML Namespaces, vocabulary >> terms, etc. etc.) as well as the confusion between vocabulary URLs >> and schema URLs and the total incompatability with such an approach >> for multiple schemas using the same vocabulary?! It appears so. > > I'm part of that community. There is definitely lots of folks wondering > why it is that they're using HTTP URIs, but I've only ever seen one > example of people having problems because of it (that was when > Netscape removed their RSS DTD and the RSS processors were all > validating and didn't cache the DTD - duh!). In the XML world, regarding the (perceived) relation between namespace URIs, schema identity, schema location, vocabularies, etc. it is a *huge* mess. Likewise, in the RDF world, differentiation between web page, owner of web page, and reification of URIs is a big mess. It seems that (and this may be way off base) that most of the participants in the classical versus contemporary debate are working only with browsers -- and that there is little input from folks working in other application areas which have very different (even if partially intersecting) needs. >>> I can do this for more than just markbaker.ca URIs. You and I can >>> have a conversation about http://www.ibm.com without invoking GET. >> >> No. You can only have a conversation about the web resource accessible >> at http://www.ibm.com, which is neither the URI 'http://www.ibm.com' >> nor the company 'IBM Inc.'. You can achieve this fundamental >> distinction by using URI schemes that embody the key semantics: >> >> http://www.ibm.com = a web resource >> uri:http://www.ibm.com = the URI for a web resource >> auth://ibm.com = a (semantic) web authority/entity >> >> Now, and only now, can we actually discuss these three things >> in a clear and consistent manner. > > In plain english, you and I can presumably manage to agree on what we're > talking about. I wasn't clear above, but we could have decided upon > either of those things you listed there. > > Now, the big issue is how is *software* supposed to know what we're > talking about. Using a new URI scheme for these concepts is one way, of > course. The problem with it is that this knowledge has to be hard coded > into URI processors. Absolutely not. That knowledge can be defined externally, by schema (e.g. RDF or otherwise) which can be retrieved and consulted at whatever frequency the application (or human controlling the application) feels is optimal. > So what happens when we want to make additions or > changes to the taxonomy? How do we deploy them? Do we require that > everybody download new software? No. Just fetch the latest schema(s). > What TimBL suggests is that we keep the URI as opaque as possible, and > treat any extended information (assertions) about them as just another > resource on the Web. But without explicit differentiation between URI schemes by common semantics and purpose, such global economy of definition can never be attained -- rather, such knowledge has to be defined for *every* instance of a URI. > Yes, all of this is quite accurate. It's just not practically > extensible. I completely disagree. In fact, the whole point is that it *is* practically extensible and scalable. The contemporary view is like saying that the semantics of every XML instance should be defined for each instance, rather than for a document type for which there is a single global schema. I.e., it's madness and terrible engineering. >> Now, playing devil's advocate to my own arguments, I will >> concede that one could have different 'http:' URLs to >> capture the distinctions provided by my separate URI schemes, >> but there still remains the problem that in such a scenario >> URLs would be used for non-digital, non-accesible resources, >> and thus, the fair and resonable expectation by both a human >> and an application that a URL provides access to a web >> resource is violated. See my definition of the "Neo-Classical View" posted to the URN and URI lists. I think it provides a satisfactory solution for this. >> Again, how is an application (or person) supposed to know >> that a failure to resolve is intended/expected rather than >> due to some actual problem accessing a web resource? > > HTTP response codes say all that needs saying, I believe. But only *if* a resource is expected to be retreivable. There is no response code I am aware of that says "this resource is not a retrievable resource" -- and even if there were, you would have to define that response for *every* non-retrievable URL, a completely unacceptable and untenable overhead. > W3C policy on > this is reasonable; any w3.org URI used in a specification *must* be > resolvable to something that defines what it means. But then you are getting something that is *not* the resource itself, and the application cannot know that -- unless it understands what the returned resource says. I.e., the contemporary view, and the W3C policy is intended for humans, not for software agents. It does not and cannot meet the needs of semantic web agents. > For others, this is a good policy, because you're right, what if > somebody is told that URI means something, but it returns a 404 when > resolved? There's an inconsistency there, but it is easily resolved > by putting a few words at the other end. It's quite a cheap process. No, it is a terribly expensive process -- as you must then define some standardized ontology that expresses the fact that a non-retrievable resource was not returned, but what was returned was not the resource but another resource that explains the non-retrievability of that first resource -- but then, what if you want an agent to actually retrieve that explaination, but not interpret it as an explaination about the first resource! Just define URI Classes for retrievable versus non retrievable resources, and then an application knows right up front, before even trying to resolve, whether that URI denotes a retrievable resource or not. *That* is much cheaper. >>> Is "foo://www-markbaker-ca/James/" an URL or an URN? You don't know, >> >> You *could* know, if you said something like >> >> <rdf:Description rdf:about="foo:"> >> <rdfs:subClassOf rdf:resource="voc://ietf.org/URI-Taxonomy/URI/URN"/> >> </rdf:Description> > > Yes. But until you know more about it, what do you call it? A URI. > And you only know something is an URN if it's in the urn: scheme. I don't accept that. I can state that any URI scheme is a URN scheme, not only the 'urn:' scheme. Neither the W3C nor IETF can forbid the definition of URI schemes that are ascribed the semantics and behavior of URNs. >> Of course, since your son actually *isn't* a web resource, that > > Sure he is. He's a "thing with identity", which is the only > requirement for something to be a Web resource in the http: URI > scheme. No. A resource may be anything. But a *web* resource must be digital and accessible. Of course, the RFC's do not make this either clear nor consistent, contradicting themselves even in the very same RFC, but the distinction is valid and necessary. >> *You* may understand that URI to denote your son. And some other >> human may be able to discern from its mnemonic characteristics >> that it likely denotes a human, but a computer just sees bits. > > Right. And those bits may express an assertion that he's a human: > > <http://www.markbaker.ca/James/> rdfs:subClassOf :Person Again, I'm not saying that you can't use a URL to denote a non-digital resource, but you will confuse both humans and applications which expect URLs to be dereferencable to the resource that is named by them, and you cannot (at least for the present) dereference your son from a URL -- only a description about him, or some other resource that is not your son. Folks misuse URLs every day, but that is why there are so many problems in areas such as XML namespaces and RDF. Patrick -- Patrick Stickler Phone: +358 50 483 9453 Senior Research Scientist Fax: +358 7180 35409 Nokia Research Center Email: patrick.stickler@nokia.com
Received on Friday, 25 January 2002 05:59:55 UTC