- From: Paul Prescod <paul@prescod.net>
- Date: Wed, 28 Jun 2006 20:46:36 -0700
- To: "Pat Hayes" <phayes@ihmc.us>
- Cc: "Harry Halpin" <hhalpin@ibiblio.org>, www-tag@w3.org
- Message-ID: <1cb725390606282046r4f696500md5990b7dceb1f0ed@mail.gmail.com>
On 6/28/06, Pat Hayes <phayes@ihmc.us> wrote: > > >Im sure it can often help, but a problem arises when someone insists > >that there *must* be something there, because there are going to be > >many cases where it is hard to impossible to provide anything useful, > >so what will be provided will in fact not be useful, but providing it > >will nevertheless absorb a lot of effort, the cost of which is a > >brake on development and deployment. > > > > > >This is the heart of the argument. What examples do you have? > > Take almost any URI reference in any OWL ontology, for example > > http://www.w3.org/TR/2003/PR-owl-guide-20031209/wine#madeFromGrape > > Now, what that *means* is a binary relation > between wines and grape types. There is no way to > put that meaning at the other end of a > dereferencing process. I don't think it will be possible to put the complete meaning of the referent of an identifier online in my lifetime. What does the Constitution of the United States Mean? But you should put as much there as you can express in the time you have available. And in this particular case, whoever created that document DID put something that they know about it there. I can traverse the link you gave me and see that the identifier is for a binary relation and infer some other stuff from other things on that page. So I don't understand how this is an example of something that it is "hard or impossible to provide a useful description of." There is a URI. It is used as a name. It is possible to dereference it in a web browser or program and get useful information. The system is working exactly as it should. Do I misunderstand how this is a counter-example or so you have a different counter-example that would be better? No, that is not the point. I know a lot about it > (let us suppose) which is why Im writing an > ontology. But the connection between this name > and its ontology is not that the latter is at the > other end of an HTTP GET starting with the > former, it is that the ontology *contains* the > name itself. What's wrong with the example you cited (other than that it seems to strengthen my argument)? The ontology contains the name (as you want) AND the ontology is at the other end of a GET (as I want). But, more important: the system is not closed. That ontology is not the sum total of universal knowledge about grapes. People will make other documents served by other URLs and sent around through emails and written on napkins. If I discover one of these and see the string #madeFromGrape I'd like to get whatever information I can about it. Google may help, but it might also not have indexed the key document that I would want to read: the one that describes what the person who minted the identifier #madeFromGrape thought it was about. When I find one of those might it not be helpful for me to copy a URI out of a document, paste it in my web browser address bar and learn something (anything!) about it? Similarly, might it not be useful if I load one of these documents into my semantic web browser for it to say: "This document indicates that there might be other useful information about #madeFromGrape at a particular URI. I've checked that URI and it turns out that there is information there. Would you like me to incorporate it into triple database?" It is the surrounding text which > embodies the meaning, not something at the other > end of a Web dereferencing process. The name, in > cases like this, gets it meaning from the way it > is used inside what amounts to a large data > structure, which is the RDF graph of which the > OWL/XML text document is a handy rendering > (representation, in the REST sense?). The Web is > relevant to this only insofar as it allows these > graphs/texts to be transmitted, combined and > used, but it adds nothing to the way that the > graphs/texts determine the *meaning* of the names > which occur in them. If you found a print-out of one of these documents by the side of the road, how would you begin the process of reconstructing more meaning than was available in that single document? Google would be one tool. Wouldn't you use a browser's address bar as another? Recall a few important facts about Google: * it is owned by third parties and not either the minter of the URI nor the person trying to learn about it. * it is not designed to be reliable for any invariant definition of reliable -- they constantly tweak the algorithm. * it is not meant to be used by computer processes like semantic web browsers To make the point more forcefully: Imagine an OWL > ontology located at http://ex.place/foo.html > which when you look at it you discover that all > the names in it have the base URI > http://ex:otherplace/baz. This might be slightly > discourteous, but it is perfectly legal and would > not cause any SWeb engines to miss a beat. In > fact, most of them wouldn't even be aware of it. > And as for human readers, if they are looking at > the name, then they are already looking at the > text which tells them as well as anything can > tell them what the name means, viz. the text of > the ontology itself. Fine, so why not put a copy or a redirect at http://ex:otherplace/baz? When someone emails me the document I will have lost the URI http://ex.place/foo.html . When I want to see whether you've updated it and provided more information I wil go to "otherplace"? I don't want to argue that the system CANNOT work as you describe. It is just less convenient to the recipient and therefore impolite. You've demonstrated that the web does not depend upon what I might call "link-oriented self-descriptiveness" (described further down). But you haven't described that such a property is not valuable _when it is used_. > >It helpsto make the Web be "self-describing", although the notion of > >>"self-describing" is something I think is another notion that could > >>really use some inspection. > > > >I'd sure like know what it means, myself :-)Can you elaborate? > > > > > >Self describing means that a reader can start by > >looking at some data and follow links backwards > >to the specifications that define the intended > >meaning of the data. > > Yes, I thought that was perhaps what it was > supposed to mean. Tim BL explained this idea to > me a few years ago. I don't buy it. First, its > just not true, and the Web seems to work just > fine whether its true or not, which suggests it > is more dogma than theory. No, the Web has always used SGML-based markup and the term "self-describing" (as used in this context) comes from the markup world (AFAIK). The meaning seems very intuitive to me. You probably have never heard of the document type I am about to reference but you can figure out something about it pretty easily. <locality xmlns="urn:oasis:names:tc:ciq:xsdschema:*xNAL*:2.0"/> That URN is not dereferencable in any software I know of, but any document embedding it is self-descriptive insofar as the creator of the vocabulary put in information specifically designed to help you find out what the element means. It would be a lot better if there were some straightforward way to get from the namespace to the human description and machine-readable schemas for the vocabulary, but the URN is better than nothing (which is typically what you got in binary file formats). It should only take you an hour or so of poking around the OASIS site to figure out what this locality element is all about. Second, are we talking > here about human readers or software? Both. The SWeb is > supposed to be usable by software agents which > are not usually capable of reading a W3C spec > document and wouldn't be able to do anything with > it even if they could. In fact, most human > readers are in the same position most of the time. "Most" human readers of raw RDF data are in that position? If we put aside those that click on the wrong link and end up somewhere confusing, I would posit that most readers of RDF would appreciate at least the opportunity of reading a specification of what they are looking at. Or at least a web page that would point them towards the spec and other tutorials. (raises an interesting point...maybe specs should point to pages that index tutorials so that people lost in a spec can back out and find something more helpful for them) >With raw XML, the tags are "links" to English word meanings > > XML tags are linked to English word meanings??? > Where are these word meanings, and how does one > link to them? Do they have URIs? No. Self-describing does not depend on the Web. I'm pretty sure the concept predates the Web. The web enhances self-descriptiveness. >which are much more helpful than bit patterns. > >With (for example) HTTP-identified namespaces > >you have actual links to resources that might > >describe the meanings of the words in a human or > >machine-processable language. > > Might, yes. In fact do, only rarely. And as I > say, it doesn't seem to matter a tinkers toss > whether they do or not. I disagree. Going back to the URN example above. I specifically asked the creator of that URN to create a resource that would allow me to efficiently research the meaning of the document. I did that just last week because I was having a tremendous amount of trouble finding the meaning just using the usual tools. As you've probably already noticed, Google gives you very little helpful information. I wasted many hours clicking links, searching specs, unzipping files. Now the first step would be for a decent resource to exist AT ALL. Then the next logical thing would be for the resource to exist at an HTTP URI pointed to from the referring document. (in the end I was pointed to comments in a schema in a zip file, which is not my idea of a very discoverable resource...) >In short, a self-describing message or document > >points from the message towards the spec whereas > >most messages or documents require you to find > >the message or document using some out-of-band > >mechanism. "This file starts with the characters > >MZ. I wonder what file type this is?" > > I find the best way to find out is usually to try > Google. So, is this a Web architectural principle > at work? Is Googling a kind of link following? Yes. Google is an often inefficient kind of link-following tool appropriate to human beings. Googling for MZ or PK will find you nothing useful because the file formats that start with those characters are not even self-describing in the pre-web standard of "providing enough information for you to research it easily." I would say that there is kind of a hierarchy of politeness when it comes to self-description: 1. Make it easy to Google or research the message's syntax and meaning 2. Make it possible to learn more about the message's syntax and meaning by following URLs (that's what the Web is for, after all) 3. Make it possible for machines to learn something useful for processing the message (whether it be a schema, or a stylesheet or anything else that can be fairly safely downloaded and evaluated). One can easily define "self-descriptive" in a way that makes neither XML nor any other language in the universe self-descriptive: http://72.14.207.104/search?q=cache:ZwMXp9SCsvgJ:www.oceaninformatics.biz/publications/e2.pdf+self-describing&hl=en&ct=clnk&cd=1&client=firefox-a But why would you want to define a phrase into uselessness? XML is self-descriptive in a clearly defined sense. The syntax is designed to make it possible to learn more about a document or message's syntax or meaning both through research (1) AND through following URIs (namespace URIs) (2). For some reason the frequent attempts at (3) are continually stymied. Paul Prescod
Received on Thursday, 29 June 2006 03:46:48 UTC