Re: Terminology (was Re: article on URIs, is this material that can be used by the) from Pat Hayes on 2007-07-05 (www-tag@w3.org from July 2007)

From: Pat Hayes <phayes@ihmc.us>
Date: Thu, 5 Jul 2007 16:14:37 -0500
To: noah_mendelsohn@us.ibm.com
Cc: Dan Brickley <danbri@danbri.org>, "Henry S. Thompson" <ht@inf.ed.ac.uk>, Pat Hayes <phayes@ihmc.us>, Tim Berners-Lee <timbl@w3.org>, www-tag@w3.org
Message-Id: <p06230900c2b2db45ccd9@[192.168.1.2]>
>Pat Hayes wrote:

Sorry this reply is rather delayed.

>
>>  >* Having HTTP GET indicate in the results of an interaction whether
>what
>>  >has been contacted is in fact an information resource, and thus whether
>>  >the representation stands in the sort of relationship to the
>>  resource that
>>  >we expect for information resources (which >can< by definition be
>>  >faithfully sent in message), is useful.
>>
>>  There I disagree. Your locution here reveals the essential point.
>>  "the sort of relationship to the resource that we expect for
>>  information resources". WRONG. In fact, I expect to have at least TWO
>>  distinct relationships to information resources. I expect to be able
>>  to access them, using some kind of xxxTP protocol, AND I expect to be
>>  able to refer to them. Referring to them is exactly like referring to
>>  anything else: the same relationship is involved, the same semantic
>>  theories apply, and the same inference processes can be used for
>>  referential languages. When referring, the nature of thing referred
>>  to is almost irrelevant, in fact. The distinction between kinds of
>>  resource matters only because non-information resources can't be
>>  accessed.
>
>Careful.  I didn't say that either the access or the reference stood in
>some relationship to the resource.

Um... we seem to be even less aligned in our terminology than I had 
realized. I was using 'refer' and 'access' as verbs, indicating two 
different relationships which can hold between a name - in our case, 
a URI or UIRreference or IRI - and a thing, which we all agree to 
call a 'resource'. I have no idea what these words mean when used as 
nouns, other than maybe acts of reference or accessing.

Let me try to summarize my viewpoint. Here I am with a name. 
Somewhere (perhaps over the rainbow) there is a thing, or things, 
which is intended to be associated with that name in some way. This 
association between the name and something can be one of at least two 
different kinds. One of them, used by human language since humans 
first used language and having nothing particularly to do with the 
internet as such, is the relation of 'being a name for', aka 'refers 
to', aka 'denotes'. That is one kind of relationship between a name 
and a thing; let us agree to not enquire too deeply into its exact 
nature, which would take us into an entire library of scholarly 
debate. However, it clearly does not require any physical connection 
between the name used and the thing referred to, which indeed might 
not even physically exist. The other, unique to the modern Web, uses 
the name to find a route for information to flow from the thing back 
to the location of the occurrence of the name under consideration. 
This I am calling, for want of a better word, 'access'. This is a 
completely different relationship between a name and a thing, and is 
much newer; indeed, it is only possible because of the planet-wide 
system of electronic pathways that now exists, and a universal system 
of name-to-electronic-thing mappings called HTTP, though obviously 
only the final 'TP' of that really matters here. Also obviously, only 
things that can emit information into this planet-wide internet - 
information resources, I am assuming - can possibly stand in the 
second kind of relationship to a name.

I have deliberately not used the word 'representation' in the above, 
by the way.

>  I said that for information resources
>on the Web in particular, we expect that >representations< stand in a
>certain relationship to the information resource being represented.  I
>agree that this is not currently defined with mathematical precision, but
>I think the gist is clear.

Not to me. In fact, I find this use of the word "represent" so 
unnatural that I have to constantly remind myself that when speaking 
to TAG members I may be talking basically a foreign language. I see 
no reason why what the information resource emits should necessarily 
be considered a representation of anything at all. I can put up a 
blank page onto a website, so that what you see on your browser is a 
blank white page. In my way of talking, that cannot possibly be a 
representation of *anything*.

>  If the resource is the text of the U.S.
>Declaration of Independence, which consists of a sequence of paragraphs
>comprised of sentences comprised of words, then we generally expect a Web
>representation to convey that sequence or paragraphs/sentences/words.

So, is my copy of, say, 'Sense and Sensibility' a >>representation<< 
of the novel? That is not correct English in any technical usage I am 
familiar with. First, what exactly is a representation *of*? (The 
'abstract' novel? Surely the best analogy to the Webpage case would 
be to say that it is a representation of the state of the printing 
platen at the time the paper passed through the roller, or some such 
- after all, it does have a few broken-type glitches in it. But 
nobody says that.) What we say is that what I have is a book, which 
is perhaps an imprint (to emphasize how it was made), or an edition, 
or even a copy. But not a >>representation<<. If 'Sense and 
Sensibility' is a representation of anything, it is the state of 
18th-century English society, or some such: it represents what it 
*describes*. Similarly, I suggest, the natural way to understand what 
'represent' means, when applied to either a website or the version 
(image? copy? imprint? edition?) of it that is sent along in response 
to an HTTP GET, is that the site/page/browser screen/whatever 
represents whatever it is about, describes, refers to or is an image 
of; perhaps all of these. In the example used in the WebArchitecture 
document, what the web page represents is not the Oaxacala weather 
report, but the *actual weather*, at the time, in Oaxacala. That is 
what people reading their browser screen under those circumstances 
understand themselves to be reading >>about<<.

However, I will admit that I have, after beating my head against 
various walls, come to understand that the TAG uses 'Web 
representation' in an idiosyncratic way, and so I do in fact 
understand what you mean (I think). But that does not make the usage 
any less strange, idiosyncratic or easier to follow. And it conflates 
this 'image/copy/imprint' notion with other notions of 
representation: in particular, the widely, almost universally, used 
sense in which a >description< of something is a representation of 
it. Since a referring name is a kind of zero or null case of a 
description, the natural thing to say in your case is that the >URI< 
is what 'represents' the web page, or perhaps the image of the page 
which arrives at the browser.

>  It
>can do that in text/plain ASCII, in HTML, perhaps even in an image of the
>characters.  Because the Declaration is an information resource, we may
>aspire to having a representation convey its essence.

I wonder what that means.

>  Were we to have
>instead the resource Mr. Thomas Jefferson (the person), then we would
>assume that any attempt at a representation would be somehow further
>removed from his essence, since he himself could not be sent through a
>wire, even when he was alive.  That was the point I was making.

I understand the point you are making, more or less, I think (though 
I don't find the notion of 'essence' very helpful.)  But it is 
largely irrelevant to the point I was making. Of course, being a 
educated adult in this world, I want to be able to see the kinds of 
'representation' that you are talking about here, whenever it is 
possible to do so.
However, I ALSO want to be able to refer to 'information resource' in 
exactly the same way that I can refer to anything else. And I would 
like to be able to clearly distinguish these two ways to use a name, 
as they have, at first blush, absolutely nothing whatever to do with 
one another.

>You are correct that, somewhat independent of what I've just said about
>representations, the distinction between reference and access is imporant.
>  I think Dan is correct that the Web as deployed, and the Web Architecture
>Document (if read a bit sympathetically) provide for both.

Really, with the best will in the world: in order to be this 
sympathetic, one would have to be telepathic.

>Here's what I think may be the essence of the confusion:  there are
>certain systems in which it is by definition possible to attempt to access
>anything that can be referenced.

Indeed this may well be part of the confusion. OK, there are a few 
such systems (a VERY few), but the Web is not one of them; or at 
least, not if its understood as described in the Web Architecture 
document. That document is at pains to explain, quite early, that 
resources identified by URIs can be physical things not connected in 
any way to the Internet, such as books and people. From that point 
on, all talk about attempting to access things that can be referenced 
is obviously crazy. You can't use HTTP or any other xxTP to access 
people and books. (You can maybe access some kind of description of 
them, which could be called a representation of them, although not in 
the same sense of "representation" used by you and the architecture 
document; and not by getting a TP poke to them and causing them to 
emit a representation in response. But you didn't say 
'representation': you said, access anyTHING that can be referenced.)

>   The Web, at least when URLs are used
>for reference, is one such system.

WRONG. Starkly, in-your-face, obvious-to-a-child wrong. Which is why 
I am tearing my hair about this issue. Given what the architecture 
document says, this is so obviously wrong - in fact, you yourself say 
it is wrong in this very email - that I cannot believe that sane men 
would be saying it. So given that you are both sane and saying it, 
the only reasonable conclusion I can draw is that what the 
architecture document says isn't correct. Which of course I have 
rather suspected since I first read a draft of it.

>Let's consider another.  In many
>computer architectures, memory locations are identified by pointers.  I
>can use those for reference.  For example, I can say: "Hey, Pat, I think
>that crash was due to a bad value in memory at location 0X12345".  I've
>made a reference, but no access.

True, one can use any number of ways to refer to something. This 
technique only works for the locations, note, or maybe their 
contents; either way, a very small sampling of the full range of 
things one can refer to.

>  I haven't gone to find the value in the
>memory, just talked (referenced) about it.  On the other hand, if I'm in
>the world of software running on that computer, >any pointer you give me I
>can try to access<.  That doesn't mean that reference and access are the
>same thing.

Of course not. But it does suggest (which is my main point here) that 
it might be wise to be at pains to be clear whether you mean 'refer 
to' or 'access' when talking of he relationship between addresses and 
what you associate them with, and not get these two ideas mixed up.

>It doesn't even mean the access will always succeed;  I might
>get a protection exception.  Within the computer though, we posit that
>there is some uniform means (probably a LOAD instruction) that I can use
>to attempt to access any item for which I have a reference.

This isn't really relevant to our debate, but I have to remark that 
you are here being rather tendentious. The fact is, that in this case 
these really are >addresses<, and accessing them is what they are 
for. Even to call them names is rather stretching a point, and it 
works only because naming and reference is so ubiquitous that almost 
anything symbolic can be treated as a name. But to take this metaphor 
- which is all it is - too literally is a well-known tarpit (as I am 
sure you are aware) in trying to make programming languages coherent. 
If you program (as I once did) in a starkly typefree language like 
BCPL where all operations are on bitstrings and any bitstring can be 
used as an address, then one rapidly loses all sense of 'identifier' 
at all.

>At the very least for http URIs, almost surely for all of what we used to
>call URLs, and arguably for all URIs, the Web is like this.

Not if we are supposed to also use URIs to refer *in general*, as we 
are when using the SWeb standards, and which the web architecture 
document seems to suggest.

But in any case, I don't think the Web is like this, nor has it been 
like this since the very earliest days.

>  I can
>reference something by giving you a URI in this email.

Some things you can. Other things (including most 'resources') you cannot.

>  For example, I can
>tell you that my employer's Web site is http://www.ibm.com.  That's a
>reference.

You had to tell me it was a Web site, though. Yes, it works for Web 
sites. (Surprise, surprise. Even there its ambiguous between the site 
and its, er, 'representation'. How often have you seen usages like 
(in html) " for more details, see <a href="http://www.ibm.com"> here 
</a>", where the instruction is clearly to read what you see on the 
browser page after tracing the link, not the website itself. In fact 
I don't think its possible to read a website *itself*, the actual 
resource.) But what about all the other things you might need to 
refer to? If you had just used the URI as a referring name, what 
sense could I have made of it? "My employer is a 
http://www.ex.example.com." "I saw http://www.ex.example.com 
yesterday." What does that URI refer to? I might try going there to 
see, but I'd have to be able to do that, and I'd also have to be 
pretty smart and use implicit knowledge of things like irony and 
analogy, to be able to figure it out. More likely I simply couldn't, 
as I have no idea what aspect of whatever the site was about was 
intended. But note, if URIs really are fundamentally referring names, 
why is that example a problem? It shouldn't be: English can routinely 
absorb referring names from other languages.

>  The Web has the interesting property that, pretty much
>independent of what http URI I give you, you can try to access it.  That
>doesn't mean we're confusing reference and access, but it does make clear
>a sense in which they come together.

Sorry, that's exactly what it FAILS to do. That is exactly why we are 
having this debate. The relationship between reference and access is 
quite unclear exactly in the case where you can do both; and now the 
Sweb is part of the Web, you have to be able to do both.

>  That's what makes the Web so
>suitable for dynamic exploration.

You really don't need to give me the sales pitch, Noah :-)

>
>Of course, there are many other environments in which this connection
>doesn't hold.  We can with some fidelity reference human beings by their
>names, but in most cases knowing my name doesn't guarantee you a strategy
>for finding (referencing) me.

Quite.

Pat

>
>BTW: I will mostly be off email for a couple of weeks.  If this little
>discussion continues, I'm unlikely to contribute much.
>
>Noah
>
>--------------------------------------
>Noah Mendelsohn
>IBM Corporation
>One Rogers Street
>Cambridge, MA 02142
>1-617-693-4036
>--------------------------------------


-- 
---------------------------------------------------------------------
IHMC		(850)434 8903 or (650)494 3973   home
40 South Alcaniz St.	(850)202 4416   office
Pensacola			(850)202 4440   fax
FL 32502			(850)291 0667    cell
phayesAT-SIGNihmc.us       http://www.ihmc.us/users/phayes
Received on Thursday, 5 July 2007 21:14:53 UTC