Re: Proposed AWWW erratum on "information resources" [was Re: Fwd: Splitting vs. Interpreting] from Xiaoshu Wang on 2009-07-13 (www-tag@w3.org from July 2009)

From: Xiaoshu Wang <xiao@renci.org>
Date: Mon, 13 Jul 2009 11:32:37 -0400
To: David Booth <david@dbooth.org>
CC: "Sean B. Palmer" <sean@miscoranda.com>, www-tag@w3.org
Message-ID: <4A5B5395.2020509@renci.org>
David Booth wrote:
> Hi Sean,
>
> On Mon, 2009-07-13 at 10:55 +0100, Sean B. Palmer wrote:
>   
>> On Mon, Jul 13, 2009 at 3:24 AM, David Booth<david@dbooth.org> wrote:
>>
>>     
>>> By design a URI identifies one resource.  The term "resource" is
>>> used in a general sense for whatever might be identified by a URI.
>>>       
>> What would these sentences look like if they were written with
>> something more like the HTML 5 philosophy?
>>     
>
> I don't think I know well enough what the HTML 5 philosophy is to
> comment.
>
>   
>> When you look at deployed usage, obviously you find that the http
>> scheme is used far more widely than any other scheme. What's the
>> second most commonly used scheme? mailto? file?
>>
>> I'm sure TimBL now rues using mailto instead of mailbox or mbox or
>> some such. If you were to ask someone how many things a file URI
>> identifies, what would they say?
>>
>> file:///tmp/example.txt
>>
>> How many /tmp/example.txt files in the world? Okay, but it only refers
>> to the file on the current system. So in that case, is that a product
>> of behaviour or is it a reference system?
>>     
>
> Yes, that's a good corner case, as that's one that is specifically
> designed to be context dependent. So in theory it identifies a single
> abstract resource -- the idea of *some* file with that name -- but in
> practice context is used to map it to a more concrete, specific file.
> In other words, in theory it can still be viewed as identifying a single
> thing, but in practice the thing it identifies depends on context.  
>
>   
>> When browser manufacturers come up with new schemes, what kind of
>> things are they commonly making and why?
>>
>> Safari redirects any http URI that returns application/atom+xml to the
>> same URI string only with feed in place of the http scheme. What the
>> heck is that all about? What is AWWW teaching us about this, or is
>> this not within AWWW's remit?
>>     
>
> Do you mean Safari makes up a new URI for it?  Something like
>   feed://example/my-feed
> instead of
>   http://example/my-feed 
> ?  
> That seems a little odd, since it already knows that it has an atom
> feed.  Do you know what is the rationale for doing this?
>
>   
>> The most common use of a URI is clicking a link in HTML or pasting it
>> in your browser's address bar. And we're not talking most common by a
>> slight majority, of course.
>>
>> So if we add up all the other uses of URIs, your mailboxes and your
>> file URIs and your XML namespaces and your URNs and tags and all kinds
>> of things, do they amount to the same body of usage as HTTP URIs used
>> in the browser?
>>
>> And if we look at the commonalities amongst how these things are used
>> and implemented, what do we want to derive from that? What can we
>> learn, and what can we teach?
>>     
>
> Not sure what you're getting at.  Are you suggesting a completely
> different approach to the AWWW's treatment of URIs identifying
> resources?
>
>   
>>> An "information resource" is any resource that plays a role
>>> in the hypertext Web by producing "representations"
>>>       
>> What server actually works on a model of resources producing
>> representations? What web framework works in this way? I've just been
>> through the Django tutorial, and I don't see resource being used in
>> there.
>>     
>
> They all work in this way, though not necessarily using that
> terminology.  That's just the terminology chosen by AWWW to describe, at
> an abstract level, what happens.  I've simply tried to be as consistent
> as possible with existing AWWW terminology while suggesting changes that
> would correct the current, flawed definition of "information resource".
> Trying to change the terminology beyond that might be useful, but it
> would be a much bigger undertaking and is outside the scope of my
> suggestion.
>
>   
>> The simple use of current common servers is that files in directories
>> are exposed on the web, and maybe you can leave the file extensions
>> off. More complex use involves scripting. To someone coding the
>> backend to the latest Web 2.0 startup, does "information resources
>> produce representations" mean anything?
>>
>> If not, where are the extents of the remit again?
>>
>> If servers were commonly implemented in Analytica, Lusture, or Prolog,
>> that might be one thing. Heck, when I wrote an HTTP client
>> implementation in Python I tried to use all the right words from RFC
>> 2616. What does your sentence tell me that RFC 2616 doesn't?
>>     
>
> Not very much.  :)   It just states it in AWWW terms.
>
>   
>>> Depending on one's perspective (or application) this may be
>>> viewed as a case in which the URI unambiguously identifies
>>> a resource that has multiple aspects or as a case of ambiguity,
>>> in which the artistic work and the web page are each deserving
>>> of their own distinct URIs.
>>>       
>> Okay this, to me, is a very admirable attempt to resolve the current
>> peculiarities of the situation that we're working on here.
>>
>> But why are you saying this? You're only saying this because of RDF,
>> not because of some common model of the web. And yet this is
>> Architecture of the World Wide Web.
>>
>> So don't say that here. It's the wrong place!
>>     
>
> In some ways I agree, that it would be more appropriate to put the
> material on ambiguity in a separate document on semantic web
> architecture (which builds on web architecture, of course).  The reason
> I included it here is that that is the only way I can see to explain
> what's going on when someone uses the same URI for both a person and a
> web page, and someone else complains that that creates an ambiguity.
>   
The word "ambiguity" is itself ambiguous without an explicitly specified 
ontological ground.  *Who * says, or *where* does it say, that it is 
ambiguous if a URI denotes both a person and a web page (what is a web 
page anyway)?  The semantics is, in fact, quite clear: the URI's 
referent is what it is -- a person and a web page.  It is no different 
if I switch "person" to "parent" and "web page" to "child".  Is it 
ambiguous or clear if I say that I am both a parent and a child?

The Web Architecture does not specify an ontological ground for the 
ambiguity about those things, So let us not use it to imply what is 
ambiguous and what is not.  The only kind of ambiguity that we can say 
at the AWWW level is that, given a URI, whether the referent is a 
resource, a URI, or representation.  The latter three kind of things is 
all we know about the Web.

> IMO a critical flaw in the current definition of "information resource"
> is the suggestion that it is a class of things that is disjoint from the
> class of people or cars or dogs.  I guess the erratum could just remove
> the disjointness constraint without further explaining it, but it seemed
> to me that people would likely want more of an explanation.
>   
The critical flaw of the IR is the idea behind it.  Take IR out of AWWW, 
we are all relieved.

Xiaoshu
Received on Monday, 13 July 2009 15:33:17 UTC