Re: URI for abstract concepts (domain, host, origin, site, etc.) from Xiaoshu Wang on 2009-06-27 (www-tag@w3.org from June 2009)

From: Xiaoshu Wang <wangxiao@musc.edu>
Date: Sat, 27 Jun 2009 17:04:42 -0400
To: Erik Wilde <dret@berkeley.edu>
CC: Dan Brickley <danbri@danbri.org>, Larry Masinter <LMM@acm.org>, "'Pat Hayes'" <phayes@ihmc.us>, "'Eran Hammer-Lahav'" <eran@hueniverse.com>, "'Dan Connolly'" <connolly@w3.org>, "apps-discuss@ietf.org" <apps-discuss@ietf.org>, "www-tag@w3.org" <www-tag@w3.org>, "'URI'" <uri@w3.org>
Message-ID: <4A46896A.8050107@musc.edu>

Erik Wilde wrote:
> hello.
>
> Xiaoshu Wang wrote:
>   
>> There should not be.  I have trying this many times.  A URI, fragmented 
>> or not, denotes one thing and its returned representations another.  The 
>> former is the content of the later.  The remedy is to define a URI 
>> syntax for representation.
>> A syntax that I have proposed is to insert a (mime-type) after the # sign.
>> Thus,
>> "http://danbri.org/foaf.rdf#danbri" denotes a person.
>> "http://danbri.org/foaf.rdf#(application/rdf+xml)danbri" denotes an RDF 
>> node.
>> "http://danbri.org/foaf.rdf#(application/xhtml+xml)danbri" denotes an 
>> HTML element ided "danbri
>>     
>
> interesting. would that be specific for http-identified resources? if 
> not, how would that be supposed to work with URI schemes that do not 
> share HTTP's capabilities for transferring content metadata, and 
> performing content negotiation? a simple example might be FTP, which is 
> similar in nature to HTTP (access to hierarchically organized resources) 
> but has no concept of media types.
>   

I don't think a URI scheme has to do anything with transportation 
protocol.  No matter what URI you use, after de-reference, you get a 
*representation*, which is a different thing from the *resource* that 
the URI denotes.  And a representation must have a content type, 
regardless how you retrieved it.
> another thing i am wondering about: aren't fragment identifiers as they 
> are currently defined client-side only and specific to the media type 
> anyway? that might indicate you are talking not about extending the URI 
> syntax, but that of HTTP URI fragment identifiers? 
See the above answer.
> there were other 
> approaches of doing this (with other goals), one of the issues was how 
> to create some framework for fragment identifiers that would be 
> uniformly applied to all fragment identifier syntaxes. 
I am not sure what is the purpose for that -- a standard syntax for 
fragment identifier?  It is just a name, right? What I am proposing is 
to make a syntactic notation on URI syntax so that we will no longer be 
pondered by the question of what a URI denote.
> that one never 
> got anywhere, and the window of opportunity is probably closed by now. 
> but there the idea was that instead of labeling fragments with the media 
> type to which they should be applied (which seems to be what you're 
> suggesting), they should follow some base syntax, and thus could be 
> designed to be less brittle across media types. HTTP's idea of content 
> negotiation (and thus dynamic media type assignment at access time) and 
> URI fragment identifiers and their media type specificity always was one 
> of the areas where web architecture certainly could need a bit of 
> improvment.
>   
Content negotiation does not cause the problem.  It only makes the 
problem obvious.  The Web is based on three fundamental entities: URI, 
Resource, Representation.  But currently, the referential range of the 
URI only covers resource but URI and Representation.  What I have 
proposed in my manuscript to ISWC 2009 is as follows.

1. If a URI's root is ended with a "~", it denotes the URI sans the "~".
2. Insert (mime-type) after # to denote a particular type of 
representation retrieved from a URI.
3. If a URI's root is ended with a "?", it denotes the list of all 
mime-types supported by the root URI.

With #1, we can built a URI that is composed of any numbers of URI but 
yet still maintain a level of curtness because the mapping can be 
described in a representation and can be retrieved.  This will  solve 
most, if not all, problems raised in XSD use cases.

The reason for #2 is for httpRange-14 that has hunted us for 7+ years.
As a side note to Dan, I think mime-type should be formulated in URI 
just like any other resource.  I have detailed my reasoning from my 
manuscript.  One of the use case is that I have one resource but there 
are two available XML schema, which overlaps but neither one consumes 
the other.  With the current media-type specification, it forces me to 
choose one, which is not what the best for me and for my potential 
clients.  Also, there are other advantages of extending mime-type to 
URI.  For instance, in principle, a client can "follow the nose" to 
retrieve something and parse a novel format.

The #3 is in essence a transparent content negotiation.  But modeling it 
as a kind of resource allows all kind of formats be used to describe the 
list.  Also, at the most fundamental level, a mime-type is  semantically 
equivalent to a service, so the #3 can be considered to be a default 
standard mechanism for service discovery. This thus essentially solves 
all the so-called "uniform access to metadata" problem. 

This  URI pattern (TIP, I called it a pattern for now since it is not a 
in URI spec) + The And Pattern (TAP i.e., to supply a resource with one 
mime-type And another And another via content negotiation), gives us 
both diversity (in presentation) and uniformness (in organization and 
therefore discovery).  Personally, I think that is all we need at the 
most basic architectural level of the Web.

Xiaoshu

Received on Saturday, 27 June 2009 21:05:26 UTC