AWWSW? from Nathan on 2011-02-02 (public-awwsw@w3.org from February 2011)

From: Nathan <nathan@webr3.org>
Date: Wed, 02 Feb 2011 01:23:57 +0000
To: AWWSW TF <public-awwsw@w3.org>
CC: Jonathan Rees <jar@creativecommons.org>, Tim Berners-Lee <timbl@w3.org>
Message-ID: <4D48B22D.8040102@webr3.org>
Guys,

We've been modelling it all wrong (well I certainly have) - here goes, 
do follow:

   <u> a :Resource .
   [] a :Representation .

...

When a relation is present in an HTTP header (Link) or in a 
representation ( <link href="<y>" rel="stylesheet"> in text/html ) 
then it is always associated with nothing /to start with/.

That is to say, all you know is { ? <foo> <y> } and you have to 
determine what ? is, the normal way of doing this is by looking at the 
domain of <foo>.

Some relations have a domain of :Representation, examples of this include:

   stylesheet
   alternate
   author
   help
   icon

Where we've been going wrong previously, is by saying things like:

   <u> a :InformationResource ;
     stylesheet <y> .

When in fact it's:

   [] a :Representation ;
     stylesheet <y> .

We know nothing about <u> other than what we can conclude by looking 
at information about <u>, for example looking at our representation 
and concluding that <u> names a [whatever].

The representation doesn't have a name (it was just pulled from a 
message of some kind).

Each representation is in it's own namespace, the closest thing I can 
describe it as, is a graph literal / quoted graph in n3.

Each representation is a description.

Each element within a representation is identified by a locally scoped 
  existentially qualified variable, a blank node, (referenced non 
persistently within the scope of that representation).

  { [] a :Link ; href <y> ; rel <stylesheet> . } a :Representation .

that'd be the following html:

  <link href="<y>" rel="stylesheet">

The link is described within the html, within the representation, and 
as you can plainly see the subject is missing, it has to be inferred 
when creating the link from it's description.

The only way to infer the subject is to understand the description, 
the rel, the domain of the rel, in stylesheet's case it has a domain 
of :Representation, hence its a relation between the representation 
and the stylesheet.

In another case, the rel 'help':

  { _:b0 a :P ; :child _:b1 .
    _:b1 a :Label ; :child _:b2, _:b3 .
    _:b2 a :Input ; name "topic" ; nextSibling _:b3 .
    _:b3 a :A ; href <h.html> ; rel <help> ; text "(Help)".
  } a :Representation .

in html:

   <p><label> Topic: <input name=topic> <a href="h.html" 
rel="help">(Help)</a></label></p>

The description of the 'help' relation:

   "For a and area elements, the help keyword indicates that the
    referenced document provides further help information for the
    parent of the element defining the hyperlink, and its children."

So the above describes (among other things) a link between the locally 
/ representation scoped P element and <h.html> (relative-uri resolved 
against the base).

That gives us another class of things, those which a representation 
comprises, elements in html for example. Again, these are not named 
with uris.

Each element within a representation can be optionally given a locally 
scoped identifier.

  { [] a Paragraph ; content "bar baz" ; id "foo" } a :Representation .

that'd be the following html:

  <p id="foo">bar baz</p>

the @id "foo" is locally scoped to the representation, within one 
context (presentation) it refers to the presentation of the thing 
described, within another (js) it refers to the memory resident dom 
node which was created as described. Note, in both cases it refers to 
the thing described, it's just being referred to for different 
purposes. Remember "thing" in this case is constrained to those things 
like html elements, the nodes in the description of the representation.

@id's can be referenced by the fragment part of a URI, however it's 
very important to note that this doesn't mean that <u#foo> is a name 
for the element within the representation, rather, and only within the 
context of dereferencing, the two components are separated, in to <u> 
and <#foo>, <u> is dereferenced to get a representation (more about 
this later), and <#foo> is used to refer to the element with the @id 
"foo" within the representation/description (if it exists in there). 
As noted earlier, how that <#foo> reference is actioned depends on the 
context in which the representation is being considered, but within 
that scope it always refers to the same thing, the thing described as 
having an id of "foo".

Outwith the scope of dereferencing, the uri <u#foo> is a (fully 
qualified?) global name in it's own right, distinct from any other, 
and can be used to name any resource.

This illustrates the duality of URIs:

  name <u#foo> != dereference( <u> <foo> )

you can probably find a much better way to write that! but essentially 
a uri w/ a fragment is very different to an @id reference within an 
unnamed representation.

So, representations are anonymous, the elements within are locally 
scoped to that representation for identification purposes, being 
bnodes, or bnodes with an @id.

That still leaves us with Resources, and there's still another level 
of duality to go.

A URI <u> is a name for a resource, when you dereference that name, 
then depending on how you are dereferencing it, the <u> is split in to 
it's component parts, and each part is resolved individually in order 
to locate some process on a machine (local or over the network), a 
request in then sent a response is received, sometimes containing a 
representation.

Now, the URI <u> in the scope of dereferencing, is a compound 
identifier, where each component refers to something (a scheme, 
another name (domain/ip), a name for a process on a (virtual) machine 
and so forth), those component parts are often augmented with 
additional information (such as message structure) and sometimes the 
full URI <u> is simply is placed in a string (like a query string) to 
do the dereferencing. The actual "process" or "resource" a request is 
sent to when dereferencing is identified by the sum of all the 
components and messages, it's often the case that this process is 
unidentifiable.

Some authority and provenance information can be given in the 
messages, to have some kind of grasp on who, or which machine, is 
responsible for replying to a request, but as for the "resource" or 
"process" it's typically just some short lived worker process on a 
machine that's killed as soon as the response is done.

Networking people, programmers, REST fans all use the term "resource" 
to refer to the abstract concept of one of these accessible processes, 
and they use a URI <u> to refer to that process within the bounds of 
dereferencing, indeed they normally refer to one of more of the 
component parts, and when they say "http://ex.org/foo/bar?x=y" they 
mean ( "http" "ex.org" "/foo/bar" "x=y" ), not 
<http://ex.org/foo/bar?x=y>.

This gives us a fourth class of thing, a "NetworkResource" (for lack 
of a better term, many think of IRs as these class of things) - 
luckily NetworkResources can easily be disambiguated. For example 
Location: "u" refers to a NetworkResource, it is within the scope of 
dereferencing, likewise all protocol usage.

As with representations, NetworkResources are not named, it's not:

   <u> a NetworkResource .

but rather:

   [] a NetworkResource ; address "u" .


To summarize where this leaves us at the minute, we have four classes 
of things:

  - Resource (can be named with a URI)
  - NetworkResource (always unnamed)
  - Representation (always unnamed)
  - RepresentationElement (always unnamed)

By simply applying this constraint to the web, and setting the correct 
domain on each relation which requires it, we can disambiguate 
everything that causes any confusion. From there it's just a simple 
case of giving different things different names.

Rather amazingly, I also find this to be consistent with web 
architecture, with what HTTP says, what URI says, what REST says, what 
Tim says, what Roy says, what we have all said along this road!

AFAICT, it also clears up some of the other TAG issues such as use of 
fragments on the web, simply each fragment when dereferencing refers 
to some element with a corresponding @id within the unidentified 
Representation returned. Sure it makes sense to align @id's when using 
content negotiation so that people don't get surprised, but it isn't a 
problem really, and both man (via knowledge) and machine (via 
knowledge of the above) can disambiguate nigh on instantly.

Questions, disagreements, any cases you want me to apply this to in 
order to prove the theory?

Finally, sorry it took so long to get something like this out, it's 
been a struggle with hundreds of incorrect paths taken and false 
assertions - I hope to hell this one is provably true!

Best,

Nathan
Received on Wednesday, 2 February 2011 01:24:54 UTC