Re: Problem definition from Nathan on 2011-02-17 (public-awwsw@w3.org from February 2011)

From: Nathan <nathan@webr3.org>
Date: Thu, 17 Feb 2011 22:54:06 +0000
CC: AWWSW TF <public-awwsw@w3.org>, Jonathan Rees <jar@creativecommons.org>
Message-ID: <4D5DA70E.2010608@webr3.org>
analysis / explaining web names theory:


Web Names
===================

   ( "http://..." , "" )  /  ( "http://..." , "foo" )

web names build on the fragment way of deploying data, such that every 
possible web name refers to a thing, however it removes the constraint 
that dereferencable absolute-URIs always refer to information resources, 
by being names that cannot be dereferenced (see linked data URIs 
feedback earlier/below), this removes the chimera state between the web 
of data and the network, such that web names, never, by default name a 
network accessible resource or a representation.

this means that should you GET the namespace part "http://..." and 
receive a representation w/ 200 OK, then you know that you got a 
representation from a network accessible resource, and do not have a 
"name" for either the representation or the network accessible resource. 
Should you wish to name one properly, then you have to give it a web 
name and describe it.

so, up to this point it takes the good parts of the null hypothesis, and 
the best practise for linked data as timbl does it.

next, it also removes the /inferred/ chimera state that resides between 
"document" and "primary topic", such that neither is inferred/implied by 
the presence of a web-name which uses the primary-ref (equal to a slash 
uri, or one with a single #). Thus any single thing, a primary topic, or 
a document, can be what the name refers to, metadata authors must pick 
what each name refers to and describe it, if you have both a document 
and a primary topic that is also an "IR" (like a blog post) that you 
need to describe, then use two web names, and pick the most commonly 
used as being the primary-ref URI.

The side effects of this, are that RDF would need changed to use "Web 
Names", as would linked data, and possibly noted in web arch - but 
they're IRI compatible so no tooling changes required - yet still, 
standards and docs would need changed.

This removes the natural / default chimera state such that:

                            _ network accessible resources
   Network / Computers ----|_
          |                   representations
     [hidden by]
          |                 _ sources of information
    Web of Documents ------|_
          |                   documents
     [hidden by]
          |                 _ primary topic of a document
     Web Of Data ----------|_
                              any thing

all web names refer to something described on the web of data, and since 
they can name anything, you can use them to refer to something on the 
different layers in an unconstrained way, as such:

   ("http://domain.com/" , "se123") a :Representation .

(remembering the lexical form is still IRI!)

Nothing can ever stop people confusing things and making incorrect 
statements, but this approach means that no prior notions of what X 
class of names refers to carries over to the web of data; and when there 
are incorrect statements already out there (like <u> stylesheet <o> ) 
then we can use domain/range to filter those statements out.

Its just information hiding, URIs hide the network/computer layer, web 
names hide the web of documents layer.

Hope that's a fair analysis and explains my "full view" of things.

Best,

Nathan

Nathan wrote:
> 
> Analysis of URI Architectures, working from:
>   http://neurocommons.org/page/WebURIArchitectures
> 
> 
> Null hypothesis
> =======================
> 
>   [
>     a :Representation ;
>     :wasRetrievedFrom [ a :NetworkResource ; address "http://..." ]
>   ]
> 
> exactly matches the uniform interface of HTTP, network accessible 
> resources are not identified by URIs, neither are representations - 
> because link between "resource" (rest/http speak) is 1-* with network 
> accessible resources, and link between resources and representations is 
> 1-* - even when the "real" mappings are both 1-1,you cannot prove or 
> tell this to be true.
> 
> unclear what one would use the URI to refer to, but given the presence 
> of those statements one could set the domain/range of common properties 
> like "stylesheet" to be :Representation and thus disambiguate them 
> (change the subject to be the blank node identifier of the representation).
> 
> null hypothesis clears up one level of the web, such that <u> is not 
> used to refer to network accessible resources or representations, but 
> still leaves a partial chimera state where <u> can refer to the 
> "document" or "primary topic"/ "any thing".
> 
> 
> TimBL / httpRange-14 / Cool URIs for Semweb
> ===========================================
> 
>   <u> a :InformationResource .
> 
> this is the current (perceived) status quo, but InformationResource is 
> unclear and often used to refer to representations, documents, and 
> primary topics (where primary topic can itself be considered an 
> information resource)
> 
> has negative side effects on community of not being understood, and 
> network of requiring 2 GETs per URI (+ potential knock on effects on 
> provenance etc) and people using the wrong URI to refer to the thing 
> named (the 303'd to one)
> 
> 
> Primary topic
> =============
> 
>   <u> a :Thing .
> 
> this design does remove the chimera state, by saying that <u> always 
> names a thing, but this means that <u> doesn't refer to the document and 
> in reality people still use <u> to refer to the :Document (and 
> representation and network accessible resource) - so fails.
> 
> 
> What the page says
> ==================
> 
>   <u> a :Thing
> 
> this design does remove the chimera state, by saying that <u> always 
> names a thing, but this means that <u> doesn't refer to the document and 
> in reality people still use <u> to refer to the :Document (and 
> representation and network accessible resource) - so fails.
> 
> "what the page says" and "primary topic" are the same, just primary 
> topic has the perspective that a page/graph only primarily describes a 
> single thing.
> 
> 
> Chimera theory
> ===============
> 
>   <u> a :Representation, :NetworkResource, :Thing, :Document .
> 
> this is the current reality, and what we want to get away from!
> 
> 
> Linked data URIs refer to information resources
> ================================================
> 
> I believe the text written about this to be wrong, and understand it to 
> mean that the dereferencable absolute-URI part of an http URI refers to 
> information resources, indeed this is consistent with the way Timbl 
> talks about the web and publishes data
> 
>   <http://www.w3.org/People/Berners-Lee/card> a :Document .
>   <http://www.w3.org/People/Berners-Lee/card#i> a :Person .
> 
> This approach, the one Tim uses (fragments for things) is the only 
> approach that currently removes the chimera state between the web of 
> data and the web of documents, however the chimera state between web of 
> documents and representations still exists when using RDFa and URIs for 
> Documents whenever representation metadata is present.
> 
> 
> Again, not mentioning anything about what the "goal" of all of this is, 
> it's certainly to remove the chimera state from the web of data, to get 
> rid of httpRange-14 -or- constrain what IR means in order to solve the 
> cc:license problem. Could probably write up the goal(s) quite easily and 
> use them to check mark solutions.
> 
> Best,
> 
> Nathan
> 
> Nathan wrote:
>> resend (sent from wrong address previously)
>>
>> Hi,
>>
>> Tried to write up the problem, as I see it (keeping away from solutions
>> at the minute).
>>
>> Agree with the following summary(?):
>>
>> Competing uses of dereferenceable absolute http scheme URIs:
>>
>> 1: network accessible resource
>>    - uri as an address for some process on the web
>> 2: representation
>>    - uri as referring to a specific content+content-meta
>>      (fixed resource/simple IR)
>> 3: document / "source of information"
>>    - not well defined, typically "web of documents"
>> 4: primary topic of a document / "source of information"
>>    - when a document is primarily about one thing
>> 5: any thing
>>    - any single thing, whatever a name is commonly used to refer to
>>
>>
>> Ideal Layers of the Web:
>>
>>                            _ network accessible resources
>>   Network / Computers ----|_
>>          |                   representations
>>     [hidden by]
>>          |                 _ sources of information
>>    Web of Documents ------|_
>>          |                   documents
>>     [hidden by]
>>          |                 _ primary topic of a document
>>     Web Of Data ----------|_
>>                              any thing
>>
>>
>> Proof of each URI use:
>>
>> 1: network accessible resource
>>    <link rel="pingback" href="/pinger" />
>>    (treats @href like an address)
>>
>> 2: representation
>>    in a document you GET from <u>
>>    <link rel="stylesheet" href="/styles.css" />
>>    creates triple <u> xhv:stylesheet "<u>/styles.css" in RDFa
>>    (uses <u> to identify representation)
>>
>> 3: document / "source of information"
>>    <a href="/foo.html">foo</a>
>>
>> 4: primary topic
>>    <meta property="og:title" content="blog post title" />
>>
>> 5: any thing
>>    have you met <a href="<u>">my mum</a> ?
>>    <http://xmlns.com/foaf/0.1/name>
>>
>>
>> chimera theory:
>>
>>   "URI refers to a 'chimera' entity that has some of the properties of
>>    the page and some of the properties of either its primary topic or
>>    the entity named on the page by the URI."
>>
>> is chimera theory current reality? yes
>>   many pages, like a doc with open graph, uses the same URI to refer to:
>>    - network accessible document
>>    - representation
>>    - document
>>    - primary topic
>>    - any thing (usually also the primary topic "blog post" or "my mum")
>>
>> problem:
>> web of data / sem web requires one name to be used to refer to one thing.
>>
>> httpRange-14:
>> there exists a class of information resources, and dereferencable http
>> scheme URIs are used to refer to these
>>
>> httpRange-14 problem, the rel cc:license problem:
>> "information resource" is loosely defined, such that when a GET pulls
>> back HTML which contains a "blog post", things from all three layers are
>> classed as an information resource
>>   - representation
>>   - the document
>>   - primary topic, the blog post.
>> does dc:created refer to representation, to the document, or the blog
>> post? does cc:license apply to the representation, the document or the
>> blog post?
>>
>> does httpRange-14 resolution remove chimera state of the web?
>>   no.
>>
>> potential next steps:
>>  - remove chimera state from web of data
>>     (possible B.C. breaks)
>>  - live with chimera state and focus on disambiguation
>>     (requires interpretation of statements, only graphs can be
>> interpreted to the level required to disambiguate chimera state)
>>
>> Best,
>>
>> Nathan
>>
>>
>>
>>
> 
>
Received on Thursday, 17 February 2011 22:55:20 UTC