Re: Proposal, a new class of Web Names from Nathan on 2011-02-16 (public-awwsw@w3.org from February 2011)

From: Nathan <nathan@webr3.org>
Date: Wed, 16 Feb 2011 01:37:17 +0000
To: Jonathan Rees <jar@creativecommons.org>
CC: AWWSW TF <public-awwsw@w3.org>, Tim Berners-Lee <timbl@w3.org>
Message-ID: <4D5B2A4D.2040307@webr3.org>
Hi Jonathan,

I can, but you may not like it:

   for the input ( {namespace} , {name} )
     the referent of {namespace} is unknown
     the referent of {name} is unknown

{namespace} corresponds to a dereferencable absolute-URI which is used 
as an identifier for interactions between components on the network.

{namespace} also provides a scope of resolution for references to things 
mentioned, described, evoked, presented or otherwise referenced by both 
representations and contexts in which those representations or that 
{namespace} are being considered.

{name} corresponds to one in an unbounded set of references within the 
scope of resolution provided by the {namespace}, these {names}, when 
used within a web name, can be used as global reference to something 
which remains consistently referred to within that {namespace}.

Previously one would have suggested that string URIs were used for this 
purpose (as intended), but some people started using dereferencable 
absolute-URIs to refer to "network accessible resources", some kind of 
semi-persistent process configured for a certain job on the network, 
they used them as "addresses" (even though the universal interface and 
layers of information hiding present in the web and http hide this). 
Whilst other people start using dereferencable absolute-URIs to refer to 
"documents", some representation which was consistently returned when 
uri was dereferenced (even though this too is hidden by the universal 
interface and layers of information hiding on the web).

So, the WebNames proposal makes the set of those dereferencable 
absolute-URIs orthogonal for the purpose of naming, let's them be used 
as addresses or whatev by people, they're just strings or compound 
identifiers, or used incorrectly for the purpose of naming - and creates 
a new class of names which always refer to whatever it is agreed that 
they name, whatever they consistently reference.

The drawback is, that you can't ever assume that X name (or style of 
name) refers to Y, but then that's also the main feature, it forces you 
to say what you mean, and describe what a name refers to, should you 
need to use names in a machine readable way.

Essentially, this deems the whole set of issues as orthogonal, if you do 
a GET on an absolute-URI and receive a 200, then you got a 200 OK 
response message to the message you sent, whatever conclusions you want 
to draw from that you may (in the general case that you asked for some 
info and got it), and if you get a 303 then you have to decide if you 
want to do another GET on a different URI, and if you get a 404 then 
you're wise to conclude that you still don't know anything and your 
request failed. Remembering of course that every process, machine, 
protocol and layer of indirection adds something extra you need to 
account for when you want to have some network aware notion of "trust".

But in the common, accepted, web case, well the namespace has a 
domainname or ip address which corresponds to a naming authority, and 
things named in that space generally fall under the domain/account of 
that naming authority - although you may have a hard time "proving" it 
in a court of law, due to all the intermediaries like processes and 
machines.

As for the license case:

   :foo a x:AnythingAtAll ;
     xhv:license </license> .

And the "refer to files you don't trust"

   [] a :UntrustedNamespace ;
     :address "http://...." .

or, well however you want to describe and refer to something you don't 
trust.

Finally, to talk about "network accessible resources" and 
"representations", by default they're just blank nodes, but by all means 
give them a name and describe them, why not?

Apologies Jonathan, this way of looking at things is totally different 
to the what's the link between a resource and a representation way, that 
I just can't use the same "relations" to express it, because afaict 
there is no relation, and neither a resource or a representation are 
actually "properly" named.

Best,

Nathan

Jonathan Rees wrote:
> I don't get this at all. I don't see you saying anything that is
> different from Harry and Ian's story. You don't avoid the licensing
> problem (at least not for RDF files), and you don't give a way to
> refer to files you don't trust. So there's no way you can claim
> neutrality.
> 
> Since I don't understand exactly what rule you're following to
> conclude what you say about these various things, maybe it would help
> if you could sketch, in pseudocode, a nose-following script.
> Input: a URI (possibly with #)
> Output: an English sentence of the form
>    The referent of '{URI}' {predicate}.
> that has the property that the sentence is true with the URI
> substituted, but not true when some similar URI is substituted. You
> can assume that an HTTP client and RDF-parsing scripts are available.
> 
> I'm happy to use 'Nathan and JAR agree that x' as a proxy for 'x is true'.
> 
> 'similar' is a judgment call but we can work that out.
> 
> For example, for input http://mumble.net/, if you believe webarch (I'm
> not saying I do), one might have:
>   The referent of 'http://mumble.net/' is an awww:informationresource
> that has an awww:representation containing the letter 'p'.
> For http://iandavis.com/2010/303/toucan, perhaps
>   The referent of 'http://iandavis.com/2010/303/toucan' is an
> awww:informationresource that has an awww:representation that is the
> serialization of an RDF graph having a node labeled with the URI
> 'http://iandavis.com/2010/303/toucan.rdf'.

for the input ( {namespace} , {name} )
   the referent of {namespace} is unknown
   the referent of {name} is unknown

for all {namespace}

an nr:NetworkResource which can be interacted with through the passing 
of message, each interaction will result in a message with a status, 
messages may include a nr:Representation

> Then feed the examples through and see what happens.
> 
> And then we can iterate on this.
> 
> (Do you really mean for RDF in 404 responses to be taken at face
> value?  That seems really weird and a huge security issue.)
> 
> Jonathan
> 
> On Tue, Feb 15, 2011 at 12:00 PM, Nathan <nathan@webr3.org> wrote:
>> Jonathan Rees wrote:
>>> Let me see if I've got this:
>>>
>>> The problem (in my restatement of what you said) is that different
>>> people want to use dereferenceable URIs in different ways. In
>>> interoperability scenarios they would be seen to be fighting over
>>> ownership of linguistic territory, and the poor agent stuck in the
>>> middle attempting to combine artifacts from the two sides (e.g. using
>>> owl:imports) is going to get wrong answers.
>>>
>>> So the solution is to retract httpRange-14 (and all the other specs
>>> that say the same thing) - http://example/x instead of always meaning
>>> the document always means the same thing that http://example/x# does,
>> With WebNames both the URIs you mention above are equal to the webname
>>
>>  ( 'http://example/x' , '')
>>
>> The above uses the primary-ref for the name part, which has the in-built
>> meaning of:
>>
>>  '' isPrimaryThingReferredToInTheNamespace 'http://example/x'
>>
>> and similarly the namespace 'http://example/x' is classed as being an
>> absolute-URI and referring to a (potentially) network accessible resource.
>>
>> It has the inbuilt meaning which caters for the top right and bottom left
>> boxes in http://www.w3.org/2001/tag/2011/02/metadata-arch#slide9
>>
>> So, the webname ( 'http://dig.csail.mit.edu/breadcrumbs/node/166' , '' )
>> would refer to the "document" which we'd understand as being named
>> "Reinventing HTML", that which remains consistent, (not the Representation)
>>
>> and ( 'http://dig.csail.mit.edu/breadcrumbs/themes/spreadfirefox/logo.png' ,
>> '' ) would refer to the image which you see
>>
>> and ( 'http://iandavis.com/2010/303/toucan' , '' ) would refer to the
>> toucan, because the RDF statements say so and because the webname is being
>> consistently used to refer to said toucan.
>>
>> and ( 'http://www.w3.org/2000/01/rdf-schema' , '' ) would refer to the RDF
>> Schema vocabulary, because the RDF statements say so and because the webname
>> is being consistently used to refer to said vocabulary.
>>
>> note, all of the above use the primary-ref as the name part, obviously one
>> can also use any name within a namespace to refer to things, like
>>  ( 'http://www.w3.org/2000/01/rdf-schema' , 'label' ) which refers to a
>> Property defined by the rdf schema vocabulary.
>>
>> etc..
>>
>>> and that's defined... how?
>> Well, it'd need specified as a standard to be "defined" so that it could be
>> referenced by the normative text of specs of course..
>>
>>> According to 3986 you look at the media type and go from there
>> Which I'd suggest is wrong and needs revised to say that the "fragment" part
>> of a URI is used as the primary form of indirect referencing [blah] not
>> related to scheme specific processing [blah] fragments must be used
>> consistently across representations [blah] media types may provide a way to
>> expose locally named things globally so that they can be referred to using
>> fragments [blah] for example @id in HTML [blah].
>>
>>> so for HTML and XML it would mean an element
>> which I'd also suggest is wrong anyway, it normally refers to that which is
>> evoked by giving a certain view of some info described in an html document,
>> the correlation to an "element" is merely an indirection hook for the
>> machine so it can show the correct view - in another case it's the part of a
>> video (media fragments) and in another it's something displayed via ajaxy
>> goodness etc. The important thing is for media types to provide a way to
>> refer to things in memory/serialization that describe or refer in some way
>> to that which is named by the frag, not to constrain it to be "an element"
>> (the domain of fragment cannot be 'element', elements themselves are an
>> abstract concept!)
>>
>> the above two mini rants are orthogonal though, web names are compatible
>> with either view of the URI world, because they're a different class of
>> identifier, and their usage maps how URIs are used by humans in a way that
>> machines can hook in to and take advantage of.
>>
>>> (which could never be defined since the empty string is not a valid
>>> element id), and for RDF would be as 'defined' by the RDF.
>> indeed, that's a reason for the "primary-ref" feature of web names
>>
>>> This is nice for those who don't like # because it gives them a way to
>>> write a # URI without writing the # character, and they don't have to
>>> bother with 303.
>> and that's another reason for the primary-ref feature, and webnames
>> themselves make dereferencing completely orthogonal to the name (303, 200,
>> whatevs doesn't matter), but intrinsic to the namespace, since that part
>> refers to a network accessible resource.
>>
>>> In fact I don't see how it differs from what Ian and Harry are saying at
>>> all.
>> it's a different approach, and allowing 200 OK on a non frag URI is just one
>> of the (positive) side effects. Tis not the same as what they've been
>> saying, rather it's compatible with what they've been saying.
>>
>> but then I hope webnames are compatible with what /everybody/ has been
>> saying, you and I included..
>>
>>> Those of us who have been using dereferenceable URIs to refer to documents
>>> are left in the cold and
>>> would have to make up a new notation (see my TAG slides).
>> not at all, webnames cover that too, as covered by the examples higher up
>> this reply - no new notation needed, the WebName concept handles all of that
>> for everybody, we'd just need RDF, and optionally web architecture, to adopt
>> them..
>>
>>> To me it would seem easier just to keep with httpRange-14 and say that
>>> dereferenceable LOD URIs actually do refer to documents - specifically
>>> nodes in the LOD network.
>> But sadly they don't refer to "documents", does http://google.com/ refer to
>> a document in the eyes of nigh on every one on the web? no it doesn't, but
>> does http://neurocommons.org/page/WebURIArchitectures ? yes it does.
>> Likewise http://dbpedia.org/resource/Toucan refers to a toucan, but
>> ironically if you GET it with a browser you'll think it refers to a document
>> about a toucan (thanks to the 303+conneg) and a linked data RDF client will
>> think it refers to a toucan - httpRange-14 is mismatched w/ reality..
>>
>>> The L in LOD  would be a document-to-document relation which is what I
>>> think that community wants.
>> I'd have to suggest that's the complete opposite of what everybody wants as
>> far as I'm aware.. the whole point of LOD is to be able to have thing to
>> thing relations, likewise with the semantic web, LOD just bolts on the
>> proviso that you publish those statements on the web (which I thought was
>> always the point of the semantic /web/).
>>
>> will reply to RDFa stuff under separate cover..
>>
>> Best,
>>
>> Nathan
>>
>>> Maybe we say it privately among ourselves, but at least we
>>> would have a consistent way to interpret combined OWL/LOD graphs.
>>>
>>> I came up with the following yesterday in conversation with Manu
>>> concerning use of # in RDFa: {Don't write <div id="foo" about="#foo">
>>> if you're at all concerned that the element [see media type reg.] and
>>> the thing you're calling #foo might have different properties - use
>>> different fragids in that case.} If you don't care about inference
>>> (your own or anyone else's) then of course it doesn't matter which you
>>> do, so you're happy. If you do, you use different fragids and again
>>> you're happy. Maybe the same approach would work with dereferenceable
>>> URIs.
>>>
>>> (1/2 cynical here, but want to explore options.)
>>>
>>> Jonathan
>>>
>>> On Mon, Feb 14, 2011 at 6:15 PM, Nathan <nathan@webr3.org> wrote:
>>>> Hi Guys,
>>>>
>>>> Please do read over the following and let me know what you think - might
>>>> be
>>>> somewhat of a different approach ->
>>>>
>>>> [[[
>>>>
>>>> Problem Statement and Background.
>>>>
>>>> The Web has long since provided names as a way of referring to things,
>>>> from
>>>> time to time the specification of these names has had to be revised, in
>>>> order to match their usage on the Web as it evolves.
>>>>
>>>> With the rise of the Semantic Web, Media Fragments and Web Applications,
>>>> the
>>>> usage of these names, especially http names, has changed to become either
>>>> inconsistent with the current URI specification or their usage is simply
>>>> unspecified.
>>>>
>>>> A side effect of this new usage, is that various communities have
>>>> differing
>>>> opinions on just what a URI can or does refer to, and on how those URIs
>>>> can
>>>> be used. This leads to tensions between communities which are trying to
>>>> converge, and in the worst case threatens the evolution of those
>>>> communities
>>>> and their respective technologies.
>>>>
>>>> The web communities using these URIs share two common requirements, they
>>>> need to use absolute URIs to refer to network accessible resources, and
>>>> they
>>>> require some form of indirect referencing, frequently turning to fragment
>>>> identifiers for this purpose.
>>>>
>>>> One of the most contended uses of URIs, is when they are used to refer to
>>>> abstract concepts or things evoked by the processing of representations,
>>>> for
>>>> example:
>>>>
>>>>  - A thing which is described within a representation, i.e. a person.
>>>>  - A particular application state or recomposable view provided by the
>>>> application.
>>>>  - Some particular scene within a movie.
>>>>
>>>> Contentions are usually particularly high when a URI of the absolute-URI
>>>> form is used for this purpose.
>>>>
>>>> In order to address this problem, it is suggested that a new class of Web
>>>> Names is needed. A class which is disjoint with the current set of names
>>>> (URIs/IRIs), fully compatible with those names, and which models existing
>>>> naming conventions.
>>>>
>>>>
>>>> Proposal - Web Names.
>>>>
>>>> Web Names provide a web friendly way of referring to things, each WebName
>>>> is
>>>> a 2-tuple comprising of a namespace and a name.
>>>>
>>>>  WebName  = ( namespace , name )
>>>>
>>>> The namespace part of a WebName takes the syntactic form of an
>>>> absolute-IRI,
>>>> the namespace typically refers to a network accessible resource.
>>>>
>>>> Each namespace has an infinite pool of locally scoped references, within
>>>> different contexts there often exists a need to expose one of those
>>>> references, for example:
>>>>  - a reference to something which is described
>>>>  - a reference to a particular state or information view
>>>>  - a reference to a function or a variable
>>>>  - a reference to a particular time sequence and area within a video
>>>>
>>>> The name part of a WebName provides a way to expose these indirect
>>>> references, the name can take the syntactic form of the primary-ref (an
>>>> empty string) or a reference (a string consisting of one or more
>>>> characters), the name provides an anchor to refer to things named within
>>>> a
>>>> namespace.
>>>>
>>>> WebNames have the following syntax:
>>>>
>>>>  web-name     =  namespace local-name
>>>>
>>>>  namespace    =  absolute-IRI
>>>>
>>>>  local-name   =  [ "#" ] primary-ref / "#" reference
>>>>
>>>>  primary-ref  =  0<ipchar>
>>>>
>>>>  reference    =  1*( ipchar / "/" / "?" )
>>>>
>>>>
>>>> Since WebNames are 2-tuples and IRIs are strings, the value space of
>>>> WebNames is completely disjoint with the value space of IRIs, however,
>>>> the
>>>> lexical form of each WebName is also a valid IRI, as such:
>>>>
>>>>  IRI          =  http://example.com/foo/bar#baz1
>>>>                 \________________________/ \__/
>>>>                                 |           /
>>>>  WebName      =             ( namespace , name )
>>>>
>>>> By sharing a lexical form which always produces a valid IRI, WebNames are
>>>> fully compatible with the deployed web technologies, require no changes
>>>> to
>>>> be made, and are backwards compatible with existing IRIs which have been
>>>> minted/used for the purpose of indirect referencing.
>>>>
>>>> Due to WebNames being 2-tuples, they cannot be dereferenced, this serves
>>>> to
>>>> null and void many of the most complicated and contentious issues
>>>> outlined
>>>> earlier, WebNames have been designed in such a way so that communities
>>>> can
>>>> opt-in to using them and focus on converging their technologies rather
>>>> than
>>>> trying to answer unanswerable questions.
>>>>
>>>> It is often the case that a network accessible resource is configured to
>>>> provide information primarily about a single thing, for this purpose a
>>>> WebName consisting of a namespace and a primary-ref can be used.
>>>>
>>>> When the name part of a WebName is the primary-ref, then the hash ("#")
>>>> is
>>>> optional, such that the WebName:
>>>>
>>>>  ( "http://example.com/foo/bar" , "" )
>>>>
>>>> can be specified using either of the following lexical forms:
>>>>
>>>>  http://example.com/foo/bar#
>>>>  http://example.com/foo/bar
>>>>
>>>> and such that both those lexical forms encode the same WebName.
>>>>
>>>> ]]]
>>>>
>>>> Still needs work, especially on the text, but I think that's enough to
>>>> get
>>>> across what I'm proposing in the meantime. Thoughts and feedback more
>>>> than
>>>> appreciated.
>>>>
>>>> Best,
>>>>
>>>> Nathan
>>>>
>>>
>>
> 
>
Received on Wednesday, 16 February 2011 01:38:15 UTC