Re: A Dirk and Ndia story about RDF and URIs and HTTPrange14 from David Booth on 2012-04-02 (www-tag@w3.org from April 2012)

From: David Booth <david@dbooth.org>
Date: Mon, 02 Apr 2012 14:03:40 -0400
To: Larry Masinter <masinter@adobe.com>
Cc: "www-tag@w3.org" <www-tag@w3.org>
Message-ID: <1333389820.2181.111078.camel@dbooth-laptop>
Hi Larry,

On Sun, 2012-04-01 at 22:42 -0700, Larry Masinter wrote:
> Since the TAG has hours scheduled to talk about httpRange-14
> (sigh) ...
> 
> Here's my cut:
> 
> Following the work I was developing earlier, I'm want to be careful to
> separate out:
> *  language  (protocol, protocol element)
> * descriptions of languages (dictionaries, specifications)
> * implementations (people, software, instances of HTML)
> 
> And to talk about the issue without making unnecessary (and illogical)
> assumptions, avoiding 
> * owner (URIs don't have owners)

"URI ownership" is a defined term of art in AWWW:
http://www.w3.org/TR/webarch/#uri-ownership
The word "owner" in this context does not mean the same thing as it does
in English.  A URI does not have an "owner" in the English language
sense, but it *does* have a "URI owner" in the AWWW sense.

> * minting (URIs aren't minted)

This is also a term of art in web architecture, but unfortunately it
does not have a standard definition.  Here is a rough definition, though
others may come up with something better:

  In the context of web architecture, "minting" a URI means 
  creating a URI.  It usually also implies that the URI owner 
  (in the AWWW sense) has authorized a particular definition 
  for that URI, which may range from being empty (i.e., no 
  definition at all) to very specific.

> * binding to HTTP (communication using URIs doesn't depend on HTTP
> status codes)

It may or may not.  It depends on the conventions that the communicating
parties use.

> * "information resource" vs." non-information resource" (an
> interesting concept but no real division)

Agreed, though some choose to make a distinction.
> 
> Dirk and Nadia want to have a conversation.
> 
> 1. In the old days before the web, they could communicate in a natural
> language, English or French or some other language, using words and
> syntax they both hopefully knew and understood. There were aids for
> their understanding, dictionaries (OED, American Heritage,
> Dictionnaire de l'Académie française) and other references (oh, for
> literary or historical references), but of course, communication was
> established because they had a common language, not because they were
> using the same dictionary.
> 
> And of course they could use computers and networks, sending text
> through email and instant messaging, or Dirk could leave files for
> Nadia to FTP, download and retrieve.
> 
> 2. The web introduced a great set of  enhancements: mark up, in a
> markup language.  This allowed them to  mark up text with styling, add
> images (which of course you could do in other ways), but also add
> links, using URLs. So now Dirk could not only say words in natural
> language,  but could annotate words and phrases and images with
> hyperlinks which would lead the reader directly to additional
> information.     A communication meant something whether or not the
> links worked (a failure that led to "404 not found" didn't suddenly
> change the meaning of a communication), but the links enhanced the
> communication, to the point where it was just as reasonable to say
> click >here< for additional information and put all the meaning in the
> link itself.
> 
> XML added to the family of languages by providing a framework with
> namespaces, where a URI could indicate a namespace which then became
> the context for communication in that name space. MIME is also used to
> describe the nature of a communication to give the parties a better
> idea of what was intended.
> 
> 3. Now, we wanted to enhance the nature of the communication even
> further by extending the languages of the web to include assertions,
> triples, which might be expressed as <A> <R> <B> such that perhaps
> some kinds of automated reasoning and processing could happen. That
> enhanceme (RDF) was in addition to hypertext markup, since Dirk and
> Nadia could exchange more formal expressions than those expressed in a
> natural language... it's a different framework, the links themselves
> were the communication.
> 
> The use of A within <A> <R> <B> has similar properties to the link
> in  
>             Click <a href="A">here</a> for more information
> 
> That is, if Dirk sends <A> <R> <B> to Nadia, the communication can be
> enhanced by having A, R, and B (if they are URIs) actually point to
> real information that Dirk or Nadia could use, if they're not already
> familiar with the terms.

Hmm, kind of like looking up a definition, right?  That sounds like
quite a useful convention.  ;)  But it seems to me that the usefulness
of that convention is highly dependent on the number of parties that
follow it.  After all, if publishers don't place definitions where those
definitions can be found when users like Dirk and Nadia click on the
URIs, then users won't find those definitions.  And conversely, if users
don't know that they *should* click on the URIs to find definitions, or
if they cannot tell whether the information that they retrieved *is* a
definition (because Dirk and Nadia are simply machines rather than
intelligent humans) then those definitions won't help either.  

On the other hand, if some standards organization were to Recommend that
convention, then that could significantly increase the adoption and
hence the usefulness of that convention.  :)

> This language of triples has some nice properties, but alas, it
> doesn't provide sufficient context for some purposes. If the intent is
> to talk about copyright or ownership or authorship of a work, there
> are some situations where it's not clear which URI to use in a triple,
> where "R" is "has copyright" or "has title" or "was written by".
> 
>  This is too bad, but we're just at the limit of what can be
> accomplished by this triple language.

That isn't a limit of the triple language.  That is a fundamental limit
of *any* language.  In the flickr/jamendo examples that Jonathan
mentioned, those web sites *could* have been clearer about their "has
copyright" or licensing statements if they had gone to more effort.  The
problem is that some in the LOD community feel that our currently
recommended conventions require too much effort, and we should modify
them to place less of a burden on publishers when they 
> 
> "httpRange-14" tried to invent some mechanism to disambiguate, but the
> problem of ambiguity is intrinsic in the nature of hyperlinks. 

No, it is intrinsic to language itself, regardless of whether the
language uses URIs or any other terms, such as words in English.

> We can't "fix" hyperlinks for the triple language without making them
> more confusing for their other purposes.

There may be some cost, but if we succeed in making the conventions
easier, then there also be benefits, so we'll have to weigh the costs
and benefits.  I don't think we should make assumptions about the costs
or benefits before we evaluate specific proposals.

> Some notes:
> 
> There are no "owners" of URIs here. 

Not in the English word sense, but there are in the AWWW sense.

> Dirk and Nadia use URIs for communication. Maybe they're both also
> engaged in establishing some web content so that they can use URIs for
> that web content to enhance their conversation, one for the other,
> maybe there are many people engaged in the conversation, but that's
> pretty irrelevant when talking about their communication using those
> URIs.
> 
> There is no process of "mint" here. Dirk and Nadia communicate, and
> they can "mint" words in natural language or in triples but doing so
> is outside of the scope of discussion of their communication.

That depends on the communication conventions that they use.  It may be
quite useful to be able to distinguish between a URI definition that was
issued by the URI's "owner" (in the AWWW sense) and a URI definition
that was issued by some random third party.  If they are using a
convention that says "Let's use the URI owners' definitions, unless
otherwise stated" then the distinction is quite useful.
> 
> There is no notion of "resource" and "representation" here. It's an
> artificial division useful for talking about content negotiation and
> so on, but unnecessary for this story. 

Probably not, but again, it depends on what conventions Dirk and Nadia
are using.

> There's no need to talk about two resources being the "same", or using
> "different" URIs for the "same" resource. 

But surely Dirk or Nadia may want to say:

  A == B

i.e., A and B refer to the same thing.
> 
> There's no separation of "information resource" vs. "general
> resource".  

Probably not, but it depends on what Dirk and Nadia are talking about,
and you haven't given enough detail to determine that.

David

> Dirk and Nadia communicate using URIs. Sometimes they use URIs to talk
> about things which cannot be easily captured in a data representation,
> but ... "there is no spoon": the world is also all data.



-- 
David Booth, Ph.D.
http://dbooth.org/

Opinions expressed herein are those of the author and do not necessarily
reflect those of his employer.
Received on Monday, 2 April 2012 18:04:07 UTC