URI declaration versus use (Was: Terminology Question)

Before discussing how to name what you get when you dereference a
non-information resource URI, we should be clearer about what it is that
we wish to name.  In discussions thus far on this list, I think there
has been insufficient differentiation between assertions that are part
of a URI declaration, and normal assertions about a resource.  This
prompted me to write up several thoughts that I've had for some time
that I hope will add value to this discussion:

Title: URI Declaration Versus Use
URL: http://dbooth.org/2007/uri-decl/
Abstract:
[[
It is important to distinguish between a URI declaration and regular
assertions about the URI's associated resource.  This distinction is not
readily apparent in RDF, because URIs are declared implicitly in RDF.
The problem becomes apparent when the URI of a non-information resource
is dereferenced in an attempt to locate related information.  This paper
motivates and explains this distinction, defines the notions of URI
declaration and URI declaration page, and suggests some related best
practices.
]]

Comments on this document are invited.  To facilitate comment on
specific portions, the content is also included below in plain text.
However, the HTML version is easier to read.

=======================================================================

URI Declaration Versus Use

David Booth, Ph.D.
HP Software
  Comments are invited: dbooth@hp.com

Latest version: http://dbooth.org/2007/uri-decl/
Views expressed herein are those of the author and do not necessarily
reflect
those of HP.

Abstract

It is important to distinguish between a URI declaration and regular
assertions
about the URI's associated resource.  This distinction is not readily
apparent
in RDF, because URIs are declared implicitly in RDF.  The problem
becomes
apparent when the URI of a non-information resource is dereferenced in
an
attempt to locate related information.  This paper motivates and
explains this
distinction, defines the notions of URI declaration and URI declaration
page,
and suggests some related best practices.

Table of Contents


* Introduction

  o Example:_A_URI_for_the_Moon

* URI_declaration


    # Definition_of_"URI_declaration"
    #
Suggested_practice_P1_(URI_declaration_should_distinguish_the_resource)
    # Definition_of_"URI_declaration_page"

  o Names_versus_resources


  o Components_of_a_URI_declaration

* Web_architecture_and_implicit_URI_declarations

  o The_"following_your_nose"_algorithm

    #
Suggested_practice_P2_(Use_follow-your-nose_algorithm_to_publish_URI
      declarations)



  o Proposed_rule_for_implicit_URI_declarations



    #
Proposed_rule_R1_(Publicaton_with_follow-your-nose_algorithm_represents
      implicit_URI_declaration)
    # Proposed_rule_R2_(converse_of_R1)
    # Suggested_practice_P3
    # Suggested_practice_P4


* Explicit_URI_declaration_in_RDF


Introduction

When an HTTP URI is used to name something that is not a web page or web
site
(i.e., not an information_resource), it is important to distinguish
between the
declaration of that URI as a name for a particular resource, and regular
assertions about that resource.  This difference is important to Web
architecture and to other parties that wish to use the URI in assertions
about
the resource.   The issue arises when another party attempts to
dereference the
URI in order to learn about the URI and its associated resource.  The
other
party may wish to make use of the URI as a means of referring to the
resource,
without necessarily believing other assertions that are made about the
resource.

This difference is particularly confusing in RDF.  Many programming
languages
distinguish between variable declarations and variable use, but RDF does
not
have a corresponding mechanism for URI declaration.  Thus, when RDF
statements
are served from a URI, it may not be evident which of those RDF
statements are
intended to constitute a URI declaration and which are intended to be
regular
assertions about the resource.  They all look the same.  In fact, given
an RDF
triple, there is no way to determine, by examining the triple, whether
that
triple should be considered a part of the URI declaration or a regular
assertion about the resource.  It is up to the URI owner to indicate
this
distinction.

This paper describes the distinction between URI declaration and use,
and
suggests some best practices.  Even though this paper is written in
terms of
URIs, the concepts apply equally to IRIs. (See RFC_3986 and RFC_3987 for
advice
on minting URIs and IRIs.)  The following example will be used to
illustrate
the ideas.

Example: A URI for the Moon

Suppose I mint a URI for the moon: http://dbooth.org/2007/moon/ .  I own
the
domain dbooth.org, so I have the authority to do so.  (See
URI_ownership.) 
Since the moon is not an information resource, in conformance with the
W3C
TAG's_httpRange-14_decision I have configured my server such that an
attempt to
dereference that URI will result in a 303-redirect to
http://dbooth.org/2007/
moon/descr.html , which, when dereferenced, returns a page containing
the
following statements:

Statement M1: The URI http://dbooth.org/2007/moon/ hereby names a
particular
resource, such that:
    a: http://dbooth.org/2007/moon/ is a moon.
    b: http://dbooth.org/2007/moon/ orbits the Earth.

Statement M2: http://dbooth.org/2007/moon/ is made of green cheese.

Statement M3: For more information about http://dbooth.org/2007/moon/ ,
see
also http://dbooth.org/2007/moon/about.html .

The role of these statements is discussed below.

URI declaration

Definition: A URI declaration is a set of statements that
authoritatively
declare the association between a URI and a particular resource.

A URI declaration is a performative speech act.[@@ref?@@]  Its
publication by
someone who has the authority to make the declaration -- i.e., the URI
owner or
delegate -- defines the association between a URI and a resource.
Therefore,
another party wishing to use that URI to denote that resource should
take all
assertions that constitute part of that URI declaration as true by
definition. 
This is a take-it-or-leave-it proposition: If you do not want to believe
the
assertions in the URI declaration, then you should not use that URI,
because,
in essence, you are trying to talk about a different resource -- one
that
shares some, but not all, of the same characteristics. 

Suggested practice P1: A URI declaration should include sufficient
information
to distinguish the named resource from other resources, such that other
parties
can use the URI confidently to make statements about the resource. [@@Is
there
a WebArch ref for this?@@]

For example, statement M1.a above ("http://dbooth.org/2007/moon/ is a
moon") is
not sufficient to uniquely identify it, because there are many moons.
However
M1.a and M1.b together are sufficient to uniquely the intended resource,
at
least for many purposes.  Beware that sufficient information for one
purpose
may not be sufficient information for another purpose.  Pat Hayes has
several
times pointed out that one application may require finer (or different)
distinctions than another.[@@add ref@@]  Thus, P1 is a guideline -- not
a hard
and fast rule.

Definition: A URI declaration page is an information resource whose
primary
purpose is to provide URI declarations.

A single URI declaration page could also contain declarations for
multiple
URIs.  Thus, the relationship between URI declaration pages and
resources is
many-to-many. 

Names versus resources

We are treating a URI as a name for a resource, so that when the name is
used
in an assertion about the resource, it will be understood as referring
to the
resource.   But the treatment of a name in an explicit name declaration
is very
different: it is treated simply as a literal sequence of characters.
Thus, in
the URI declaration phrase "The URI http://dbooth.org/2007/moon/ hereby
names .
. .",  http://dbooth.org/2007/moon/ refers only to a sequence of
characters
that conforms to URI syntax, whereas in the statement
"http://dbooth.org/2007/
moon/ is a moon" it refers to a resource.  In other words, the subject
of a URI
declaration as a whole (such as M1) is a URI string -- not a resource --
  whereas the subject of a regular assertion is a resource, even though
some
subordinate parts of the URI declaration (such as M1.a and M1.b) may use
resources as subjects.

This distinction is readily apparent in a language like Java that uses
explicit
name declarations, but not in RDF, because RDF does not have explicit
name
declarations.  Nonetheless, the difference is important because other
parties
wishing to use http://dbooth.org/2007/moon/ to make statements about the
moon
need to know whether a statement like M2, "http://dbooth.org/2007/moon/
is made
of green cheese", is a subordinate part of the URI declaration or a
separate
statement about the moon.

Components of a URI declaration

More precisely, a URI declaration consists of:

  1. a URI u;
  2. a predicate p(x), where x is a resource; and
  3. a performative speech act, issued by the URI's owner or delegate,
that
     indicates u and p(x).

The URI declaration can be understood as stating:

"If a resource r exists such that p(r) is true, then henceforth u
denotes r.
Otherwise, if no such resource exists, the URI declaration is
malformed."

It is important to realize that the mere pairing of u and p(x) does not
constitute a URI declaration without a distinguishable speech act.
Thus, a
critical aspect of any mechanism for making URI declarations is the
ability to
distinguish the performative speech act from other, normal speech.
There are
many ways this can be done; usually context is involved.

In the moon example above, URI u is http://dbooth.org/2007/moon/ ,
predicate p
(x) is the conjunction of M1.a and M1.b, and x is the moon.  Note that
if M2
("http://dbooth.org/2007/moon/ is made of green cheese") had also been a
part
of p(x) then the URI declaration would have been malformed, since there
is no
moon that orbits the Earth and is made of green cheese.  The
performative
speech act is the act of publishing statement M1 ("The URI
http://dbooth.org/
2007/moon/ hereby names . . . .").  In this example, the English
phrasing " . .
. hereby names . . ." distinguishes this performative speech act from
M2, which
is intended as normal speech.

The word "authoritative" has sometimes caused confusion in discussions
of URI
declarations: if a URI 303-redirects to a URI declaration page, in what
sense
is that page "authoritative"?   A URI declaration page is authoritative
in its
URI declarations -- i.e., in declaring that URI to be a name for a
particular
resource -- but that does not mean that the assertions that the page
contains
are necessarily true.

Web architecture and implicit URI declarations

How should URI declarations be indicated on the Web?

 The "following your nose" algorithm

[Editorial note: Somewhere a more precise definition of this algorithm
should
be provided.  I didn't bother to do so here, but it is needed.  --
DBooth]

Given a URI, it is very helpful to others if that URI's declaration page
can be
readily located, using the URI as a starting point:

Suggested practice P2: URI owners should mint and support their URIs
such that
an attempt to dereference a URI of a non-information resource will lead
to a
URI declaration page for that URI, using one of the following
mechanisms:

* If the URI contains a fragment identifier, then the racine of the URI
(i.e.,
  the part before the #) should lead to a suitable URI declaration page.
* If the URI does not contain a fragment identifier, then an attempt to
  dereference the URI should yield a 303-redirect that leads to a
suitable URI
  declaration page.

[@@Is there a WebArch reference for this?@@]

Thus, http://dbooth.org/2007/moon/ 303-redirects to its URI declaration
page at
http://dbooth.org/2007/moon/descr.html .

Proposed rule for implicit URI declarations

Page http://dbooth.org/2007/moon/descr.html uses English both to make
clear
that a URI declaration is intended, and to distinguish between the URI
declaration and regular assertions about the moon.  But what should be
done in
other cases, such as RDF, that do not have a mechanism for explicit URI
declarations?

I propose that the Web architecture treat the act of serving a page
using
either of the above two follow-your-nose mechanisms -- hash or 303 -- as
a
performative speech act of URI declaration:

Proposed rule R1:  Given a URI u, if either of the follow-your-nose
mechanisms
described above yields a representation r, then, unless otherwise
indicated,
the conjunction of assertions made in r represents an implicit URI
declaration
for u.

And the converse:

Proposed rulel R2: Unless otherwise indicated (such as by rule R1 or by
some
explicit indication), publication of assertions about a resource denoted
by a
URI should not be construed as a performative speech act of declaring
that URI.

This does not mean that rule R1 should be the only way to declare a URI.
There
could be other mechanisms also, particularly explicit mechanisms.

Rule R1 clearly has the first two components of a URI declaration, but
what is
the performative speech act?  First, publication of the page --
regardless of
the URI that leads to it -- represents the utterance of the declaration.

Second, the follow-your-nose algorithm provides prima facie evidence
that the
declaration is authorized by the owner of the originating URI.  This is
important because the domain name in the URI of the declaration page
could be
quite different from the domain name of the original resource URI.  This
act of
publishing the page in response to the follow-your-nose algorithm from
the
original URI is what distinguishes this performative speech act from
other,
normal speech.

Rule R1 also implies that, unless otherwise indicated, every assertion
in the
page obtained should be considered a part of the URI declaration.
Therefore:

Suggested practice P3: A URI declaration page should avoid making
assertions
about the URI's associated resource that are not intended to be a part
of that
URI's declaration.

In the moon example above, this means that statement M2
("http://dbooth.org/
2007/moon/ is made of green cheese") should not be included in an
equivalent
RDF page, because if it were it would be considered a part of the URI
declaration and the URI http://dbooth.org/2007/moon/ would thus be
unusable to
parties who wish to refer to the moon and do not choose to believe the
moon is
made of green cheese.  On the other hand, statement M3 ("For more
information
about http://dbooth.org/2007/moon/ , see also
http://dbooth.org/2007/moon/
about.html") is safe to include in the URI declaration page, because it
is
merely a suggestion: it does not affect the satisfiability of p(x).
Notice
that by rule R2, page http://dbooth.org/2007/moon/about.html should not
be
interpreted as a URI declaration page for http://dbooth.org/2007/moon/ .

This also means that if several URIs share the same URI declaration
page,
examination of the URI declaration page via one of those URIs will not
necessarily indicate whether the other URIs are also being declared.  To
avoid
the inefficiency of having to dereference each of those URIs in order to
determine their URI declarations, either specialized URI prefixes can be
defined (as described in
"Converting_New_URI_Schemes_or_URN_Sub-Schemes_to
HTTP"), or explicit URI declaration mechanisms could be defined, such as
the
one proposed below.

If a URI declaration page only contains URI declarations, how can other
parties
find other information about the associated resources?

Suggested practice P4: A URI declaration page should provide links to
other
information about the resources whose URIs are declared by that page. 

This does not mean that a URI owner should be responsible for providing
links
to all other information about the associated resource.  But providing
links to
other known sources of information would be helpful to others, and the
URI
declaration page is a logical place starting place to look for such
links.  It
should be understood that providing a link does not imply any particular
endorsement.

Explicit URI declaration in RDF

I do not know of any explicit URI declaration predicate that has already
been
defined for RDF -- please tell me if there is one -- but it would be
easy to
define one using named_graphs:

If g is the URI of a named graph, and u is a URI, then the following N3
statements provide an explicit URI declaration for u:

@prefix dbooth: <http://t-d-b.org?http://dbooth.org/2007/uri-decl/#> .
gdbooth:declares "u".

Note the quotes around URI u, because in the declaration context it must
be
treated as a literal string -- not a reference to a resource.

Acknowledgements

Thanks to Jeremy Carroll for review comments.

Comments by all are invited.  If I have missed a reference that I should
have
included, please let me know.

------------------------------------------------------------------------
-------
30-Jul-2007: Added TOC, clarified speech act, misc minor fixes..
25-Jul-2007: Original draft.

=======================================================================

David Booth, Ph.D.
HP Software
+1 617 629 8881 office  |  dbooth@hp.com
http://www.hp.com/go/software

Opinions expressed herein are those of the author and do not represent
the official views of HP unless explicitly stated otherwise.

Received on Tuesday, 31 July 2007 19:00:03 UTC