- From: Booth, David (HP Software - Boston) <dbooth@hp.com>
- Date: Tue, 31 Jul 2007 14:58:31 -0400
- To: "Pat Hayes" <phayes@ihmc.us>, "Hugh Glaser" <hg@ecs.soton.ac.uk>
- Cc: "Tim Berners-Lee" <timbl@w3.org>, "Chris Bizer" <chris@bizer.de>, <www-tag@w3.org>, <semantic-web@w3.org>, "Linking Open Data" <linking-open-data@simile.mit.edu>
Before discussing how to name what you get when you dereference a non-information resource URI, we should be clearer about what it is that we wish to name. In discussions thus far on this list, I think there has been insufficient differentiation between assertions that are part of a URI declaration, and normal assertions about a resource. This prompted me to write up several thoughts that I've had for some time that I hope will add value to this discussion: Title: URI Declaration Versus Use URL: http://dbooth.org/2007/uri-decl/ Abstract: [[ It is important to distinguish between a URI declaration and regular assertions about the URI's associated resource. This distinction is not readily apparent in RDF, because URIs are declared implicitly in RDF. The problem becomes apparent when the URI of a non-information resource is dereferenced in an attempt to locate related information. This paper motivates and explains this distinction, defines the notions of URI declaration and URI declaration page, and suggests some related best practices. ]] Comments on this document are invited. To facilitate comment on specific portions, the content is also included below in plain text. However, the HTML version is easier to read. ======================================================================= URI Declaration Versus Use David Booth, Ph.D. HP Software Comments are invited: dbooth@hp.com Latest version: http://dbooth.org/2007/uri-decl/ Views expressed herein are those of the author and do not necessarily reflect those of HP. Abstract It is important to distinguish between a URI declaration and regular assertions about the URI's associated resource. This distinction is not readily apparent in RDF, because URIs are declared implicitly in RDF. The problem becomes apparent when the URI of a non-information resource is dereferenced in an attempt to locate related information. This paper motivates and explains this distinction, defines the notions of URI declaration and URI declaration page, and suggests some related best practices. Table of Contents * Introduction o Example:_A_URI_for_the_Moon * URI_declaration # Definition_of_"URI_declaration" # Suggested_practice_P1_(URI_declaration_should_distinguish_the_resource) # Definition_of_"URI_declaration_page" o Names_versus_resources o Components_of_a_URI_declaration * Web_architecture_and_implicit_URI_declarations o The_"following_your_nose"_algorithm # Suggested_practice_P2_(Use_follow-your-nose_algorithm_to_publish_URI declarations) o Proposed_rule_for_implicit_URI_declarations # Proposed_rule_R1_(Publicaton_with_follow-your-nose_algorithm_represents implicit_URI_declaration) # Proposed_rule_R2_(converse_of_R1) # Suggested_practice_P3 # Suggested_practice_P4 * Explicit_URI_declaration_in_RDF Introduction When an HTTP URI is used to name something that is not a web page or web site (i.e., not an information_resource), it is important to distinguish between the declaration of that URI as a name for a particular resource, and regular assertions about that resource. This difference is important to Web architecture and to other parties that wish to use the URI in assertions about the resource. The issue arises when another party attempts to dereference the URI in order to learn about the URI and its associated resource. The other party may wish to make use of the URI as a means of referring to the resource, without necessarily believing other assertions that are made about the resource. This difference is particularly confusing in RDF. Many programming languages distinguish between variable declarations and variable use, but RDF does not have a corresponding mechanism for URI declaration. Thus, when RDF statements are served from a URI, it may not be evident which of those RDF statements are intended to constitute a URI declaration and which are intended to be regular assertions about the resource. They all look the same. In fact, given an RDF triple, there is no way to determine, by examining the triple, whether that triple should be considered a part of the URI declaration or a regular assertion about the resource. It is up to the URI owner to indicate this distinction. This paper describes the distinction between URI declaration and use, and suggests some best practices. Even though this paper is written in terms of URIs, the concepts apply equally to IRIs. (See RFC_3986 and RFC_3987 for advice on minting URIs and IRIs.) The following example will be used to illustrate the ideas. Example: A URI for the Moon Suppose I mint a URI for the moon: http://dbooth.org/2007/moon/ . I own the domain dbooth.org, so I have the authority to do so. (See URI_ownership.) Since the moon is not an information resource, in conformance with the W3C TAG's_httpRange-14_decision I have configured my server such that an attempt to dereference that URI will result in a 303-redirect to http://dbooth.org/2007/ moon/descr.html , which, when dereferenced, returns a page containing the following statements: Statement M1: The URI http://dbooth.org/2007/moon/ hereby names a particular resource, such that: a: http://dbooth.org/2007/moon/ is a moon. b: http://dbooth.org/2007/moon/ orbits the Earth. Statement M2: http://dbooth.org/2007/moon/ is made of green cheese. Statement M3: For more information about http://dbooth.org/2007/moon/ , see also http://dbooth.org/2007/moon/about.html . The role of these statements is discussed below. URI declaration Definition: A URI declaration is a set of statements that authoritatively declare the association between a URI and a particular resource. A URI declaration is a performative speech act.[@@ref?@@] Its publication by someone who has the authority to make the declaration -- i.e., the URI owner or delegate -- defines the association between a URI and a resource. Therefore, another party wishing to use that URI to denote that resource should take all assertions that constitute part of that URI declaration as true by definition. This is a take-it-or-leave-it proposition: If you do not want to believe the assertions in the URI declaration, then you should not use that URI, because, in essence, you are trying to talk about a different resource -- one that shares some, but not all, of the same characteristics. Suggested practice P1: A URI declaration should include sufficient information to distinguish the named resource from other resources, such that other parties can use the URI confidently to make statements about the resource. [@@Is there a WebArch ref for this?@@] For example, statement M1.a above ("http://dbooth.org/2007/moon/ is a moon") is not sufficient to uniquely identify it, because there are many moons. However M1.a and M1.b together are sufficient to uniquely the intended resource, at least for many purposes. Beware that sufficient information for one purpose may not be sufficient information for another purpose. Pat Hayes has several times pointed out that one application may require finer (or different) distinctions than another.[@@add ref@@] Thus, P1 is a guideline -- not a hard and fast rule. Definition: A URI declaration page is an information resource whose primary purpose is to provide URI declarations. A single URI declaration page could also contain declarations for multiple URIs. Thus, the relationship between URI declaration pages and resources is many-to-many. Names versus resources We are treating a URI as a name for a resource, so that when the name is used in an assertion about the resource, it will be understood as referring to the resource. But the treatment of a name in an explicit name declaration is very different: it is treated simply as a literal sequence of characters. Thus, in the URI declaration phrase "The URI http://dbooth.org/2007/moon/ hereby names . . .", http://dbooth.org/2007/moon/ refers only to a sequence of characters that conforms to URI syntax, whereas in the statement "http://dbooth.org/2007/ moon/ is a moon" it refers to a resource. In other words, the subject of a URI declaration as a whole (such as M1) is a URI string -- not a resource -- whereas the subject of a regular assertion is a resource, even though some subordinate parts of the URI declaration (such as M1.a and M1.b) may use resources as subjects. This distinction is readily apparent in a language like Java that uses explicit name declarations, but not in RDF, because RDF does not have explicit name declarations. Nonetheless, the difference is important because other parties wishing to use http://dbooth.org/2007/moon/ to make statements about the moon need to know whether a statement like M2, "http://dbooth.org/2007/moon/ is made of green cheese", is a subordinate part of the URI declaration or a separate statement about the moon. Components of a URI declaration More precisely, a URI declaration consists of: 1. a URI u; 2. a predicate p(x), where x is a resource; and 3. a performative speech act, issued by the URI's owner or delegate, that indicates u and p(x). The URI declaration can be understood as stating: "If a resource r exists such that p(r) is true, then henceforth u denotes r. Otherwise, if no such resource exists, the URI declaration is malformed." It is important to realize that the mere pairing of u and p(x) does not constitute a URI declaration without a distinguishable speech act. Thus, a critical aspect of any mechanism for making URI declarations is the ability to distinguish the performative speech act from other, normal speech. There are many ways this can be done; usually context is involved. In the moon example above, URI u is http://dbooth.org/2007/moon/ , predicate p (x) is the conjunction of M1.a and M1.b, and x is the moon. Note that if M2 ("http://dbooth.org/2007/moon/ is made of green cheese") had also been a part of p(x) then the URI declaration would have been malformed, since there is no moon that orbits the Earth and is made of green cheese. The performative speech act is the act of publishing statement M1 ("The URI http://dbooth.org/ 2007/moon/ hereby names . . . ."). In this example, the English phrasing " . . . hereby names . . ." distinguishes this performative speech act from M2, which is intended as normal speech. The word "authoritative" has sometimes caused confusion in discussions of URI declarations: if a URI 303-redirects to a URI declaration page, in what sense is that page "authoritative"? A URI declaration page is authoritative in its URI declarations -- i.e., in declaring that URI to be a name for a particular resource -- but that does not mean that the assertions that the page contains are necessarily true. Web architecture and implicit URI declarations How should URI declarations be indicated on the Web? The "following your nose" algorithm [Editorial note: Somewhere a more precise definition of this algorithm should be provided. I didn't bother to do so here, but it is needed. -- DBooth] Given a URI, it is very helpful to others if that URI's declaration page can be readily located, using the URI as a starting point: Suggested practice P2: URI owners should mint and support their URIs such that an attempt to dereference a URI of a non-information resource will lead to a URI declaration page for that URI, using one of the following mechanisms: * If the URI contains a fragment identifier, then the racine of the URI (i.e., the part before the #) should lead to a suitable URI declaration page. * If the URI does not contain a fragment identifier, then an attempt to dereference the URI should yield a 303-redirect that leads to a suitable URI declaration page. [@@Is there a WebArch reference for this?@@] Thus, http://dbooth.org/2007/moon/ 303-redirects to its URI declaration page at http://dbooth.org/2007/moon/descr.html . Proposed rule for implicit URI declarations Page http://dbooth.org/2007/moon/descr.html uses English both to make clear that a URI declaration is intended, and to distinguish between the URI declaration and regular assertions about the moon. But what should be done in other cases, such as RDF, that do not have a mechanism for explicit URI declarations? I propose that the Web architecture treat the act of serving a page using either of the above two follow-your-nose mechanisms -- hash or 303 -- as a performative speech act of URI declaration: Proposed rule R1: Given a URI u, if either of the follow-your-nose mechanisms described above yields a representation r, then, unless otherwise indicated, the conjunction of assertions made in r represents an implicit URI declaration for u. And the converse: Proposed rulel R2: Unless otherwise indicated (such as by rule R1 or by some explicit indication), publication of assertions about a resource denoted by a URI should not be construed as a performative speech act of declaring that URI. This does not mean that rule R1 should be the only way to declare a URI. There could be other mechanisms also, particularly explicit mechanisms. Rule R1 clearly has the first two components of a URI declaration, but what is the performative speech act? First, publication of the page -- regardless of the URI that leads to it -- represents the utterance of the declaration. Second, the follow-your-nose algorithm provides prima facie evidence that the declaration is authorized by the owner of the originating URI. This is important because the domain name in the URI of the declaration page could be quite different from the domain name of the original resource URI. This act of publishing the page in response to the follow-your-nose algorithm from the original URI is what distinguishes this performative speech act from other, normal speech. Rule R1 also implies that, unless otherwise indicated, every assertion in the page obtained should be considered a part of the URI declaration. Therefore: Suggested practice P3: A URI declaration page should avoid making assertions about the URI's associated resource that are not intended to be a part of that URI's declaration. In the moon example above, this means that statement M2 ("http://dbooth.org/ 2007/moon/ is made of green cheese") should not be included in an equivalent RDF page, because if it were it would be considered a part of the URI declaration and the URI http://dbooth.org/2007/moon/ would thus be unusable to parties who wish to refer to the moon and do not choose to believe the moon is made of green cheese. On the other hand, statement M3 ("For more information about http://dbooth.org/2007/moon/ , see also http://dbooth.org/2007/moon/ about.html") is safe to include in the URI declaration page, because it is merely a suggestion: it does not affect the satisfiability of p(x). Notice that by rule R2, page http://dbooth.org/2007/moon/about.html should not be interpreted as a URI declaration page for http://dbooth.org/2007/moon/ . This also means that if several URIs share the same URI declaration page, examination of the URI declaration page via one of those URIs will not necessarily indicate whether the other URIs are also being declared. To avoid the inefficiency of having to dereference each of those URIs in order to determine their URI declarations, either specialized URI prefixes can be defined (as described in "Converting_New_URI_Schemes_or_URN_Sub-Schemes_to HTTP"), or explicit URI declaration mechanisms could be defined, such as the one proposed below. If a URI declaration page only contains URI declarations, how can other parties find other information about the associated resources? Suggested practice P4: A URI declaration page should provide links to other information about the resources whose URIs are declared by that page. This does not mean that a URI owner should be responsible for providing links to all other information about the associated resource. But providing links to other known sources of information would be helpful to others, and the URI declaration page is a logical place starting place to look for such links. It should be understood that providing a link does not imply any particular endorsement. Explicit URI declaration in RDF I do not know of any explicit URI declaration predicate that has already been defined for RDF -- please tell me if there is one -- but it would be easy to define one using named_graphs: If g is the URI of a named graph, and u is a URI, then the following N3 statements provide an explicit URI declaration for u: @prefix dbooth: <http://t-d-b.org?http://dbooth.org/2007/uri-decl/#> . gdbooth:declares "u". Note the quotes around URI u, because in the declaration context it must be treated as a literal string -- not a reference to a resource. Acknowledgements Thanks to Jeremy Carroll for review comments. Comments by all are invited. If I have missed a reference that I should have included, please let me know. ------------------------------------------------------------------------ ------- 30-Jul-2007: Added TOC, clarified speech act, misc minor fixes.. 25-Jul-2007: Original draft. ======================================================================= David Booth, Ph.D. HP Software +1 617 629 8881 office | dbooth@hp.com http://www.hp.com/go/software Opinions expressed herein are those of the author and do not represent the official views of HP unless explicitly stated otherwise.
Received on Tuesday, 31 July 2007 19:00:03 UTC