Re: Documents, Cars, Hills, and Valleys

From: "Dan Brickley" <>

> Anyone else want to try wrapping this up?

The first post of this thread attempted to wrap it up.

I urge anyone who believes in the range of HTTP being a document to help
formalize a definition of "document", otherwise you do not have a strong
case with which to back yourselves up. [...] To progress, we need to decide
those two things: the range of HTTP, and the nature of URIs+fragments.
]]] - [1]

The "range of HTTP" issue was taken up by the Technical Architecture Group,
and they are still working towards a resolution:-

httpRange-14 : What is the range of the HTTP dereference function? [...]
Resolution summary @@
]]] -
   $Date: 2002/04/30 13:52:28 $

Mark Nottingham has pointed us to:-


The bullet point titled "Does the URI name a specific or generic target?"
in section 5 suggests a point of view that is somewhat similar to Miles
Sabin's - that HTTP URIs can identify basically anything, but that also
what they identify changes from context to context. This is a point that I
argued many months ago on www-talk:-

There are various levels of use for HTTP URIs, ranging from the very large
scale through to the small scale:-

   * This HTTP URI identifies some path on some server;
      "widgets" on ""
   * This HTTP URI identifies the concept of widgets, example
      use: "it is commonly known that _widgets_ [link] are becoming
      increasingly rare..."
   * This HTTP URI identifies *a* page about widgets, example:
      "see _Fred's page on widgets_ [link]"
   * This HTTP URI identifies *the* page about widgets, example:
      "Fred: 'I like Widgets', Cite: _widgets_ [link]"

You can see that the level of persistence decreases as you go down the
]]] - [2].

The fact is that hypertext links do not have to have a strict semantics,
and so until the Semantic Web came about (or rather, when XMLNS came about)
it didn't matter what the range of HTTP was. People could say "_Fred's
orange_ is great" one day, and "Fred said 'x' on his _page about his
orange_" the next. Now that we're wanting to define a range for HTTP, the
documentalists have taken it upon themselves to retrofit a solution. I
think that they'd find it difficult to convince everybody on the Web to
change their hypertext link texts to suitable document-oriented
statements - especially given the absence of a suitable definition of

There have been real technical issues raised - and their current solutions
noted - in this very thread. Joshua Allen has been keen on my EARL
scenario, and the solution there is that we apply indirection predicates or
test case range (hence avoiding the issue), or probe for Resource-Type
headers. Miles noted that he is happy with the solution, as long as these
are not the *only* methods for disambiguating (Mark Baker's
Content-Location disabiguation method, can also contribute to

OTOH - and IMO this is the only decent argument so far for the range of
HTTP being a document like thing - you often don't have a URI to refer to
the homepage of an entity, when that there is a URI around being used for
the organization that be the obvious choice.

Well, how do you represent what I would say as

[] a :standardsOrg;  :homepage <>;
    is ipr:opyRightHolder of <>;
    org:subGroup :tands, :wai, :df, :int, :arch.

<>  :isolang "";
     http:representation [ in:contentType "text/html"; http:size "6576" ];
     dc:creator [ con:mailbox <> ].

If the home page and the organization are the same,
how does that work?
]]] - [3]

Aaron has attempted to solve this problem:-

but IMO there is no satisfactory solution. If there is an oganization that
uses for its homepage URI, then you can use
[ :homepage <http://[...]/myOrg.html> ] for the organization, but if OTOH
that URI identifies the organization, what URI do you use for the homepage?
[ is :homepage of <http://[...]/myOrg.html> ]? You expect homepage URIs to
be able to return representations under HTTP GET, and I can't GET a bNode.
By making that URI the URI of the organization, you're making it so that
there is no HTTP URI for the homepage of that organization.

Thankfully, in EARL there is not as much need to identify a conceptual
homepage as there is to identify some representation of that homepage,
which must be always be done using a set of indirection properties (e.g.
[ :reprOf <x>; :date "2002..."; :type "text/html" ]). With a
Content-Location mirror, one can of course have two possibly distinct
resources that have the same set of representations associated with them.

But we're back in muddy waters. The problem is that for as long as there
are prominent people on either side of the document vs. anything argument,
it will continue to rage. Since the "HTTP URIs can identify anything"
position is minimally constraining, the onus has always been on the
documentists to explain their position and provide a suitable definition
for a document. Since a document is itself a conceptual resource (and in
that respect is not a series of hashable bytes any more than a car is), I
don't really think that a definition will ever be forthcoming.

Still, the "no HTTP URI for the homepage" issue above is a problem, and
perhaps a set of best practices for HTTP URIs is something that the TAG
could produce. Even extremists like Aaron Swartz have said that it is
inadvisable to identify cars with HTTP URIs. To conclude, I think that a
document capturing this thread in its entirety would be a great help (and
suggest that the TAG may be a suitable group to draft it).


Kindest Regards,
Sean B. Palmer
@prefix : <> .
:Sean :homepage <> .

Received on Tuesday, 30 April 2002 12:36:48 UTC