Candidate message to TAG re httpRange-14 resolution from Jonathan Rees on 2011-01-31 (public-awwsw@w3.org from January 2011)

From: Jonathan Rees <jar@creativecommons.org>
Date: Sun, 30 Jan 2011 20:52:02 -0500
To: AWWSW TF <public-awwsw@w3.org>
Message-ID: <AANLkTikxh9L6k_FJT59OWi1vuy=m=boHCW4T=yXA3sB8@mail.gmail.com>
Candidate message from JAR and/or AWWSW to TAG in advance of 8-10 Feb
2011 TAG F2F.  For AWWSW discussion on Tues Feb 1 or by email.

-----

The httpRange-14 issue won't go away.

Some prominent voices are calling for the retraction of the
httpRange-14 resolution.  They are reporting difficulties in linked
data deployment that threaten uptake, and attribute these difficulties
to the requirement conveyed in the resolution.

Like it or not, the TAG "owns" this issue and has some responsibility
for it.

There has been considerable investment in the httpRange-14 rule in the
form of both deployed metadata and systems that create and consume
metadata.  For any substitue to gain adoption would be significantly
disruptive.

Web architecture (in particular the idea of a global namespace) will
also lose ground as the rule is largely an embodiment of webarch.

This is not mere philosophy or silliness.  Misunderstandings about the
meaning of metadata can have serious consequences.  Erosion of the
rule would mean that anyone who currently works with metadata would
need to retool to defend against potential confusion and loss of
interoperability.

Here are a few actions the TAG, or an entity acting on its behalf
(task force?), might take that might help.

- Publish a note that attempts a neutral overview of the conflict.
  (This would require some research.)

- Document the httpRange-14 rule to provide motivation, fix its
  numerous errors, and record some of its implicit assumptions.  (JAR
  has already drafted a document like this.)

- Bring such a document forward for review.  The httpRange-14
  resolution has never received the review a rec-track document or
  even a finding would, and that weakens it.

- Investigate ways in which dissenters' objectives could be met
  without an incompatible change.  (This would require research and
  creativity.)

- Prepare a more rigorous account of the "has representation"
  relationship and its connection to metadata, to give the
  architecture some teeth.  (This is what the AWWSW 'task force' is
  trying to do.)

-----------------------------------------------------------------------------

Below is a technical summary of the issue, avoiding the
objective-sounding and confusing form "URI identifies thing" and
replacing it with the more neutral "agent uses URI to refer to thing".

Let's invent for purposes of this missive a relationship 'carries'
between a 'representation' (in the sense of content + media type) and
an 'information resource' (document, image, etc.) meaning that the
representation has all or a significant part of the information that
the information resource does, 'represented' in some particular way
(media type, character encoding, etc.).

This definition is intended to capture what TimBL has meant by 'is a
representation of' all these years.  It's not completely correct but
is a good approximation for present purposes.  'Carries' is not at all
the same relationship as 'describes' or 'is about' - in particular the
information resource is part of the provenance of a representation
that carries it, but not necessarily of something that describes it.

Currently, if an information resource - call it 'R' here at the
meta-level - has 'representations' that carry it that are retrievable
(e.g. GET/200) using a URI 'u:u', then you refer (in RDF or a similar
language) to R as 'u:u' or '<u:u>'.  For example, to say that R is
licensed under CC-BY-3.0, you would write in RDF

  <u:u> xhtml:license
    <http://creativecommons.org/licenses/by/3.0/> .

A client reading this knows which work is licensed - it's R, the
information resource for which carrying representations are
retrievable using 'u:u'.  That is, if I retrieve a representation
using 'u:u', whether I can legally copy that representation is
controlled by the given license (subject to exceptions such as fair
use).

Now suppose that R is "about" another resource S, that it is important
to have a way to refer to S, and that no such way-to-refer presents
itself.  If S is R's primary topic, one might use "blank node"
notation such as

  [foaf:isPrimaryTopicOf <u:u>]

A new URI created for the purpose would also work, such as 'u:u#s' or
'u:s' (where there is no successful retrieval using 'u:s').
'tdb:2011:u:u' and various other non-resolving URI schemes have been
suggested.  For example, if S is itself an information resource (it
may or may not be), one might write

  [foaf:isPrimaryTopicOf <u:u>] xhtml:license
    <http://creativecommons.org/licenses/by/3.0/> .

to communicate that S is subject to the specified license.  (Example:
R might be a
bibliographic record - perhaps with its own license - and S the article it
describes, which might or might not be "on the web".)

The dissenting view rejects all of these approaches to referring to S
as being either unsuitable for the application or having an
unacceptable cost.  It is proposed to forego 'u:u' as a way to refer
to R, and instead use 'u:u' to refer to S.  For example,

  <u:u> xhtml:license
    <http://creativecommons.org/licenses/by/3.0/> .

would say, under some if not all circumstances, that S is so licensed,
not R.  One would have to say something else to indicate R's license, e.g.
the following made-up notation:

  [ xyz:isAccessibleAt "u:u"^^xsd:anyURI ] xhtml:license
    <http://creativecommons.org/licenses/by/3.0/> .

If a client assuming one convention encounters metadata composed by an
agent or tool assuming a different convention, the consequences can be
serious - such as applying a license to the wrong subject.

If S is not an "information resource" then the failure mode is more
likely to be nonsense than the wrong answer because most useful
predicates are either defined only on information resources or
undefined on information resources.  A human can sometimes figure out
what's intended by mentally replacing R with S or vice versa
("metonymy") so as to make sense out of nonsense, but RDF reasoners
are not so clever.

The details from this point - the exact relationship between R and S,
how one refers to R when 'u:u' refers to S, how to decide whether
'u:u' refers to S as opposed to R, and so on - vary from one proposal
to the next, and I'm afraid I'd get something wrong if I tried to
explain how the various proposals work in detail, even if I knew.

(Thanks to Stuart Williams for helpful comments on an earlier version.)

(tbd: add references: genont, HH email, etc.)
Received on Monday, 31 January 2011 01:52:37 UTC