Comment on: Providing and discovering definitions of URIs from Dave Reynolds on 2011-06-26 (www-tag@w3.org from June 2011)

From: Dave Reynolds <dave.e.reynolds@gmail.com>
Date: Sun, 26 Jun 2011 15:09:43 +0100
To: www-tag@w3.org
Message-ID: <1309097383.7601.137.camel@Obsidian3>
This is a personal response to [1].

First, thank you for the document and the hard work that must have gone
into it. Attempting to capture all the details of the issue and all
suggested responses to it is a useful step.

Unfortunately I don't feel the document, in its current state, achieves
the level of clarity required to move this debate forward and suggest
some rework.

Specifically:

(1) There should be one set of criteria by which all the proposed
solutions are evaluated. Those criteria should be clearly stated and
justified.

Currently the document lists six success criteria in section 1.1,
evaluates the proposals against a different (rather biased) list of 5
criteria and uses still other criteria in the discussion of some of the
solutions. 

I expand on this below [3].

(2) Several of the specific proposals are hard to recognize or follow in
the current write up. Specifically 4.3 confuses the punning proposal
with an attempt to reconcile punning with criteria 6 in section 1.1 and
doesn't really communicate the proposal itself [4]. Similarly I assume
that 5.3 is supposed to capture Ian Davis' "Back to basics" proposal [2]
but it is hard to recognize and again seems to mix up the proposal with
various ways to mutate and reconcile it with some of the criteria.

(3) The document fails to address the question of Information Resources.
IMO part of the confusion that surrounds http-range-14 arises from lack
of clarity over exactly what is and isn't an Information Resource and
whether and how that notion should get reflected in RDF, Linked Data and
the Semantic Web. The glossary definition given here ("Roughly speaking,
something that is appropriate as the subject of metadata") does not
reduce that confusion. The status of the associated IR document is
unclear.

(4) Perhaps minor but please adopt a consistent style for description
and attribution of proposals and criticisms. The mix of well referenced
criticisms with highly personalized arguments of the level "Harry
says ..." is jarring. Some proposals are attributed some not.

I suggest reworking to unify and clarify the solution criteria and to
initially present the set of existing proposals straight and assess them
on their own merits. The attempts to recast or rationalize the proposals
(especially as included in 4.3 and 5.3) would be better in appendices or
in a separate discussion document.

Hope this helps.

Dave

[1] http://www.w3.org/2001/tag/awwsw/issue57/20110625/

[2] http://blog.iandavis.com/2010/12/06/back-to-basics/

[3] More detailed comments on success criteria

The initial enumeration of success criteria is:
  1. Simple
  2. Easy to deploy on Web hosting services.
  3. Easy to deploy using existing Web client stacks.
  4. Efficient. 
  5. Browser-friendly.
  6. Compatible with Web architecture.

The description of (5) misses out the issue of errors caused by showing
the wrong URI in browser address bars. Arguments about specs aside it is
simply the case that all browsers show the result of the temporary
redirection and I've lost count of the number of errors in linked data
and examples which this has caused. Either make that part of the
"browser friendly" category or put it in a separate category. It is
mentioned in the discussion in 3.6.4 but not pulled out.

The explanation of (6) "A URI should have a single agreed meaning
globally" at least needs clarification. It is already the case that in
OWL 2 one URI denotes two things (see below) and even in plain RDF a
resource can have independent "class nature" and "property nature".
Technically in those cases the URI still denotes one thing but that
"thing" has independent natures.

Criterion 6 also ties back to the need to confront the "what exactly is
an IR then" issue.

The summary section uses the success criteria:

(a) webarch?  
(b) robust?
(c) easy to deploy?
(d) min round trips
(e) sound?

So we have a=6, c=2, d=4. Criteria 1, 3, 5 are omitted, b and e are
introduced.

Of these introductions then (b) is bizarre. The definition of
"robustness" is "Is the URI free of fragment identifiers?"! Why not just
have as one of your criteria "uses 303 redirects" :) Seriously if
robustness is a criterion then define it and assess each of the
proposals against it, ideally with some evidence. Personally I've never
seen a case of fragment IDs getting lost but frequently see "/doc" forms
used instead of "/id" forms. Is that an aspect of robustness? Does my
anecdotal non-evidence trump Harry's anecdotal non-evidence?

Then in the various discussion sections we see other criteria being
used:

  3.1, 3.2  URI denotation identifiable from the URI alone, 
            no context book keeping required.

  3.4    This discussion seems to assume implicit criteria of:
     Solution already be defined by an IETF standard
     Solution must not use DNS
  which can't possibly be what was intended.

Aside: while I don't think LSIDs are "the answer" this section either
needs rewriting or replacing with words to the effect of "we aren't
going to discuss this option seriously". 

  3.5.1   All representations must define the URI the same way

  3.5.4   REST compatible, "works" with PUT, POST and DELETE

  3.5.5   ??

  3.6.4   Bookmarkability (noted earlier)

  4.3     Must not reinterpret existing vocabularies (an aspect of 
          soundness?)

  5.3     Must provide a way to reference the "information resource" 

Some of these are subject to debate (e.g. the last one), some need
better specification, all need to be gathered into one consistent place
if these are the way the options are to be judged.

[4] I would characterize the "punning" proposal(s) as:

"A URI doesn't not denote a single thing. It can denote both a document
and some abstract concept. Each vocabulary term should make it clear
which referent the term applies to. Some applications and publishers may
not even care."

There is some precedent for this. In OWL 2 direct semantics then the
same URI can denote both an individual and an owl:Class. For example
if :c1 and :c2 are each both classes and individuals and you assert :c1
owl:sameAs :c2 that does not imply that they are the same class (have
the same class extent).

This works in OWL 2 because the semantics of the OWL terms are carefully
defined to make it work. It is clear in the model theory when an axiom
applies to the "class nature" of the URI and when it applies to the
"individual nature". [Making that particular approach work unambiguously
for the IR/nonIR distinction and for all vocabularies including ones
that are already published and only informally described would be ...
hard.]

The discussion in 4.3 is saying something slightly different, it is
saying that the URI still denotes just one thing, the document, but we
reinterpret vocabularies as sometimes referring to the document and
sometimes being a chained predicate that refers to a "designated
subject" of that document. That may be one way to reconcile the punning
proposal with criteria 6 but it is not obviously the same proposal and
makes for a complex write up.
Received on Sunday, 26 June 2011 14:10:15 UTC