- From: <noah_mendelsohn@us.ibm.com>
- Date: Tue, 15 Aug 2006 12:32:13 -0400
- To: Bjoern Hoehrmann <derhoermi@gmx.net>
- Cc: www-tag@w3.org
Bjoern Hoehrmann wrote on June 30 a note [1] that was nominally about XBL
namespaces, but that in fact conveyed a lot of concerns regarding the
draft finding on Metadata in URIs [2]. So, I'm taking it as I would any
other input on a draft finding, and am attempting to respond here in
detail. The comments in fact arrived just as the TAG was voting to move
the draft to full finding status. I sent a rather tentative and
incomplete response around that time at [3], and Bjoern responded to that
at [4]. Accordingly, I told the TAG that we should hold off on the final
finding until I had a chance to work through Bjoern's response in more
detail. This is my attempt to do that.
I'm trying to strike a balance here. On the one hand, I want to be
responsive where there are important concerns. On the other, we always
have to pick a point where the TAG will say "publish", and further
comments can be considered as input to possible revisions. So, I've tried
to respond to Bjoern's comments with some detail and care, but given their
late arrival I am setting the bar a little higher than I might normally in
being open to significant redraft of the findings. I hope the following
strikes a reasonable balance.
The following quotes are from Bjoern's notes [1,3] followed by my
comments.
> * noah_mendelsohn@us.ibm.com wrote:
> >Which begs the questions: what sorts of information should bein a URI,
> >and who should or shouldn't depend on it being there? This
> might be the
> >time to remind everyone that the TAG has been working on just those
> >questions under the banner metadataInURI-31 [1]. We have a
> draft finding
> >[2], which as of the last TAG F2F is quite close to final. If this
> >discussion is going to turn to what should or shouldn't be
> encoded in the
> >text of a URI, I suggest giving the draft a look first. Thanks!
>
> >[1] http://www.w3.org/2001/tag/issues.html#metadatainURI-31
> >[2] http://www.w3.org/2001/tag/doc/metaDataInURI-31
>
> Let's see. The resource locator of the resource is poorly chosen,
This finding is following the same naming policy for drafts as the TAG has
used for other findings. While I can see that sensitivities are raised
particularly for a finding on metadata in URIs, I personally think the
name is fine (if not ideal), and propose not to tackle changes to the W3C
naming policy in conjunction with publication of this particular finding.
> the document then incorrectly claims to make correct use of RFC2119
terms
>From Bjoern's 2nd response at [4]
> TAG findings are not normative documents and do not specify con-
> formance; the keywords are to specify requirement levels. The
> concept of having requirements in non-normative documents that
> cannot be complied with does not make sense to me. The specific
> use of "MUST NOT" is highly suspicious, it's entirely unclear
> from the text how interoperability is at stake, or which harm
> is done by ignoring it.
>From RFC 2119:
"In many standards track documents several words are used to signify the
requirements in the specification. These words are often capitalized.
This document defines these words as they should be
interpreted in IETF documents. Authors who follow these guidelines
should incorporate this phrase near the beginning of their document:
"The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
"SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
document are to be interpreted as described in RFC 2119.
"Note that the force of these words is modified by the requirement level
of the document in which they are used."
I don't see that as forbidding references from documents which are
non-normative. In fact, if the scope is limited at all, it's specifically
to IETF documents, and I think the precedent is well established for
referencing RFC 2119 from non-IETF specifications. Furthermore, while TAG
findings are not Recommendations, they are in some sense normative.
Continuing from RFC 2119:
"1. MUST This word, or the terms "REQUIRED" or "SHALL", mean that the
definition is an absolute requirement of the specification."
I'd say that's true of the use in the draft finding.
"6. Guidance in the use of these Imperatives
"Imperatives of the type defined in this memo must be used with care and
sparingly. In particular, they MUST only be used where it is actually
required for interoperation or to limit behavior which has
potential for causing harm (e.g., limiting retransmisssions) For
example, they must not be used to try to impose a particular method on
implementors where the method is not required for
interoperability."
Well, I think that's true here. The draft finding says:
"Constraint: Web software MUST NOT depend on the correctness of metadata
inferred from a URI, except as licensed by applicable standards and
specifications."
I think that's pretty essential to preserving interoperability of the Web
and eliminating harm.
> The first good practise is rather odd; it's introduced by claiming that
> the scheme component of a resource identifier is metadata "peeked" from
> the resource identifier, and attempts to suggest making, say, a browser
> that supports only HTTP and therefore only HTTP resource locators is
> somehow bad (which it is not). It then repeats justifaction for the
> constraint discussed above. Frankly, I have no idea what's this trying
> to say.
I suspect this is referring to what was section "2.2 Avoid depending on
metadata". That entire section has been removed, so I'm guessing that
concern is resolved.
> The next good practise is obvious again ("Don't do something unless
> willing to accept the consequences") though I like how it's introduced
> by "Web users act on guesses about URIs all the time"; this happens e.g.
> if I call you and ask you to go to "http://example.org/weather/Boston"
> and tell me what's on that page. If you then infer I might be wondering
> about the weather in Boston, you are acting contrary to "Agents making
> use of URIs SHOULD NOT attempt to infer properties of the referenced
> resource."
I agree that it borders on the stupidly obvious, but I didn't come up with
a better way to make the point that you may guess, but there are
downsides, and the risk is yours. Since the TAG at its face to face
approved this formulation, I'm inclined to leave it, but I take the point.
We'll be reviewing this response with the rest of the TAG, and I'll
highlight this as an area of concern (Stuart Williams raised it too.) If
we can come up with a better way to make the point, I'll do it.
> Section 2.4 is a bit funny. I am unaware of where the HTML spec makes
> any mention of what is inferred,
It's only indirectly the HTML spec. The sequence is: the resource
authority sends to a client computer (let's call it my computer) an HTML
form. On the form is code that renders a button that, when pressed,
assembles a URI that is built in a structured way to include data entered
from fields on the form. In that sense, the resource authority has
invited me to submit URIs of that form. Insofar as the form itself
includes documentation that warrants what those URIs are for, and the form
comes from the resource authority, then the URI has effectively documented
the intended use of those URIs. Crucially, the server at the authority
can't tell whether I construct those URIs by actually doing the obvious
thing and presenting the form in a browser, or by some other means. I
don't believe that HTML forms have a timeout mechanism that says "don't
hold this form for a few weeks and then fill it in", so I can equally
write some other software to send those same URIs a few weeks later (cool
URIS don't change). I agree the HTML spec doesn't tell the story in quite
those terms, but the deployment of the form by the resource authority has
exactly that implication I think.
> and "The same HTML Form is also a
> computer program, executable by the browser, ..." well, that's sure a
> good one for the .signature file.
I'm afraid the above comment doesn't at all convince me that what I wrote
is even slightly wrong. Is it a computer program? I'd say so. It's not
in a Turing-complete lanuage, but it surely instructs a browser to render
certain UI, prompt for certain things, and if a given button is pushed to
submit another Web transaction. Sounds like a program to me. I admit
its not a program in a general purpose language.
> Section 2.5 apparently contradicts "Avoid software dependencies on
> metadata in URIs." in that it suggests "Even if it does not document
> this policy publicly, example.org's own Web servers can safely depend
> on it" implying it's okay to do that.
First of all, that "avoid dependencies" section is indeed gone. Secondly,
I think we acknowledge at the top:
"As these examples show, encoding or not encoding metadata in a URI or
deciding whether to rely on such metadata is often a tradeoff, involving
some benefits and some costs. In such cases, choices should be made that
best meet the needs of particular resource providers and users. "
That's exactly the case here. The authority and its server managers may
benefit from a regular assignment of URIs and from writing software, for
use within their authority, that benefits from their assignment rules.
True. That software will be less general than other software, and in that
respect less valuable. Also true. A tradeoff. The important distinction
is that someone running similar software without reliable knowledge of the
assignment rules is breaking the rules: they are creating an expectation
that the authority will use URIs in a certain way, when in fact it's the
authority that gets to make that decision.
> I am unsure what good practise
> is established here, the text explains an option, it does not make
> any suggestion.
It goes a little further: it says "Good Practice: URI assignment
authorities and the Web servers deployed for them may benefit from an
orderly mapping from resource metadata into URIs." I think that's true
and I think it's stated quite clearly.
> Section 2.6 is paradox, http://example.org/123Hx67v4gZ5234Bq5rZ is
> obviously not intended for direct use by people and therefore the
> good practise does not apply.
I understand your point, I just don't think the section as drafted suffers
from the problem that you imply it does. To make a point clearly, it
effectively says: "If the authority were dumb enough to think that a user
would easily remember and type "http://example.org/123Hx67v4gZ5234Bq5rZ"
they would obviously be sadly mistaken. Having established that fact...
> To make it meaningful you would have
> to root the good practise as usability concern and base it upon user
> goals; in other words, URIs that people want to make direct use of
> are to be made usable by people (which again is obvious).
Well, I agree with one important exception: it's often the case that the
authority decides which URIs it intends to be conveniently usable by
users, and which it expects will be used internally to databases, web page
links, etc. The finding says:
"Although Web architecture does not require that URIs be easy to
understand or suggestive of the resource named, it's handy if those
intended for direct use by people are.
"Good Practice: URIs intended for direct use by people should be easy to
understand, and should be suggestive of the resource actually named."
I think you're saying: the end users should decide which URIs they want
convenient, and somehow that turns into the assignment authority having an
obligation to them. Of course, many organizations will have commercial or
other reasons for wanting to make their users happy, but I think it's
ultimately up to the authority to decide how to balance convenience for
end users of its resources with other factors affecting URI allocation. I
think the finding is good as it stands.
> It seems http://www.w3.org/2001/tag/doc/metaDataInURI-31 is intended
> to be used directly by people, yet it is not easy to understand (why
> "2001"? Why "doc"? Why "31"?) and consequently not suggestive of the
> resource actually named (too much noise to determine the signal).
Well, the choice of this assignment pattern was made years before this
finding was named, but I don't by your premise: I don't think this is to
be regularly used by people. Almost everyone I know who uses this,
including me (and I use it a lot), either clicks on it, copies or pastes
it from the clipboard, finds it on the TAG's list of findings, clicks on
links to it in emails etc. I think this is a simple disagreement between
you and those who administer the W3C site as to just how easy they want
these finding IDs to be to type, and perhaps whether you want what appears
to be a date (2001) to be suggestive of some significant event in the
authorship of this finding.
> The good practise in 2.7 is implied by the one in 2.6, it is harmful
> if interpreted incorrectly, and poorly extroduced (resource locators
> locate resources, they do not convey metadata about a resource as
> claimed in the extroduction).
I didn't say they convey it. I said that at some authorities there is an
internally established policy of >synchronizing< metadata with the
assignment of the URI. That's not necessarily about conveying it to
anyone outside. If you establish such a policy, and if that metadata
changes, you're going to face a difficult choice: change the URI for an
existing resource, or not observe your own policy. I think that's a fact.
I think it's only indirectly related to "conveying" metadata in the URI.
> I also note that there is little if any
> consensus about this; the principle builds upon the assumption that
> it's a bad idea to change resource locators as the resource changes;
Yes. Cool URIs don't change.
> you can equally well say that resources should never be changed,
Um, well, then it makes it really difficult to have links to anything like
a clock, as the name of the clock would change every time it ticked.
That's an extreme case, but the same is true for things like news
articles. Sometimes you want a link that refers to exactly the 4PM
revision, and sometimes you want a link that gets you the live version. If
you put the author's name in that link, and suddenly a new author adds a
paragraph at 5PM, you've got a problem. See also the evolving work on TAG
issue Generic Resources [5] and the early draft's of a finding that Raman
is working on [6]. This work is exploring the balance between having
stable URIs and URIs that change when there are differing or time varying
representations of what otherwise feels like the same resource.
> It seems the References section violates the "Consistent URI usage"
> good practise and the document its own "Resource metadata that will
> change SHOULD NOT be encoded in a URI."
Sorry, I'm not getting this comment.
> "URIs" have been obsoleted many, many years ago, only a few
> confused people want them to stay.
Huh? Sorry, I'm again not getting this.
> So in conclusion I am unsure why we should look at the draft?
Well, at the risk of stating the obvious, there's a lot of confusion about
when to put metadata into URIs and when to depend on it. While I'm sorry
that you're obviously not enthusiastic about a lot that's in this, I would
have thought that "why to look at it" wasn't in question.
> There is extraordinarily broad agreement that W3C's so-called
> "namespace policy" makes no sense whatsoever, I don't think
> there is anything to be discussed here really. Many acceptable schemes
This makes me nervous that you're mixing two things that are only
tangentially related: the merits of this finding, vs. the W3C's
particular assignment policy for URIs such as
http://www.w3.org/2001/tag/doc/amazingFinding.html. While one would
certainly hope that W3C's choices would be broadly consistent with the
advice in the finding, I think this discussion needs to be about the
finding itself.
In conclusion: I hope the above represents at least a careful look at the
issues raised. There are several which I believe are already resolved, or
which I propose (as noted above) to run by the rest of the TAG. I will
also be working through a related set of comments from Stuart Williams.
While I certainly wouldn't look for final concurrence until a revised
draft is complete, I hope this is an acceptable path forward. Bjoern:
thank you again for your care in commenting on the draft.
Noah
[1] http://lists.w3.org/Archives/Public/www-tag/2006Jun/0152.html
[2] http://www.w3.org/2001/tag/doc/metaDataInURI-31-20060609.html
[3] http://lists.w3.org/Archives/Public/www-tag/2006Jun/0156.html
[4] http://lists.w3.org/Archives/Public/www-tag/2006Jun/0157.html
[5] http://www.w3.org/2001/tag/issues.html?type=1#genericResources-53
[6] http://www.w3.org/2001/tag/doc/alternatives-discovery.html
--------------------------------------
Noah Mendelsohn
IBM Corporation
One Rogers Street
Cambridge, MA 02142
1-617-693-4036
--------------------------------------
Received on Tuesday, 15 August 2006 16:32:26 UTC