- From: <noah_mendelsohn@us.ibm.com>
- Date: Tue, 15 Aug 2006 12:32:13 -0400
- To: Bjoern Hoehrmann <derhoermi@gmx.net>
- Cc: www-tag@w3.org
Bjoern Hoehrmann wrote on June 30 a note [1] that was nominally about XBL namespaces, but that in fact conveyed a lot of concerns regarding the draft finding on Metadata in URIs [2]. So, I'm taking it as I would any other input on a draft finding, and am attempting to respond here in detail. The comments in fact arrived just as the TAG was voting to move the draft to full finding status. I sent a rather tentative and incomplete response around that time at [3], and Bjoern responded to that at [4]. Accordingly, I told the TAG that we should hold off on the final finding until I had a chance to work through Bjoern's response in more detail. This is my attempt to do that. I'm trying to strike a balance here. On the one hand, I want to be responsive where there are important concerns. On the other, we always have to pick a point where the TAG will say "publish", and further comments can be considered as input to possible revisions. So, I've tried to respond to Bjoern's comments with some detail and care, but given their late arrival I am setting the bar a little higher than I might normally in being open to significant redraft of the findings. I hope the following strikes a reasonable balance. The following quotes are from Bjoern's notes [1,3] followed by my comments. > * noah_mendelsohn@us.ibm.com wrote: > >Which begs the questions: what sorts of information should bein a URI, > >and who should or shouldn't depend on it being there? This > might be the > >time to remind everyone that the TAG has been working on just those > >questions under the banner metadataInURI-31 [1]. We have a > draft finding > >[2], which as of the last TAG F2F is quite close to final. If this > >discussion is going to turn to what should or shouldn't be > encoded in the > >text of a URI, I suggest giving the draft a look first. Thanks! > > >[1] http://www.w3.org/2001/tag/issues.html#metadatainURI-31 > >[2] http://www.w3.org/2001/tag/doc/metaDataInURI-31 > > Let's see. The resource locator of the resource is poorly chosen, This finding is following the same naming policy for drafts as the TAG has used for other findings. While I can see that sensitivities are raised particularly for a finding on metadata in URIs, I personally think the name is fine (if not ideal), and propose not to tackle changes to the W3C naming policy in conjunction with publication of this particular finding. > the document then incorrectly claims to make correct use of RFC2119 terms >From Bjoern's 2nd response at [4] > TAG findings are not normative documents and do not specify con- > formance; the keywords are to specify requirement levels. The > concept of having requirements in non-normative documents that > cannot be complied with does not make sense to me. The specific > use of "MUST NOT" is highly suspicious, it's entirely unclear > from the text how interoperability is at stake, or which harm > is done by ignoring it. >From RFC 2119: "In many standards track documents several words are used to signify the requirements in the specification. These words are often capitalized. This document defines these words as they should be interpreted in IETF documents. Authors who follow these guidelines should incorporate this phrase near the beginning of their document: "The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in RFC 2119. "Note that the force of these words is modified by the requirement level of the document in which they are used." I don't see that as forbidding references from documents which are non-normative. In fact, if the scope is limited at all, it's specifically to IETF documents, and I think the precedent is well established for referencing RFC 2119 from non-IETF specifications. Furthermore, while TAG findings are not Recommendations, they are in some sense normative. Continuing from RFC 2119: "1. MUST This word, or the terms "REQUIRED" or "SHALL", mean that the definition is an absolute requirement of the specification." I'd say that's true of the use in the draft finding. "6. Guidance in the use of these Imperatives "Imperatives of the type defined in this memo must be used with care and sparingly. In particular, they MUST only be used where it is actually required for interoperation or to limit behavior which has potential for causing harm (e.g., limiting retransmisssions) For example, they must not be used to try to impose a particular method on implementors where the method is not required for interoperability." Well, I think that's true here. The draft finding says: "Constraint: Web software MUST NOT depend on the correctness of metadata inferred from a URI, except as licensed by applicable standards and specifications." I think that's pretty essential to preserving interoperability of the Web and eliminating harm. > The first good practise is rather odd; it's introduced by claiming that > the scheme component of a resource identifier is metadata "peeked" from > the resource identifier, and attempts to suggest making, say, a browser > that supports only HTTP and therefore only HTTP resource locators is > somehow bad (which it is not). It then repeats justifaction for the > constraint discussed above. Frankly, I have no idea what's this trying > to say. I suspect this is referring to what was section "2.2 Avoid depending on metadata". That entire section has been removed, so I'm guessing that concern is resolved. > The next good practise is obvious again ("Don't do something unless > willing to accept the consequences") though I like how it's introduced > by "Web users act on guesses about URIs all the time"; this happens e.g. > if I call you and ask you to go to "http://example.org/weather/Boston" > and tell me what's on that page. If you then infer I might be wondering > about the weather in Boston, you are acting contrary to "Agents making > use of URIs SHOULD NOT attempt to infer properties of the referenced > resource." I agree that it borders on the stupidly obvious, but I didn't come up with a better way to make the point that you may guess, but there are downsides, and the risk is yours. Since the TAG at its face to face approved this formulation, I'm inclined to leave it, but I take the point. We'll be reviewing this response with the rest of the TAG, and I'll highlight this as an area of concern (Stuart Williams raised it too.) If we can come up with a better way to make the point, I'll do it. > Section 2.4 is a bit funny. I am unaware of where the HTML spec makes > any mention of what is inferred, It's only indirectly the HTML spec. The sequence is: the resource authority sends to a client computer (let's call it my computer) an HTML form. On the form is code that renders a button that, when pressed, assembles a URI that is built in a structured way to include data entered from fields on the form. In that sense, the resource authority has invited me to submit URIs of that form. Insofar as the form itself includes documentation that warrants what those URIs are for, and the form comes from the resource authority, then the URI has effectively documented the intended use of those URIs. Crucially, the server at the authority can't tell whether I construct those URIs by actually doing the obvious thing and presenting the form in a browser, or by some other means. I don't believe that HTML forms have a timeout mechanism that says "don't hold this form for a few weeks and then fill it in", so I can equally write some other software to send those same URIs a few weeks later (cool URIS don't change). I agree the HTML spec doesn't tell the story in quite those terms, but the deployment of the form by the resource authority has exactly that implication I think. > and "The same HTML Form is also a > computer program, executable by the browser, ..." well, that's sure a > good one for the .signature file. I'm afraid the above comment doesn't at all convince me that what I wrote is even slightly wrong. Is it a computer program? I'd say so. It's not in a Turing-complete lanuage, but it surely instructs a browser to render certain UI, prompt for certain things, and if a given button is pushed to submit another Web transaction. Sounds like a program to me. I admit its not a program in a general purpose language. > Section 2.5 apparently contradicts "Avoid software dependencies on > metadata in URIs." in that it suggests "Even if it does not document > this policy publicly, example.org's own Web servers can safely depend > on it" implying it's okay to do that. First of all, that "avoid dependencies" section is indeed gone. Secondly, I think we acknowledge at the top: "As these examples show, encoding or not encoding metadata in a URI or deciding whether to rely on such metadata is often a tradeoff, involving some benefits and some costs. In such cases, choices should be made that best meet the needs of particular resource providers and users. " That's exactly the case here. The authority and its server managers may benefit from a regular assignment of URIs and from writing software, for use within their authority, that benefits from their assignment rules. True. That software will be less general than other software, and in that respect less valuable. Also true. A tradeoff. The important distinction is that someone running similar software without reliable knowledge of the assignment rules is breaking the rules: they are creating an expectation that the authority will use URIs in a certain way, when in fact it's the authority that gets to make that decision. > I am unsure what good practise > is established here, the text explains an option, it does not make > any suggestion. It goes a little further: it says "Good Practice: URI assignment authorities and the Web servers deployed for them may benefit from an orderly mapping from resource metadata into URIs." I think that's true and I think it's stated quite clearly. > Section 2.6 is paradox, http://example.org/123Hx67v4gZ5234Bq5rZ is > obviously not intended for direct use by people and therefore the > good practise does not apply. I understand your point, I just don't think the section as drafted suffers from the problem that you imply it does. To make a point clearly, it effectively says: "If the authority were dumb enough to think that a user would easily remember and type "http://example.org/123Hx67v4gZ5234Bq5rZ" they would obviously be sadly mistaken. Having established that fact... > To make it meaningful you would have > to root the good practise as usability concern and base it upon user > goals; in other words, URIs that people want to make direct use of > are to be made usable by people (which again is obvious). Well, I agree with one important exception: it's often the case that the authority decides which URIs it intends to be conveniently usable by users, and which it expects will be used internally to databases, web page links, etc. The finding says: "Although Web architecture does not require that URIs be easy to understand or suggestive of the resource named, it's handy if those intended for direct use by people are. "Good Practice: URIs intended for direct use by people should be easy to understand, and should be suggestive of the resource actually named." I think you're saying: the end users should decide which URIs they want convenient, and somehow that turns into the assignment authority having an obligation to them. Of course, many organizations will have commercial or other reasons for wanting to make their users happy, but I think it's ultimately up to the authority to decide how to balance convenience for end users of its resources with other factors affecting URI allocation. I think the finding is good as it stands. > It seems http://www.w3.org/2001/tag/doc/metaDataInURI-31 is intended > to be used directly by people, yet it is not easy to understand (why > "2001"? Why "doc"? Why "31"?) and consequently not suggestive of the > resource actually named (too much noise to determine the signal). Well, the choice of this assignment pattern was made years before this finding was named, but I don't by your premise: I don't think this is to be regularly used by people. Almost everyone I know who uses this, including me (and I use it a lot), either clicks on it, copies or pastes it from the clipboard, finds it on the TAG's list of findings, clicks on links to it in emails etc. I think this is a simple disagreement between you and those who administer the W3C site as to just how easy they want these finding IDs to be to type, and perhaps whether you want what appears to be a date (2001) to be suggestive of some significant event in the authorship of this finding. > The good practise in 2.7 is implied by the one in 2.6, it is harmful > if interpreted incorrectly, and poorly extroduced (resource locators > locate resources, they do not convey metadata about a resource as > claimed in the extroduction). I didn't say they convey it. I said that at some authorities there is an internally established policy of >synchronizing< metadata with the assignment of the URI. That's not necessarily about conveying it to anyone outside. If you establish such a policy, and if that metadata changes, you're going to face a difficult choice: change the URI for an existing resource, or not observe your own policy. I think that's a fact. I think it's only indirectly related to "conveying" metadata in the URI. > I also note that there is little if any > consensus about this; the principle builds upon the assumption that > it's a bad idea to change resource locators as the resource changes; Yes. Cool URIs don't change. > you can equally well say that resources should never be changed, Um, well, then it makes it really difficult to have links to anything like a clock, as the name of the clock would change every time it ticked. That's an extreme case, but the same is true for things like news articles. Sometimes you want a link that refers to exactly the 4PM revision, and sometimes you want a link that gets you the live version. If you put the author's name in that link, and suddenly a new author adds a paragraph at 5PM, you've got a problem. See also the evolving work on TAG issue Generic Resources [5] and the early draft's of a finding that Raman is working on [6]. This work is exploring the balance between having stable URIs and URIs that change when there are differing or time varying representations of what otherwise feels like the same resource. > It seems the References section violates the "Consistent URI usage" > good practise and the document its own "Resource metadata that will > change SHOULD NOT be encoded in a URI." Sorry, I'm not getting this comment. > "URIs" have been obsoleted many, many years ago, only a few > confused people want them to stay. Huh? Sorry, I'm again not getting this. > So in conclusion I am unsure why we should look at the draft? Well, at the risk of stating the obvious, there's a lot of confusion about when to put metadata into URIs and when to depend on it. While I'm sorry that you're obviously not enthusiastic about a lot that's in this, I would have thought that "why to look at it" wasn't in question. > There is extraordinarily broad agreement that W3C's so-called > "namespace policy" makes no sense whatsoever, I don't think > there is anything to be discussed here really. Many acceptable schemes This makes me nervous that you're mixing two things that are only tangentially related: the merits of this finding, vs. the W3C's particular assignment policy for URIs such as http://www.w3.org/2001/tag/doc/amazingFinding.html. While one would certainly hope that W3C's choices would be broadly consistent with the advice in the finding, I think this discussion needs to be about the finding itself. In conclusion: I hope the above represents at least a careful look at the issues raised. There are several which I believe are already resolved, or which I propose (as noted above) to run by the rest of the TAG. I will also be working through a related set of comments from Stuart Williams. While I certainly wouldn't look for final concurrence until a revised draft is complete, I hope this is an acceptable path forward. Bjoern: thank you again for your care in commenting on the draft. Noah [1] http://lists.w3.org/Archives/Public/www-tag/2006Jun/0152.html [2] http://www.w3.org/2001/tag/doc/metaDataInURI-31-20060609.html [3] http://lists.w3.org/Archives/Public/www-tag/2006Jun/0156.html [4] http://lists.w3.org/Archives/Public/www-tag/2006Jun/0157.html [5] http://www.w3.org/2001/tag/issues.html?type=1#genericResources-53 [6] http://www.w3.org/2001/tag/doc/alternatives-discovery.html -------------------------------------- Noah Mendelsohn IBM Corporation One Rogers Street Cambridge, MA 02142 1-617-693-4036 --------------------------------------
Received on Tuesday, 15 August 2006 16:32:26 UTC