- From: Sandro Hawke <sandro@w3.org>
- Date: Fri, 13 Feb 2004 16:14:55 -0500
- To: public-webarch-comments@w3.org
This document has come a long way since I last read it. Excellent. Here are some comments on the first half or so. Some are trivial, some are more substantive. Commenting on http://www.w3.org/TR/2003/WD-webarch-20031209/ -- sandro =========================================== == Comment 1, 1. Introduction: Identification. Each resource is identified by a URI. In this travel scenario, the resource is about the weather in Oaxaca and the URI... ^^^^^ This was jarring to read. The text up to that point is simple and direct, but suddenly here there's handwaving with "about." What *is* the resource identified by that URI? Fortunately, in the picture you answer this question. Suggested text: Each resource is identified by a URI. In this travel scenario, the resource is a periodically-updated report on the weather in Oaxaca, and the URI ... =========================================== == Comment 2, 1. Introduction: The server responds with a representation that includes XHTML data and the Internet Media Type "application/xml+xhtml". In the graphic, you show the media type as text/html, which is probably the better choice for simplicity's sake. =========================================== == Comment 3, 1.1.3 Principles, Constraints, and Good Practice This categorization is derived from Roy Fielding's work on "Representational State Transfer" [REST]. Authors of protocol specifications in particular should invest time in understanding the REST model and consider the role to which of its principles could guide their design: statelessness, clear assignment of roles to parties, uniform address space, and a limited, uniform set of verbs. The first sentence is fine, the second reads rather like a paid product placement. Is Fielding's thesis that much better than every other work ever written on distributed systems design that it merits strong recommendation in the section introducing labeling terms? If you want to save this text, put it in a Recommended Reading section. =========================================== == Comment 4, 1.2.1. Orthogonal Specifications: ... agents can interact with any identifier ... That's ambiguous. Replace "with" with "using" and I think you're okay. Otherwise it sounds rather like the identifier is one of the parties doing something in an interaction. =========================================== == Comment 5, 1.2.1. Orthogonal Specifications: ... that the an image ... ^^^ (typo) =========================================== == Comment 6, 1.2.2. Extensibility: The following applies to languages, in particular the specifications of data formats, of message formats, and URIs. Note: This document does not distinguish in any formal way the terms "format" and "language." Context has determined which term is used. I can't really parse the first sentence. Maybe you mean something like: The data formats and (more generally) formal languages used in the bodies of messages and even in the text of URIs can be defined to have certain properties to promote evolution and interoperation. =========================================== == Comment 7, 1.2.4. Protocol-based Interoperability It is common for programmers working with the Web to write code that generates and parses these messages directly. It is less common, but not unusual, for end users to have direct exposure to these messages. This leads to the well-known "view source" effect, whereby users gain expertise in the workings of the systems by direct exposure to the underlying protocols. This seems out of place. I get the point, but it's never summed up. And I don't see how it belongs in 1.2 General Architecture Principles. I think you mean: Good practice: design protocols and data formats which people can view and reproduce with a minimum of special tools and effort. [ Ahhh, this is finally covered in Section 4.1; maybe a forward link? ] and maybe: Good practice: user agents should allow user to look "inside" to see (and even manipulate) the protocol interactions the agent is performing on behalf of the user. =========================================== == Comment 8, 2. Identification (see also Comment 10 below) Parties who wish to communicate must agree upon a shared set of identifiers and on their meanings. This is untrue for some reasonable meanings of "meaning", as Pat Hayes has argued from time to time. You could say instead: Parties who wish to communicate must agree on the practical effects of using certain identifiers. or Parties who wish to communicate must agree upon a shared set of identifiers and (to a reasonable degree) on their meanings. That is: some ambiguity of meaning is both reasonable and unavoidable. I don't think an unqualified "agree" normally means "partially agree". Does http://weather.example.com/oaxaca identify the weather report for just Oaxaca or for the Oaxaca region? When it starts to matter, you can start to build a shared understanding of which it is. But you can't banish those ambiguities until you notice them. There's also a school of design where you choose not to banish them, even when you see them, until you know they matter. =========================================== == Comment 9, 2.2. URI Ownership ... the "uuid" scheme ... and ... the "md5" scheme ... but you don't give references. They are not on IANA's list. I pay some attention, and I'm not aware of a stable specification for either one. The spec on DanC's list for UUID has long since expired; the reference for MD5 is simply to a hypothetical use of it. For uuid you could use urn:nid-5, but that's technically not a "URI scheme": http://www.iana.org/assignments/urn-namespaces http://lists.research.netsol.com/pipermail/urn-nid/2002-July/000308.html Maybe you can says "such as a possible 'UUID' scheme", etc, or you could use WebDAV's unique-lock-token scheme. =========================================== == Comment 10, 2.3. URI Ambiguity (see also Comment 8 above) URI ambiguity should not be confused with ambiguity in natural language. Well then use a different word! Please! Call it "URI Overloading" and let "URI Ambiguity" be used for the unavoidable and quite acceptable situation I talked about in my Comment 8. "Overloading" seems to me the appropriate word. Eg: (1) In programming languages, a feature that allows an object to have different meanings depending on its context. The term is used most often in reference to operators that can behave differently depending on the data type, or class, of the operands. For example, x+y can mean different things depending on whether x and y are simple integers or complex data structures. -- http://www.webopedia.com/TERM/o/overloading.html I suggest, then, that "URI overloading" is the bad practice of using one URI to identify multiple things commonly known to be distinct and useful to distinguish among, while "URI ambiguity" refers to the fact that we can never communicate *exactly* what someone else means a URI to identify. =========================================== == Comment 11, 2.3.1. URIs in other Roles: The fact that the URI serves other purposes in non-Web contexts does not lead to URI ambiguity. URI ambiguity arises [when] a URI is used to identify two different _Web_ resources. What makes the example case okay is not that Nadia is not a "_Web_ resource". She may well be. There are two things that make it okay to refer to Nadia as "mailto:nadia@example.com". 1. It's done in secret, in private. People can redefine terms however they want in private. It's probably a bad idea for the same reason defining private URI schemes is bad, of course; they leak out. But until they do, it's basically okay. 2. Their system is maintaining an implicit indirection between the mailbox and the person. They're just abbreviating the screen prompt which should strictly say "Please enter the identifier for the mailbox of the person:" to "Please enter person id:". That's just simplification for the human reader -- the programmer should be remembering the indirection in effect and when necessary (eg writing out the conference registration data in RDF to publish) using or documenting it explicitely. Not a problem. =========================================== == Comment 12, 2.6. Fragment Identifiers: The fragment identifier of a URI allows indirect identification of a secondary resource by reference to a primary resource and additional information. The secondary resource may be some portion or subset ... So if a URI contains a "#", then the "second resource identified by the URI" is the same as the "resource identified by the URI." That's a rather odd use of the word "secondary". I would suggest instead that you: (1) Name the the portion of the URI up the the "#". TimBL has called this the "racine", but I like "stem", "trunk", or maybe even "non-fragment portion". (2) Call the resource identified by a URI's stem the "stem resource", or something like that. I imagine the awkward primary/secondary bit is probably left over from the idea that "URI-References" were not URIs. Actually, that's worth talking about. The newly issue RDF Recommendations still call then "URI References": A URI reference within an RDF graph (an RDF URI reference) is a Unicode string ... -- http://www.w3.org/TR/2004/REC-rdf-concepts-20040210/#section-Graph-URIref Can you add some text saying why you've decided "URI References" are now to be called "URIs", ... or something? =========================================== == Comment 13 References for OWL and RDF can now point to the Recs. =========================================== == Comment 14, 2.7.2. Assertion that Two URIs Identify the Same Resource Emerging Semantic Web technologies, including the "Web Ontology Language (OWL)" [OWL10], define RDF [RDF10] properties such as sameAs to assert that two URIs identify the same resource or functionalProperty to imply it. I would add: Knowing two URIs identify the same resource does not, however, mean they are interchangeable. For example, Oaxaca might have several government-run weather stations, and the measurements take from each of these might be available from both weather.example.org and weather.example.com. The first might call a particular station http://weather.example.org/stations/oaxaca#ws17a while the second calls it http://weather.example.com/rdfdump?region=oaxaca&station=ws17a These two URIs would both identify the same resource, a certain collection of weather measuring equipment. They are owl:sameAs each other. But an attempt to dereference them might well produce different content produced by different organizations (probably based originally on the same government-supplied data), so a user agent which substituted one for the other would be serving its user poorly. =========================================== == Comment 15, 3.2. Messages and Representations (#def-representation) ... includes a representation of the state of the resource. A representation is an octet sequence ... ... electronic data about resource state ... Does "state" really mean anything there? Is there a difference between "data about Ian" and "data about the state of Ian"? Maybe this could be clarified with: Note: the phrases "representation of the state of the resource" and "representation of the resource" mean essentially the same thing; the term "state" is sometimes used to help convey that resources and thus their representations often vary over time. =========================================== == Comment 16, 3.2. Messages and Representations (#def-representation) ... electronic data about resource state ... In theory, not all computers are electronic. How about just "data"? Or "information". =========================================== == Comment 17, 3.5. Safe Interactions and sends a message composed of form data using the POST method. Note that this is not a safe interaction; So you're suggesting it's not safe to engage in e-commerce? :-) I know you've worked hard to find a nice, simple way to explain When To Use GET, but I don't think this is it. Asking a non-commital question can be very dangerous. For instance: GET http://books.example.org/ check-order-status?credit-card=1234567877653453 That's non-committal, but it's certainly not "safe". Similarly, using POST isn't necessarily "unsafe"; it just means you're making a statement instead of asking a question. It's not quite as snappy as "GET IS SAFE!", but I suppose Ian could get people to chant "Never Punish People For Asking Questions!". =========================================== == Comment 18, on 3.6. Representation Management Dirk clicks on the link in the email he receives and is surprised to see his browser display a page about auto insurance. That seems pretty implausible. It's a working weather site to one user and a working car-insurance-ad site to another? Based in IP address, or what? Did his IP address get listed in a database of entries like, "please deliver this ad to that guy, if he ever visits"...? This seems more common and likely to me: Since Nadia finds the Oaxaca weather site useful, she emails a review to her friend Dirk recommending that he check out 'http://weather.example.com/oaxaca'. Dirk clicks on the link in the email he receives and is surprised to see his browser respond "404 Not Found". Dirk confirms the URI with Nadia, who now has the same problem. Eventually, they figure out the site has been reorganized, and the page Nadia recommended is now called "http://weather.example.com/newserv?loc=oaxaca". Embarassed by this interaction, Nadia stops recommending the site people. =========================================== == Comment 19, on URI Persistence There are strong social expectations that once a URI identifies a particular resource, it should continue indefinitely to refer to that resource; this is called URI persistence. I think "URI persistence" is more about observable behavior than the URI->Resource binding. What matters most is what the page is useful for -- that's what guides how people try to categorize and store and annotate and generally talk about it. A daily news page which stops being updated is nearly as bad as one which goes 404. It's the same resource ("Joe's Life"), it's just the representations aren't as high quality any more. Same resource being identified; different experience. Uncool URI. =========================================== == Comment 20, 3.7. Future Directions for Interaction "MLdonkey" appears to be just an application/client. The actually network/protocol which MLdonkey and many other app/clients implement is called "eDonkey2000". (http://www.edonkey2000.com/) =========================================== == Comment 21, 4.2.3. Extensibility 2. "Must understand": The agent treats unrecognized markup as an error condition. "markup"? This isnt just about XML. How about "syntactic constructs".
Received on Friday, 13 February 2004 16:12:34 UTC