- From: Tim Berners-Lee <timbl@w3.org>
- Date: Fri, 14 Nov 2003 09:04:51 +0900
- To: www-tag@w3.org
- Message-Id: <2B14BA5A-1636-11D8-8D4F-000A9580D8C0@w3.org>
Here are my comments on a complete read-through of the architecture document as of 2003-11-11. First of all, I must confess to a warm glow of satisfaction and pride in the group as I read through the stuff we have got in there which is really well hashed out and I think will be a great benefit. So now to the comments. Most of these are editorial. In some cases I have written some text, a couple being new stuff we should maybe resist putting in, though they were in response to "the tag should" marks in the text. Text to be inserted I have quoted in mail form, as I am using a mail writer, so thats what I got. It does not indicate text from another message, just quoting my suggested text. Makes it blue for me. ____________________________ The very first paragraph of the Abstract encapsulates the fact that we haven't solved httpRange-14 yet. It uses the word "resource" in two distinct ways. 1. Introduction, after the story, List element 1 - s/involves/is about/ (too vague) - Diagram misrepresents "representation" as being the set of octets "<html>...</html>". These are only the bits, there is metadata "content-type: text/xhtml+xml" or whatever, which should be I suggest in a box on top of the existing box. - Editor's note "we may add other diagrams" seems silly. 1.2.3 Syntax and Interop'y s/syntax, by specifying the content and sequence/the syntax, meaning and sequence/ (the content and sequence are not specified in the syntax) 2. Identification s/linked-to within the information space/linked-to/ We are on tricky ground here without httpRange14. Don't imply that the destinatoin of a link should be an information resource if you want it to be able o be a car. Actually I don't like the use of "link" for all uses of URIs. Just use as a reference is a good thing to talk about unless specifically talking about hypertext. Later, in box "Princple: Assign URIs", remove word "identified" in a strange sense. suggest: "assign a URI to each resource to which it it intended that others may refer" or "assign a URI to each resource which others will refer to." 2.2 just before 2.3: rewrite as "URI ambiguity only arises if different parties use "http://www.example.com/moby" to identify different things" I don't think we need to get into belief here. 2.3 URI schemes There is a Note that the TAG should provide more justification for expanding by media type instead of new URIs. I agree. I have had a go at it here:- > HTTP is a powerful technology benefiting from many features and much > support, technical and social. Technical support includes not only > client and server code, but also proxy and firewall systems, offline > caching, robots, search engines, and so on. Features include > confidentiality (with SSL), authentication, etc., and hierarchical > delegation of authority. Socially, suppport is from the DNS system > management, and internal resource and access management within web > sites. > > The HTTP space is more than just a protocol. It is an information > space supported by an evolving set of protocols. HTTP has evolved and > will evolve again. > > Good Practice > Use HTTP when possible for new designs where possible instead of > designing a new space. > > To make a new scheme when HTTP could have been used > - deprives the new system of the support mentioned above; > - increases the code burden for systems (for example small portable > devices) which will end up having to implement both stacks. > - could involve the community in an expensive rework of all the HTTP > system to date. > > If there is a perceived inadequacy in the HTTP features,(such as, say, > or security, or domain name governance) then it is generally better > to fix the feature in HTTP than to design the stack. > > To make a new scheme name (such as webcal:) when the protocol in use > is in fact HTTP, > - prevents existing software from being able to use the URI when > otherwise it could; > - prevents the new system from taking advantage of certain of the > features of http, such as local caching. > > Good Practice: > > Where a system uses HTTP, it should also use the "http:" I think this should go in, but probably not into a last call draft. 2.5 Fragment Identifiers Remove "indirect" from the first non-story para of 2.5 "allows [indirect] identification of a secondry resource". The word "indirect" has specifically been used for another case, where "indirect idenification" is by for example giving a unambiguous property of a thing, such as a person's SSN or email address. 2.6.2 Determination that two URIs identify could we change "determination" to "expression", please? We are talking here not so much about the ability to determine but the ability to express that two URIs are the same. In the same section, change "equaivalentTo" to "sameAs". The OWL vocab is now current, and this changed from DAML. In the same para, change "state assert" to "directly state or indirectly imply" 2.6.3 I didn't notice this section go in. Can we remove it? I think DDDS is often harmful, as it is used as [much more complicated] way of reinventing HTTP and justifying the use of new URN schemes. If we mention it, we should put a warning there. 3.2 Messages and Representations rewrite first para as A message is a communication event that is part of one of a non-exclusive set of messaging protcols (eg HTTP, FTP, NNTP, SMTP and SOAP). (To say it is an event is true and important but misses the point that it is a communication. To use "represented" by here was wrong - we use that word with a very specific meaning in this doc. To omit SOAP suggests that SOAP messages are not messages, when they are. Important to show that messages at this level are messages too, even when conveyed on the back of lower level messages.) In point 1 just below that: s/Electronic data about resource state/Electronic data expressing resource state/ [end of my p17] Section 3.4 There has been a nervousness around talking about ownership of URIs. We currently use the term "authority responsible for" and we are embarassed about it. This partly comes from not writing it down - and partly from our (unfortunate) reluctance to discuss HTTP specifically. Here goes some text. > Ownership of URLs. > > This term is used in the following sense. URIs are minted as an > operation be an agent, when a given string for the first time is > associated with the given resource. The requirement for URIs to be > unambiguous demands that two agents do not mint the same URI for > different resources. The URI schemes assure this using different > techniques. > > a) The hierarchical delegation of authority allows ever smaller nested > parts of URI space to be assigned to parties. (Example: http, mailto) > b) The generation of a fairly large random number reduces ambiguity to > calculated small risk; (example: uuid) > c) The generation of a URi as a checksum of a data object itself > (example: md5: ) > > or a combination of more than one (eg mid:, cid:). Whatever the > techniques used, except for the case (c), the agent ha a unique > relationship with the URI, which we can term "ownership". The social > implications of this are not discussed here. > > The HTTP protocol gives the owner the power to serve representations > of resources, and the HTTP origin server is the URI owner's agent. The > concept of URI ownership, or "responsable authority" is particularly > visible in this case. It deos not apply at all in case (c). 3.5 Safe interactions. Story ends ... "Neither data transmitted .. nor .. response ... corresponds to a resource named with a URI" This is a bug. We should admit this. I don't know whether it is a footnote or a bit of text or a new issue. Here is the way I would write it: > Neither the POST request, which expresses Nadine's commitment, nor > the response, which expresses the web site's acknowledgment and its > own commitment, can be referenced by URIs. This is a problem. Even > though in this case only two parties currently know the content of > these messages, the messages are an important part of the relationship > between them. It is a breakdown of the web architecture that they are > not given automatically a URI by which the parties can refer to them. > (Compare with mail messages which are given a message Id URI when they > are committed to.) > > Hence, while electronic commerce is done using HTTP POSTS, > accountability and reference relies on a web site generating a web > page to represent the transaction after it has occurred, and > suggesting that the user "print this and keep it". The browser does > not in general keep a record of the POSTS made, even though an email > client keeps outgoing mails. There is nothing to which a user can > point the website operators when tracing problem. The results of POSTS > are sometimes treated as web pages which cannot be revisited. Browsers > do not allow the user to manage the relationship between the form, the > posting (generally only meaningful in the context of the original > form) and the response (only meaningful in the context of the given > posting). > > Compare this to the superior accountability in email, when a request > can be copied to many people including public archives. 4 Data Formats You note that "language", "data format" and "vocabulary" are used interchangeably. I hope that "vocabulary" isn't. I would say that some data formats are languages, but a vocabulary is different. As far as I understand the way we tend to use these words, here is my bash at explaining it in case it useful maybe for a glossary some day. Data format Constrained syntax for a series of bits, and an accompanying specification of how such series should be interpreted. Examples: PNG, Plain text, OFX, HTML, RDF, HTTP request, HTTP response Language Constrained syntax for a series of (normally) characters (normally encoded as a series of bits), and a specification of what such series mean. Examples; OFX, RDF, HTTP request, HTTP response. (I don't see any use in belaboring the difference, mind you, except for connecting onto other people's ideas. Note that things in langauges have meaning, when data formats often are just presented to a user, who then determines any meaning in other ways. Also, languages are normally defined in terms of characters, so an encoding step exists between the data format and the language. XML is a data format as it specifies a bits as well as the characters.) Vocabulary: A set of terms which may be used for specific places in the grammar of a given language. Examples: FOAF RDF ontology; SOAP HTTP headers. RDF and HTTP headers define places where the grammar has an open set of terms which can be added to. These sets are vocabularies. 4.1 Just before 4.2 remove "very" ... we are making the point (elsewhere) that URI schemes are more expensive. 4.3 we say, "Two strategies are particularly useful". Add after the 1 and 2 points, "A powerful technique is for the language to allow either form of extension but dististinguish explicitly between them in the syntax." This idea seemed to have got lost. Later on in that section, in the Good Practice box, s/logic/semantics/ 4.5 Links Please retitle "Hypertext links". This section deals with global hypertext as a web application. Further down, "A link is built from two pieces" is wrong - say "The URI referred to is built from two pieces". In point 1 just there, add text: "..in which the link appears, and defaults to the URI of the referring resource, and ..." In most cases the base if just the document URI, so not mentioning seems silly. Two paras down, para starts "Section 5". "; this is called resolving a URI reference". Is it really? In many cases, resolving means looking up. I suggest strike this language, as I don't think (check) that we use "resolve" in this way anywhere else in the paper. Next non-ox para, startring "What agents". This talks about active or passive and never defines either or uses the terms again. Suggest delete the sentence "For instance ..passive". No loss to the document. Also, delete the sentence "On the other hand ... control points" as it seems waffly, gets off the topic of hypertext, and doesn't seem to make sense to me. Moving on to section 4.7.1 This section needs some work. I think it makes a vague distinction between "shallow" and "deep, sophisticated" mechanisms, which might work for people at parties (avoid both ;-) but doesn't work here. Rewrite section as follows: > """Many modern data format specifications include mechanisms for > composition. For example, > > - It is possible to embed text comments in some image formats, such as > JPEG/JFIF. Although these comments are embedded in the containing > data, they have little or no effect on the content of the image. > > - There are container formats such as SOAP which fully expect to be > composed from multiple namespaces but which provide an overall > semantic relationship of message envelope and payload. > > - The RDF allows well-defined mixing of vocabularies, and allows text > and XML to be used as a data type values within a statement with > clearly defined semantics. > > These relationships can be mixed and nested arbitrarily. In principle, > a SOAP message can contain a JPEG image that contains an RDF comment > that references a vocabulary of terms for describing the image. > > Note however, that for general XML there is no model semantic model > which defines the interactions within XML documents with elements > and/or attributes from a variety of namespaces. How these namespaces > interact and what effect an element's namespace has on its ancestors, > siblings, and descendants must be defined application by > application.""" 5. Conformance Oooo.... we are defining conformance for people! Resource owners, server managers and authors of specifciations. Hmmm. i am an author of a spec... am I conformant? If we could define the conformance of a spec, then this might be useful. But even then it is trying to be rigid about a very subjective thing. I don't see us giving AAA star ratings for specs. But maybe we should look at it. So much of the material in our document is not in a MUST box that really all kinds of specs could be produced and claim conformance. Discuss. Over dinner. Tim
Attachments
- text/enriched attachment: stored
Received on Friday, 14 November 2003 18:58:07 UTC