- From: Ian Hickson <ian@hixie.ch>
- Date: Sat, 22 Nov 2008 10:11:19 +0000 (UTC)
- To: "Roy T. Fielding" <fielding@gbiv.com>
- Cc: HTML WG <public-html@w3.org>
On Wed, 19 Nov 2008, Roy T. Fielding wrote: > > When did you stop beating your wife, Ian? Stop rephrasing questions > that have nothing to do with my comments and just deal with my comments. I don't understand your comments. I'm trying to work out what you mean. I'm sorry if it comes across as ignoring your comments. You said that I was ignoring your comments. In trying to find out why you believe I am doing so, I have come to the conclusion that you and I have such dramatically different world views that I have been repeatedly misunderstanding you, and that my replies to your feedback have been phrased such that they do not convey to you the meaning that I intended. To address this, I have to better understand where you're coming from. Since we are so far from each other in terms of a common understanding, this might take some time. I really want to understand your feedback. > ... I have no time for this endless stream of pointless interrogatives, > none of which address the objection raised by me that led to this thread > on this mailing list. If you would just address the issues, we might > actually be able to resolve something. The problem is that I don't understand your objection. I assume you are referring to your request in: http://lists.w3.org/Archives/Public/public-html/2007Nov/0430.html ...that we rename the specification because the current specification is specific to browsers. However, the HTML5 specification as it stands today isn't supposed to be specific to browsers. It doesn't require support for the DOM as you suggest, or CSS, or scripting. > The definition of what is a "current browser" changes every six months. > So, in order for me to design my content by testing in a current > browser, I would have to redesign my content every six months. > Obviously, I don't do that. I design my content using tools that have > been developed based on specifications and experience over the past 15 > years. I have the luxury of being able to fix those tools when they do > something that violates the specifications. Most authors just use the > tools that are given to them. Ok, that makes sense... but do you think that people _do_ by and large redesign their content every six months in order to design their content by testing in a current browser? Or do you think that there are people who think that authors by and large do this? I'm trying to understand the point you originally made in: http://lists.w3.org/Archives/Public/public-html/2008Nov/0132.html It seems like you are suggesting that Jonas was suggesting that content authors redesign their content regularly to ensure they keep working in browsers, but I don't understand why you would interpret what he said that way. (As I understand it, he was saying that authors by and large check their content in contemporary browsers when writing content, and ensure that their content works in contemporary browsers, not that they redesign their content on a regular basis.) You also seem to suggest in that e-mail that you think that browser vendors hold the opinion that content is redesigned every six months to match whatever browsers are doing at that point in time. Is this correct? (That is, do you think that browser vendors believe this?) > > > Written programmatically includes everything rendered by blogging > > > software, content management systems, Google, Yahoo, Facebook, > > > YouTube, Word's "save as HTML", and regurgitated by sites that > > > transclude other sites. Could you describe where you think the markup written programmatically from such systems comes from? Is it, generally speaking, written by people, written by inference engines based on schemas and constraints, or written in some other manner? That is, is there some significant qualitative difference between content programmatically generated by Facebook today, and content written by hand in the late 90s? If so, could you elaborate on what this difference is? > > To clarify, is it therefore the case that you subscribe to the > > following statements?: > > > > 2a. What works in contemporary browsers at a time t1 is not > > necessarily a > > subset of what works in browsers at a significantly later time t2. > > Of course not. I assume you mean that you _do_ subscribe to the above statement (i.e. that the "of course not" refers to the "not" within the statement, as opposed to a reply to the question). > There are plenty of things that worked in one browser four years ago > that were removed from (most) browsers because of security concerns. I > am hoping for more of the same over time. Hm, interesting. Could you cite some examples? I'm trying to work out what kind of changes you have seen over the years, to get an idea of what experiences may have formed your opinions of Web browsers. > > 2b. Authoring tools and templates are written according to the > > standards > > and are not tested against browsers contemporary to their > > creation. > > They are written according to working standards and tested against the > most popular browsers contemporary to their creation. Could you elaborate on what you mean by "working standards" in this context? Do you mean that people writing templates and authoring tools look at the W3C HTML4 specification? Do you mean that they have some house style guide that they use? Do you mean that they write content according to what they have learnt through experience in working on the Web? I'm not familiar with the term in this context. > > That is, do you subscribe to the following?: > > > > 3'. Authors when writing Web pages attempt to make their pages look > > like they want in the browser they use. > > What authors are you talking about here? The ones that put words in a > readable order, the ones that build tools that translate from paragraphs > to HTML documents, or the ones that design templates for paragraphs to > be inserted within? Primarily I am concerned with the people who write the actual tags or scripts, whether that be in a static file, or in the output code of a tool, or whatever. > Normal authors use the tools that we create. In your opinion, are the pages that are _not_ created in this manner (i.e. that are created without the use of tools) rare enough as to be considered unimportant? That is, should we focus on making the language appropriate for generator tools, hand-authoring, or both? > Most of those tools do not, in fact, have a DOM or anything like a > browser engine engaged in the authoring of HTML. What is the significance of this? Could you elaborate on what design decisions you think we should base on this? > > > > 4. A browser that doesn't implement the APIs, vocabulary, and > > > > error handling that major browsers implement could effectively > > > > compete in the marketspace. > > > > > > > > Could you provide an example of a competitive browser that doesn't > > > > implement, as you put it, "all that crap"? > > > > > > No. The Web would be better off without that crap, but I have no > > > objection to you putting all that crap into a browser spec if you > > > think all browsers need to implement it. > > > > So do you believe the following statement?: > > > > 4'. A browser, to effectively compete in the marketspace, has to > > implement the APIs, vocabulary, and error handling that major > > browsers implement. > > > > That is, do you believe in statement 4, or statement 4'? > > I don't believe either statement is relevant. I am not interested in > creating a browser that competes in the market, nor are the vast > majority of implementations of HTML that would need to conform to a new > HTML standard. That is why I said the title is important. Interesting. Do you think a specification should only cover the requirements common to all conformance classes? Do you believe that browsers should be one of the conformance classes of the HTML specification? > > > That is in contrast to HTML, the language, which is something that > > > my software does generate and needs to remain compliant with, and > > > thus it does cause a great deal of harm for you to add a bunch of > > > procedural nonsense to the declarative language definition. > > > > I don't understand. Could you elaborate on how DOM APIs cause harm? > > They make it impossible to understand the language syntax and semantics > without plugging in a browsing engine and observing its operation over > time. They are not declarative definitions. Could you provide an example of this from the spec? I don't understand why HTML4 + DOM2 HTML doesn't have this problem but HTML5 (which is basically just a combination of what HTML4 defined and what DOM2 HTML defined, at least in terms of scope) does. > > > > 5. A specification that defines how to implement a Web browser > > > > would remove competition in the browser space. > > > > > > > > Reports from browser vendors suggest that a considerable amount of > > > > time is spent reverse-engineering other browsers in order to be > > > > competitive. HTML5 attempts to reduce this by doing all this work > > > > for them, thus reducing the amount of work that it would take to > > > > make a competitive browser. > > > > > > > > Why do you think that defining these features in detail reduces > > > > the ability for new competitors to enter the market? > > > > > > Because defining error behavior as the standard makes it very > > > difficult for applications that are error-free to be approved for > > > use within the environments that require adherence to standards > > > (including the stupid ones). > > > > I don't understand. Could you provide an example? > > See FIPS and WCAG. I really don't understand. Could you elaborate? (I really have no idea what you mean here.) > > > > 6. Most people don't want a specification that covers the features > > > > that HTML5 covers. > > > > > > > > I understand that you might not want it, but what evidence do you > > > > have that the majority of the Web standards community doesn't want > > > > it? > > > > > > Because not a single expert in the Web standards community that I > > > have talked to in the past two years has supported the current work > > > in HTML5. The single most common reaction to the features that you > > > have wedged into HTML5 is abject laughter and disdain for this > > > process. > > > > Hm, this is in stark contrast to the feedback I have received (from > > literally hundreds of people). > > > > It is obviously of critical importance to me that HTML5 addresses the > > needs of the wide Web community. Clearly, we have received different > > feedback from different parts of this community. I would like to > > receive feedback from the the people to which you have been talking. > > Would it be possible for you to point me in the right direction to > > obtain this feedback? Are there mailing lists where it would be > > appropriate to request constructive feedback from these people? > > Why? You don't deal with the constructive feedback received from me. > Why should I subject this process to our customers and my friends? > They've given up on this process. Well, assuming that your customers and friends care about the Web, and want the Web to satisfy their needs in the future, and assuming that HTML is going to be an important part of the Web in the future, presumably it is in their interests for us to at least try to ensure we've addressed their needs. I'm sorry you've had a bad experience; many people have had good ones, maybe your friends and customers would be more lucky? > > Do you have any suggestions for how we could obtain a representative > > sample of people to determine once and for all what fraction of > > experts in the Web standards community are in favour of the current > > direction of HTML5 and what fraction are opposed to it? > > Yes. Stop treating the ideas that sprang out of the WHATWG as proven by > their very existence. By and large the ideas that sprang out of the WHATWG were based on research and argument; they aren't proven by their existence, but are continually reverified when new information comes to light. > I don't care how long ping has been under consideration by WHATWG > mailing lists, nor do I care how many fanboys have thought in the past > that it is worth implementing. It represents a change to HTML (a harmful > one at that). Place it on the block and let it fight for itself in terms > of implementation. It should be a separate proposal until it has been > successfully implemented by two independent implementations. Likewise > for all of the other new additions. This is certainly an interesting way of writing specifications, but it's not how the W3C (or the IETF) has worked so far. Why would we start with HTML5? Would it help if you consider HTML5 spec as it stands today to _be_ the separate proposal? FWIW, everything in the spec is "on the block", in that only things that get interoperable implementations will be kept. We've already dropped many proposals and features over the years, e.g. the repetition template and data template features, form prefilling, etc. Regarding ping="", I looked at the list archives and found three e-mails in which you sent feedback. I have quoted (and attempted to address) the substantial points made in those e-mails below: >From http://lists.w3.org/Archives/Public/public-html/2007Oct/0360.html: > > It is not sufficient for accurate user tracking (mandatory in the realm > of referral payments) It it not intended for accurate user tracking. It is intended to enable click tracking for sites that wish to perform simple studies to find which links on their site are the most popular, and for systems like AdSense to allow clicks to be reported without hiding the final destination URL. It's good enough for both of these (where in both cases the user's privacy far outweighs the need for accurate data). > [It] would never be implemented consistently in practice Could you elaborate on this? Why would it not? It seems simple to implement, and the spec is pretty detailed. > [It] is trivial to defeat That's by design. The whole point is to protect user privacy for users who desire pings to be disabled. > [It] is trivial to use for a DoS attack or mass fraud on the referral > provider Surely it's easier to use <img> for a DOS attack than ping="". I don't understand how it would be easier to use for fraud than, say, redirects, which in practice are what is used today. > [It] is completely redundant to the current features provided by HTTP > (cookies and referer) and HTML (any embedded request). I don't understand how you would implement click tracking with any of those. Could you provide code examples should how one would translate the following to an existing mechanism other than redirects? Found on, say, example.net: <a href="http://example.com/" ping="http://example.org/">...</a> >From http://lists.w3.org/Archives/Public/public-html/2008Feb/0145.html: > > I see no actual implementations Mozilla has an implementation, but it was disabled due to last minute changes to the spec. > [I see] an overwhelming number of comments that indicate it isn't > desirable in HTML. Volume of comments one way or the other is not a technical argument. >From http://lists.w3.org/Archives/Public/public-html/2007Nov/0101.html: > > Not really. The actions generated by a user agent should be consistent > with the actions selected by the user. That is why TimBL had an axiom > about GET being safe -- clicking on a link (or a spider wandering > around) must be translated into a safe network action because to do > otherwise would require every user to know the purpose of every resource > before the GET. It follows, therefore, that the UI for a user action > that is safe (a link) must be rendered differently from all other > actions that might be unsafe. > > In short, if the UI is being presented as a normal link, then the HTTP > methods resulting from the user's selection must all be safe > (GET/HEAD/OPTIONS). You can argue this one for the next few years if > you like, but I'd be shocked if the TAG let anything else progress past > the WD stage. > > I don't care how many user agents already get it wrong today. They are > responsible for their own implementations. We are responsible for the > standards by which those implementations will be judged broken and > liable for that broken behavior. > > The discussion on ping assumes that the ping target is expecting to > receive empty-body POST requests (i.e., that the target has not been > deliberately supplied to fool an unsuspecting user into triggering a > non-safe action when they select the link). But that is an invalid > assumption -- the target of the ping could be any URI, including those > that do fun things like delete wiki pages or print documents or send > mail ... we've been through this all before and not all of them require > bodies. That's why HTTP and HTML both have requirements on use of safe > methods. Browsers should show that ping="" will cause a side-effect, that's pretty much the whole point of the attribute. This is in line with what RFC 2616 says to do for unsafe methods -- tell the user. A ping is non-idempotent, too, so we can't use GET. Also, note that sites vulnerable to ping="" cross-site would already be vulnerable to numerous CSRF attacks, so that's not an argument against ping="" using POST. > I am well aware of how link tracking works and the entire history of the > user tracking industry in Web protocols (due to a recent patent case), > and you haven't even reached the most minimal requirements that a real > site would need for tracking referrals, and would never be capable of > proving undercounts [the sole apparent reason for this new feature] > because there is no guarantee that the two DNS requests will deliver > equally reachable servers for the ping and href, nor that the href > request will succeed before the ping succeeds, nor that the href URI > corresponds to the ping-per-referral URI. There have been several groups that have said that ping="" is exactly what they need, including (but not limited to) two groups at Google, which is a pretty major player in click tracking. I'm sure it doesn't address everyone's needs, and if there are changes that can be made to improve the feature so that it covers even more groups, we certainly should consider them, but saying that the feature doesn't match any real site's needs is simply not true. > A solution to that problem, if one exists, needs to be vetted by people > at companies that do referral tracking and payments in real life not as > a hacking exercise in cool features, and for that you will need to talk > directly to the right people at Google The proposal originated at Google, motivated by concerns over the current techniques being suboptimal in terms of honouring user preferences (in particular regarding privacy) and suboptimal in terms of providing a good user interface. > Amazon, Linkshare, and at least a few of the retailers that are aware of > all the ways in which tracking can be abused. We have discussed the feature with a number of groups, but more imput is always welcome. Do you have any contacts with the relevant people from these groups? > Even if such a ping was standardized, it would be years before a > sufficient number of deployed browsers were out there to make it work, > and during that time the content providers would have to do both > redirects and pings to get their numbers. Certainly it will take a long time for ping="" to be widely-enough deployed to be useful, but it is backwards- and forwards-compatible, so that's not a huge problem. The same argument could be used against most new features in most standards. This concludes all the substantial feedback from you regarding ping="" that I could find. Did I miss something? > There are no use cases, reasons, technical arguments, supporting data, > or any other form of logical thought that supports the addition of ping > if it is viewed with even the slightest understanding of how the Web > measurement community (and particularly the referral tracking companies) > work. I've already explained that numerous times. That doesn't seem to > bother you at all. As far as I can tell, you _haven't_ explained it. You've asserted it, but you have in fact never listed the requirements you believe aren't met in any detail, at least not as far as I can see. Can you point me to the e-mails in which you provide this detailed feedback? We have received feedback from companies that do click tracking (including, but not limited to, Google), and indeed their feedback directly influenced the design. The click-tracking requirements of all the companies that have spoken up and presented their use cases and requirements have, as far as I am aware, been met. > Likewise, this is a discussion about whether HTML should be defined > using a declarative language specification or not. I think it must be > defined as a declarative language specification because that is what my > tools need in order to understand and implement HTML. None of my tools > have a DOM. Nor do any of the tools developed by our competitors in the > same marketplace. Hopefully the answers to some of the questions near the top of this e-mail will help me understand this feedback better. As far as I can tell, there is nothing in the HTML spec that requires a tool to have a DOM (unless it supports executing script in pages). The language is, as far as I can tell, declarative. It would be helpful if you could quote the parts of the specification that are problematic. > We outnumber the browser manufacturers 100 to 1. And they outnumber you in terms of installed based a 1000 to 1. It doesn't matter. It's not a numbers game. IMHO all implementations of HTML on the public Web are important, and the needs of all of these need to be met if we are to improve the Web. > [with regard to why I should listen to you and not the majority of the > working group] > > [...] Let them demonstrate by deploying implementations, not opinions. As Maciej asked: Do browsers count as implementations? Previously you implied that the input of browser vendors should not be given significant weight. Is only the particular kind of software you work on relevant to HTML expertise? > > > > Why do you think that, for example, search engines, validators, > > > > authoring tools, data mining tools, and so forth, would benefit > > > > from _not_ handling errors in HTML documents in the same way as > > > > browsers do? > > > > > > They all handle errors in different ways. > > > > But why is this a good thing? > > Because they are different contexts. Why does this matter? We've received feedback from, for example, search engine vendors, saying that they want their parsers to be as close to what browsers do as possible, so that they can't be tricked into seeing content that browsers can't. Similarly, we've had feedback from validator implementors saying that they want to parse documents in a manner compatible with what browsers do, because otherwise they return misleading results (as, e.g. the W3C's HTML4 validator does in some cases). > The fact that I want my authoring tool to spellcheck my content does not > imply that I want all browsers to display squiggles under every word not > found in their own dictionaries. Does the spec require the same behavior for this case? > The fact that I want my browser to check for errors in content-type > charset encoding and display the text anyway does not imply that I want > my XSLT tool to do the same. Surely ignoring a Content-Type error for an XML document would be violation of HTTP and XML rules. But ok, consider XSLT: If XSLT worked on text/html HTML documents, wouldn't you want all XSLT implementations to give the same output, given the same document and transformation sheet? > The HTML5 spec can't be understood without an implemented DOM. Hm. Interesting. This certainly isn't the intention. Could you show examples of what you mean? I should fix that. > > > Error handling in entirely dependent on context. > > > > So would it be correct to say then that you believe that: > > > > 7'. Search engines and data mining tools should use different error > > handling rules than browsers. > > No, but again that isn't relevant to the vast majority of > implementations. If the vast majority of implementations that do parsing aren't browsers, data mining tools, or search engines, what are they? Is the list of conformance classes in HTML5 incomplete? What tools are we missing? If we are missing entire classes of tools, then certainly that s a big omission and we should fix it! Could you elaborate on this topic? > > Obviously, there are parts of the spec that are specific to browsers, > > just like there are parts specific to data mining tools, search > > engines, validators, authors, etc. Is that all you are referring to? > > Yes. The parts of the spec that are only relevant to specific > implementations belong in specifications about those implementations, > not in the specification of HTML. That wouldn't leave anything. I don't think there is anything in the entire spec that applies to every conformance class. Do you think HTML4 should be renamed also? It, after all, has many parts that are specific to different conformance classes. Similarly, should SVG be renamed? Should HTTP be renamed? As far as I can tell, all Web technology specifications have language specific to different conformance classes. > Finally, the parts of the spec that have nothing to do with HTML, such > as SQL storage for web applications, should be kicked out. I agree that there are parts that should be removed. I expect to take the storage section out before last call. -- Ian Hickson U+1047E )\._.,--....,'``. fL http://ln.hixie.ch/ U+263A /, _.. \ _\ ;`._ ,. Things that are impossible just take longer. `._.-(,_..'--(,_..'`-.;.'
Received on Saturday, 22 November 2008 10:11:59 UTC