- From: Michael(tm) Smith <mike@w3.org>
- Date: Thu, 4 Sep 2008 22:40:37 +0900
- To: Jirka Kosek <jirka@kosek.cz>, Julian Reschke <julian.reschke@gmx.de>
- Cc: public-html@w3.org, Henri Sivonen <hsivonen@iki.fi>
I noticed that so far there's not been any specific response to the following part of one of Henri's messages in this thread - Henri Sivonen <hsivonen@iki.fi>, 2008-08-28 13:08 +0300: > On Jul 5, 2008, at 00:44, Jirka Kosek wrote: > > Of course there is second issue on which you really elaborate in your > > email and this is how to extend some *future version* of XSLT language > > and its implementation to support all bits of HTML5. I almost agree with > > your analysis on this issue. > > The issues can be fixed without changing the XSLT language. I released > version 1.1.0 of the Validator.nu HTML Parser the other day. The package > comes with a sample program that uses an unmodified XSLT engine (whatever > you have set as the TrAX default) with an HTML5 parser and an HTML5 > serializer. There's running code for addressing the issues *today*. > http://about.validator.nu/htmlparser/ I suspect that if you object to it, the response is likely going to be that, as great as having something like that is, using that or something similar instead of just using a stock/off-the-shelf XSLT engine is something that creates and additional burden or hurdle for developers. If so, I guess what I would wonder is how you would weigh those developer concerns/costs against those of casual authors. What I mean is, if we restrict the spec to only allowing <!doctype html> as a conformant HTML5 doctype, then we have something very simple for new authors to learn and for HTML/Web-authoring teachers to teach: You must include a <!DOCTYPE HTML> string at the beginning your HTML documents, and it must look just like that -- with just the word "doctype" followed by the word "html". And the teacher -- if he or she wants to try to rationalize it for the students without needing to go on at all about the whole quirks-mode FUBAR mess that forces us to require the doctype at all -- might make a reasonable case that the doctype actually has some small amount of meaning ("it's just a way of asserting that the document is meant to be conformant HTML", or whatever). On the other hand, if we want to make things easier for those developers who are using XSLT (or some XSLT-related thing like what Julian has described) as part of their document-generating toolchains, BUT -- because of limitations in their development environments or just because of their own choice -- are limited to only using output from stock XSLT engines without any post-processing... ...then to make thing easier for them, we allow a doctype in some form that includes the meaningless-in-this-context-but-required word "PUBLIC" after the word "HTML", followed by some other meaningless-but-required quotation marks, or with those quotation marks and some other string inside them that should in this context be even more completely meaningless, by design. So now all those new authors would have to learn -- and their teachers would have to teach -- that, well, things are a bit more complicated than just <!DOCTYPE html> because, for certain cases that they really are not likely to have any good understanding of at the time they first learn it, they need to know that the doctype can optionally also be in the form <!DOCTYPE HTML PUBLIC "FUBAR"> (or whatever). And they also need to know that they should never actually use a doctype in that form if they are using the normal kinds of authoring tools that they're likely to be learning with... This seems like a case where we really should be carefully considering our "Priority of Constituencies" design principle ("costs or difficulties to the user should be given more weight than costs to authors; which in turn should be given more weight than costs to implementors..."), and really looking carefully at who we want to put the costs on in this case. What we have should we make <!doctype html> the only conformant doctype is]: If developers use stock XSLT engines to generate their output and If they try to validate that output using an HTML5 conformance checker, they are going to get one message, one time, telling them that the doctype is not conformant -- or to put it into language that might more clearly mean something to them -- that the document is using something that's no longer conformant because it's been "deprecated". Do we really want to build a special exception into the spec just to prevent that special set of developers from seeing that message? (which is effectively just a warning message) Thinking in particular about the case of Java developers and speaking anecdotally from my own experience: Every time I upgrade my JDK and try to compile some existing Java code I have, I seem get gobs of new messages from the compiler that I'd never seen before, warning me about deprecated stuff. I've learned to just ignore those and to wait to deal with them if/when in some future JDK upgrade they actually cause compile errors instead of just warnings. I would suspect that most real Java developers are a lot more accustomed to seeing those than I am, and would think that unless they like have lots of extra time to spend, they're not actually going back to rewrite all their works-just-fine-as-is- despite-all-the-warnings code just to cause those warning messages to be suppressed. To get back to HTML5 and conformance checking, I would hope that we are not aiming to work toward a goal of an HTML conformance check be that the author gets a pat-on-the-back sense that their HTML documents are perfect. What I mean is, if authors/developers who are generating output from stock XSLT engines -- who would seem to me to be fairly savvy about knowing what kinds of error/warning messages they can safely ignore -- run a conformance check on their documents and see a message saying that the doctype is not conformant, I would really wonder what degree of frustration or trouble that's really going to cause them and what if anything we should do to try to avoid it. An HTML5 conformance checker will not be handing out gold stars to anybody anyway -- no "Valid HTML5" badges for anybody to beat others over the head with. So warning those authors/developers about the case of a non-conforming doctype that they are probably already aware is not conforming, I just really wonder how far we should go in adjusting the spec to prevent those developers from seeing that warning -- given that if they are doing XSLT development, many or most of them would know enough to realize they can just ignore it, and maybe what we'd really be left with it trying to please the obsessive/compulsive/perfectionist ones -- the ones that just can't tolerate running a compile/lint/ conformance check and seeing any warnings at all. God help us if we make it a goal to try to please that class of developers. --Mike -- Michael(tm) Smith http://people.w3.org/mike/
Received on Thursday, 4 September 2008 13:41:16 UTC