- From: <noah_mendelsohn@us.ibm.com>
- Date: Mon, 20 Oct 2008 13:05:44 -0400
- To: Ian Hickson <ian@hixie.ch>
- Cc: public-html-comments@w3.org
Ian: at breakfast this morning we discussed some ideas regarding the creation of an authoring specification for HTML 5, and you encouraged me to remind you of some of my suggestions with a note to the comments list. Here 'tis. My overall comment is that I think the specification for the authoring of correct HTML 5 documents is of great importance. I understand that you are hoping that the need can be met in part by, eventually, using scripts to produce a stripped down version of the existing draft, leaving out much of the parsing and error recovery detail. Perhaps this will lead to a first class result, but I have some nervousness that the result might not be as effective as one might like. Accordingly, I'd like to suggest that the following be considered: * Simple success criteria should be explicitly set down for the authoring specification. What information must it convey, to which audiences should it be comprehensible, etc.? I'm not suggesting any big, elaborate, or time-consuming requirements effort, just a very brief set of criteria that could be agreed by the community as a yardstick for judging any particular draft. Perhaps this already exists. * I think it would be a good idea to generate representative drafts sooner rather than later. If practical, this could be done by marking up the existing draft and running the full automated process. If that's impractical soon, as I suspect may be the case, I would think that one or two members of the HTML working group could be tasked with manually producing a partial skeleton for evaluation, including at least some of the key sections such as 8.1, and representative slices of some of the others. For example, if one or two microsyntaxes and the definitions of a few representative elements were converted, it would probably give a very good idea as to whether the presentation of all of them would eventually be effective. I think the resulting draft should be circulated for comment, and should be used to inform planning for how the final HTML 5 authoring draft will eventually be prepared. * I think there are good reasons why most of the semantics of HTML 5 are explained in terms of the DOM, but it's worth keeping in mind that for authors (except when scripting), it's the serialized document that's of primary concern. So, it's worth explaining clearly and early the key invariants of what a legal HTML 5 document looks like. For example: "Start tags look like <this>, end tags look like </this>; elements are properly nested and thus encode a tree, which by the way is isomorphic to the corresponding DOM tree; etc.. Determining thinks like this from the existing specification is a bit of a theorem proving exercise: you have to notice that the DOM is always a tree, even though browsers accept input that's poorly nested, you have to notice that there are serialization rules that invariably result in properly nested tags, and you have realize that those in turn define what is intented as legal HTML 5. There's a risk that, if all one does is to strip the existing spec. to produce the authoring spec, these key aspects of correct HTML 5 will be unduly hard to discover. * I think the authoring specification is important enough that attention should be given to introductory material, organization of the table of contents, etc. Perhaps this comment is obvious, in which case I apologize for mentioning it. Right now, I understand that the most critical section for authors is in section 8.1, so it's not immediately obvious that a simple stripping of the existing draft will result in a document that flows in sensible order, with key concepts suitably highlighted. For example, I could imagine introductory material setting out some of the information mentioned in the bullet above. You could also any general syntactic rules, such as whether tags need to be explicitly closed or can be implicitly closed by the end tag for a parent, and if it's not obvious from the table of contents, provide simple guidance as to which sections are good starting points for learning key concepts. I hope these suggestions are helpful. I should point out that they represent my personal suggestions, and not necessarily those of other W3C TAG members. Thank you. Noah -------------------------------------- Noah Mendelsohn IBM Corporation One Rogers Street Cambridge, MA 02142 1-617-693-4036 --------------------------------------
Received on Monday, 20 October 2008 17:06:27 UTC