- From: Michael(tm) Smith <mike@w3.org>
- Date: Thu, 20 Nov 2008 17:21:26 +0900
- To: Henri Sivonen <hsivonen@iki.fi>
- Cc: Lachlan Hunt <lachlan.hunt@lachy.id.au>, public-html <public-html@w3.org>
Henri Sivonen <hsivonen@iki.fi>, 2008-11-19 21:09 +0200: > The syntax is RELAX NG Compact Syntax. The syntax for the regular > expressions appearing in the document is the XSD regular expression syntax. > (Instead of pulling the regexps from schema comment, I think it would be > nicer to pull the same descriptions Validator.nu uses as UI strings: > http://wiki.whatwg.org/wiki/MicrosyntaxDescriptions ) Yeah, I agree, and I'll update the build for the document to scrape that page and pull those descriptions in instead. > I guess the methodology behind the document isn't clear to everyone on the > list. The document is not manually written. On last week's HTML WG telcon[1], I discussed a bit about how the document is put together, though the record in the minutes is not very detailed: http://www.w3.org/html/wg/markup-spec/schema.html I will be adding an Acknowledgments section give credits and copyright statements for the sources (the whattf.org schema, the existing HTML5 draft, the default user-agent stylesheet from WebKit) of the parts of the spec that are generated in the output as part of the build -- possibly also along with a short Colophon that describes how the generated parts of the document are built. For now, here are a few more details - Some parts of the spec are manually written, though the per-element "Content model", Attribute", and "Assertions" subsections are generated, as is everything else from section 5 "Common Content Models" on. The parts that I'm manually maintaining now are the Syntax section (based largely on initial text from the existing HTML5 draft, and reorganized) and the prose descriptions of the elements and attributes. In most cases, the current prose descriptions for the elements are primarily still verbatim text initially pulled from the HTML5 draft, though I think I may have reworded some slightly. Some of the attribute descriptions I have already re-written a bit (or maybe more than a bit) from descriptions initially pulled in from the HTML5 draft. I think so far, I've done that only for some of the "A" ones -- e.g., <a>, <area>, <audio> -- and <base>. Mostly that re-writing has amounted to attempting to make those descriptions more succinct (where it seemed like they could be) and doing rephrasing to fit the context of this document. The per-element Examples subsections are all currently pulled in by the build verbatim from the HTML5 draft. But I may change some or remove some later. > It has been generated from various sources using XSLT. ...and a specially modified/hacked version of Trang, and some Perl hacks, and maybe some other things I'm forgetting about. For those that are interested, the Makefile that does the build is here: http://www.w3.org/html/wg/markup-spec/Makefile ...and the main XSLT driver stylesheet is here: http://www.w3.org/html/wg/markup-spec/tools/generate-spec-source.xsl > The document has some original text, but a lot > of content in pulled in and mashed up from the HTML 5 spec proper, the > whattf.org HTML5+ARIA schema used by html5.validator.nu ...which, for the record, is here: http://svn.versiondude.net/whattf/syntax/trunk/relaxng/ The nature of the build is such that whenever that schema changes and I re-build, the per-element "Content model", Attribute", and "Assertions" subsections, etc., will get regenerated and will reflect any changes made to the schema. The basic intent is for the specification to be automatically consistent with the same conformance rules that are checked by validator.nu. I fully recognize the potential issues of tying the spec to a particular schema and too closely to a particular conformance- checking tool. It could be that the draft might eventually use a different schema instead of the whattf.org schema, or I may dispense entirely with the idea of trying to use a schema to auto-generate those parts of the draft, and use manually maintained prose descriptions instead. But for now, I think it's kind of useful to experiment at least with keeping it closely in sync with the one HTML5 conformance checker that we doe have. > and from the UA style sheet of WebKit. ...the source for which is here: http://svn.webkit.org/repository/webkit/trunk/WebCore/css/html4.css The build actually takes that and converts it to an XML representation (yeah, go ahead and say ugh) and then chops it up per-element and add syntax highlighting to it to produce what's actually shown in the draft. > I think the document is very cool as documentation of the whattf.org schema > and works as a reference for people who are comfortable with reading RELAX > NG. (I link to it from the Validator.nu documentation.) However, I don't > support putting it forward as a normative spec. As far providing a reference for people who are comfortable with reading RELAX NG, there's also a hyperlinked HTML representation of the whattf.org schema here: http://www.w3.org/html/wg/markup-spec/schema.html That's auto-generated by the build from the schema sources, so if the schema sources change, it will get automatically updated. > > with some weird anomalies with the way attributes are seemingly included > > within the element's content model. > > That's a pretty cool feature in RELAX NG, actually. Along with the simplicity of the RELAX NG compact syntax, I think it makes for relatively readable content models (though I recognize they're less friendly to casual readers than prose descriptions of the content models are). --Mike -- Michael(tm) Smith http://people.w3.org/mike/
Received on Thursday, 20 November 2008 08:22:10 UTC