- From: Sam Ruby <rubys@us.ibm.com>
- Date: Fri, 03 Aug 2007 12:59:51 -0400
- To: Henri Sivonen <hsivonen@iki.fi>
- CC: public-html@w3.org
Henri Sivonen wrote: > On Aug 2, 2007, at 18:16, Sam Ruby wrote: > >> Since the workgroup demands use cases for any proposed new feature, I >> will provide one up front: this feature’s use case is to enable >> features without use cases. > ... >> FBML isn’t intended to be directly processed by browsers, but that >> shouldn’t preclude it from being processed by other HTML5 tools, >> everything from sanitizers to conformance checkers to pretty printers, >> to search engines. > > Is it the assumption that HTML5 so extended would be served on the > public network in ways that would routinely expose the extension markup > to browsers? If the extensions are intended to be processed by > non-browser tools in the context of a walled garden such as Facebook, > wouldn't XHTML5 plus namespaced extensions work? Perhaps, for the six of us or so that seem capable of consistently producing well formed XML. But what about Francis here: https://secure.mysociety.org/cvstrac/fileview?f=mysociety/pb/phplib/pbfacebook.php&v=1.1 Note: I don't care for the use of pejorative terms like walled gardens here. I will readily concede that the term is accurate, but it is a distraction. People should be "freely extensible by anybody" (I lifted those words straight from Atom's Roadmap). While we should encourage extensions, we should recognize that users will screw things up. While I can't conceive of anything that would validate the PHP script I referenced above, I can imagine conformance checkers that validated the output. Conformance checkers that not only validate HTML structure, but also validate that a/@href attributes are URIs (despite being defined in a separate document) and that fb:default elements have fb:switch elements as their parents (again, despite being defined in a separate document). >> XML permits an alternate syntax, namely default namespaces. In >> certain circles, such a syntax is very popular. Regrettably, allowing >> such a syntax would pose problems for back level user agents, and >> therefore must be disallowed in the HTML5 “custom format”. > > However, such an approach might well work for bringing specific > well-known XML vocabularies with distinct subtrees to the text/html > serialization, specifically SVG and MathML with namespace mapping scope > established by <svg> and <math> as subtree roots. When it comes to > extending text/html to be an alternative infoset serialization for a > broader range of possible infosets, I'd prefer to optimize for enabling > those two well-known namespaces instead of optimizing for private > extensions. (Not to suggest that the two goals were mutually > exclusive--just suggesting that well-known vocabularies are preferable > oven private vocabularies in a non-walled garden.) We should design for everybody, and verify for those that we care about. As somebody who authors SVG in VIM, I'm comfortable with what I proposed. I will also note that these requirements (at least for SVG) are consistent with the following profile: http://www.w3.org/TR/2002/WD-XHTMLplusMathMLplusSVG-20020809/ As to MathML, we still need to decide whether or not to grandfather in that vocabulary. >> Messy details >> ------------- >> >> I don’t pretend that these are exhaustive, but they should seed an >> interesting set of discussions: > > * Should the tokenizer do ASCII case folding when scanning a name until > it hits a colon (effectively making prefixes ASCII-case-insensitive)? Or > should each name be scanned without case folding and case-folded > conditionally later? Good catch. I'm going to add it to my original source page on my weblog. >> * You might think that this proposal wouldn’t change how text >> nodes or comments were processed, but there is one case that merits >> consideration. The default processing by existing user agents is to >> render text nodes even when they are enclosed in unknown markup. In >> some cases, this may not be desirable. The XML CDATA[] syntax is >> treated as a comment by HTML parsers, so this may be used to “cloak” >> such text regions. For this to work, however, HTML5 compliant parsers >> would have to treat such constructs as text, but only when enclosed by >> an extension element. Again, a more complicated parse state machine >> is necessary in order to preserve backwards compatibility and >> extensibility. > > I think I don't understand what behavior you are suggesting here. Time for a concrete example: http://intertwingly.net/svg/410.svg If I don't use CDATA, the "410" shows up in IE. If I do use CDATA, nothing shows up in IE. We need a way to say "if you can't handle this extension, don't show anything (or perhaps, show a fallback defined separately)". Now, if we allow this use of CDATA, we need it to actually show up as a text node in the DOM as opposed to a comment node when used in precisely this circumstance. >> Implications >> ------------ > > * As an unintended consequence an extension mechanism like this could > make it possible to declare the HTML namespace with a prefix and hack > different core element treatment in legacy UAs and extension > mechanism-aware UAs. (This can be a good thing or a bad thing.) Let's simply declare that to be an error and be done with it. I'm also going to update my page with this one. - Sam Ruby
Received on Friday, 3 August 2007 17:00:06 UTC