Re: Use cases from James Clark on 2011-01-05 (public-html-xml@w3.org from January 2011)

From: James Clark <jjc@jclark.com>
Date: Wed, 5 Jan 2011 08:04:29 +0700
To: public-html-xml@w3.org
Message-ID: <AANLkTi==B2Msjo1xgsxxY9KxXXccvOviJuA9b6LEtVJh@mail.gmail.com>
1. One use case that is important to me is using a schema-driven XML
authoring tool to create HTML5 in the HTML syntax.  In particular, I would
like to be able to use a tool like nxml-mode, which is schema-aware but
exposes the syntax of the document, rather than working purely on a tree.

I want to do this so that I can at the same time:

- use a customized schema to constrain documents to a subset of HTML5; this
is not just a matter of using only a subset of HTML5 elements and
attributes, but also things like constraining the values of "class"
attributes and constraining structure (for example, enforcing the use of
<section> elements to to indicate document structure explicitly)

- create a document that is well-formed XML (so that XML tools can be used
on it)

- create a document that is a valid HTML5 document in the HTML syntax, or a
fragment of such a document (so I can eg view it directly in a browser, or
combine it with other HTML fragments using non-XML-aware tools)

Although I can already do this, the experience is not as good as it could
be, because my document must follow a whole bunch of detailed restrictions
that are not enforced by the authoring tool, because they are not
expressible in a schema and not required by XML well-formedness.

This is one of the scenarios that motivates my interest an XML subset.  The
idea would be that the authoring tool would add support for this subset.
 For example, perhaps there would be an annotation in the schema that the
tool would use to tell it to use this subset for a document.  Ideally use of
the subset together with conformance to the schema would be enough to
guarantee HTML5 validity.  This ideal is unlike to be 100% achievable, but
it's not a binary thing: the closer one can get to this ideal, the better.
Also, ideally, the XML subset should be useful to people who are looking for
something like XML, but radically simpler.

2. Other use cases don't need to involve tools at all.  I am not sure if
everybody would call these use-cases, but they are scenarios that I think
are relevant to the design.

(a) A non-expert user writes HTML5 in Notepad.  He has some modest
familiarity with HTML4 and with XML.  He has heard that HTML5 supports the
XML empty element syntax.  So he decides to write

  <script src="foo.js"/>

and is rather surprised to find it doesn't work.

(b) A scripting language user is writing a little script to generate some
HTML5 from a tree of elements.  He does the simplest thing that (he thinks)
could possibly work and generates a start-tag and end-tag for every element
(he's not too sure about the void elements). He tests this lightly by
loading it in his browser, and is delighted to finds that <img> works just
find with an end-tag.  He puts it into production and is puzzled to find
strange vertical spacing issues around <br> elements, which he fixes with
some CSS.

3. A user has a tool that generates SVG or MathML. Both SVG and MathML have
the ability to embed foreign XML from arbitrary XML vocabularies. The tool
takes advantage of this (with or without the user's knowledge).  The user
then copy-and-pastes the generated SVG or MathML into an HTML5 document
(perhaps performing some rudimentary fixup of the start of document, like
removing the XML and DOCTYPE declaration).   They want the resulting
composite document to be valid HTML5.

James
Received on Wednesday, 5 January 2011 01:05:51 UTC