- From: Noah Mendelsohn <noah@arcanedomain.com>
- Date: Tue, 11 Oct 2011 22:18:44 -0400
- To: Norman Walsh <ndw@nwalsh.com>
- CC: public-html-xml@w3.org
- Message-ID: <4E94F904.3000607@arcanedomain.com>
On 9/27/2011 10:11 AM, Norman Walsh wrote:
> Noah,
>
> A few weeks ago, you set out to draft some new introductory text for
> our report. I realize that I've also redrafted that section slightly.
> Are you satisfied with the result, or are you still working on a
> proposal for additional changes?
>
> Be seeing you,
> norm
OK, here's a cut at it:
-----------------------------------
HTML and XML share a common ancestor in SGML. The precise details of that
ancestry are not strictly important, its significant consequence is that
HTML and XML have a quite similar surface syntax. Both use angle brackets
and ampersands to distinguish "markup" characters from "content"
characters. Both have elements which contain other content and elements
which are empty.
This high level of surface similarity suggests, at least to some and at
least at first, that there should be a high level of interoperability
between HTML and XML systems. This notion is amplified by the fact that
when XML arrived on the scene, well after HTML was widely deployed, efforts
were made to recast HTML as an XML application rather than an SGML
application. HTML was never broadly implemented as an "SGML application",
but it was defined as one in the early HTML specifications.
However, if you look beyond those high-level generalities, the languages
are quite different and serve quite different purposes. Where HTML is a
single language, XML is a framework for defining languages. Where HTML
defines how a tree is constructed from any input, XML only defines tree
construction for a small subset of all possible inputs. Where HTML defines
explicit extension points within a single vocabulary, XML encourages the
use of multiple vocabularies defined in a distributed fashion. Where HTML
is in a small, explicit set of namespaces, XML provides for an unbounded
number of namespaces.
Nonetheless, there are a number of potential benefits that might result if
XML and HTML were more compatible and interoperable.These include:
* ·XML tools, including database and content-management systems, as well
as the export/import capabilities provided in many programs such as
spreadsheets, might be directly usable with HTML.
* The same XML markup, e.g. for content management or for vector graphics
(SVG), might be usable in HTML as well as other XML container
documents.Such shared markup might be supported by common code and
tooling, and copy/paste scenarios might be supported.
* HTML fragments might more easily be copied for use in XML container
documents. Syntax rules learned for use in one context would work in
the other.
* Some overlap might be eliminated from specifications, e.g. rules for
embedding SVG into XML and HTML container documents might be specified
just once, rather than in duplicate and slightly differently.
* Etc., etc.
Against the backdrop of this tension, the TAG formed this Task Force in
order to explore how interoperability between HTML and XML could be
improved. The Task Force began by collecting use cases to focus its
efforts. The original expectation was that a set of the use cases would
highlight those areas w_h_ere additional work changes to XML and/or HTML
specifications could usefully improve aid in the
interoperability.betweenXML and HTML. Then aHowever, the task force could
not identify any such changes that would provide practical benefit, and
that would likely be widely deployed in practice. All of the use cases do
appear to have at least plausible solutions todayusing XML as deployed
today and HTML5 as planned. solutions that do not appear amenable to
significant improvement, So, it appears that there is little that can be
usefully be done now beyond documenting these circumstances.
In the following section, we'll describe a set of use cases that the Task
Force considered, and how the needs of those use cases can be met today.
Readers are particularly encouraged to report additional use cases that
they feel are not represented or specific examples where the solutions
outlined are not appropriate.
A note about terminology: there are a great many ways to represent the
"object model" of an HTML or XML document. There are specifications for
both abstract and concrete representations. As a simplification, we use the
term "DOM" (Document Object Model) throughout as a general term for any of
these possible representations.
-----------------------------------
Noah
Received on Wednesday, 12 October 2011 02:19:14 UTC