- From: Noah Mendelsohn <noah@arcanedomain.com>
- Date: Tue, 11 Oct 2011 22:18:44 -0400
- To: Norman Walsh <ndw@nwalsh.com>
- CC: public-html-xml@w3.org
- Message-ID: <4E94F904.3000607@arcanedomain.com>
On 9/27/2011 10:11 AM, Norman Walsh wrote: > Noah, > > A few weeks ago, you set out to draft some new introductory text for > our report. I realize that I've also redrafted that section slightly. > Are you satisfied with the result, or are you still working on a > proposal for additional changes? > > Be seeing you, > norm OK, here's a cut at it: ----------------------------------- HTML and XML share a common ancestor in SGML. The precise details of that ancestry are not strictly important, its significant consequence is that HTML and XML have a quite similar surface syntax. Both use angle brackets and ampersands to distinguish "markup" characters from "content" characters. Both have elements which contain other content and elements which are empty. This high level of surface similarity suggests, at least to some and at least at first, that there should be a high level of interoperability between HTML and XML systems. This notion is amplified by the fact that when XML arrived on the scene, well after HTML was widely deployed, efforts were made to recast HTML as an XML application rather than an SGML application. HTML was never broadly implemented as an "SGML application", but it was defined as one in the early HTML specifications. However, if you look beyond those high-level generalities, the languages are quite different and serve quite different purposes. Where HTML is a single language, XML is a framework for defining languages. Where HTML defines how a tree is constructed from any input, XML only defines tree construction for a small subset of all possible inputs. Where HTML defines explicit extension points within a single vocabulary, XML encourages the use of multiple vocabularies defined in a distributed fashion. Where HTML is in a small, explicit set of namespaces, XML provides for an unbounded number of namespaces. Nonetheless, there are a number of potential benefits that might result if XML and HTML were more compatible and interoperable.These include: * ·XML tools, including database and content-management systems, as well as the export/import capabilities provided in many programs such as spreadsheets, might be directly usable with HTML. * The same XML markup, e.g. for content management or for vector graphics (SVG), might be usable in HTML as well as other XML container documents.Such shared markup might be supported by common code and tooling, and copy/paste scenarios might be supported. * HTML fragments might more easily be copied for use in XML container documents. Syntax rules learned for use in one context would work in the other. * Some overlap might be eliminated from specifications, e.g. rules for embedding SVG into XML and HTML container documents might be specified just once, rather than in duplicate and slightly differently. * Etc., etc. Against the backdrop of this tension, the TAG formed this Task Force in order to explore how interoperability between HTML and XML could be improved. The Task Force began by collecting use cases to focus its efforts. The original expectation was that a set of the use cases would highlight those areas w_h_ere additional work changes to XML and/or HTML specifications could usefully improve aid in the interoperability.betweenXML and HTML. Then aHowever, the task force could not identify any such changes that would provide practical benefit, and that would likely be widely deployed in practice. All of the use cases do appear to have at least plausible solutions todayusing XML as deployed today and HTML5 as planned. solutions that do not appear amenable to significant improvement, So, it appears that there is little that can be usefully be done now beyond documenting these circumstances. In the following section, we'll describe a set of use cases that the Task Force considered, and how the needs of those use cases can be met today. Readers are particularly encouraged to report additional use cases that they feel are not represented or specific examples where the solutions outlined are not appropriate. A note about terminology: there are a great many ways to represent the "object model" of an HTML or XML document. There are specifications for both abstract and concrete representations. As a simplification, we use the term "DOM" (Document Object Model) throughout as a general term for any of these possible representations. ----------------------------------- Noah
Received on Wednesday, 12 October 2011 02:19:14 UTC