- From: Jon Bosak <bosak@atlantic-83.Eng.Sun.COM>
- Date: Mon, 9 Sep 1996 13:44:59 -0700
- To: tbray@textuality.com
- CC: w3c-sgml-wg@w3.org
[Responding to Tim Bray:] | At 07:19 PM 9/9/96 +0100, Martin Bryan wrote: | | > For example, before choosing whether or | >not to retain the distinction between abstract and reference concrete | >syntaxes | | If you want such a distinction, use real SGML. XML should use a hardwired | concrete syntax. I totally agree. | >- HTML has extended the Quantities defined in the reference concrete syntax: | >should XML offer less flexibility than HTML? | >- the SGML community has already agreed on a new set of Quantity defaults | >for the next version of SGML: should XML offer less flexibitity than the | >next version of SGML | | XML should have *no* concept of quantities. Names, nesting depths, whatever, | can be as large as required to meet the requirements of the application. | One straightforward way to do this and preserve compatibility | with SGML is to require an XML processor to have the capability of writing | an appropriate SGML declaration to set the quantities high enough to make | a particular XML DTD valid. I agree with Tim's intent here, but aren't some quantities related to the particular document instance? (Correct me if I'm wrong, but I seem to recall cases where declarations worked fine for me until a specific instance blew them up.) | >- the reference concrete syntax only permits the use of Latin alphanumeric | >characters in names of elements, attributes and tokens... | >- only Arabic numerals are recognised in the 1986 version of 8879... | | If you want to use anything but 7-bit ASCII in markup, use real SGML. | XML should have the reference concrete syntax hardwired in. Having just gone through a big struggle in WG8 and X3V1 over the ERCS proposal, I would feel pretty strange about limiting markup to something that not even Western Europeans could use the way they want to. I would like to see some serious discussion of this point. | >- the default character set in 8879 matches that of the reference concrete | >syntax: should users be able to select which character set is most | >appropriate for their documents and specify an SGML declaration in which | >only a subset of ISO 10646 is recognized as valid while still retaining the | >reference concrete syntax for markup? | | *Good* point... with modern parsing and encoding technology, it seems like | it would be easy, and it would certainly be desirable, for XML | data not to be limited to small old character sets. On the other hand, with | XML, ultimate flexibility is of less importance than ease of implementation; | would it be thinkable to say that "all XML data is always in UTF8"? It | seems this would break almost nothing and allow almost anything you'd want | to do. It's certainly thinkable to me. Is it thinkable to say that "all markup is in UTF8" as well? Jon
Received on Monday, 9 September 1996 16:48:59 UTC