W3C home > Mailing lists > Public > w3c-sgml-wg@w3.org > September 1996

Re: Concrete syntax, character sets

From: Bill Smith <bsmith@atlantic-82.Eng.Sun.COM>
Date: Mon, 9 Sep 1996 16:05:00 -0700
Message-Id: <199609092305.QAA05806@providence.eng.sun.com>
To: w3c-sgml-wg@w3.org
Time Bray wrote:

> 1. Document *data* is (mostly) for people to read, and thus of course 
>    has to support the languages they write in.  Document *markup* is
>    (mostly) for computer programs to read, plus the occasional unfortunate
>    document designer.  Given that these things are already monocased,
>    and by industry habit that I doubt XML will break, short, it's not
>    clear that expressing GI's & attribute names in Cyrillic or Chinese is all 
>    that important to the market.

Ask the question to the Cyrillic, Chinese, and other markets that can't live with 
7-bit ASCII and I suspect you will get a very different answer. XML should 
*embrace* I18N not merely make it possible. 

> 2. Supporting bigger & more complex encodings in markup brings the benefit
>    of making life easier & friendlier for document designers who want to
>    use them.  Restricting the markup character set down to 7 bits brings
>    the benefit of making it quicker & easier to generate software that
>    processes such markup.  If I didn't already think that the second 
>    of these two incompatible benefits was more important, I wouldn't
>    be working on XML.

I don't see these as incompatible but rather as complementary requirements. XML 
should be both easy to use (document designers authors) and easy to implement 
(software developers). I don't see needing to trade them off, at least in this 
instance. 

We are agreed that 7-bit ASCII isn't sufficient for XML data. This will require 
significant effort to (properly) support. Suuprting non-7-bit ASCII in markup is 
trvial by comparison.

What about the UTF8 suggestion for both markup and data?
Received on Monday, 9 September 1996 19:06:26 EDT

This archive was generated by hypermail pre-2.1.9 : Wednesday, 24 September 2003 10:03:20 EDT