Re: Concrete syntax, character sets

At 01:44 PM 9/9/96 -0700, Jon Bosak wrote:

>| If you want to use anything but 7-bit ASCII in markup, use real SGML.
>| XML should have the reference concrete syntax hardwired in.
>Having just gone through a big struggle in WG8 and X3V1 over the ERCS
>proposal, I would feel pretty strange about limiting markup to
>something that not even Western Europeans could use the way they want
>to.  I would like to see some serious discussion of this point.

OK, for the moment I'll stick to my guns.  Here are 2 arguments:

1. Document *data* is (mostly) for people to read, and thus of course 
   has to support the languages they write in.  Document *markup* is
   (mostly) for computer programs to read, plus the occasional unfortunate
   document designer.  Given that these things are already monocased,
   and by industry habit that I doubt XML will break, short, it's not
   clear that expressing GI's & attribute names in Cyrillic or Chinese is all 
   that important to the market.
2. Supporting bigger & more complex encodings in markup brings the benefit
   of making life easier & friendlier for document designers who want to
   use them.  Restricting the markup character set down to 7 bits brings
   the benefit of making it quicker & easier to generate software that
   processes such markup.  If I didn't already think that the second 
   of these two incompatible benefits was more important, I wouldn't
   be working on XML.

Jon, maybe you could bring some of the ERCS-folks' arguments alive for us?

Cheers, Tim Bray
tbray@textuality.com http://www.textuality.com/ +1-604-488-1167