Re: Concrete syntax, character sets
At 01:44 PM 9/9/96 -0700, Jon Bosak wrote:
>| If you want to use anything but 7-bit ASCII in markup, use real SGML.
>| XML should have the reference concrete syntax hardwired in.
>Having just gone through a big struggle in WG8 and X3V1 over the ERCS
>proposal, I would feel pretty strange about limiting markup to
>something that not even Western Europeans could use the way they want
>to. I would like to see some serious discussion of this point.
OK, for the moment I'll stick to my guns. Here are 2 arguments:
1. Document *data* is (mostly) for people to read, and thus of course
has to support the languages they write in. Document *markup* is
(mostly) for computer programs to read, plus the occasional unfortunate
document designer. Given that these things are already monocased,
and by industry habit that I doubt XML will break, short, it's not
clear that expressing GI's & attribute names in Cyrillic or Chinese is all
that important to the market.
2. Supporting bigger & more complex encodings in markup brings the benefit
of making life easier & friendlier for document designers who want to
use them. Restricting the markup character set down to 7 bits brings
the benefit of making it quicker & easier to generate software that
processes such markup. If I didn't already think that the second
of these two incompatible benefits was more important, I wouldn't
be working on XML.
Jon, maybe you could bring some of the ERCS-folks' arguments alive for us?
Cheers, Tim Bray
firstname.lastname@example.org http://www.textuality.com/ +1-604-488-1167