Re: Concrete syntax, character sets
> * Notwithstanding the last point, we can (if we want) go farther than
> saying that 10646 is the XML character set (in the SGML sense) and
> also say that UTF-8 is the only (or recommended, or expected, or
> default) encoding that we shall/should/expect to find when we open a
> file containing an XML document or receive a byte stream during an
> HTTP session.
This sounds good as long as UTF-8 is recommended or defaulted. The
other issue is what to present at the parser's API. Here to I do
not want to be restricted to a UTF-8 encoding because if I were to
write a parser in JAVA, a UTF-16 encoding would be more appropriate.
Instead I view the parser proper as excepting UCS-4 on input and
output. Encoding should be handled outside the parser in the storage
B. Todd Bauman
University of Maryland, Baltimore County