Re: Concrete syntax, character sets
Glad to see the discussion starting. Martin raised some interesting points and
Tim's responses prompted me to reply.
Tim Bray wrote:
> XML should have *no* concept of quantities. Names, nesting depths, whatever,
> can be as large as required to meet the requirements of the application.
This I like.
> One straightforward way to do this and preserve compatibility
> with SGML is to require an XML processor to have the capability of writing
> an appropriate SGML declaration to set the quantities high enough to make
> a particular XML DTD valid.
This I don't like. Requiring that XML processors have this capability (feature?)
seems overly restrictive. Noting that it can be done would be sufficient for me.
A reference application would be nice - perhaps something available from W3C.
> If you want to use anything but 7-bit ASCII in markup, use real SGML.
> XML should have the reference concrete syntax hardwired in.
I think we should recognize that 7-bit ASCII isn't sufficient for something that
professes to be "World Wide". I'm not aware of large technical problems with
other encodings for markup but do know 7-bit ASCII restrictions are an issue with
many people. I'd like to see XML support other encodings in markup.
> *Good* point... with modern parsing and encoding technology, it seems like
> it would be easy, and it would certainly be desirable, for XML
> data not to be limited to small old character sets. On the other hand, with
> XML, ultimate flexibility is of less importance than ease of implementation;
> would it be thinkable to say that "all XML data is always in UTF8"? It
> seems this would break almost nothing and allow almost anything you'd want
> to do.
I don't have a problem with UTF8 for data. Why not for markup as well?