Martin wrote: > > > >-- > >This means that XML or HTML documents are always processed as a > sequence of > >characters from the Unicode character set. > >-- > > This may not always be true. It is perfectly fine to have an > XML parser that works in US-ASCII for US-ASCII documents, and > so on. It may not be a good idea in terms of implementation, > but it wouldn't be against the XML Rec. > (personal response) Yes, but the effect is the same: a US-ASCII document might still contain an NCR that must be treated as a Unicode code point. It is useful to note that the paragraph directly following this sentence makes the point that the file might use any encoding, including a non-Unicode encoding. While my suggestion might not be quite the right wording, it does, I think, convey the important point, which is that document authors may (and document processors must) treat files as if they were a sequence of Unicode code points. What encoding the processor uses internally is invisible. AddisonReceived on Tuesday, 10 June 2008 14:33:42 GMT
This archive was generated by hypermail 2.2.0+W3C-0.50 : Wednesday, 1 October 2008 10:18:55 GMT