- From: Elliotte Rusty Harold <elharo@metalab.unc.edu>
- Date: Fri, 26 Jul 2002 07:47:40 -0400
- To: <xml-dev@lists.xml.org>
- Cc: <www-xml-blueberry-comments@w3.org>
At 3:03 PM +1000 7/26/02, Rick Jelliffe wrote: >In http://www.w3.org/TR/newline the three use cases are: > Looking at this document, I note that its author(s) had some serious misconceptions about XML. For example, they state: Well-formed but invalid - because the [NEL] character appears in element content: <a>[NEL]<b/>[NEL]</a> where the corresponding DTD contains <!ELEMENT b EMPTY> <!ELEMENT a (b)> In fact, however, the second example is invalid with or without allowing NEL. Valid elements declared empty may not contain white space. A similar misconception is seen later when the authors state: \n printf output: OS/390 C or Java program [NEL] This may be true in C. It is not true in Java. In Java \n always results in a linefeed. If it's producing a NEL on OS/390, then the OS/390 JVM is not conformant to the Java spec either. >> Using native system string functions, such as atoi and atof, to >>convert XML strings, documents, or fragments, to other data types This really goes to the heart of the problem: atoi and atof are ASCII functions that are simply not suitable for Unicode-based XML regardless of what we do with NEL. The atof() signature is: double atof(const char \nptr); It's been a while since I've written C, but my recollection is that the char type is always one-byte wide. Processing XML in C requires using different kinds of wide chars and wide string types. You can't use native system string functions to work with XML data because XML data is Unicode, not ASCII. For instance, in the Apache Xerces-C DOM "String is represented by 'XMLCh*' which is a pointer to unsigned 16 bit type holding utf-16 values, null terminated." Other schemes are possible. However, you simply cannot use C's traditional 1-byte strings and characters and their associated functions. This is not an OS/390 issue. It is a C issue. The same is true on Windows, Mac OS, Unix, and every other platform that uses C. All of the other functions we're talking about are similar. Even with NEL, you still shouldn't be using these to process XML. OS/390 needs to get some modern libraries. XML does not need to change. If mainframe programmers think that NEL is the only problem they have, they are sorely mistaken. IBM is asking us to break XML for many thousands of users for something that won't even fix their own problems. Short of moving XML to ASCII (a solution we all rightly abhor), the only way to solve the OS/390 problem is to fix OS/390. XML *cannot* be fixed enough to make XML usable on OS/390 in the way IBM wants. -- +-----------------------+------------------------+-------------------+ | Elliotte Rusty Harold | elharo@metalab.unc.edu | Writer/Programmer | +-----------------------+------------------------+-------------------+ | XML in a Nutshell, 2nd Edition (O'Reilly, 2002) | | http://www.cafeconleche.org/books/xian2/ | | http://www.amazon.com/exec/obidos/ISBN%3D0596002920/cafeaulaitA/ | +----------------------------------+---------------------------------+ | Read Cafe au Lait for Java News: http://www.cafeaulait.org/ | | Read Cafe con Leche for XML News: http://www.cafeconleche.org/ | +----------------------------------+---------------------------------+
Received on Friday, 26 July 2002 08:15:40 UTC