- From: David G. Durand <dgd@cs.bu.edu>
- Date: Thu, 28 Nov 1996 16:50:40 -0500
- To: w3c-sgml-wg@w3.org
- Cc: w3c-sgml-wg@w3.org
At 2:23 PM 11/28/96, Charles F. Goldfarb wrote: >The SGML technique for accomplishing this result has two steps: > >1. Declare a new SEPCHAR function character in the XML concrete syntax, using a >character number that is not expected to occur naturally. Call this >"newsep". : >2. In a shortref map used by the document element, map RE to NEWSEP. > >That's it. If you're unhappy with SGML's built-in whitespace handling, you can >use this technique today. If we choose to use it for XML, it will be simple to >explain, easy to implement, and totally conforming. Unfortunately this does not work for XML, as it is not possible to tell the difference between mixed content and element content when parsing without a DTD. It also unfortunately requires throwing away a character code point. Especially for 8-bit character sets, that might be a high price to pay. Isn't there some chance that we could get a way to specify the behavior wanted rather than an arcane workaround? I want to be able to turn off RS/RE handling, and turn all whitespace into SEPCHARS without having to make arcane restrictions on what other, non-whitespace, characters can appear in a document. Is that too much to ask? Also, it's not clear to me that RS/RE are visible to shortrefs at the beginning of an element. I thought SHORTREF applied to document data, not ignored characters, but I could well be wrong, as I have avoided delving into the mysteries of shortref (like why a shortref cannot include the letter "B"). -- David I am not a number. I am an undefined character. _________________________________________ David Durand dgd@cs.bu.edu \ david@dynamicDiagrams.com Boston University Computer Science \ Sr. Analyst http://www.cs.bu.edu/students/grads/dgd/ \ Dynamic Diagrams --------------------------------------------\ http://dynamicDiagrams.com/ MAPA: mapping for the WWW \__________________________
Received on Thursday, 28 November 1996 16:44:47 UTC