- From: Kent M Pitman <kmp@harlequin.com>
- Date: Wed, 6 May 98 03:59:54 EDT
- To: xml-editor@w3.org
The XML 1.0 specification seems to go out of its way to make a CDStart [19] appear as a single token '<![CDATA[' even though both common sense and the SGML specification (Section 10.4 Marked Section Declaration, definitions [93] and [97] and [100]) would lead one to expect that all marked section declarations are uniformly treated and permit '<![' whitespace keyword whitespace '[' ...data... ']]>' For XML not to have omitted the possibility of whitespace here forces parsing of '<![' to introduce gratuitous special cases. Was there a reason for that? This kind of thing complicates my parser in what seems to me a useless way. I ended up writing something like this: (let ((status-keyword (cond ((NameStartChar? (peek-code stream)) ;; '<[CDATA or '<[IGNORE' or '<[INCLUDE' per [19][62][63] (require-token-among '("CDATA" "IGNORE" "INCLUDE") "CDSect or includeSect or ignoreSect" stream)) (t (peek-code-after-S stream) ;; '<[ IGNORE' or '<[ INCLUDE' but NOT '<[ CDATA' per [19] (require-token-among '("IGNORE" "INCLUDE") "includeSect or ignoreSect" stream)))))) (cond ((equal status-keyword "CDATA") ;; No whitespace allowed in '<[CDATA[' per [19]. (require-char #\[ "CDSect" stream) ....) (t ;; Whitespace IS allowed in '<[ INCLUDE/IGNORE [' before the ;; second bracket per [62][63]. (peek-code-after-S stream) (require-char #\[ "includeSect or ignoreSect" stream) ...))) where I feel something like this ought to have sufficed: (let ((status-keyword (prog2 (peek-code-after-S stream) ; skip preceding whitespace ;; '<[ CDATA ' or '<[ IGNORE ' or '<[ INCLUDE ' ;; per [19][62][63] (require-token-among '("CDATA" "IGNORE" "INCLUDE") "CDSect or includeSect or ignoreSect" stream) (peek-code-after-S stream)))) ; skip following whitespace (require-char #\[ status-keyword stream) ...)
Received on Wednesday, 6 May 1998 03:56:28 UTC