- From: Etan Wexler <ewexler@stickdog.com>
- Date: Wed, 25 Dec 2002 04:33:32 -0500
- To: www-style@w3.org, Ian Hickson <ian@hixie.ch>
Ian Hickson wrote to <www-style@w3.org> on 23 December 2002 in "WD-css3-text-20021024 substantive comments" (<mid:Pine.LNX.4.21.0212162038130.17087-100000@dhalsim.dreamhost.com>): > Both XML and SGML normalise newlines to single U+000A characters. Your statement is incorrect regarding SGML, defined by ISO 8879. (My "SGML Handbook" is across the country at the moment, so I will not be able to cite clauses from ISO 8879. The following comes from memory.) In general, an SGML system will, after parsing, use a carriage return character (U+000D) to represent a line break from the input stream that constitutes an SGML document. I elaborate for those interested. In SGML, each line of input text is called a record, to distinguish it from lines of output (for example, a formatted line on a printed page). Records come with delimiting characters: a record start character noted as RS, and a record end character noted as RE. In SGML, RS and RE are delimiter roles that, in the concrete syntax used by a document, need to be assigned codepoints. The Reference Concrete Syntax, a recommended part of ISO 8879 that is in wide use in SGML documents and systems, reflects a common convention and assigns U+000A to RS and U+000D to RE. SGML parsing rules typically discard RS characters and retain RE characters (there are a few minor exceptions, most of which I cannot recall). This behavior, working with the Reference Concrete Syntax, leaves us with carriage return (U+000D) for line breaks. Then again, somebody could write and use a concrete syntax that assigns U+000D to RS and U+000A to RE. For that matter, somebody could assign U+231B (hourglass) to RS and U+0C6C (Telugu digit six) to RE. That is the beauty (and terror) of SGML: it is a framework most versatile, and nothing in ISO 8879 prohibits users from breaking convention or dashing expectations. > So in CSS, [U+000A (line feed) is] the newline character. The point, though, is that CSS could be used with any ordered hierarchy of content objects, if we did not overly confine CSS processors. XML documents are prime examples of ordered hierarchies of content objects, but not the only ones. Whether one chooses XML, SGML, or some other framework, CSS should be able to do a good job. One requirement of a good job is using the line breaks native to the framework. -- Etan Wexler <mailto:ewexler@stickdog.com> Every time you touch me I feel like I'm being bored.
Received on Wednesday, 25 December 2002 05:35:11 UTC