- From: Robert Streich <streich@slb.com>
- Date: Sun, 22 Sep 96 15:29:03 CDT
- To: w3c-sgml-wg@w3.org
At 10:53 AM 9/20/96 CDT, Michael Sperberg-McQueen wrote: >Or even > > <p>Listen to my heart beat. > <?DIRECTOR: audio on > >And beat and beat and beat.</p> This is an excellent technique. DynaTag uses it to great success to get around mixed content problems in the really screwy DTDs that it creates. At 06:08 AM 9/21/96 +0000, Tim Bray wrote: >And I agree 100%. I am proposing (perhaps only as a strawman) that in XML we >make it 100% crystal clear that in in case [2], the "true information" is, in >C notation, > "\nListen to my heart beat\n\nand beat and beat\n" >or if you typed it in on a windows box > "\r\nListen to my heart beat\r\n\r\nand beat and beat\r\n" I agree, sort of. If possible, I'd like to have the first and last discarded. These should be easy to differentiate and discard and are "logically" insignificant. >No ambiguity whatsoever. The costs are: >[a] difficulty in figuring out how to make this 8879-compliant, and This is tricky (if not impossible) and I could accept a divergence here as it's impact is very, very small. >[b] the fact that you can no longer use whitespace around markup if you are > worried about the application's handling of line breaks in the data, so you > might in fact have to use > > <p>Listen to my heart beat<? Director Audio on> and beat and beat.</p> > > which is harder to read, and hard to type in vi, for long paragraphs. I can easily live with this as this is the current situation. Applications decide what a newline sequence means based on the stylesheet. In Author/Editor and DynaText they look like spaces unless I specify "verbatim" formatting in the stylesheet. >The advantages are: >[a] you can explain quickly, clearly, and precisely exactly what the "true > information" is >[b] Implementation is very easy I think the advantages far outweigh the disadvantages and it is easy to explain to the author. The only place it becomes an issue is when you have markup in data content that is not a proper subelement and it's the only thing on the line. Worst case: if line breaks are significant (to the presentation), you get an empty new line; if line breaks are not significant, you get an extra space. These things are easily picked up during proofing. Good spellcheckers even pick up the two or more spaces for you. If I'm in line-break-significant content, I'm also very unlikely to put in any extraneous markup anyway. If I need to, I can easily remember to "hide" the line break in the added markup using either Lee's or Michael's suggestions. Another advantage is that it makes it very easy for an SGML editor to "save as XML," at least in this case. The biggest disadvantage is that in the probably very, very few cases where someone wanted the extraneous line-break after some markup, an SGML parser would discard it. I can live with this risk. At 08:21 PM 9/21/96 GMT, Charles F. Goldfarb wrote: >Yes, but then you are requiring the author to enforce the rules as well as >remember them. With smart record handling in the parser, the author only has to >remember the rules; the parser enforces them. But I think the rules are much simpler to remember and a lot easier to digest than having to sometimes "quote" data content. This requires that the author know what mixed content is and which elements are mixed. This is a lot more to bite off than the alternative. bob Robert Streich streich@slb.com Schlumberger voice: 1 512 331 3318 Austin Research fax: 1 512 331 3760
Received on Sunday, 22 September 1996 16:29:25 UTC