Re: Simple(?) question on obscure comments detail

>On Fri, 20 Sep 1996, Murray Altheim wrote:
>
>> The parser is scanning forward for the next instance of COM, not for the
>> next instance of "-->", which has no singular significance in a comment
>> declaration; it is simply the concatenation of a COM and MDC (">"); that's
>> why parsers that look for "-->" are making an error. It is perfectly
>> SGML-legal to write a comment declaration such as:
>>
>>      <!-- hello --
>>      >
>
>Let me see if I (who knows nothing of SGML) can get this straight:

Well, if you know HTML you understand a fair amount of SGML syntax; HTML is
simply one application of SGML. Go look at [DocBook]'s DTD and you'll
understand that its just a more complicated cousin to HTML.

><! > is an HTML element that stands for an SGML declaration
>
>-- some text -- is an SGML comment
>
>Hence any whitespace is allowed between <! and the comment, as is between
>the comment and >, according to what I know of HTML.
>
>Is this correct?

Pretty close. Just as you can't put a space between a start-tag open
(STAGO) and the generic identifier (GI, eg., "B") in document content, such
as

    < B>invalid</B>
     ^
the same rule applies to markup declarations. An SGML markup declaration
begins with a "markup declaration open" (MDO), which in the reference
concrete (ie., default) syntax is defined as:

    <!

Following immediately after this (no whitespace) is the declaration type.
There are 13 types of markup declarations in SGML, but only one is allowed
in the document instance: the comment declaration. All others are specific
to the document prologue (SGML declaration, DTD, etc.). The markup
declaration ends with a markup declaration close (MDC), defined in the
reference concrete syntax as ">". Remember that in SGML there is a
differentiation between the terms "comment" and "comment declaration".

    <!--  SGML Comment Declarations                          --
      --  SGML comments must always
          be contained within a pair of COM delimiters,
          which in the reference concrete syntax are
          defined as a pair of dashes.
      --
      --  The *only* place where whitespace is significant:
                  between MDO and the first COM.             --
     >

Dan's message also reiterates the associated text of RFC1866. I hope this
covers the subject well enough...

--------------

[DocBook] I maintain some browseable DTDs at
    http://www.cambridge.spyglass.com/doc/

PS. Note that we're undergoing domain name changes here, going from
    "www.stonehand.com" to "www.cambridge.spyglass.com", including new
    firewall and DNS, so if you can't hit this over the weekend, try
    again early next week.

Murray

```````````````````````````````````````````````````````````````````````````````
     Murray Altheim, Program Manager
     Spyglass, Inc., Cambridge, Massachusetts
     email: <mailto:murray@spyglass.com>
     http:  <http://www.cambridge.spyglass.com/murray/murray.html>
            "Give a monkey the tools and he'll eventually build a typewriter."

Received on Saturday, 21 September 1996 14:46:25 UTC