Re: Shorthand for default attributes (was: Re: Whitespace) from Alex Milowski on 1997-05-13 (w3c-sgml-wg@w3.org from May 1997)

From: Alex Milowski <lex@www.copsol.com>
Date: Tue, 13 May 1997 09:55:46 -0500 (CDT)
To: w3c-sgml-wg@w3.org
Message-Id: <199705131455.JAA03141@copsol.com>
Bert Bos writes:
 
<snip>

> 
> That's the problem: the spec doesn't say so. I think it should be
> added, because without it the risk is too great that different parsers
> arrive at different results. Implementers that know about SGML may
> implement it differently from those that don't.
> 
> The problem is that the latest draft still allows documents that
> cannot be read without reading the DTD first. The purpose of a DTD is
> to constrain the syntax, not to change its interpretation. (This is
> also what the draft says, in 2.9). I thought that one of the main
> reasons for starting the XML effort was to get rid of this double
> (triple?) function of DTDs that has plagued SGML so much:
> 
>    1. Limit what is allowed in a document, such as what elements can
>       occur where. DTDs are not very good at this, but defining a
>       better syntax is not currently on the agenda. So let's stick
>       with DTDs for now. This function is needed by generators.
> 
>    2. Change interpretation of characters in the document. We don't
>       want this, and most of it has been removed by XML already
>       (shortrefs, datatags, RS/RE interpretation,...). Only whitespace
>       handling hasn't been resolved yet. The keep-all-whitespace rule
>       that XML seems to be converging on is not my first choice, but
>       at least it is makes whitespace handling independent of the
>       DTD.
> 
>    3. Provide macro-like capabilities (entities and default
>       attributes). To replace entities we already have XML-link, but
>       we don't have anything for default attributes yet. Since it is
>       not only generators that need this, it has to be an independent
>       syntax, outside the DTD.
> 
> I would argue for the return of the <?xml default...?>, or some
> similar macro mechanism. The syntax around ATTLISTs is too complex,
> and requiring people to include all attributes exhaustively is too
> cruel and also error prone. Moreover, neither option is compatible
> with the stated design goals for XML (especially 4, 5, 6 and 9). A
> simple macro mechanism (with scope, please) seems an ideal solution.
> 
> I'd like to do this:
> 
>     <DOC>
>     Start of a long document...
> 
>     <DIV>
>     <?XML DEFAULT P  CLASS="NOTE" SECT="1"?>
>     <P>In this section, all Ps have the same attributes...</P>
>     <P>...</P>
>     ...
>     </DIV>
> 
>     <DIV>
>     <?XML DEFAULT P  CLASS="DEF" SECT="2"?>
>     <P>In this section, all Ps also have the same attributes...</P>
>     <P>But different ones from the previous section...</P>
>     ...
>     </DIV>
> 
>     ... etc.
>     </DOC>

Hmmm, looks similar to #CURRENT which is in SGML but not in XML.

I have never found a *really good* reason for #CURRENT.  It would seem that
#CURRENT is also a "shorthand" for a container.  

In the above case, couldn't the 'SECT' attribute be specified on the DIV 
element.  In this way, only those paragraphs that needed to have an "override"
would have to specify the attribute value.

For example, say I have an attribute called 'LEVEL' which represents the
experience level of the reader.  I may have a section that is for an
intermediate user but one note in the section is pertinent to the expert
user.  Hence, I might have:

<SECTION LEVEL=INTERMEDIATE><NAME>How to Shutdown Solaris 2.x</NAME>
<P>...blah blah blah... 
<NOTE LEVEL=EXPERT><P>Rebooting with '-r' can be dangerous in certain
situations, use the following procedure:
</NOTE>
</SECTION>

Now, the semantics of how the LEVEL attribute is used on elements with
the LEVEL default is left up to the using application--most likely a #IMPLIED
attribute.  In this way, direct semantics of how attributes are applied
are not encoded when the do not have to.  The containment relationship of
SECTION to its children allows the LEVEL attribute of SECTION to govern the
children without forcing "funky" processing with processing instructions.

We want parsing to be easy in XML--not harder because of special cases.  I
believe this is one of the reasons why #CURRENT was not included (among other 
reasons that we don't need to go over again).

> Btw. Do we need the "xml" string at the start of the PI? I don't see
> why we need PIs at all. The only thing we need is three distinct
> syntaxes that look neither like data not like tags and that can be
> used to specify the encoding, the default attributes, and the XML
> version number. How about <?encoding utf8>, <?default...>, and
> <?xml1.0>.

I like the idea of having the XML at the start of PI.  It gives a processing
instruction an pseudo namespace.  I could very easily design an application
that "needs" <?default something> for some other reason and now I have
conflict with the XML standard.

Generally, I avoid processing instructions as much as possible.  ...but, that
is my world, and thus, my opinion.  ;-)

==============================================================================
R. Alexander Milowski     http://www.copsol.com/   alex@copsol.com
Copernican Solutions Incorporated                  (612) 379 - 3608
Received on Tuesday, 13 May 1997 10:57:18 UTC