SD2 - Structured Attributes:

Jean Paoli wrote:

<q>
     Proposal: I'm going to say something radical here: We should invent
     a way to add structure and attributes to attributes.  A special
     character sequence "<*" in an opening tag (the exact characters
     used are to-be-determined) signals that the contents are an attribute,
     not a subdivision.
</q>

(I would like to see the remainder of the "SD2 - Structured Attributes:"
document, which appears truncated.)

Jean is not the only one who thinks (apparently ?) that SGML's notion
of "attribute" is deficient.  I have heard people saying this for a
decade, and I have hit the wall with SGML attributes many times
myself.  I sense that others in the XML working group may not have the
inclination to address this topic head-on in the context of XML's
"Structured Data" design discussion, but I think it deserves to be
considered at some point within the larger framework of the SGML
revision.

To me, it makes sense to say that the markup model for an element
(representing a [real world] Object) is plausibly the same model as
for an attribute (representing a [real world] attribute) *ONLY* if we
claim that there are no significant differences between real world
"objects" and "attributes" which are worthy of being modeled in our
information representation metalanguages.  That is: one might claim
that an XML element "content model" definition can be the same as an
XML "attribute definition" because there are no fundamental (and
significant) differences between objects and attributes.  Is this so?
In this respect, it's not clear to me that the (common) strategy of
privately declaring some GI namespace for (real) "attributes" masquerading
as "elements", or a reserved instance markup syntax like <*color>...
is the correct solution.  If objects and attributes have significantly
different attribute structure, then these conceptual model
differences should be reflected in the metalanguage.  Objects do not
"contain" attributes, for starters.

As I understand the problem, it finds evidence most clearly in
situations where one wishes to model complex structured information
for which semantic relations are critical to the conceptual model.
That is, to point out why the XML "solution" illustrated in Liam
Quin's note is less than ideal requires the observation that SGML/XML
validation on the information itself is thereby surrendered.  Lee said:


> --------------------------------------------------
  Consider
        <percy>
          <attribute>
            <name>Socks</name>
            <value>
                <colour>red</colour>
                <material>cotton<material>
            </value>
          </attribute>
          <content>
            whatever you want
          </content>
        </percy>
> --------------------------------------------------

and of course this works, so long as one does not care that we
might countenance an instance with:

          <colour>fermented warm arsenic</colour>.

A common response ("let's say it all together, now, class: ...") is
the reminder that SGML intends *not* to provide any facility for the
expression of primitive relational semantics or data types, noting
that even the 16 or so "declared values" for SGML attributes are
an anomaly in this regard.  Fine.  But in the real world modeling of
information, semantics is pretty important.  The ID - IDREF mechanism
is so weak semantically as to be almost useless, but not completely
so when one contemplates the alternative:

       <section><id>sec1</id>...
       ... 
       <xref><target>sec1</target>See Section One</xref>

I can't prove that Jean's concern is relevant for XML -- if we say that
XML is primarily for representing "displayable (surface) text" that is
delivered over the Internet.  I think it is relevant if we think of XML
as a "better SGML" in terms of its ability to represent information
(structure).  I always thought that "information" was what SGML was all
about.  A document conceived as "text characters to be displayed" may
be able to embrace a model of "attribute" that is as weak as SGML's
model is; a document conceived as an abstract (but projected) object
having a rich and multidimensional information structure may require
a more complex notion of "attribute," and ideally, one that is defined
clearly enough to support validation on attribute values.

Tilting at windmills again, probably.  If I could ever forge these
ideas more cogently into a paper, it might be entitled:

                    "Strings are not enough."

I hope others agree, even if they think XML is not the time or place
to do something about it.

**Apologies if I have misunderstood or misrepresented Jean's concern
for attribute structure (complex attributes).

-Robin

Reference:

Alexander Borgida, "Features Of Languages For The Development
   Of Information Systems At The Conceptual Level," IEEE Software
   2/1 (January 1985) 63-72 (with 17 references).  See further the
   <a href="http://www.sil.org/sgml/bib-ab.html#borgidaCML">Borgida</a>
   bibliographic entry for other details, and links to an online version
   of the article.

-------------------------------------------------------------------------
Robin Cover                    Email: robin@acadcomp.sil.org
6634 Sarah Drive           
Dallas, TX  75236  USA            >>> The SGML Web Page <<<
Tel: +1 (972) 296-1783 (h)     http://www.sil.org/sgml/sgml.html
Tel: +1 (972) 708-7346 (w)
FAX: +1 (972) 708-7380
=========================================================================

Received on Friday, 16 May 1997 14:02:28 UTC