Re: Distinguishing Attributes and Content from Matthew Fuchs on 1997-05-20 (w3c-sgml-wg@w3.org from May 1997)

From: Matthew Fuchs <matt@wdi.disney.com>
Date: Tue, 20 May 1997 13:48:32 -0700
To: Paul Prescod <papresco@calum.csclub.uwaterloo.ca>, w3c-sgml-wg@w3.org
Message-Id: <9705201348.ZM21940@scrumpox.rd.wdi.disney.com>

On May 20,  1:43pm, Paul Prescod wrote:
> Subject: Re: Distinguishing Attributes and Content
> > A little intellectual sleight of hand could be useful here.  It is easy
enough
> > to look at PCDATA as an empty element with a single _data_ attribute, i.e.
> > <element-with-data>Here's some data</element-with-data> ==
> > <element-with-data><pcdata data = "Here's some data"/></element-with-data>
> > This makes the document data attributes of the pcdata elements.  If the
parent
> > element doesn't have mixed content we can hoist the content:
> > <element-with-data data = "Here's some data"/>.
> >
> > The implication of this little trick is that data and attributes are leaves
of
> > the document tree and therefore shouldn't have any structure (data being
just a
> > privileged attribute).
>
> I see some (small? you decide) problems:
>
I called it "intellectual sleight of hand" because I didn't want to be mistaken
as actually implying this should guide implementations - it was just meant to
examine the one particular question - should attributes have structure.

> #1. Aren't empty elements also leaves? If all leaves are attributes, then
>     empty elements (e.g. <BR>) are "really" attributes.
>

Other way around - all attributes are leaves, not all leaves are attributes.

> #2. As we discussed in the fall, if you think of character data as an
element,
>     you add an extra level to the grove. element-with-data has <pcdata> as a
> child which has "data" as an attribute. The mythical <pcdata> element is a
> peer to elements we would consider "embedded" in the text (e.g. a <BR>)
> But the "traditional" SGML grove has each character as a peer node with
> elements buried in it.
>

You are generalizing beyond my intent, but it's fair game.  The set of groves
is certainly bigger than the set of valid SGML groves, which is also bigger
than the set of valid XML groves.  All I would be saying is to restrict the set
of valid XML groves to ones which have certain characteristics - actual XML
groves wouldn't need the extra layer, but there would need to be a bijection
with the set of groves that do.

> #3. "data" isn't a privileged attribute: its an unpriviledged one. Why can't
> I express an attribute type for data? Why can't I express a default value for
> it?
>

Well, why not?  Only because we've arranged the language, lexically, to behave
a certain way.  If you actually place your data in attributes, then you would
be able to do these things.

> If you turn the problem around, the language should STILL be more regular in
> its treatment of attributes and data.
>
Agreed.

> *But* I **do not think that this should be corrected in XML**. We don't have
> time, we don't have all of the right people here to discuss it, we can't
> create something incompatible with SGML, etc.  There will be an XML 2.0.
>

Agreed again.  Note that I was arguing _against_ a language change.

Matthew Fuchs
matt@wdi.disney.com

--

Received on Tuesday, 20 May 1997 16:46:47 UTC