Re: Link-2: Pseudo-elements from Michael Sperberg-McQueen on 1997-05-22 (w3c-sgml-wg@w3.org from May 1997)

From: Michael Sperberg-McQueen <U35395@UICVM.UIC.EDU>
Date: Thu, 22 May 97 16:11:23 CDT
To: W3C SGML Working Group <w3c-sgml-wg@w3.org>
Message-Id: <199705222155.RAA23920@www10.w3.org>
On Sun, 18 May 1997 05:38:50 -0400 (EDT) Tim Bray said:
>What does CHILD(N) mean in mixed content?  Counting pseudo-elements
>is icky to start with, but with our shakiness as to white space in
>element content, it's even shakier.  James has suggested just
>bagging the whole pseudo-element handling thing.  Comments?

If I understand the postings I've seen on this topic so far, the
argument for counting pseudo-elements is that this is currently the only
method we have for pointing at strings within the document that are not
themselves elements or bounded by elements.

The arguments against are that (1) the XML white-space rules mean
different processors may provide different child counts, which is surely
a deal-breaker, and (2) pseudo-elements don't actually buy us anything,
since anything we can do with them we can do with other forms of
addressing within character data, if we have any, and if we don't have
any then pseudo-elements only help when what we want to address happens
to be coextensive with a pseudo-element.

I think James Clark is right that pseudo-elements by themselves don't
get us much; their main advantage in my view is that they allow the
pointer to get as close to a string as possible before starting to count
characters, bytes, or tokens.  If anyone thinks it's obvious what effect
element boundaries should have on any of these counts, I suggest they
try discussing it in a group of three.  The alleged complications and
stickiness of pseudo-elements are child's play compared to deciding how
to count characters across subelements.

Because they help make it possible to minimize the scope for error in
byte/character/token counts, and because they seem to me intuitively
necessary, I think pseudo-elements make a lot of sense in counting
children in the Full-SGML context.  After the adoption and
implementation of the 8879 TC, when Full-SGML parsers have the capacity
to pass *all* white space (including the white space in element content,
unless this got changed in Barcelona), the variability Peter Murray-Rust
points out can go away, and XML can require all processors to pass all
white space, even in element content.

But I don't think it would be the end of the world if XML 1.0 did
not count pseudo-elements as children.

Not mentioned so far is the fact that the base specification we're
working from defines CHILD the other way:  pseudo-elements do count, in
CHILD and other keywords of TEI extended pointers.

Agreement with the TEI specification is not as crucial to most
participants in the work group as agreement with 8879, but it *is*
important to some of us.  At the very least, it would be a friendly
gesture to the TEI if we could find a way to avoid the name conflict
that will arise if TEI pointers use CHILD in one way and XML pointers
use it in a different way.  The editors of the TEI will certainly ask
the TEI's technical review committee to consider changing some parts of
the TEI extended pointer notation to agree with XML's changes -- but I'd
be grateful not to be forced into choosing between harmonization with
XML and compatibility with the original TEI spec, particularly since
the only strong argument for dropping pseudo-elements -- that XML's
white-space rules allow different processors to get different counts --
does not apply in the Full-SGML context.

-C. M. Sperberg-McQueen
Received on Thursday, 22 May 1997 17:55:02 UTC