- From: Steven J. DeRose <sjd@eps.inso.com>
- Date: Wed, 11 Jun 1997 13:11:39 -0400
- To: w3c-sgml-wg@w3.org
(sorry this took a while to get through -- phoneline probs while out of country) The ERB met last week with Bosak, Bray, Clark, Connolly, DeRose, Hollander, Kimber, Magliery, Maler, Paoli, Sperberg-McQueen and Wood present. We set as agenda resolving several summary questions under discussion. Link decisions: 1. syntax None of the 9 syntactic options in question for distinguishing "HERE" as the EPN keyword from "HERE" as an ID in a URL found enthusiastic support, though several seemed acceptable. James introduced a tenth suggestion, namely to require an empty parameter list after those EPN keywords that do not already require them. For example: HERE is an ID HERE() is an extended pointer This met with immediate enthusiasm for several reasons, including that it increases the consistency of EPN itself and appears to be easy to teach/learn/implement/document. This option was unanimously approved. Link decisions: 2. Pseudo-elements Discussion involved several aspects of the problem of whether to count pseudo-elements. <NOTE TYPE="terminological"> A pseudo-element is a portion of #PCDATA content uninterrupted by markup. A "real" element in contrast, is one that has a GI. "Subelement addressing" involves addresses like "the third word". </NOTE> Whitespace relates to both pseudo-element and sub-element addressing. We tabled the pseudoelement issue to discuss how the SGML TC changes re. whitespace relates. The result was RE deleta est as reported already by Michael. Returning to the pseudo-element question, we noted that the removal of ambiguity about the *presense* of whitespace removes ambiguity in *how* to count pseudo-elements (though not about *whether* to). The great cost of not counting pseudo-elements is that then you cannot address them. It was pointed out that *if* you do still allow sub-element addressing (such as character offsets into #PCDATA), you can get at pseudo-elements that way, but that character counting across markup boundaries is itself complex and relatively fragile. It also imposes a subtle incompatiblity with HyTime and with TEI pointers (and not just for CHILD, but for several other keywords including complex cases such as PRECEDING and DESCENDENT). After much discussion the ERB is leaning toward a proposal under which both options are available to the user, distinguished by the GI parameter. This has not been voted, but seems at this time to be the best compromise. Thus: CHILD (3) locates the 3rd real subelement CHILD (3 *) locates the 3rd real subelement CHILD (3 !) locates the 3rd real or pseudo subelement (the particular reserved value to flag the last case is to be determined; "!" is merely for illustration). The approach was also suggested, that pseudo-elements consisting *only* of whitespace not be counted. This may enhance intuitiveness and compatibility with SGML systems that do not yet support the TC. This proposal will be presented to the WG for discussion. Link decisions: 3. Sets & singletons Discussion here centered on our relationship to the DOM work, since both require an explicit definition of what the document structure representation is before we can give a complete formal specification of what information is in fact referenced by a locator, particularly in the more complex cases, where the destination is not a single element. Lauren will be coordinating this liaison effort, and seek to present a first cut proposal for a DOM/XML data schema (or grove plan) by July 1. Michael and James will be contributing to this effort. As for locating spans, there are complexities because a span is not generally representable as a set, list, or tree or elements. The span from the 2nd to the 4th P within SEC ID=SEC3 can be; but the span from the last word of one P through the 4th word of the next P is not. Neither including nor excluding the P's involved, or their common ancestor, fully represents the link: All those elements are *partly* included in the resource. The end proposal was to include spans in the location syntax, specified as a start/end pair, with the meaning defined in the same manner as in TEI: as a reference to the included range. At the same time, we will acknowledge that the precise details are not yet specified, and that we expect that to be accomplished via the DOM effort, with which we are working. This was approved, with James and Dave dissenting. Steven J. DeRose, Ph.D., Chief Scientist Inso Electronic Publishing Solutions (formerly EBT)
Received on Wednesday, 11 June 1997 13:15:10 UTC