W3C home > Mailing lists > Public > w3c-sgml-wg@w3.org > April 1997

addressing into char content with xml-link

From: Paul Grosso <paul@arbortext.com>
Date: Thu, 10 Apr 1997 20:04:57 -0700
Message-Id: <3.0.32.19970410164725.006b6e50@pophost.arbortext.com>
To: w3c-sgml-wg@w3.org
In discussions with others over that last couple days, I've come
to the conclusion we should consider added to xml-link the capability
to address into data character content (aka dataloc).

The requirement I see is that users will expect an interface that
allows them to highlight some text in one document, highlight some
text in a second document, and make a link from one to the other.
If the target is a three word phrase in the middle of a very long
paragraph element, making the entire paragraph the target is unacceptable.
(Imagine if the application is one in which a reviewer of a document is
pointer out misspelled words--targeting the entire paragraph is unacceptable.)

I understand the difficulties in counting, and I understand the desire
to avoid specifying a grove plan in the XML spec, but I think we need
to try something.

Considering the 970331 lang spec and the 970406 link spec, what follows
is a concrete suggestion to start things off (numbers in brackets are
production numbers in the indicated spec).

In xml-link[13], add to the or group that defines "Element" something
like "*CHAR" or "*ATOM" to indicate that the Instance indication [12] 
is referring to data content atoms such as characters.  (I see no reason
to worry about what it means to have Attr and Val on *CHAR since we didn't
worry about it on *CDATA.)  The meaning of the Instance indication when
applied to *CHAR would be the obvious except for the specifics of what to
count as a unit.  In that regard, I'd suggest the following (production
numbers below all refer to xml-lang).

Each occurrence of each of the following shall be counted as one unit
for the purposes of the *CHAR addressing:

comment [17]
PI [18]
CDStart [20]
CDEnd [22]
CharRef [59]
EntityRef [61] 
STag [31]
ETag [34]
EmptyElement [37]
Char [2]

Note that Char != byte, but if we can expect the XML processor to know what
Char is when it's parsing an XML file, I figure we can expect it to know
what a Char is when it's addressing into an XML file.
Received on Thursday, 10 April 1997 23:08:14 EDT

This archive was generated by hypermail pre-2.1.9 : Wednesday, 24 September 2003 10:04:24 EDT