Re: Client-side highlighting; tag proposal

Joe English (jenglish@crl.com)
Tue, 14 Mar 1995 20:04:19 -0800


Message-Id: <199503150404.AA09597@mail.crl.com>
To: Multiple recipients of list <www-html@www10.w3.org>
Subject: Re: Client-side highlighting; tag proposal 
In-Reply-To: <199503142038.PAA02876@ebt-inc.ebt.com> 
Date: Tue, 14 Mar 1995 20:04:19 -0800
From: Joe English <jenglish@crl.com>


sjd@ebt.com (Steven J. DeRose) wrote:

> At  1:35 AM 3/14/95 +0500, Joe English wrote:
> >This is not much of an issue for HTML documents on the Web,
> >since they tend to be small and are rendered as a single unit 
> >anyway.  It's not like a browser is going to display the book of 
> >Leviticus and have to worry about a marked region starting in Exodus 
> >and ending in Deuteronomy.
> 
> On the contrary, that is *exactly* the problem. I do have Leviticus on a
> web site, and although my server is kind enough to break it into net-size
> chunks if/when asked, I sure do have to know whether there is some
> long-distance thing in effect, otherwise we can't know to send whatever
> start-tag caused it when sending a smaller piece. 

The *browser* only needs to worry about the net-sized chunks though.  

Would HyTime spanlocs (or equivalent) help the server in a case like this?
That is, would it be any easier for the server to recognize that
Leviticus is in the middle of a marked range if the range
were identified by an external locator instead of by embedded
<MARK> elements?  (or <SPOT> elements, which I'm starting to 
like better now.)  Maybe it can -- there's an obvious optimization 
for the case where both locators are treelocs, but in the general 
case I really don't see how it would help.

> >> Likewise, one cannot easily build a stack-based
> >>  formatter, e.g. that keys styles off the list of element types in one's
> >>  ancestry.
> >
> >This is only partly true, and irrelevant besides.
> >If the browser is going to include this functionality -- 
> >highlighting regions that may cross element boundaries -- 
> >it can't use ancestor-driven style resolution in any case,
> >regardless of how the regions are identified.
> 
> Your critique is incorrect. Existence proof: open a dynatext book, since
> dynatext does in fact use "ancestor-driven style resolution" for SGML.
> It quite happily supports "highlighting regions that may cross element
> boundaries" -- just do a drag-select or a phrase search and watch.

That's basically the point I was trying to make.
See if this makes sense:

> >> Likewise, one cannot easily build a stack-based
> >>  formatter, e.g. that keys styles off the list of element types in one's
> >>  ancestry.

[ my comment sneakily removed to make it look like Steven was
  answering the above and not me  --JE ]

> Your critique is incorrect. Existence proof: open a dynatext book, since
> dynatext does in fact use "ancestor-driven style resolution" for SGML.
> It quite happily supports "highlighting regions that may cross element
> boundaries" -- just do a drag-select or a phrase search and watch.


> >And lastly, you *can* use a single-pass parser with a stack-based 
> >formatter to keep track of marked spans.
> 
> Precisely my point:  you must do O(n), not O(lg n). Is that not unfortunate?

How do you format a document with n elements in O(lg n) time?

Once you've *parsed* the document, you can *display any piece of it*
in O(m * d) time (m being the size of the piece, d being the
depth of the element hierarchy -- not exactly O(lg n), but close enough).
This is true of the scheme I had in mind too; tracking the marked
regions during parsing is no more expensive than processing an 
LPD would be (that's cheap, BTW).  It boils down to doing
something like #POSTLINK processing during the parse; you don't
need to scan backwards during rendering. 

I agree with the rest of your points (which I've deleted); HyTime or
HyTime-like locators would be a better approach than <MARK> or
<SPOT> elements would be.  I *don't* agree that <MARK> should be
dropped from HTML 3 just because HyTime could do it better, though, any
more than <P align=center> should be eliminated once stylesheets come
along.  <MARK> has the distinct advantage of simplicity, for both
browsers and search agents.  The arguments against it are IMO not valid.

I also agree that <MARK>-like elements have no place in 
DTDs intended for large-scale authoring.  HTML is not such
a DTD, though.  It's for lightweight presentation and delivery,
and <MARK> is a good lightweight mechanism.

I also also agree that universal support for HyTime in Web browsers 
would be great.  I have serious doubts that it will happen 
any time soon though; the most popular browsers around still don't
support entity declarations or marked sections, and don't even get 
attribute value literals right.  (If the release of Panorama
causes a mass migration away from Netscape, I'll reconsider 
this one, but until that happens I have little hope that HyTime
is a viable solution for the Web as a whole.)


--Joe English

  jenglish@crl.com