Re: Client-side highlighting; tag proposal

Nick Arnett writes:

> A couple of times lately, I've brought up the notion that clients should
> handle highlights (the terms that match a search query) better.  It's
> rather inefficient to force the search server to proxy documents just so
> that it can add highlights.  Worse, it takes the decision about *how* to
> highlight (bold?  underline?  surround with asterisks?) out of the user's
> hands (barring some sort of ugly protocol for telling the server).

> We'd like to suggest a very simple approach -- a highlight tag.  This way,
> our server could add the highlight tag in the appropriate places, but it
> would be up to the browser (under the user's control, presumably) to decide
> how to identify highlights in the text (turn them red, underline,
> whatever).  An appropriate UI enhancement would be the addition of a "next
> highlight" button or menu item and optionally a "previous highlight"
> button.

If you look at the current proposal for HTML 3.0 at:

        http://www.hpl.hp.co.uk/people/dsr/html3/CoverPage.html

you will see a MARK element matches your needs. Unfortunately, I have been
strongly advised by SGML Open to avoid using paired empty elements:

"Many optimizations that prgrams can use because they know they are
 dealing with trees are lost if such structures are permitted. For example, 
 a program can no longer tellhow to format part of a document without going
 all the way back to the beginning, on the off chance that there was a MARK
 element a long way back. Likewise, one cannot easily build a stack-based
 formatter, e.g. that keys styles off the list of element types in one's
 ancestry. An editor is in even worse shape. There is no way to validate
 that such pairs even match, because "matching" is not a generic notion --
 it has to be custom-built for each kind of pair.

 Many DTDs have inserted such element-pairs in their first drafts; they
 end up removing them later, because they prove to be a pain for both
 implementors and users, and to have surprising side-effects. We strongly
 recommend avoiding them completely."

As a result, I am now looking at a way of specifying both the start and
ends of highlighted region separately from the document body, e.g. using
a single element in the document head, e.g. something like:

        <highlight from=3096 until=4013>

Where the numbers are byte offsets into the document body.

-- Dave Raggett <dsr@w3.org> tel: +44 117 922 8046 fax: +44 117 922 8924
  Hewlett Packard Laboratories, Filton Road, Bristol BS12 6QZ, United Kingdom

Received on Monday, 13 March 1995 12:09:18 UTC