Re: Page v. document (search and agent issue)

Nick Arnett (
Fri, 22 Nov 1996 13:04:58 -0800

Message-Id: <>
Date: Fri, 22 Nov 1996 13:04:58 -0800
From: Nick Arnett <>
Subject: Re: Page v. document (search and agent issue) 

At 03:39 PM 11/22/96 -0500, Sunil Mishra wrote:
>No, HTML is not geared towards a hieararchical document definition, which
>is essentially what you seem to be looking for. The closest you might be
>able to get is to specify each article within it's own
><div>. Unfortunately, the ID attribute has disappeared from HTML 3.2, which
>is exactly what you would be looking for if you wanted to specify a
>specific subpart of the HTML. The agent would of course also have to be
>modified to react to changes within specific <div>'s rather than a change
>anywhere within the document. A poor alternative to id would be to
><a name...> the headline at the top of the <div>.
>HTML 3.2 does specifies a class attribute. I would generally consider it a
>very bad hack to use class to specify different stories. But then you would
>not be the first to hack up HTML.

We'd much prefer to work within the standard.  Our engine can treat a byte
range as a retrieval unit, which parsing the text between the DIV tags could
produce.  But I'm wondering what publishers and users would expect as
default behavior -- would they *expect* each clump of HTML in a DIV section
to be a search and retrieval unit?  Is the primary purpose of DIV to define
sub-documents within pages?  Or would we be trying to change the purpose of
DIV if we promoted this as a solution to the agent problem?

Nick Arnett

Product Manager, Advanced Technology
Verity Inc.
408-542-2164; home office 408-369-1233