Re: Indexing non-HTML objects from David W. Morris on 1997-05-03 (ietf-http-wg@w3.org from April to June 1997)

From: David W. Morris <dwm@xpasc.com>
Date: Fri, 2 May 1997 18:26:21 -0700 (PDT)
To: Andrew Daviel <advax@triumf.ca>
Cc: http-wg@cuckoo.hpl.hp.com
Message-Id: <Pine.SOL.3.95.970502181822.4195E-100000@shell1.aimnet.com>

Note:  I've deleted the robot list because to post requires that I
subscribe which is not for me at the moment.

On Fri, 2 May 1997, Andrew Daviel wrote:

> On Fri, 2 May 1997, David W. Morris wrote:
> 
> > Yes, but as has been already pointed out, the LINK is a subpart of
> > the HTML and thus doesn't provide for describing arbtrary www content,
> > in the case of this thread, for purposes of representing the arbitrary
> > www content in a suitable fashion for indexing.
> 
> The idea is that the HTML document includes the metadata (as META tags, 
> PICS label headers, or just plain HTML). The LINK references the resource 

I understand that was one idea, I believe another idea was presented which
was much more general and allows for indexing of arbitrary resources (and
with the TCN notion different indexing if desired in different contexts).
Since the LINK is not the basis of the indexing, but rather the resource
itself, there is no way for indexing of the resource to be misrepresented
by a referencing page which is not associated with the owner of the
resource.

There is no reason why the content owner might not use a page which
imbeds a resource as the metainfo page reference for indexing of the
resource. But that is not a requirement.

Furthermore, with the push to JavaScript negotiated page content by 
MicroSoft, there is a real risk that index-bot won't be able to figure
out how to meaningfully index even 'basic' content pages.  Providing
a mechanism whereby a index-bot can request the indexable form of
a resource via TCN is a possible solution.

> > Transparent Content Negotiation would provide the ideal infrastructure via
> > which the URL/URN/URI identified resource listed in the proposed metainfo
> > header could have the appropriate variant delivered based on the 
> > specific needs of a particular indexing service. Then some content
> > could have multiple descriptive documents for indexing purposes if the
> > publisher so chose.
> 
> Mm, sounds exciting if people will use it. I suspect not ..
> Any idea how many people are using content-negotiation at this point ?
> I've been waiting for HTTP/1.1 to address the cacheing issues (which it 
> has), but I don't really have much negotiable content anyway and haven't 
> updated yet.

TCN is still in draft status within the HTTP-WG ... some groups debate
whether it is worth implementing. Indexable content may be a very
important reason for TCN support.

Dave Morris

Received on Friday, 2 May 1997 18:27:56 UTC