- From: Martin Bryan <mtbryan@sgml.u-net.com>
- Date: Tue, 07 Jan 1997 15:58:55 +0000
- To: digitome@iol.ie (Digitome Ltd.), w3c-sgml-wg@www10.w3.org
Sean Mc Grath wrote >[Martin Bryan] >On the web what constitutes a document is no longer a >fixable object. Its only when you activate the links that the entities >should be considered as included in the top level node. >[] >I understand how this follows from documents having multiple possible >layers of links semantics that are "late bound" i.e. at parse time. > >What is the default set or is there one? I'm not sure there can be one. Hopefully there will be a way of identifying an author generated set of documents that must be accessible to make sense of "a document" associated with each XML file. This is then augmented as you go through a particular link within that file by the "required reading list" for the file accessed if that is an XML file, otherwise it will remain in memory as the current "essential reading history". > Specifically, I am wondering what >Web crawlers will need to do in order to harvest descriptive markup for >intelligent searching... At present they are supposed to follow chains ad-infinitem. (In practice they don't). With XML they can just follow the chains the author has specified as being required, looking in those documents for any further "required reading" documents. ---- Martin Bryan, The SGML Centre, Churchdown, Glos. GL3 2PU, UK Phone/Fax: +44 1452 714029 WWW home page: http://www.u-net.com/~sgml/
Received on Tuesday, 7 January 1997 11:00:49 UTC