Re: Structure vs. appearance in HTML from Philippe-Andre Prindeville on 1995-09-22 (www-html@w3.org from September 1995)

From: Philippe-Andre Prindeville <philipp@res.enst.fr>
Date: Fri, 22 Sep 95 05:35:56 +0200
To: Joe English <jenglish@crl.com>, www-html@w3.org
Message-Id: <9509220535.ZM9406@jones.res.enst.fr>

> > For that matter, HTML doesn't really _have_ that much usable
> > structure.
> 
> True, but it's got enough to do a *few* useful things:
> 
>     * Build a table of contents from <Hn>eadings
>     * Spell-check the document, but skip stuff inside
>       <CODE>, <SAMP>, <KBD>, <VAR>, and <PRE>
>     * Build a "graph" of all the hypertext links in a collection
>       of documents, listing the anchor text of each link
>     * Automatically convert it to Braille
>     * Build a full-text search index for a collection of documents
>     * Build a keywords-based search index, giving higher weight
>       to keywords, emphasized phrases, and stuff in headings

This last one is dubious.  I have no way of saying, find me all
occurences of "Sprint" (as a proper noun, ie. name) in a document
or set of documents, skipping "sprint" the verb or noun.  Obiously,
"... winning the men's 100m sprint." does not pertain to
telecommunications or American corporate culture.

A lot of the things you mention are superficial.  They still don't
scratch the surface of *semantic* tagging of information.

We are creating and stocking quantities of information that will
be used well into the next century.  Machines will be used to
search these enormous quantites of data.  If it isn't tagged
meaningfully now (at its inception), it never will be.  And that
will be a real shame.

> And let's not forget:
> 
>     * Render it on just about any output device, with reasonably
>       good results.

Whoopie.

> This last is something that few other text markup languages
> have been able to accomplish.

To be honest there aren't that many.

-Philip

Received on Thursday, 21 September 1995 23:36:18 UTC