Re: Meta Tag - proposal (suggestions ???) from Jon Wallis on 1995-11-16 (www-html@w3.org from November 1995)

From: Jon Wallis <j.wallis@wlv.ac.uk>
Date: Thu, 16 Nov 1995 07:50:27 +0000
To: www-html@w3.org
Message-Id: <m0tFz5Y-0007iLC@scitsc.wlv.ac.uk>
At 17:23 15/11/95 -0500, "Joe Budge" <budge@clark.net> wrote:
>> Request for comments, suggestions, etc...
>
>The META tag could benefit greatly from HTTP-EQUIV's for "revision" 
>(as in 'revision number') and 'timestamp' (as in 'date/time the 
>document was authored').
>
[snip
>An interesting "nice" feature would be an HTTP-EQUIV for 'period' (as 
>in 'the document covers the stated period').  This would be used to
>organize information so that one can organize/retrieve by historical 
>time period (eg: "give me all documents where 'title' contains 
>'United Nations' and 'period' contains '1945').
[snip]

>>    It is possible to use any text string, but if you want to define these
>>    properties you have to use the following words:
>> 
>>    	keywords: to indicate the keywords of the document
>> 	author:   to indicate the author of the document		
>> 	expire:   to indicate the expire date of the document
>> 	language: to indicate the language of the document
>> 	abstract: to indicate the abstract of the document
>>         organization: to indicate the organization of the author
>> 	public (yes,no): to indicate if the document is available to averybody
>> 			 or not


What about a META element for "subject classification" - using the Dewey
Decimal or Universal Decimal system?

        classification:  to indicate the subject classification of the document

This would of great use in broad high level searching, obviating the need to 
havw to do low-level content-based searching from the outset, which, in any
case, tends to return lots of "false-positive" results. 

Class-base searching (using the content field of an element like 

               <META NAME="Class" CONTENT="123.4">

would significantly reduce the problems of homonyms, synonyms, variant
spelling and different languages.

e.g., 

homonyms 
you search for "bass" - looking for the fish of that name - and get
documents about "bass" the musical instrument.  

synonyms 
e.g., you search for "theology", but my document only contains the words
"religious dogma", or you look for "car" but my document says "automobile"

variant spelling
you search for "colour", my word is spelt "color"

different languages
I look for "car", but your document is in french and says "voiture"

In a classification based approach, supported by a META "class" entry added
by the author (or by indexation in a "Web Library" that uses Dewey/Universal
Decimal), all the above problems could potentially be eliminated:

- "bass" fish would be under "597", bass the instrument would be "787"

- theology and religious dogma would both be under "2", car and automobile
would both be under 629.222

- colour and color would both be under "535.6" (NB this is an extension of
Dewey)

- car and voiture would both be under "629.222"

So, when searching, class-based searching would be used to identify a set of
"candidate" documents that were relevant to the subject in question,
low-level content-based text searching would be used, if necessary,  to
focus the search on  highly specific topics.

I would very much welcome comments on this idea.

Regards to all,



 


        
--
Jon Wallis         Senior Lecturer in Information Systems Engineering
School of Computing & I.T., University of Wolverhampton, UK - WV1 1SB
   Personal WWW Home Page   <URL:http://www.scit.wlv.ac.uk/~cm1906>
     University WWW Home Page <URL:http://www.scit.wlv.ac.uk/> 
-----------------"That's some catch, that catch-22"------------------
Received on Thursday, 16 November 1995 02:50:31 UTC