RE: A Simple Analogy

<- Refining the query may well be the easy bit. Controlling metadata
<- spoofing in the target data sounds really hard to me: that's one
<- thing the number crunching appraoch to search has in it's favour,
<- the potential for hybrid approaches (such graphing citations)
<- notwithstanding.

Why should metadata spoofing be an issue? Its practice on the web appears to
be on the decline, maybe as the pron vendors have started realising that
targeting is a more efficient sales strategy than pissing people off.

In any case, the originator of the data doesn't have to be the only source
of metadata - think DMOZ. Also there is a lot of unrealised potential in
that there number crunching - I've been looking at applying self-organising
maps [1] to searching/automatic cataloguing, and I reckon it's perfectly
feasible to classify according to semantic content through e.g. SOM-like
conceptual mapping. This is only one of many available techniques - but in
general generating metadata from content pretty much precludes spoofing.

<- Sam Johnson said it best...
<-
<- "I saw that one enquiry only gave
<- occasion to another, that book
<- referred to book, that to search
<- was not always to find, and to
<- find was not always to be informed."
<-
<- ...in 1753.

Apposite ;-)

[1] Self-Organising Maps, T. Kohonen, ISBN 3-540-62017-6 (a search on
'Kohonen' and/or 'WEBSOM' should be productive)

Received on Sunday, 15 April 2001 00:52:03 UTC