- From: Orion Adrian <orion.adrian@gmail.com>
- Date: Sat, 8 Jan 2005 10:24:16 -0500
- To: www-html@w3.org
What I'm saying though is that if you have to process the entire page and not just the first X% then the process has become an order of magnatude more resource intensive. The question is what benefit do we get by making it that way. Secondly what do we lose by forcing metadata to exist in the file and not be in a predictable location. (Think meta-data based file systems). Orion Adrian On Sat, 8 Jan 2005 17:22:50 +1100, Trejkaz Xaoza <trejkaz@trypticon.org> wrote: > On Sat, 8 Jan 2005 07:53, you wrote: > > > distinct.) It seems to me that it's the search engine's problem if it=20 > > > somehow fails to find important information. > > > > Often such heuristics are defences against abuse by authors trying to > > increase their rating. Metadata, because it doesn't get displayed in > > HTML 4/XHTML 1, is a good place for keyword stuffing by people who don't > > really care about its true purpose. > > Nothing stops the search engine from stopping indexing of keywords after a > certain point in the page either. > > Although in all honesty, you would get better results if you _did_ index the > entire page. Then you can trivially detect keyword abuse by counting the > number of keywords in the page and penalising for large numbers. I thought > this was already how Google worked anyway. > > But like I said, if they skip _important_ metadata, then it's their own > problem. They would quickly get supplanted by superior search engines, just > like Altavista did when their results started getting crap. > > TX > > -- > Email: Trejkaz Xaoza <trejkaz@trypticon.org> > Web site: http://xaoza.net/ > Jabber ID: trejkaz@jabber.zim.net.au > GPG Fingerprint: 9EEB 97D7 8F7B 7977 F39F A62C B8C7 BC8B 037E EA73 > > >
Received on Saturday, 8 January 2005 15:24:48 UTC