- From: Charles McCathieNevile <chaals@opera.com>
- Date: Thu, 08 Jan 2009 11:37:50 +1100
On Mon, 05 Jan 2009 00:17:39 +1100, Henri Sivonen <hsivonen at iki.fi> wrote: > On Jan 3, 2009, at 17:05, Dan Brickley wrote: > >> But perhaps a more practical concern is that it unfairly biases things >> towards popular languages - lucky English, lucky Spanish, etc., and >> those that lend themselves more to NLP analysis. The Web is for >> everyone, and people shouldn't be forced to read and write English to >> enjoy the latest advances in Web automation. > > Some languages are higher in the pecking order than others when software > development is prioritized, and RDFa cannot level the playing field here. > > Suppose there's a use case that can be satisfactorily addressed by > applying NLP heuristics to content for the top-tier languages. Even if > there were an RDF mechanism for addressing the same use case without > relying on natural language, software aimed for serving the top-tier > languages would still do the NLP thing for the use case. No. There is no reason for most developers to prefer one over the other under the circumstances described. Clearly Google has an investment in text-harvesting in a bunch of languages. Equally clearly its competitors who are more sucessfeul in various languages (Yandex, Baidu, etc) have an investment in the technology they use. But developing a new indexing process, there is no a priori reason to favour NLP over some other technique that is also satisfactory, and if you happen to be interested in a global market, it makes sense to develop a system that can be more easily adapted, other things being equal. ... > Instead of bearing the cost of developing a totally alternative > technology stack for the other languages without benefiting from any > spillover from the effort done for the top-tier languages, it makes more > sense to invest the effort into building upon the reusable parts already > developed for the top-tier languages. Except that it turns out that the re-usable parts of most search engines, for the general developer, are pretty limited. Whereas the re-usable parts of the RDF stack are numerous, available for many different platforms, from GPL open source to bespoke commercial closed-source and everything between. All this does not necessarily establish the case for using RDF in HTML, it is just meant to demonstrate that this particular case *against* doesn't seem to be established, to me. cheers Chaals -- Charles McCathieNevile Opera Software, Standards Group je parle fran?ais -- hablo espa?ol -- jeg l?rer norsk http://my.opera.com/chaals Try Opera: http://www.opera.com
Received on Wednesday, 7 January 2009 16:37:50 UTC