[whatwg] Trying to work out the problems solved by RDFa from Charles McCathieNevile on 2009-01-08 (public-whatwg-archive@w3.org from January 2009)

From: Charles McCathieNevile <chaals@opera.com>
Date: Thu, 08 Jan 2009 11:37:50 +1100
Message-ID: <op.unexdcrtwxe0ny@widsith.local>

On Mon, 05 Jan 2009 00:17:39 +1100, Henri Sivonen <hsivonen at iki.fi> wrote:

> On Jan 3, 2009, at 17:05, Dan Brickley wrote:
>
>> But perhaps a more practical concern is that it unfairly biases things  
>> towards popular languages - lucky English, lucky Spanish, etc., and  
>> those that lend themselves more to NLP analysis. The Web is for  
>> everyone, and people shouldn't be forced to read and write English to  
>> enjoy the latest advances in Web automation.
>
> Some languages are higher in the pecking order than others when software  
> development is prioritized, and RDFa cannot level the playing field here.
>
> Suppose there's a use case that can be satisfactorily addressed by  
> applying NLP heuristics to content for the top-tier languages. Even if  
> there were an RDF mechanism for addressing the same use case without  
> relying on natural language, software aimed for serving the top-tier  
> languages would still do the NLP thing for the use case.

No. There is no reason for most developers to prefer one over the other  
under the circumstances described.

Clearly Google has an investment in text-harvesting in a bunch of  
languages. Equally clearly its competitors who are more sucessfeul in  
various languages (Yandex, Baidu, etc) have an investment in the  
technology they use.

But developing a new indexing process, there is no a priori reason to  
favour NLP over some other technique that is also satisfactory, and if you  
happen to be interested in a global market, it makes sense to develop a  
system that can be more easily adapted, other things being equal.
...
> Instead of bearing the cost of developing a totally alternative  
> technology stack for the other languages without benefiting from any  
> spillover from the effort done for the top-tier languages, it makes more  
> sense to invest the effort into building upon the reusable parts already  
> developed for the top-tier languages.

Except that it turns out that the re-usable parts of most search engines,  
for the general developer, are pretty limited. Whereas the re-usable parts  
of the RDF stack are numerous, available for many different platforms,  
 from GPL open source to bespoke commercial closed-source and everything  
between.

All this does not necessarily establish the case for using RDF in HTML, it  
is just meant to demonstrate that this particular case *against* doesn't  
seem to be established, to me.

cheers

Chaals

-- 
Charles McCathieNevile  Opera Software, Standards Group
     je parle fran?ais -- hablo espa?ol -- jeg l?rer norsk
http://my.opera.com/chaals       Try Opera: http://www.opera.com

Received on Wednesday, 7 January 2009 16:37:50 UTC