I would imagine that if search engines starting crawling this data – that would be incentive enough to begin incorporation of this paradigm into their site.
“Most of html generated my websites is not even valid” – by this do you mean IDE’s like Visual Studio? I would probably assume this to be true as Microsoft has a history of not conforming to the W3C standards.
________________________________
From: semantic-web-request@w3.org [mailto:semantic-web-request@w3.org] On Behalf Of ?????? ????? (ravinder thakur)
Sent: 2008-10-20 11:55 AM
To: John Flynn; semantic-web@w3.org; semantic_web@googlegroups.com
Subject: Re: web to semantic web : an automated approach
Buts whats the incentive for web site owners to mark up their website with semantic data. Few days back i was reading some study conducted by Opera browser team that said that most of the html generated by websites is not even valid. How can we hope them to create correct semantic data. Also what happens to lot of other user submitted content(blogs, wikis etc ) ?
Instead why not create a mechanism to automatically convert web data to semantic data. Opencalais.com is already doing it on small domain, why can't/shouldn't we do it at web's scale ?
John : I realized that you are form BBN. In case you are aware, can you please tell us from your experience about the state of NLP ? To what extent the current best NLP systems are capable of extracting infroatmion from unformatted text ? And what are the hopes for the future to overcome the curent shortcomings in NLP systems?