- From: रविंदर ठाकुर (ravinder thakur) <ravinderthakur@gmail.com>
- Date: Mon, 20 Oct 2008 15:42:23 +0530
- To: semantic-web@w3.org, semantic_web@googlegroups.com, "Andreas Langegger" <al@jku.at>
- Message-ID: <617073f10810200312x107a8004y372c57ef5bd1879f@mail.gmail.com>
>>>>This is indeed an essential point in the development of the Semantic Web. I'm >>>>mostly in the "it'll happen" camp with regards to people creating semantic >>>> content. There are two main sources, one is that they say that 70% of the >>>> data on the web is allready in some structured form, thus what's needed is to >>>> clarify what that structure means. I have been in "it will happen" camp but nothing far reaching seems to be happening so i am out. I would say that most of the data (90%) of data out there is unstructured. Also most of the strucutred data is specific to companies and they wont share it. There are people writing blogs, wikipedia, news websites producing content continuisley, people reviewing the products, putting their opinions online, the list of unstructured data is endless and will continue to grow with increasing Internet peneratration in 3rd world conturies. To assume that all users will manually convert this data to sturcutred seems too far fetched. To assume that the information being put by these end users is of little uses than say wikipedia/dbpedia would be a horrible mistake. Even if we have large data, someone needs to club this vast amount of rdf/owl data and create a global graph interlinking all of that.(BTW i see some serious ontology issues anyone will likely to hit in this approach) >>>>Also, I think IBM's SUKI http://www.research.ibm.com/UIMA/SUKI/ might be of >>>>interest. I have used UIMA but its not a one man army's job. Its just a framework and there is hell lot of things to be done yet on this. eg. write domain specific components etc. >>>>A3 is cumbersome and may produce wrong links and information - a nightmare without implicit support for provenance. In corporate >>>>environments A3 is already very popular, but in the broader Web-scale I'm a bit sceptical this will work well. What do you tink? I am hoping a lot on the progress we have made in NLP and no doubt NLP will continue to improve its performance in the near future. Currently to aliviate the wrong linking/information problem I think reduancy of information will play an important role. If we have 10 sources of same peice of information and 6 NLP parsers give one view and rest 4 give other view, i am pretty sure the one on which 6 are agreeing will be the right one. Also we dont have to be 100% right(that too in the begining) since ( other than your boss :) ) nobody is 100% right:) >>>>Some CMS like Drupal have already understood this and are rapidly moving towards exposing their content as RDF data Here's the problem. Drupal are exposing _the data stored in Drupal. Do we expect everyone on web to use Drupal ? No. What happens to information on times.com, blogspot.com, googlegroups.com or kashmirtimes.com ? Semantic web is not about converting someone's data and exposing it with semantic view. Its about the _whole_ data out there on web and then building a web of semantic links on top of that and then doing reasoning on top of that etc. Thanks for initiating the discussion anyways. Keep it coming :)
Received on Monday, 20 October 2008 10:12:59 UTC