- From: Paul Tyson <phtyson@sbcglobal.net>
- Date: Sat, 04 Oct 2014 18:47:19 -0500
- To: Michael Brunnbauer <brunni@netestate.de>
- Cc: "semantic-web@w3.org" <semantic-web@w3.org>, Linking Open Data <public-lod@w3.org>
Hi Michael, On Sat, 2014-10-04 at 11:19 +0200, Michael Brunnbauer wrote: > Hello Paul, > > On Fri, Oct 03, 2014 at 04:05:07PM -0500, Paul Tyson wrote: > > Yes. We are setting the bar too low. The field of knowledge computing > > will only reach maturity when authors can publish their theses in such a > > manner that one can programmatically extract the concepts, propositions, > > and arguments; > > I thought Kingsley is the only one seriously suggesting that we communicate in > triples. Let's take one step back to the proposal of making research datasets > machine readable with RDF. I certainly was not suggesting this. It would indeed be silly to publish large collections of empirical quantitative propositions in RDF. Nor do I think Kingsley would endorse such efforts (but he can speak for himself on that). I mostly admire and agree with Kingsley's indefatigable efforts to show how easy it is to harvest the low-hanging fruit of semantic web/linked data technologies. I just don't want that to be mistaken for the desired end state. > > Please go to http://crcns.org/NWB > > Have a look at an example dataset: > > http://crcns.org/data-sets/hc/hc-3/about-hc-3 > > "The total size of the data is about 433 GB compressed" > > Even if you do not use triples for all of that (which would be insane), > specifying a "structured data container" is a very difficult task. > > So instead of talking about setting the bar higher, why not just help the > people over there with their problem? Creating, tracking, and publishing empirical quantitative propositions is not their biggest impediment to contributing to human knowledge. Connecting those propositions to significant conclusions through sound arguments is the more important problem. They will attempt to do so, presumably, by creating monographs in an electronic source format that has more or less structure to it. The structure will support many useful operations, including formatting the content for different media, hyperlinking to other resources, indexing, and metadata gleaning. The structure will most likely *not* support any programmatic operations to expose the logical form of the arguments in such a way that another person could extract them and put them into his own logic machine to confirm, deny, strengthen, or weaken the arguments. Take for example a research paper whose argument proceeded along the lines of "All men are mortal; Socrates is a man; therefore Socrates is mortal." Along comes a skeptic who purports to have evidence that Socrates is not a man. He publishes the evidence in such a way that other users can if they wish insert the conclusion from such evidence in place of the minor premise in the original researcher's argument. Then the conclusion cannot be affirmed. The original researcher must either find a different form of argument to prove his conclusion, overturn the skeptic's evidence (by further argument, also machine-processable), or withdraw his conclusion. This simple model illustrates how human knowledge has progressed for millenia, mediated solely by oral, written, and visual and diagrammatic communication. I am suggesting we enlist computers to do something more for us in this realm than just speeding up the millenia-old mechanisms. Of course we don't need a program to help us determine whether or not Socrates is mortal. But what about the task of affirming or denying the proposition, "Unchecked anthropogenic climate change will destroy human civilization." Gigabytes of data do not constitute logical argument. A sound chain of reasoning from empirical evidence and agreed universals is wanted. Yes, this can be done in academic prose supplemented by charts and diagrams, and backed by digital files containing lots of numbers. But, as Kingsley would say, that is not the best way ca. 2014. Regards, --Paul
Received on Saturday, 4 October 2014 23:50:07 UTC