- From: Bob DuCharme <bob@snee.com>
- Date: Wed, 10 Mar 2010 12:22:53 -0500
- To: Alasdair Logan <alasdair.logan@yahoo.co.uk>
- CC: Linked Data community <public-lod@w3.org>
As Irene said, http://esw.w3.org/topic/ConverterToRdf is the best place to start, but I thought I'd ramble a bit about some of the broader issues. If the data to convert is in a file, as opposed to being delivered from a server with an interface that you can write to (as D2RQ and OpenLink do for relational data), then the first step is to parse the input, so tools will be built around parsers for each input format. Any modern programming language can parse CSV easily, and most tools that advertise the ability to convert spreadsheets to RDF actually expect CSV input. (TopQuadrant's tools can read binary Excel files. Full disclosure: I work for them.) When your input is XML (which can include HTML if you use TagSoup or Tidy to clean it up), XSLT is a popular way to create triples. This is the principle behind GRDDL (http://www.w3.org/2004/01/rdxh/spechttp://www.w3.org/2004/01/rdxh/spec). TopQuadrant also has a more general-purpose XML-to-RDF converter that takes the structure of the input document into account so that it can round-trip the RDF back to XML. With plain text, something needs to identify structure within the text so that it can work out what the subjects, predicates, and objects are, and that structure depends on the needs of the application. (That actually applies to CSV and XML as well, but commas and tags give you more to go on if you understand the purpose of the input data.) Semweb meetups are seeing more interest from the Natural Language Processing community--I think the NYC semweb meetup actually has a subgroup of people dedicated to NLP issues--so there could be more interesting work coming from them in the future. Thomson Reuters Calais is the most well-known example that comes to mind of a tool that takes plain text as input and returns it with embedded RDF. Bob Alasdair Logan wrote: > Hey all, > > I was wondering if anyone is familiar with tools to convert data into RDF triples and Linked Data. They can be for any data format i.e. XML, CSV, plain text etc. > > Im doing this as part of a pilot study for my Master's project so i'm just trying get a general view of any tools used. > > Thanks in advance > > Ally > >
Received on Wednesday, 10 March 2010 17:23:32 UTC