- From: Juan Sequeda <juanfederico@gmail.com>
- Date: Fri, 21 Mar 2014 09:27:43 -0500
- To: Andy Seaborne <andy@apache.org>
- Cc: Ivan Herman <ivan@w3.org>, Gregg Kellogg <gregg@greggkellogg.net>, W3C CSV on the Web Working Group <public-csv-wg@w3.org>, "Tandy, Jeremy" <jeremy.tandy@metoffice.gov.uk>
- Message-ID: <CAMVTWDxubEPAfct_cp3Er1mYtAoWObqF5br=iguaNfOfEtEwdw@mail.gmail.com>
Andy, On Fri, Mar 21, 2014 at 5:51 AM, Andy Seaborne <andy@apache.org> wrote: > > > PS Were there particular parts of R2RML that took particularly long / > cause most debate? > > One thing that quickly comes to mind were the issues of tables without PKs; use of Blank Nodes; what happens if there are NULL values; and the integration of the Direct Mapping with R2RML. See this: http://www.w3.org/TR/r2rml/#default-mappings I've started to study R2RML, from a formal point of view. R2RML is rather expressive (I'm using this term loosely). If we map the expressivity of R2RML to (datalog) rules, there is a total of 57 distinct rules, which means that there are 57 different ways of generating RDF triples. The Direct Mapping can be represented in only 3 (datalog) rules. More info: http://ceur-ws.org/Vol-1035/iswc2013_poster_4.pdf > > Thanks >> >> Ivan >> >> On 21 Mar 2014, at 24:25 , Juan Sequeda <juanfederico@gmail.com> wrote: >> >> Ivan, all, >>> >>> This is our use-case: >>> >>> Constitute Project [1] is a search engine for the worlds constitution. >>> This is a project funded by Google Ideas [2]. We, Capsenta, did the mapping >>> of the constitution data to RDF and OWL. All of the data was original Excel >>> spreadsheets (i.e. CSV files). What we did was to import the spreadsheets >>> into SQL Server, and then used Direct Mapping, R2RML and Ultrawrap to map >>> the data to RDF. Why did we want to use RDF/OWL? Several reasons: >>> >>> 1) RDF (graph data model) is flexible. We don't know what is going to >>> happen to constitutional data later. So we need to be ready for change >>> 2) We currently have 189 constitutions, each in it's own spreadsheet. We >>> need to integrate this data. >>> 3) We created an ontology about constitutional topics. Naturally, we >>> want to represent this in OWL. >>> 4) We want to link to other datasets, such as DBpedia >>> 5) RDF is becoming the standard to publish open data. >>> >>> These reasons are not specific to Constitute. It can apply to any csv >>> dataset which needs search or integrated with other datasets. More info >>> can be found in our 2013 Semantic Web Challenge submission [3]. We won 2nd >>> prize :) >>> >>> Constitute is having a lot of impact. We know for a fact that >>> constitutional drafters of Tunsia, Egypt and now Mongolia have been using >>> Constitute. >>> >>> Btw, interesting fact: On average, 5 constitutions are written from >>> scratch every year. A constitution last on average for 20 years. People who >>> write constitutions have never done that before and will never do that >>> again; that is why they want to search through existing constitutions. >>> >>> [1] https://www.constituteproject.org/#/ >>> [2] https://www.google.com/ideas/projects/constitute/ >>> [3] http://challenge.semanticweb.org/2013/submissions/swc2013_ >>> submission_12.pdf >>> >>> >>> Juan Sequeda >>> +1-575-SEQ-UEDA >>> www.juansequeda.com >>> >>> >>> On Thu, Mar 20, 2014 at 12:53 PM, Ivan Herman <ivan@w3.org> wrote: >>> Sorry if I sound like a broken record, but I would really like to see >>> and understand the CSV->RDF use cases, also in terms of the people who are >>> likely to use that. Learning CSV-LD or R2RML-CSV requires a learning curve. >>> The question is which of the two is steeper for the envisaged user base. >>> >>> (I do not have anything against any of the two, but we may have to make >>> a choice at some point if we go down that route...) >>> >>> Ivan >>> >>> On 20 Mar 2014, at 18:47 , Gregg Kellogg <gregg@greggkellogg.net> wrote: >>> >>> On Mar 20, 2014, at 10:39 AM, Juan Sequeda <juanfederico@gmail.com> >>>> wrote: >>>> >>>> If there is going to be a CSV to RDF mapping, shouldn't it be >>>>> relatively close (if not almost equal to) R2RML. I foresee users doing >>>>> RDB2RDF mappings with R2RML and having a few (or many) CSV files that they >>>>> would like to map to RDF too. They would want to continue using the same >>>>> tool. >>>>> >>>>> What we do is import the CSVs to a RDB, and then use R2RML. So as a >>>>> user who needs to transform to RDF, I would want to have something almost >>>>> equivalent to R2RML. >>>>> >>>> >>>> This certainly is a valid use case. I was considering what the impact >>>> on developers using these tools might be. If there is a single tool (and >>>> spec) which handles the relevant use cases, then it might simplify the life >>>> of developers. Nothing against R2RML, and if that's the chain a developer's >>>> working with, the same logic would indicate that having to use something >>>> like CSV-LD would be a burden. >>>> >>>> Gregg >>>> >>>> Juan Sequeda >>>>> +1-575-SEQ-UEDA >>>>> www.juansequeda.com >>>>> >>>> >
Received on Friday, 21 March 2014 14:35:17 UTC