- From: Gregg Kellogg <gregg@greggkellogg.net>
- Date: Wed, 21 May 2014 16:48:17 -0700
- To: Ivan Herman <ivan@w3.org>
- Cc: Andy Seaborne <andy@apache.org>, W3C CSV on the Web Working Group <public-csv-wg@w3.org>
On May 19, 2014, at 10:09 AM, Ivan Herman <ivan@w3.org> wrote: > Ok, now I understand the difference, thanks. Indeed, I use templates for one term; again, just as R2RML does. > > I am a little bit afraid of the potential complexity of that approach. The one-term-template is pretty straightforward both for the implementation and the user, is syntax independent and can be easily re-used for XML or JSON, too. The per-row-template seems to be syntax dependent and more complex though, clearly, much more powerful. I have to think about it... I think it's really pretty simple; I implemented something similar for another project I'm doing. In Ruby, it takes advantage of the ability to use "gsub" and pass it a block: csv.each do |line| result = csvm.gsub(/"[^"]*\{[^"]*"/) { |match| match.gsub(/\{[^\}]*\}/) { |field_ref| ... } } end In this case, because JSON uses braces in it's basic syntax, I look for braces contained within double-quotes; the example Andy and I use for Turtle are consistent with this approach. For the non-Ruby literate, it basically says match anything including an opening curly brace ("{") surrounded by double quotes and replace it with the result of the block/callback. Each of these looks for field references such as {...}. Note that the field reference may contain some RFC6570 processing elements in addition to the variable/column name, but these should only be performed if we've determined that the column type is IRI. Gregg > Ivan > > > > On 19 May 2014, at 18:16 , Andy Seaborne <andy@apache.org> wrote: > >> On 19/05/14 15:23, Ivan Herman wrote: >>> Let me try to see if I understand what you mean... >>> >>> If there is no metadata assigned to the data then (at least conceptually) we say that we generate a metadata of, roughly, the form: >>> >>> { >>> "@id" : "URI OF THE DATA", >>> "columns" : [{ >>> "name" : "col1", >>> "template" : "{col1}, >>> },{ >>> "name" : "col2", >>> "template" : "{col2}, >>> }] >>> } >> >> Where we seem to differ is "template" - that's a template for one term (the object of a triple). >> >> The template I have in mind is a complete row: >> >> Taking from: >> >> https://github.com/w3c/csvw/blob/gh-pages/examples/simple-weather-observation.md >> >> Date-time, Air temperature (Cel), Dew-point temperature (Cel) >> 2013-12-13T08:00:00Z, 11.2, 10.2 >> >> >> <site/22580943/date-time/20131213T0800Z> >> a ssn:Observation ; >> ssn:observationSamplingTime >> [ time:inXSDDateTime "2013-12-13T08:00:00Z"^^xsd:dateTime ] ; >> ssn:observationResult [ >> a ssn:SensorOutput ; >> def-op:airTemperature_C >> [ qudt:numericValue "11.2"^^xsd:double ] ; >> def-op:dewPointTemperature_C >> [ qudt:numericValue "10.2"^^xsd:double ] ] . >> >> That could be created with a template like: >> >> ---------------------------------------------- >> Columns: >> >> "columns" : [{ >> "name" : "date-time" >> },{ >> "name" : "air-temperature" >> },{ >> "name" : "dew-point" >> }] >> >> >> ---------------------------------------------- >> <site/22580943/date-time/{date-time}> >> a ssn:Observation ; >> ssn:observationSamplingTime >> [ time:inXSDDateTime "{date-time}"^^xsd:dateTime ] ; >> ssn:observationResult [ >> a ssn:SensorOutput ; >> def-op:airTemperature_C >> [ qudt:numericValue "{air-temperature}"^^xsd:double ] ; >> def-op:dewPointTemperature_C >> [ qudt:numericValue "{dew-point}"^^xsd:double ] ] . >> ---------------------------------------------- >> >> skipping over the conversion of 2013-12-13T08:00:00Z to 20131213T0800Z >> >> Andy >> >>> >>> And, by doing that, we have only one generation algorithm instead of two branches like in my document now. >>> >>> Yes, this works, I guess. It certainly makes the specification simpler and avoids getting out of sync. I am slightly worried that the end-user would be a bit screwed up, but that may have to go into a separate, tutorial-like text. So it may be worth doing it indeed... >>> >>> (Would need a rewrite of the text I produced, but that is probably relatively easy; just that I would not do it today or tomorrow...) >>> >>> Ivan >>> >>> >>> >>> On 19 May 2014, at 16:14 , Andy Seaborne <andy@apache.org> wrote: >>> >>>> On 19/05/14 15:00, Ivan Herman wrote: >>>>>>> Generating a template, if none provided, would keep the user-template driven mechanism and metadata-gdefineeneated template mechanism in-step. It would be clear that they aren't alternatives with (potentially) capabilities in the direct roue not in the template route. You could get the generated template and tweak it, for example. >>>>>>> >>>>> I would need an example to understand what you mean... >>>>> >>>> >>>> If the columns are "foo" and "bar" and no template is in the metadata then we define the process to be to create and use: >>>> >>>> ------------------------- >>>> [ >>>> :foo "{foo}" . >>>> :bar "{bar}" . >>>> ] >>>> ------------------------- >>>> >>>> Andy >>>> >>> >>> >>> ---- >>> Ivan Herman, W3C >>> Digital Publishing Activity Lead >>> Home: http://www.w3.org/People/Ivan/ >>> mobile: +31-641044153 >>> GPG: 0x343F1A3D >>> WebID: http://www.ivan-herman.net/foaf#me > > > ---- > Ivan Herman, W3C > Digital Publishing Activity Lead > Home: http://www.w3.org/People/Ivan/ > mobile: +31-641044153 > GPG: 0x343F1A3D > WebID: http://www.ivan-herman.net/foaf#me > > > > >
Received on Wednesday, 21 May 2014 23:48:53 UTC