W3C home > Mailing lists > Public > public-csv-wg@w3.org > September 2014

Re: Reflection on the special telco of CSVW

From: Ivan Herman <ivan@w3.org>
Date: Sun, 14 Sep 2014 09:39:19 +0200
Cc: Dan Brickley <danbri@google.com>, W3C CSV on the Web Working Group <public-csv-wg@w3.org>, Axel Polleres <axel.polleres@wu.ac.at>
Message-Id: <E5C8F9CB-376D-45DE-AAE4-7D1B65D1DF6C@w3.org>
To: Andy Seaborne <andy@apache.org>

On 13 Sep 2014, at 21:46 , Andy Seaborne <andy@apache.org> wrote:

> On 13/09/14 17:09, Ivan Herman wrote:
>> 
>> On 12 Sep 2014, at 18:52 , Andy Seaborne <andy@apache.org> wrote:
>> 
>>> On 10/09/14 13:31, Ivan Herman wrote:
>>>> The template should/must be language independent. Should be
>>>> simple, ie, such features as lowercase is, in my view a "no-no"
>>> 
>>> That makes me very uncertain.  A bit of string manipulation to
>>> generate URIs seems quite common.
>>> 
>>> (i.e. not everything can be pushed to manipulation after the
>>> template/generation step in practice).
>>> 
>>> e.g. the pre-processing step --
>>> 
>>> https://github.com/w3c/csvw/blob/gh-pages/examples/simple-weather-observation.md
>>> 
>>> 
>>> 
>>> 
> Is that in or out by alternative 2.5?

I do not see the problem. Looking at the JSON-LD 'mapping frame', it would be used almost verbatim, except for the syntax I used; it would be

"def-op:airTemperature_C": { "value": "{{Air temperature (Cel)}}" },

Do I miss something?


>>> 
>>> W3C does have a prog-lang-neutral solution for common string
>>> manipulation in a subset of the XQuery/XPath Functions and
>>> Operators (as used in SPARQL and RIF).  It is reasonably close to
>>> what programming languages provide.
>>> 
>> 
>> Hm. You indeed have a point, I must admit, but I am scared of
>> feature-creep. I.e., to bring in extra complexity.
>> 
>> What I see as a simple(?) extension to the current idea is a series
>> of filters, something of the sort
>> 
>> {{Name.filter1.filter2...}}
>> 
>> that may execute a number of filters on the value of {{Name}}. Each
>> filter is a function that, informally, gets the value of {{Name}} as
>> an input argument, and also has an access to the CSV's metadata
>> (and, probably, the format string of the target, ie, whether it is
>> JSON, Turtle, XML, or whatever that the templates are used for).
> 
> We could decouple the template from function calling by having a process
> 
> CSV -> Data table
> data table -> extra columns data table2
>    -- that is add .filter1.filter1 as a named slot.
> data table2 -> format X.
> Clearup X
> 
> I don't think this is the best way to proceed because multiple stages is
> a rather idealistic viewpoint.  Practically, in one file is better.  But
> I though I'd mention it.

From a user's point of view this may be confusing. Or is this just a conceptual/implementation strategy? 

> 
>> Looking, e.g., on the SPARQL definition I see uri, uri_encode,
>> strlen, ucase, lcase, int, float, ceil, floor as obvious candidates.
>> It may make sense to add functions that ignore the value altogether
>> (uuid, rand). I am not sure whether the hash functions (md5, sha1,
>> sha256, etc) make sense, maybe yes.
>> 
>> I am unsure about the time functions (year, month, now, etc). The
>> problem is that these functions may have to rely on some format
>> string, and it seems to be a bit complicated to provide that as part
>> of the metadata.
>> 
>> I presume we can easily go one step further and say that if a
>> "filter" is not part of the predefined list, that is implementation
>> dependent.
> 
> Yes, that is a possibility.
> 
> A few string related functions, especially replace(,), could be called
> out as MUST or SHOULD be provided.

See my comment below.


> 
>> 
>> If we go beyond that, then we do get into a complicated situation,
>> but specification and implementation wise. We are then back to
>> Alternative 1.
> 
> I don't quite follow "beyond that" - doesn't the complexity of alt-1 come from template flow control {{#if ...} etc and function call on a cell is orthogonal to that.
> 

The complexity of alt-1 comes from, e.g., flow control. But not only. If we allow for a, say, replace, we would have to have a syntax (and processing) of extra parameters. One extra step may be something like

{{name.filter1("regexp").filter2}}

which may still work (and, of course, it may be used for other purposes, too, like date formats), but would we go again one step further and say something like

{{name.filter1({{name2}}).filter2}}

ie, having some sort of a recursive macro? We are heading, in my view, back to Alternative 1, ie, a complex language to specify, implement, test, etc. That is what I called 'floodgates'...

Ivan



> 	Andy
> 
>> 
>> Ivan
>> 
>> 
>>> 
>>> Andy
>>> 
>> 
>> 
>> ---- Ivan Herman, W3C Digital Publishing Activity Lead Home:
>> http://www.w3.org/People/Ivan/ mobile: +31-641044153 GPG: 0x343F1A3D
>> WebID: http://www.ivan-herman.net/foaf#me
>> 
>> 
>> 
>> 
>> 
> 
> 


----
Ivan Herman, W3C 
Digital Publishing Activity Lead
Home: http://www.w3.org/People/Ivan/
mobile: +31-641044153
GPG: 0x343F1A3D
WebID: http://www.ivan-herman.net/foaf#me






Received on Sunday, 14 September 2014 07:40:03 UTC

This archive was generated by hypermail 2.4.0 : Friday, 17 January 2020 19:27:42 UTC