Re: Reflection on the special telco of CSVW

On Sep 13, 2014, at 5:09 PM, Ivan Herman <ivan@w3.org> wrote:

> 
> On 12 Sep 2014, at 18:52 , Andy Seaborne <andy@apache.org> wrote:
> 
>> On 10/09/14 13:31, Ivan Herman wrote:
>>> The template should/must be language independent. Should be simple, ie, such features as lowercase is, in my view a "no-no"
>> 
>> That makes me very uncertain.  A bit of string manipulation to generate URIs seems quite common.
>> 
>> (i.e. not everything can be pushed to manipulation after the template/generation step in practice).
>> 
>> e.g. the pre-processing step --
>> 
>> https://github.com/w3c/csvw/blob/gh-pages/examples/simple-weather-observation.md
>> 
>> Is that in or out by alternative 2.5?
>> 
>> W3C does have a prog-lang-neutral solution for common string manipulation in a subset of the XQuery/XPath Functions and Operators (as used in SPARQL and RIF).  It is reasonably close to what programming languages provide.
>> 
> 
> Hm. You indeed have a point, I must admit, but I am scared of feature-creep. I.e., to bring in extra complexity.
> 
> What I see as a simple(?) extension to the current idea is a series of filters, something of the sort
> 
> {{Name.filter1.filter2…}}

Such chaining works in many languages, where String is a complete class for which methods such as filter1 and filter2 may be defined (Ruby, for example), but not all languages allow this natively. However, chaining is a recognized pattern that is reasonable to consider; certainly, if the values are parsed, then they could tern into something like (filter2 (filter1 Name)).

> that may execute a number of filters on the value of {{Name}}. Each filter is a function that, informally, gets the value of {{Name}} as an input argument, and also has an access to the CSV's metadata (and, probably, the format string of the target, ie, whether it is JSON, Turtle, XML, or whatever that the templates are used for). Looking, e.g., on the SPARQL definition I see uri, uri_encode, strlen, ucase, lcase, int, float, ceil, floor as obvious candidates. It may make sense to add functions that ignore the value altogether (uuid, rand). I am not sure whether the hash functions (md5, sha1, sha256, etc) make sense, maybe yes.

+1

> I am unsure about the time functions (year, month, now, etc). The problem is that these functions may have to rely on some format string, and it seems to be a bit complicated to provide that as part of the metadata.

Hard to say how to get around the need for a format string though, given the different use cases we’ve seen.

> I presume we can easily go one step further and say that if a "filter" is not part of the predefined list, that is implementation dependent.

+1 like a SPARQL extension function? Maybe namespaced?

Gregg

> If we go beyond that, then we do get into a complicated situation, but specification and implementation wise. We are then back to Alternative 1.
> 
> Ivan
> 
> 
>> 
>> 	Andy
>> 
> 
> 
> ----
> Ivan Herman, W3C 
> Digital Publishing Activity Lead
> Home: http://www.w3.org/People/Ivan/
> mobile: +31-641044153
> GPG: 0x343F1A3D
> WebID: http://www.ivan-herman.net/foaf#me

Received on Saturday, 13 September 2014 20:14:47 UTC