Re: Reflection on the special telco of CSVW

On 12 Sep 2014, at 18:52 , Andy Seaborne <andy@apache.org> wrote:

> On 10/09/14 13:31, Ivan Herman wrote:
>> The template should/must be language independent. Should be simple, ie, such features as lowercase is, in my view a "no-no"
> 
> That makes me very uncertain.  A bit of string manipulation to generate URIs seems quite common.
> 
> (i.e. not everything can be pushed to manipulation after the template/generation step in practice).
> 
> e.g. the pre-processing step --
> 
> https://github.com/w3c/csvw/blob/gh-pages/examples/simple-weather-observation.md
> 
> Is that in or out by alternative 2.5?
> 
> W3C does have a prog-lang-neutral solution for common string manipulation in a subset of the XQuery/XPath Functions and Operators (as used in SPARQL and RIF).  It is reasonably close to what programming languages provide.
> 

Hm. You indeed have a point, I must admit, but I am scared of feature-creep. I.e., to bring in extra complexity.

What I see as a simple(?) extension to the current idea is a series of filters, something of the sort

{{Name.filter1.filter2...}}

that may execute a number of filters on the value of {{Name}}. Each filter is a function that, informally, gets the value of {{Name}} as an input argument, and also has an access to the CSV's metadata (and, probably, the format string of the target, ie, whether it is JSON, Turtle, XML, or whatever that the templates are used for). Looking, e.g., on the SPARQL definition I see uri, uri_encode, strlen, ucase, lcase, int, float, ceil, floor as obvious candidates. It may make sense to add functions that ignore the value altogether (uuid, rand). I am not sure whether the hash functions (md5, sha1, sha256, etc) make sense, maybe yes.

I am unsure about the time functions (year, month, now, etc). The problem is that these functions may have to rely on some format string, and it seems to be a bit complicated to provide that as part of the metadata.

I presume we can easily go one step further and say that if a "filter" is not part of the predefined list, that is implementation dependent.

If we go beyond that, then we do get into a complicated situation, but specification and implementation wise. We are then back to Alternative 1.

Ivan


> 
> 	Andy
> 


----
Ivan Herman, W3C 
Digital Publishing Activity Lead
Home: http://www.w3.org/People/Ivan/
mobile: +31-641044153
GPG: 0x343F1A3D
WebID: http://www.ivan-herman.net/foaf#me

Received on Saturday, 13 September 2014 16:10:27 UTC