Re: Reflection on the special telco of CSVW

On 13/09/14 17:09, Ivan Herman wrote:
>
> On 12 Sep 2014, at 18:52 , Andy Seaborne <andy@apache.org> wrote:
>
>> On 10/09/14 13:31, Ivan Herman wrote:
>>> The template should/must be language independent. Should be
>>> simple, ie, such features as lowercase is, in my view a "no-no"
>>
>> That makes me very uncertain.  A bit of string manipulation to
>> generate URIs seems quite common.
>>
>> (i.e. not everything can be pushed to manipulation after the
>> template/generation step in practice).
>>
>> e.g. the pre-processing step --
>>
>> https://github.com/w3c/csvw/blob/gh-pages/examples/simple-weather-observation.md
>>
>>
>>
>>
Is that in or out by alternative 2.5?
>>
>> W3C does have a prog-lang-neutral solution for common string
>> manipulation in a subset of the XQuery/XPath Functions and
>> Operators (as used in SPARQL and RIF).  It is reasonably close to
>> what programming languages provide.
>>
>
> Hm. You indeed have a point, I must admit, but I am scared of
> feature-creep. I.e., to bring in extra complexity.
>
> What I see as a simple(?) extension to the current idea is a series
> of filters, something of the sort
>
> {{Name.filter1.filter2...}}
>
> that may execute a number of filters on the value of {{Name}}. Each
> filter is a function that, informally, gets the value of {{Name}} as
> an input argument, and also has an access to the CSV's metadata
> (and, probably, the format string of the target, ie, whether it is
> JSON, Turtle, XML, or whatever that the templates are used for).

We could decouple the template from function calling by having a process

CSV -> Data table
data table -> extra columns data table2
     -- that is add .filter1.filter1 as a named slot.
data table2 -> format X.
Clearup X

I don't think this is the best way to proceed because multiple stages is
a rather idealistic viewpoint.  Practically, in one file is better.  But
I though I'd mention it.

> Looking, e.g., on the SPARQL definition I see uri, uri_encode,
> strlen, ucase, lcase, int, float, ceil, floor as obvious candidates.
> It may make sense to add functions that ignore the value altogether
> (uuid, rand). I am not sure whether the hash functions (md5, sha1,
> sha256, etc) make sense, maybe yes.
>
> I am unsure about the time functions (year, month, now, etc). The
> problem is that these functions may have to rely on some format
> string, and it seems to be a bit complicated to provide that as part
> of the metadata.
>
> I presume we can easily go one step further and say that if a
> "filter" is not part of the predefined list, that is implementation
> dependent.

Yes, that is a possibility.

A few string related functions, especially replace(,), could be called
out as MUST or SHOULD be provided.

>
> If we go beyond that, then we do get into a complicated situation,
> but specification and implementation wise. We are then back to
> Alternative 1.

I don't quite follow "beyond that" - doesn't the complexity of alt-1 
come from template flow control {{#if ...} etc and function call on a 
cell is orthogonal to that.

	Andy

>
> Ivan
>
>
>>
>> Andy
>>
>
>
> ---- Ivan Herman, W3C Digital Publishing Activity Lead Home:
> http://www.w3.org/People/Ivan/ mobile: +31-641044153 GPG: 0x343F1A3D
>  WebID: http://www.ivan-herman.net/foaf#me
>
>
>
>
>

Received on Saturday, 13 September 2014 19:46:40 UTC