Re: R-CellValueMicroSyntax

On May 8, 2014, at 1:26 AM, Ivan Herman <ivan@w3.org> wrote:
> 
> Hi Jeni,
> 
>> On 07 May 2014, at 17:59 , Jeni Tennison <jeni@jenitennison.com> wrote:
>> Hi Ivan,
>> 
>> From: Ivan Herman ivan@w3.org Date: 1 May 2014 at 08:00:05
>>>> On 30 Apr 2014, at 19:57 , Jeni Tennison wrote:
>>>> See http://w3c.github.io/csvw/use-cases-and-requirements/#R-CellValueMicroSyntax  
>>>> 
>>>> I’d like to have a quick discussion about this requirement because I think it’s covering  
>>> a wide range of things which we might take different positions on when considering whether  
>>> they’re in scope.
>>>> 
>>>> The use cases show four types of microsyntax:
>>>> 
>>>> 1. various date/time syntaxes (not just ISO-8601 ones)
>>>> 2. comma-separated lists of editors within fields in UC-JournalArticleSearch
>>>> 3. embedded structured data (eg XML (VML) in UC-PaloAltoTreeData)
>>>> 4. semi-structured text in UC-PaloAltoTreeData
>>>> 
>>>> And I can see four things you might want to do with them:
>>>> 
>>>> A. document the microsyntax so that humans can understand what it’s conveying
>>>> B. validate the values to make sure they conform to the microsyntax you expect
>>>> C. label the value as being in a particular microsyntax when converting into JSON/XML/RDF  
>>> (eg marking an XML value as an XMLLiteral)
>>>> D. process the microsyntax into an appropriate data structure when converting into  
>>> JSON/XML/RDF (eg mapping the XML value into an appropriate JSON object)
>>>> 
>>>> I want to suggest that:
>>>> 
>>>> * We should mark as Deferred the intersection of 3 & D — we shouldn’t expect CSV processors  
>>> to be able to take values that are XML and convert them into RDF or into JSON.
>>>> 
>>>> * We should mark as Deferred the intersection of 4 & D — similarly, we shouldn’t expect  
>>> CSV processors to be able to take arbitrary semi-structured text and convert it into  
>>> XML/JSON/RDF.
>>> 
>>> I agree with both.
> 
> [snip]
>> 
>>> I am also not sure what 2+'B' means. Do you mean we should have some sort of a 'schema' like  
>>> description on the structure of a particular microsyntax when converting the CSV file  
>>> into the Data Model? Ie, that the microsyntax should be a number, followed by a data, followed  
>>> by something else? I am tempted to push this into a Deferred category as well, ie, the conversion  
>>> into the Data Model should be opaque with a possible human readable description.
>> 
>> I think that the vast majority of microsyntaxes are sufficiently describable using regexps for validation purposes, so I don’t think validation is ever a real issue.
> 
> I did not think in terms of regular expressions, I think the metadata document was not yet up there when I read this mail (or I simply missed it). I agree with your expectation, b.t.w., ie, you are right.
> 
> 
>>> If I put a possible implementer's hat on, I would probably implement the conversion into  
>>> a Data Model (I actually did something like that in node.js as a JavaScript-learning  
>>> exercise recently) by giving the possibility to the user to add a callback function on  
>>> cells to make any conversion that is possible). I wonder whether this should remain an  
>>> implementation-specific trick or something we would describe in the conversion process.  
>> 
>> Yes, definitely the processing of microsyntaxes that aren’t described through the declarative metadata should be handled through extension mechanisms in implementations. I’m hoping that the conversion processes will all be able to ‘bug out’ to something sufficiently powerful that that parsing is possible in a standard way rather than being implementation dependent.
> 
> Would we want to formalize this 'callback' aspect in our document or simply leave it to the imagination of our implementers? Probably the latter, but I am not 100% sure.

JSON-LD started using callbacks, but on strong recommendations from the API community changes to using Promises; I think this is the norm for W3C WebIDL definitions now.

Gregg

> Ivan
> 
> 
>> Cheers,
>> 
>> Jeni
>> --  
>> Jeni Tennison
>> http://www.jenitennison.com/
> 
> 
> ----
> Ivan Herman, W3C 
> Digital Publishing Activity Lead
> Home: http://www.w3.org/People/Ivan/
> mobile: +31-641044153
> GPG: 0x343F1A3D
> WebID: http://www.ivan-herman.net/foaf#me
> 
> 
> 
> 
> 

Received on Thursday, 8 May 2014 20:43:54 UTC