R-CellValueMicroSyntax

See http://w3c.github.io/csvw/use-cases-and-requirements/#R-CellValueMicroSyntax

I’d like to have a quick discussion about this requirement because I think it’s covering a wide range of things which we might take different positions on when considering whether they’re in scope.

The use cases show four types of microsyntax:

  1. various date/time syntaxes (not just ISO-8601 ones)
  2. comma-separated lists of editors within fields in UC-JournalArticleSearch
  3. embedded structured data (eg XML (VML) in UC-PaloAltoTreeData)
  4. semi-structured text in UC-PaloAltoTreeData

And I can see four things you might want to do with them:

  A. document the microsyntax so that humans can understand what it’s conveying
  B. validate the values to make sure they conform to the microsyntax you expect
  C. label the value as being in a particular microsyntax when converting into JSON/XML/RDF (eg marking an XML value as an XMLLiteral)
  D. process the microsyntax into an appropriate data structure when converting into JSON/XML/RDF (eg mapping the XML value into an appropriate JSON object)

I want to suggest that:

* We should mark as Deferred the intersection of 3 & D — we shouldn’t expect CSV processors to be able to take values that are XML and convert them into RDF or into JSON.

* We should mark as Deferred the intersection of 4 & D — similarly, we shouldn’t expect CSV processors to be able to take arbitrary semi-structured text and convert it into XML/JSON/RDF.

Otherwise I’m happy to include those requirements. WRT to the data model, I don’t think that means we need the data model to say that values in a CSV file *are* lists or object structures; I think we can continue to say that they’re annotated strings, and the annotation (which might include a definition of the format of the string) can be used to validate the string and (in some cases) convert it into a suitable value or data structure.

Cheers,

Jeni
--  
Jeni Tennison
http://www.jenitennison.com/

Received on Wednesday, 30 April 2014 17:58:21 UTC