- From: Data on the Web Best Practices Working Group Issue Tracker <sysbot+tracker@w3.org>
- Date: Wed, 17 Feb 2016 18:22:11 +0000
- To: public-dwbp-wg@w3.org
dwbp-ISSUE-239 (Laufer): machine-readable standardized data formats - serialization data formats - dataset formats [Best practices document(s)] http://www.w3.org/2013/dwbp/track/issues/239 Raised by: Carlos Laufer On product: Best practices document(s) In Best Practice 14, "Use machine-readable standardized data formats", the term data format is used to define the serialization format of a dataset distribution. The example uses GTFS (https://developers.google.com/transit/gtfs/reference), a standard way of distributing timetables. We have here two standards: GTFS (structure and serialization) and CSV (serialization). GTFS is distributed as a set of CSV files embedded in a single .zip style file. The previous BP examples use timetables but it is not explicit if it was a GTFS feed. It could be any format and it seems that it is a single file containing all the information, distributed in different formats as csv, json, ttl, etc. But GTFS is a standard way of defining more that the serialization format (a set of csv files). It defines the structure and the meaning of data (a set of specific named files and a vocabulary). Serialization standardized data formats has a semantic related to how a machine understand the meta-model of the different ways of distributing data, the data itself is inside this pack. This data could use a standard: a vocabulary or a more complex structure of distribution, as GTFS, for example, and so on. I think this difference should be clear in the document. Maybe it will be interesting to have a BP talking about things like GTFS. I cannot see a BP that talks about this: using standards for publishing datasets for specific domains or applications.
Received on Wednesday, 17 February 2016 18:22:17 UTC