W3C home > Mailing lists > Public > public-csv-wg@w3.org > May 2014

Request for comments about requirements

From: Ceolin, D. <d.ceolin@vu.nl>
Date: Tue, 20 May 2014 21:22:51 +0000
To: W3C CSV on the Web Working Group <public-csv-wg@w3.org>
Message-ID: <BD5F71D4-5FBA-4751-A19C-D0D60C4DBB43@vu.nl>
Dear all,
first of all, it's been an hectic period, so my apologies for coming up with this only now.
Anyway, following last telco, here is the current list of requirements, that I've tried to reorganize based on recent discussion and email exchange.
To keep the email manageable, I've reported only the short description and the categorization. I'll go more into detail about each of them in a second stage.
I moved R-MultipleHeadingRows and R-RandomAccess to "Deferred Requirements".
The rest of requirements is still "Candidate" and organized in categories which have changed over time (I've kept the old ones as well to remind of them and make sure that we agree on the new categorization).
So, I'd have a few queries for you:
- the first issue regards R-AnnotationAndSupplementaryInfo. I think that we agree that this is a "super-requirement" wrt a few other reqs (e.g., R-LinksToExternallyManagedDefinitions). I would be inclined to keep both super- and sub-requirements because the sub-requirements allow to address specific issues, but we should also provide a generic mechanism to annotate CSVs with classes of information we have not explicitly considered, but might be relevant (e.g., license of use).
- I think that a categorization is useful, given the relevant number of requirements we have. Do you agree? If so, do you suggest any change in it?
- Is there any candidate requirement you suggest to defer or delete or do you agree to accept all of them?

4.2 Candidate Requirements
4.2.1 Requirements relating to parsing of CSV

Ability to determine that a CSV is syntactically well formed
Ability to normalize data that is not in normal form and possibly vice-versa.
Ability to determine that a CSV is using RTL
4.2.2 Requirements relating to annotation of CSV

4.2.3 Requirements relating to metadata discovery

4.2.4 Requirements relating to applications

Ability to validate a CSV for conformance with a specified DDR<http://w3c.github.io/csvw/use-cases-and-requirements/index.html#dfn-ddr>
Ability to automatically transform a CSV into RDF
Ability to automatically transform a CSV into JSON
Ability to transform CSV conforming to the core tabular data model yet lacking further annotation into a object / object graph serialisation
Ability to publish metadata independently from the tabular data resource it describes
4.2.5 Non-functional requirements

Compatibility of data analysis tools in common usage with CSV+
Ability to add supplementary metadata to an existing CSV file without requiring modification of that file
4.2.6 Data Model Requirements

Ability to handle columns as row headers.
Ability to parse internal data structure within a cell value
Ability to parse tabular data with field delimiters other than comma (,)
Ability to determine the primary key for entities described within a CSV file
Ability to cross reference between CSV files
Ability to reference a Data Definition Resource defining supplementary metadata external to the CSV file
Ability to add annotation and supplementary information to CSV file
Ability to associate a code value with externally managed definition
Ability to assert how a single CSV file is a facet or subset of a larger dataset
Ability to provide (hyper)links to externally managed definitions from with a CSV file
Ability to declare syntactic type for data values
Ability to declare semantic type for data values
Ability to declare a "missing value" token and, optionally, a reason for the value to be missing
Ability to map the values of a CSV row/column into corresponding URI (e.g. by concatenating those values with a prefix).
Ability identify/express the unit of measure for the values reported in a given column.
Ability to group multiple data tables into a single package for publication
Ability for a metadata description to explicitly cite the tabular dataset it describes
4.3 Deferred requirements

Ability to handle headings spread across multiple initial rows, as well as to distinguish between single column headings and file headings.
Ability to access and/or extract part of a CSV file in a non-sequential manner.
Received on Tuesday, 20 May 2014 21:23:25 UTC

This archive was generated by hypermail 2.3.1 : Tuesday, 6 January 2015 20:21:40 UTC