Request for comments about requirements from Ceolin, D. on 2014-05-20 (public-csv-wg@w3.org from May 2014)

From: Ceolin, D. <d.ceolin@vu.nl>
Date: Tue, 20 May 2014 21:22:51 +0000
To: W3C CSV on the Web Working Group <public-csv-wg@w3.org>
Message-ID: <BD5F71D4-5FBA-4751-A19C-D0D60C4DBB43@vu.nl>
Dear all,
first of all, it's been an hectic period, so my apologies for coming up with this only now.
Anyway, following last telco, here is the current list of requirements, that I've tried to reorganize based on recent discussion and email exchange.
To keep the email manageable, I've reported only the short description and the categorization. I'll go more into detail about each of them in a second stage.
I moved R-MultipleHeadingRows and R-RandomAccess to "Deferred Requirements".
The rest of requirements is still "Candidate" and organized in categories which have changed over time (I've kept the old ones as well to remind of them and make sure that we agree on the new categorization).
So, I'd have a few queries for you:
- the first issue regards R-AnnotationAndSupplementaryInfo. I think that we agree that this is a "super-requirement" wrt a few other reqs (e.g., R-LinksToExternallyManagedDefinitions). I would be inclined to keep both super- and sub-requirements because the sub-requirements allow to address specific issues, but we should also provide a generic mechanism to annotate CSVs with classes of information we have not explicitly considered, but might be relevant (e.g., license of use).
- I think that a categorization is useful, given the relevant number of requirements we have. Do you agree? If so, do you suggest any change in it?
- Is there any candidate requirement you suggest to defer or delete or do you agree to accept all of them?
Thanks,

Davide
4.2 Candidate Requirements
4.2.1 Requirements relating to parsing of CSV

R-WellFormedCsvCheck
Ability to determine that a CSV is syntactically well formed
R-TableNormalization
Ability to normalize data that is not in normal form and possibly vice-versa.
R-RightToLeftCsvCheck
Ability to determine that a CSV is using RTL
4.2.2 Requirements relating to annotation of CSV

4.2.3 Requirements relating to metadata discovery

4.2.4 Requirements relating to applications

R-CsvValidation
Ability to validate a CSV for conformance with a specified DDR<http://w3c.github.io/csvw/use-cases-and-requirements/index.html#dfn-ddr>
R-CsvToRdfTransformation
Ability to automatically transform a CSV into RDF
R-CsvToJsonTransformation
Ability to automatically transform a CSV into JSON
R-CanonicalMappingInLieuOfAnnotation
Ability to transform CSV conforming to the core tabular data model yet lacking further annotation into a object / object graph serialisation
R-IndependentMetadataPublication
Ability to publish metadata independently from the tabular data resource it describes
4.2.5 Non-functional requirements

R-ZeroEditCompatibility
Compatibility of data analysis tools in common usage with CSV+
R-ZeroEditAdditionOfSupplementaryMetadata
Ability to add supplementary metadata to an existing CSV file without requiring modification of that file
4.2.6 Data Model Requirements

R-HeadingColumns
Ability to handle columns as row headers.
R-CellValueMicroSyntax
Ability to parse internal data structure within a cell value
R-NonStandardFieldDelimiter
Ability to parse tabular data with field delimiters other than comma (,)
R-PrimaryKey
Ability to determine the primary key for entities described within a CSV file
R-ForeignKeyReferences
Ability to cross reference between CSV files
R-ExternalDataDefinitionResource
Ability to reference a Data Definition Resource defining supplementary metadata external to the CSV file
R-AnnotationAndSupplementaryInfo
Ability to add annotation and supplementary information to CSV file
R-AssociationOfCodeValuesWithExternalDefinitions
Ability to associate a code value with externally managed definition
R-CsvAsSubsetOfLargerDataset
Ability to assert how a single CSV file is a facet or subset of a larger dataset
R-LinksToExternallyManagedDefinitions
Ability to provide (hyper)links to externally managed definitions from with a CSV file
R-SyntacticTypeDefinition
Ability to declare syntactic type for data values
R-SemanticTypeDefinition
Ability to declare semantic type for data values
R-MissingValueDefinition
Ability to declare a "missing value" token and, optionally, a reason for the value to be missing
R-URIMapping
Ability to map the values of a CSV row/column into corresponding URI (e.g. by concatenating those values with a prefix).
R-UnitMeasureDefinition
Ability identify/express the unit of measure for the values reported in a given column.
R-GroupingOfMultipleTables
Ability to group multiple data tables into a single package for publication
R-LinkFromMetadataToData
Ability for a metadata description to explicitly cite the tabular dataset it describes
4.3 Deferred requirements

R-MultipleHeadingRows
Ability to handle headings spread across multiple initial rows, as well as to distinguish between single column headings and file headings.
R-RandomAccess
Ability to access and/or extract part of a CSV file in a non-sequential manner.
ReSpec
Received on Tuesday, 20 May 2014 21:23:25 UTC