W3C home > Mailing lists > Public > public-csv-wg@w3.org > May 2014

Re: Request for comments about requirements

From: Andy Seaborne <andy@apache.org>
Date: Wed, 21 May 2014 10:19:52 +0100
Message-ID: <537C6FB8.9060901@apache.org>
To: public-csv-wg@w3.org
Hi Davide,

Some comments on the requirements list:

>
>         R-RightToLeftCsvCheck
>             /*Ability to determine that a CSV is using RTL*/
>
I don't see how this is "check" - from the description, it's a need for 
a declaration that the columns are RTL.

Suggest:


        R-SupportRightToLeftCsvColumns
            /*Ability to declare that a CSV is using RTL columns
            */



>         R-CsvValidation
>             /*Ability to validate a CSV for conformance with a
>             specified DDR
>             <http://w3c.github.io/csvw/use-cases-and-requirements/index.html#dfn-ddr>*/
>

Is DDR supposed to refer to a specific standard? Or why is the UCR 
introducing a piece of terminology? It feels if there is something more 
to it. We could just say "metadata" and not define the terminology in 
UCR.  see also *R-ExternalDataDefinitionResource*.


>         R-CsvToRdfTransformation
>             /*Ability to automatically transform a CSV into RDF*/
>
s/automatically//

"Automatically" reads to me as a requiremenet for server conversion as 
part of conneg or happens as the file is first published resulting in a 
RDF file along side the CSV file. Conversion is, to me, a client-side 
operation when the data consumer decides they want RDF.  Of course, 
there may be conneg but it's not a requirement.

>
>         R-CsvToJsonTransformation
>             /*Ability to automatically transform a CSV into JSON*/
>
Ditto
>
>
>         R-HeadingColumns
>
>             /*Ability to handle columns as row headers.*/
>
I find "row headers" confusing ("column headers" makes more sense to me).

RFC4180 calls it a "header line"


>         R-CellValueMicroSyntax
>             /*Ability to parse internal data structure within a cell
>             value*/
>
-1 : this is open-ended from identify numbers as numbers, through 
extract part of a field, to turning "," separated fields into list or 
mutlivalue to working out author name lists.

(even though I think some cases are natural and easy to do in the RDF 
conversion doc)

I don't understand why its needed for UC#11 where it seems to be mixed 
up with giving a datatype to a compound structure (the geo literal) as 
per previous email discussions.
>
>
>         R-NonStandardFieldDelimiter
>             /*Ability to parse tabular data with field delimiters
>             other than comma (|,|)*/
>
Isn't this a CSV format issue - i.e. IETF RFC?

>
>         R-PrimaryKey
>             /*Ability to determine the primary key for entities
>             described within a CSV file*/
>         R-ForeignKeyReferences
>             /*Ability to cross reference between CSV files*/
>
This usage of "foreign key" is the right one IMO.

The cross reference between files should be limited to files from one 
publisher - else they are just web links with no guarantee of whether 
the target of the link exists which "foreign key" might imply.

>
>         R-ExternalDataDefinitionResource
>             /*Ability to reference a Data Definition Resource defining
>             supplementary metadata external to the CSV file*/
>
"reference one or more metadata descriptions providing supplementary 
information"

>
>         R-AssociationOfCodeValuesWithExternalDefinitions
>             /*Ability to associate a code value with externally
>             managed definition*/
>
In UCR, it says "description to be added here" :-)

>
>         R-CsvAsSubsetOfLargerDataset
>             /*Ability to assert how a single CSV file is a facet or
>             subset of a larger dataset*/
>         R-LinksToExternallyManagedDefinitions
>             /*Ability to provide (hyper)links to externally managed
>             definitions from with a CSV file*/
>         R-SyntacticTypeDefinition
>             /*Ability to declare syntactic type for data values*/
>

Shouldn't that be "declare syntactic type for table fields", or "table 
columns", not "data values" i.e. referring to the characters in the 
cell.  or does this requirement imply parsing to detect numbers, not 
strings.

>
>         R-URIMapping
>             /*Ability to map the values of a CSV row/column into
>             corresponding URI (e.g. by concatenating those values with
>             a prefix).*/
>

>         R-UnitMeasureDefinition
>             /*Ability identify/express the unit of measure for the
>             values reported in a given column.*/
>

>
>               4.3 Deferred requirements
>

"Deferred" may be taken as an implication that it will be addressed 
later, and seen as saying there will be another WG. Or that they are 
valid requirements, just not being done here.

I'd prefer to say that there are "not accepted" requirements or some 
other neutral terminology.

     Andy
Received on Wednesday, 21 May 2014 09:20:24 UTC

This archive was generated by hypermail 2.3.1 : Wednesday, 21 May 2014 09:20:25 UTC