Re: What to do when "primary key" cell values are blank from Andy Seaborne on 2014-06-06 (public-csv-wg@w3.org from June 2014)

From: Andy Seaborne <andy@apache.org>
Date: Fri, 06 Jun 2014 10:23:01 +0100
To: public-csv-wg@w3.org
Message-ID: <53918875.4070704@apache.org>

On 06/06/14 09:53, Tandy, Jeremy wrote:
> Hi - when putting together Use Case #24 - Expressing a hierarchy within occupational listings [1] I was considering how primary key behaviour might work. In this use case, there are four different types of entity described in a single CSV file. I inferred that we might apply four different templates to pull out the relevant contents and transform into RDF. A given row describes _one of_ the types of entity, meaning that the primary key column asserted, say, for extracting "SOC Major Group" concepts will often be blank.
>
> I have stated in the use case that:
>> Where the value in the designated primary key column is blank, the row is ignored.
>
> I have also added this constraint to the primary key requirement [2].
>
> Please advise is this is inappropriate!

We use template conversion - we often run multiple templates on the same 
CSV, essentially extracting different kinds of entity on each pass. 
"The" primary key is different in each pass.  The note in R-PrimaryKey 
does not meet our experiences.

JeniT's condition extract is an example where it might be done as a pass 
to generate the skos:broader separately from the "code rdfs:label ....".

"Primary" is being overloaded between uniquely identifying a row 
(structural to CSV files), and uniquely identifying an entity 
(modelling).  In denormalised data, entities might get repeated on 
different rows.

	Andy

>
> Regards, Jeremy
>
>
> [1] http://w3c.github.io/csvw/use-cases-and-requirements/#UC-ExpressingHierarchyWithinOccupationalListings
> [2] http://w3c.github.io/csvw/use-cases-and-requirements/#R-PrimaryKey
>

Received on Friday, 6 June 2014 09:23:31 UTC