- From: Tandy, Jeremy <jeremy.tandy@metoffice.gov.uk>
- Date: Mon, 26 May 2014 14:05:42 +0000
- To: "Tim Robertson [GBIF]" <trobertson@gbif.org>
- CC: "public-csv-wg@w3.org" <public-csv-wg@w3.org>
- Message-ID: <2624871D9A05174691BD59F8EFD68AE20884211A@EXXCMPD1DAG3.cmpd1.metoffice.gov.uk>
Hi Tim - I've amended the use case to include the idea of adding default property value pairs to sparse data. Rather than add a new requirement, I merged it into the http://w3c.github.io/csvw/use-cases-and-requirements/#R-SpecificationOfPropertyValuePairForEachRow requirement. Regarding the ability to declare NULL fields, we already have requirement http://w3c.github.io/csvw/use-cases-and-requirements/#R-MissingValueDefinition to cover this. I've not included it in the biodiversity use case because there's nothing to actually hang it on there :) The same applies regarding the ability to document multiple files, and their relationships. Hope that's OK. Jeremy From: Tim Robertson [GBIF] [mailto:trobertson@gbif.org] Sent: 26 May 2014 08:33 To: Tandy, Jeremy Cc: public-csv-wg@w3.org Subject: Re: Updates to use case #21: biodiversity Thank you very much Jeremy - great improvements which are accurate. A few brief comments which might be worth adding: Although not present in this example, the DwC-A supports: a) The ability to define a default value for declared fields should none be found in sparsely populated tables > no requirement exists for this? b) The ability to document multiple files, and their relationships > requirement exists already with foreign key If I were to rework the DwC-A standard today, I would include the explicit declaration of the NULL value. In our code [1] we have to handle this with guesswork which is pretty fragile. We make use of the Hadoop import tool Sqoop which allows this feature and it is very useful: http://sqoop.apache.org/docs/1.4.4/SqoopUserGuide.html#_null_string_handling I believe there is a use case for explicitly being able to declare in a file produced by (e.g.) MySQL that \N represents NULL globally without resorting to guess work. I'm not sure I've seen a requirement for this though. Perhaps this can be added under this use case for consideration? Thanks again, Tim [1] https://github.com/gbif/dwca-reader/blob/master/src/main/java/org/gbif/dwc/record/RecordImpl.java#L17 On 26 May 2014, at 02:07, Tandy, Jeremy <jeremy.tandy@metoffice.gov.uk<mailto:jeremy.tandy@metoffice.gov.uk>> wrote: All - I've updated the biodiversity use case (originally contributed by Tim Robertson of GBIF) so that it is now action-oriented and user-centred ... our protagonist is a citizen scientist who wants to build a web app to show biodiversity information about the Sierra Nevada national park, Spain (because that's what the dataset I picked up as an example from GBIF refers to!). The use case is renamed "PublicationOfBiodiversityInformation" and is available at <http://w3c.github.io/csvw/use-cases-and-requirements/#UC-PublicationOfBiodiversityInformation>. Also note the new Requirement <http://w3c.github.io/csvw/use-cases-and-requirements/#R-SpecificationOfPropertyValuePairForEachRow>. Comments welcome - although I am particularly interested in Tim's perspective as to whether this heavily edited use case still makes the key points he wanted. I think it does - but I'd like confirmation. Jeremy
Received on Monday, 26 May 2014 14:06:15 UTC