- From: Tandy, Jeremy <jeremy.tandy@metoffice.gov.uk>
- Date: Mon, 26 May 2014 14:18:42 +0000
- To: "Tim Robertson [GBIF]" <trobertson@gbif.org>
- CC: "public-csv-wg@w3.org" <public-csv-wg@w3.org>
- Message-ID: <2624871D9A05174691BD59F8EFD68AE208842177@EXXCMPD1DAG3.cmpd1.metoffice.gov.uk>
Thanks for feedback. Change included. Jeremy From: Tim Robertson [GBIF] [mailto:trobertson@gbif.org] Sent: 26 May 2014 15:12 To: Tandy, Jeremy Cc: public-csv-wg@w3.org Subject: Re: Updates to use case #21: biodiversity Thanks Jeremy, Looks good, although I'd suggest tightening the wording from: In the case of sparsely populated data, this property-value pair can be applied as a default where that property is absent from the data. to In the case of sparsely populated data, this property-value pair [must] be applied as a default [only] where that property is absent from the data. Without enforcing this the intent of the annotator might get lost along the way. Seem sensible? Cheers, Tim On 26 May 2014, at 16:05, Tandy, Jeremy <jeremy.tandy@metoffice.gov.uk<mailto:jeremy.tandy@metoffice.gov.uk>> wrote: Hi Tim - I've amended the use case to include the idea of adding default property value pairs to sparse data. Rather than add a new requirement, I merged it into the http://w3c.github.io/csvw/use-cases-and-requirements/#R-SpecificationOfPropertyValuePairForEachRow requirement. Regarding the ability to declare NULL fields, we already have requirement http://w3c.github.io/csvw/use-cases-and-requirements/#R-MissingValueDefinition to cover this. I've not included it in the biodiversity use case because there's nothing to actually hang it on there :) The same applies regarding the ability to document multiple files, and their relationships. Hope that's OK. Jeremy From: Tim Robertson [GBIF] [mailto:trobertson@gbif.org] Sent: 26 May 2014 08:33 To: Tandy, Jeremy Cc: public-csv-wg@w3.org<mailto:public-csv-wg@w3.org> Subject: Re: Updates to use case #21: biodiversity Thank you very much Jeremy - great improvements which are accurate. A few brief comments which might be worth adding: Although not present in this example, the DwC-A supports: a) The ability to define a default value for declared fields should none be found in sparsely populated tables > no requirement exists for this? b) The ability to document multiple files, and their relationships > requirement exists already with foreign key If I were to rework the DwC-A standard today, I would include the explicit declaration of the NULL value. In our code [1] we have to handle this with guesswork which is pretty fragile. We make use of the Hadoop import tool Sqoop which allows this feature and it is very useful: http://sqoop.apache.org/docs/1.4.4/SqoopUserGuide.html#_null_string_handling I believe there is a use case for explicitly being able to declare in a file produced by (e.g.) MySQL that \N represents NULL globally without resorting to guess work. I'm not sure I've seen a requirement for this though. Perhaps this can be added under this use case for consideration? Thanks again, Tim [1] https://github.com/gbif/dwca-reader/blob/master/src/main/java/org/gbif/dwc/record/RecordImpl.java#L17 On 26 May 2014, at 02:07, Tandy, Jeremy <jeremy.tandy@metoffice.gov.uk<mailto:jeremy.tandy@metoffice.gov.uk>> wrote: All - I've updated the biodiversity use case (originally contributed by Tim Robertson of GBIF) so that it is now action-oriented and user-centred ... our protagonist is a citizen scientist who wants to build a web app to show biodiversity information about the Sierra Nevada national park, Spain (because that's what the dataset I picked up as an example from GBIF refers to!). The use case is renamed "PublicationOfBiodiversityInformation" and is available at <http://w3c.github.io/csvw/use-cases-and-requirements/#UC-PublicationOfBiodiversityInformation>. Also note the new Requirement <http://w3c.github.io/csvw/use-cases-and-requirements/#R-SpecificationOfPropertyValuePairForEachRow>. Comments welcome - although I am particularly interested in Tim's perspective as to whether this heavily edited use case still makes the key points he wanted. I think it does - but I'd like confirmation. Jeremy ---------------------------------------------------------------------------------------- Tim Robertson - GBIF Head of Informatics - trobertson@gbif.org<mailto:trobertson@gbif.org> Global Biodiversity Information Facility http://www.gbif.org/ GBIF Secretariat, Universitetsparken 15, DK-2100 Copenhagen Ø, Denmark Tel: +45 3532 1487 Mob: +45 2826 1487 Fax: +45 2875 1480 ----------------------------------------------------------------------------------------
Received on Monday, 26 May 2014 14:19:13 UTC