Re: Organising Requirements

As far as I am concerned: yes. The question is whether these cases (which are obviously real) are also present in published CSV files. If they are (eg, if the Statistics guys indeed publish their data in CSV 'verbatim' from those Excel sheets) then of course it is an issue for us. But if they are, hm, smart enough to publish CSV that, for example, does not contain the presentation artifacts, then we can safely ignore those cases...

Of course... there is a remark somewhere on the use case wiki page that says that data publishers may _not_ want to publish CSV because those very artifacts are not available, then that is of course an issue for us, too. That may be the case of your first example below...

Thanks!

Ivan

On 24 Feb 2014, at 10:35 , Tandy, Jeremy <jeremy.tandy@metoffice.gov.uk> wrote:

> Point accepted.
> 
> I think that there are two areas of the Publication of National Statistics use case <http://w3c.github.io/csvw/use-cases-and-requirements/#UC-PublicationOfNationalStatistics> to modify:
> 
> 1) "Several tables may appear within a single sheet; for example, refer to labour market statistics sheet 18(1). Here we see statistics relating to headline estimates, change on quarter and change on year. A closer inspection indicates that the column layout is identical for each sub-table; only the meaning of the data values changes."
> 
> 2) "The layout of tabular data within each sheet is optimised for human consumption. Presentation artifacts are typically intermingled with the data itself making it difficult for data reusers to extract data from these tables for further processing. Presentation artifacts, some of which are illustrated in figure Fig. 1 Presentation artifacts in statistics worksheet, include:" [etc.]
> 
> Can you confirm that I've interpreted your concerns correctly?
> 
> Thanks, Jeremy
> 
> -----Original Message-----
> From: Ivan Herman [mailto:ivan@w3.org] 
> Sent: 24 February 2014 08:37
> To: Jeni Tennison
> Cc: Tandy, Jeremy; W3C CSV on the Web Working Group
> Subject: Re: Organising Requirements
> 
> 
> On 23 Feb 2014, at 22:03 , Jeni Tennison <jeni@jenitennison.com> wrote:
> 
>> Hi Jeremy,
>> 
>> Thanks for all your work pulling together the use cases and requirements.
>> 
>> Do you think it would be useful to cluster the requirements? Looking at them, I can see:
>> 
>>  * Parsing, eg requirements around recognising other delimiters
>>  * Annotation Types, eg R-PrimaryKey
>>  * Metadata Discovery, eg R-PackagingOfMultipleTables
>>  * Applications, eg R-CsvValidation
>>  * Non-Functional, eg R-ZeroEditCompatibility
>> 
>> Regarding the requirement R-PackagingOfMultipleTables, I think the requirement is to annotate a group of tables, not necessarily to package them. In other words, a design in which there was a metadata file that pointed to a group of tables hosted elsewhere on the web would seem to satisfy the requirement from PublicationOfNationalStatistics: they wouldn't necessarily need to be packaged together (eg in a zip).
>> 
>> Also, FWIW, I would only take syntactic requirements from published "CSVs", not from non-text-based formats like Excel. So, for example, I wouldn't use the ONS Excel files as demonstrating a requirement to have multiple tables within a single CSV file.
> 
> +1
> 
> We would open the flood gates if we took the original spreadsheet (and other) programs into account...
> 
> Ivan
> 
>> 
>> Cheers,
>> 
>> Jeni
>> --  
>> Jeni Tennison
>> http://www.jenitennison.com/
>> 
> 
> 
> ----
> Ivan Herman, W3C 
> Digital Publishing Activity Lead
> Home: http://www.w3.org/People/Ivan/
> mobile: +31-641044153
> GPG: 0x343F1A3D
> FOAF: http://www.ivan-herman.net/foaf
> 
> 
> 
> 
> 


----
Ivan Herman, W3C 
Digital Publishing Activity Lead
Home: http://www.w3.org/People/Ivan/
mobile: +31-641044153
GPG: 0x343F1A3D
FOAF: http://www.ivan-herman.net/foaf

Received on Monday, 24 February 2014 11:30:25 UTC