RE: Request for assistance: editing use case & requirements document from Tandy, Jeremy on 2014-02-25 (public-csv-wg@w3.org from February 2014)

From: Tandy, Jeremy <jeremy.tandy@metoffice.gov.uk>
Date: Tue, 25 Feb 2014 23:14:19 +0000
To: Eric Stephan <ericphb@gmail.com>
CC: "Ceolin, D." <d.ceolin@vu.nl>, W3C CSV on the Web Working Group <public-csv-wg@w3.org>
Message-ID: <2624871D9A05174691BD59F8EFD68AE2B3620B@EXXCMPD1DAG3.cmpd1.metoffice.gov.uk>
OK - I've been through the use cases on the wiki <https://www.w3.org/2013/csvw/wiki/Use_Cases> and have come up with a plan of how we can progress these from current state into the UCR document.

I've tried to cluster the use cases into sets that I think match the kinds of things you've either been talking about in the teleconfs/mailing list or have submitted as use cases already. Please forgive me if I've got that wrong.

There are a couple of use-cases that I think need further discussion before we elaborate them. I've created separate threads on the mailing list for these:
a) Peel Sessions use case <https://www.w3.org/2013/csvw/wiki/Use_Cases#Peel_Sessions>, thread: <http://lists.w3.org/Archives/Public/public-csv-wg/2014Feb/0191.html> 
b) NetCDF data use case <https://www.w3.org/2013/csvw/wiki/Use_Cases#NetCDF_data>, thread: <http://lists.w3.org/Archives/Public/public-csv-wg/2014Feb/0192.html> 

I also note that we still have no use cases talking about CSV dump from relational database. Hey ho.

Let's partition the work ...

=======
JEREMY:

#3 Publication of statistics <https://www.w3.org/2013/csvw/wiki/Use_Cases#Publication_of_Statistics> ... we've been given some more information from folks at the ONS about some more pertinent data access services, so I'll need to re-edit this one.

#4 Organogram data <https://www.w3.org/2013/csvw/wiki/Use_Cases#Organogram_Data> 

#6 Land Registry Data <https://www.w3.org/2013/csvw/wiki/Use_Cases#Publication_of_Data_by_the_UK_Land_Registry> 

#11 Visualisation of time series data with annotations <https://www.w3.org/2013/csvw/wiki/Use_Cases#Visualisation_of_time_series_data_with_annotations> ... I'll merge this into the climate data use case already published <http://w3c.github.io/csvw/use-cases-and-requirements/#UC-SurfaceTemperatureDatabank> 

=======
DAVIDE:

#8 Reliability analysis of police open data <https://www.w3.org/2013/csvw/wiki/Use_Cases#Reliability_Analysis_of_Police_Open_Data> ... excellent work so far, please edit into the UCR document and associate with requirements

#9 Analysing Scientific Spreadsheets <https://www.w3.org/2013/csvw/wiki/Use_Cases#Analyzing_Scientific_Spreadsheets> ... this is a good use case, but it's not clear why someone would want to write a parser to extract information; can we edit this to include a user perspective indicating how they want to use this data and for what purpose?

#14 OpenSpending data <https://www.w3.org/2013/csvw/wiki/Use_Cases#OpenSpending_data> ... this is a good use case as it illustrates the ambiguities arising from the lack of semantics. However, I wonder if we could develop this further to (i) illustrate the goal of collecting heterogeneous data from multiple sources - how do they support upload of data from data publishers to include semantics etc., given that datahub.io is from OKF <http://okfn.org/> does it support the Simple Data Format <http://dataprotocols.org/simple-data-format/>? ... (ii) the point of publication of this data is to allow broad and unanticipated re-use - but we see that re-use is inhibited because semantics (e.g. currency type) is implied rather than made explicit; I wonder if we could illustrate this by trying to combine / compare with data from the WorldBank <http://data.worldbank.org>?

#20 Representing entities and facts extracted from text <https://www.w3.org/2013/csvw/wiki/Use_Cases#Representing_entitles_and_facts_extracted_from_text> ... this use case illustrate the difficulties in working with non-homogenous tables, some discussion on the mailing list (see thread: <http://lists.w3.org/Archives/Public/public-csv-wg/2014Feb/0159.html>) provides some further insights

#22 Intelligent preview of CSV files <https://www.w3.org/2013/csvw/wiki/Use_Cases#Intelligently_Previewing_CSV_files> ... a nice simple use case with some clear goals and challenges.    

=====
ERIC:

#7 Journal article search <https://www.w3.org/2013/csvw/wiki/Use_Cases#A_local_archive_of_metadata_for_a_collection_of_journal_articles> ... I think that this use case illustrates the utility of CSV as a convenient exchange format for pushing tabular data between software components; I think the key points relate to (i) making it easier to interpret the data on subsequent ingest, and (ii) being able to work with manageable chunks of a tabular data set (e.g. only subsets of the tabular dataset are ever materialised in a single CSV file, and we often want to know how that subset fits within the larger whole). I also note that the use case talks about extracting subsets of data ... this gets somewhat towards PROTOCOL DESIGN, which I'm not sure is in scope, but might give an anchor to look at the ODATA Protocol <http://www.odata.org/>?

#12 Mass Spec <https://www.w3.org/2013/csvw/wiki/Use_Cases#Mass_Spec> and #13 FTIR <https://www.w3.org/2013/csvw/wiki/Use_Cases#FTIR_Fourier_Transform_Infrared_Spectroscopy> are both quite similar as they talk about data exported from scientific instruments. I wonder if we might merge these together? Also, the use case needs to be presented in narrative style ... what might a user want to do with this data? What are the challenges inherent when they work with it (from which we can derive requirements).

#16 City of Palo Alto tree data <https://www.w3.org/2013/csvw/wiki/Use_Cases#City_of_Palo_Alto_Tree_Data> ... great use case - the first to incorporate geospatial mapping and web applications! For me, this is all about taking CSV data and publishing it in an interactive map form to allow people to derive insight. It's also great that it's using Google Fusion Tables because we don't yet have an example of that. Some things missing from this are (i) the original source CSV data (if you can find it), and (ii) details of the data-conditioning required to get Fusion Tables to interpret the data as geospatial.

#17 Protein data bank <https://www.w3.org/2013/csvw/wiki/Use_Cases#Protein_Data_Bank_File_Format> and #18 XYZ chemical file format <https://www.w3.org/2013/csvw/wiki/Use_Cases#XYZ_Chemical_File_Format> ... both of these are very similar, with #17 being a bit richer. Does #18 add anything unique? That said, I think the use case needs more information about the tools people use to work with this data, what they're trying to do, what challenges they have with this data etc.

#21 Displaying locations of care homes on a map <https://www.w3.org/2013/csvw/wiki/Use_Cases#Displaying_Locations_of_Care_Homes_on_a_Map> ... this is quite similar to #16, but I think illustrates some different tools such as use of a web component for embedding a map of these locations into web-pages as opposed to Google Fusion Tables.

=====

I know I've got fewer bits to add than Eric and Davide, but I'm out of office a lot over the next few weeks with limited time. Sorry!

Do you think that this is workable? Please advise.

Goodnight! Jeremy

PS: as you inevitably come across issues when editing the use cases, simply add them into the UCR document (e.g. <div class="issue">) so that we can discuss them in the teleconfs.


-----Original Message-----
From: Eric Stephan [mailto:ericphb@gmail.com] 
Sent: 25 February 2014 21:40
To: Tandy, Jeremy
Cc: Ceolin, D.; W3C CSV on the Web Working Group
Subject: Re: Request for assistance: editing use case & requirements document

Jeremy,

Sounds like a plan!  I can start tomorrow.  Davide and Jeremy I'll keep you both posted once I'm underway.

Eric

On Tue, Feb 25, 2014 at 9:21 AM, Tandy, Jeremy <jeremy.tandy@metoffice.gov.uk> wrote:
> Hi Davide & Eric ... many thanks for your offer of help.
>
> Later this evening (after I've got back from a run :-) ) I plan to review the use cases in the wiki to identify which are in a good state for transfer to the use case and requirements (UCR) doc. I'll provide my thoughts on that ASAP. Then we can each grab some to edit into the UCR doc.
>
> OK ... working practices.
>
> The source document is at 
> github:w3C/csvw/use-cases-and-requirements/index.html
>
> Because we're using Github pages, the main branch we're using is gh-pages.
>
> My preference, at least initially, would be for you to branch gh-pages and send me pull requests? Are you ok with that?
>
> Points to note:
> * You'll see that we're using W3C Respec for the document, there's a user guide <http://www.w3.org/respec/guide.html> but I think it should be fairly self evident to follow the pattern of the use cases I've added already.
> * I'm dumping the supporting files straight into the 
> use-cases-and-requirements/ directory, so please make the file names 
> unique :-)
> * If you can help it, please try to avoid "prettifying" the html as we may adversely impact each other (difficult to merge).
> * I am maintaining bi-directional links between use cases and requirements (Requires: and Motivation:) - please can you keep these up to date.
> * Put new requirements in the "candidate requirements" section until 
> we've approved them
> * Please follow the pattern I've started for the use-case and requirement fragment identifiers; human readable please!
>
> Finally, don't forget to add yourselves as EDITORS!
>
> Many thanks, Jeremy
>
> -----Original Message-----
> From: Ceolin, D. [mailto:d.ceolin@vu.nl]
> Sent: 24 February 2014 18:09
> To: Eric Stephan
> Cc: Tandy, Jeremy; W3C CSV on the Web Working Group
> Subject: Re: Request for assistance: editing use case & requirements 
> document
>
> Me too.
>
> Davide
>
> Il giorno 24/feb/2014, alle ore 18.05, Eric Stephan ha scritto:
>
>> Hi Jeremy,
>>
>> I can help, let me know what I can do.
>>
>> Eric
>>
>> On Mon, Feb 24, 2014 at 2:10 AM, Tandy, Jeremy 
>> <jeremy.tandy@metoffice.gov.uk> wrote:
>>> Hi all -
>>>
>>>
>>>
>>> The use case document is now well underway, hopefully we can see 
>>> where that's heading.
>>>
>>>
>>>
>>> Looking at my calendar for the next 4 weeks, I have less time to 
>>> contribute to the editorial ... and I note the timescales identified 
>>> by Jeni and danbri at our first telcon which indicated the milestone 
>>> for first public working draft for March - with an initial draft for end Feb.
>>>
>>>
>>>
>>> So ... there are two options:
>>>
>>> i)                    More offers of assistance to edit the document - I
>>> think that we can probably partition the use cases from the wiki.
>>>
>>> ii)                   We push the timelines
>>>
>>>
>>>
>>> Here's hoping for option #1
>>>
>>>
>>>
>>> Jeremy
>>
>
Received on Tuesday, 25 February 2014 23:14:48 UTC