[Minutes] 2014-04-09 (NB No meeting next week 16th) from Phil Archer on 2014-04-09 (public-csv-wg@w3.org from April 2014)

From: Phil Archer <phila@w3.org>
Date: Wed, 09 Apr 2014 14:03:52 +0100
To: Public CSVW WG <public-csv-wg@w3.org>
Message-ID: <53454538.2010900@w3.org>
Dear all,

The minutes of today's meeting are at 
http://www.w3.org/2014/04/09-csvw-minutes.html.

Note that the WG will skip next week's meeting so that the next one is 
on Wednesday 23rd April.

A snapshot of today's minutes are below.


               CSV on the Web Working Group Teleconference

09 Apr 2014

    See also: [2]IRC log

       [2] http://www.w3.org/2014/04/09-csvw-irc

Attendees

    Present
           AndyS, fresco, +1.937.207.aaaa, phila, JeniT, MathewT,
           DavideCeolin, danbri, +44.777.586.aabb, jtandy

    Regrets
           Axel, Stasinos, Alfonso

    Chair
           Jeni

    Scribe
           Andy Seaborne

Contents

      * [3]Topics
          1. [4]UCR
          2. [5]Conversion
          3. [6]Model for tabular data
      * [7]Summary of Action Items
      __________________________________________________________

    <trackbot> Date: 09 April 2014

    <scribe> scribe: Andy Seaborne

    <scribe> scribenick: AndyS

    <danbri> thanks AndyS!

    AndyS: Regrets for next week.

    <JeniT> JeniT: Regrets for next week

    <JeniT> [8]http://www.w3.org/2014/04/02-csvw-minutes.html

       [8] http://www.w3.org/2014/04/02-csvw-minutes.html

    <danbri> looks good

    AndyS: Not all actions recorded in the tracker

    <danbri> 3 of them are for me; i'll make todos directly.

    APPROVED: Minutes
    [9]http://www.w3.org/2014/04/02-csvw-minutes.html

       [9] http://www.w3.org/2014/04/02-csvw-minutes.html

UCR

    Davide: will sync with jeremy

    phila: making progress on my action for a UC

    <phila> ACTION: phila to add use case linking from metadata to
    the data [recorded in
    [10]http://www.w3.org/2014/04/09-csvw-minutes.html#action01]

    <trackbot> Created ACTION-12 - to add use case linking from
    metadata to the data [on Phil Archer - due 2014-04-16].

    <danbri> (phil's action was on me last week as "chase phila for
    his usecase in which a party provides metadata for another's
    csv"; I declare my work done)

Conversion

    [11]http://w3c.github.io/csvw/csv2rdf/

      [11] http://w3c.github.io/csvw/csv2rdf/

    <JeniT> AndyS: we had a telcon yesterday

    <JeniT> ... including jtandy, Gregg, Juan

    <JeniT> ... we're looking at processing from CSV to CSV to
    clean up the general data

    <JeniT> ... eg fixing up new lines, delimiters, date formats

    <JeniT> ... thought better to do that as rewriting CSV

    <JeniT> ... then convert clean CSV to RDF/JSON/XML

    <JeniT> ... R2RML is the nuclear option for complicated
    transforms

    <JeniT> ... we didn't push on the boundaries around that

    <JeniT> ... similarly might want to do RDF-to-RDF or
    JSON-to-JSON transforms after conversion

    <JeniT> ... we don't want to repeat work done elsewhere, or add
    more tools to end users' toolchain

    <JeniT> ... we discussed on what's published

    <JeniT> ... there's CSVs published as the outcome of a longer
    process

    <JeniT> ... shared schemas, shared transformations, custom
    mappings

    <JeniT> ... at scale & in volume; sharing parts of the files is
    beneficial

    <JeniT> ... vs someone taking CSV from data.gov.uk

    <JeniT> ... and adding their own transform

    <JeniT> ... they need something more self-contained

    <JeniT> ... a single file to control the transformation

    <JeniT> ... also whether the CSV was created without the web in
    mind, or with the web in mind

    <JeniT> ... particularly with spotting links & data formats

    <JeniT> ... Gregg is going to look at pulling out his transform
    description to apply it independently of JSON-LD

    <JeniT> ... we're hopeful that there will be commonality in
    conversion to JSON

    <JeniT> ... which kind of depends on whether the conversion is
    to JSON-LD

    <JeniT> ... had a good chat with Ivan when we met up

    <JeniT> ... comments on what's been written would be great

    <JeniT> ... it's a bit scruffy, but the general approach is
    there

    <JeniT> ... I'm using the term 'basic mapping' rather than
    'direct mapping'

    <danbri> 'simple mapping'?

    <JeniT> ... there's a progression of complexity

    <danbri> 'wishfulthinking mapping'

    <Zakim> danbri, you wanted to ask status of test case csvs for
    this exploration

    <JeniT> danbri: are there test files?

    <JeniT> AndyS: there's tests in the repo

    <JeniT> danbri: are they mainstream examples or test cases?

    <JeniT> AndyS: the test ones from gkellogg are focused

    <JeniT> danbri: we'd like mainstream examples

    <JeniT> AndyS: I've put some of those in the document

    <JeniT> ... if you could work through one of the examples you
    want to put in, that would be great, like jtandy did

    <danbri>
    [12]https://github.com/w3c/csvw/blob/gh-pages/examples/simple-w
    eather-observation.md

      [12] 
https://github.com/w3c/csvw/blob/gh-pages/examples/simple-weather-observation.md

    JTandy: we also talked about was charter and metadata in RDF
    ... may be distinct from the mapping framing (not in RDF)
    ... want to test this with WG.

    <JeniT> AndyS: yes, metadata about the CSV file may or may not
    be in RDF

    <JeniT> ... it might be simpler to have one language that
    drives all the mappings

    <JeniT> ... which might include provenance etc

    <phila> from the charter "The vocabulary should be defined, or
    should have an encoding, in standard RDF and, wherever possible
    and appropriate, should refer to, and reuse, existing
    vocabularies developed elsewhere." - i.e. it doesn't have to
    *only* be in RDF

    <JeniT> ... even in JSON-LD, the context part isn't RDF

    <JeniT> jtandy: we talked about gkellogg pulling out the
    transformation stuff from JSON-LD to see if it could be
    expressed in Turtle

    jeniT: easy to write might mean TTL
    ... want to see the things it will say to guide the syntax
    choice.
    ... separating CSV-specific xform from JSON-LD will be good.,
    ... nudged Rufus and Ross Jones re JSON.

    <JeniT> [13]https://www.w3.org/2013/csvw/wiki/Conversions

      [13] https://www.w3.org/2013/csvw/wiki/Conversions

    <danbri> aside - another JSON-LD launch at google this week:
    [14]https://devsite.googleplex.com/webmasters/business-location
    -pages/schema.org-examples (i.e. we like JSON-LD)

      [14] 
https://devsite.googleplex.com/webmasters/business-location-pages/schema.org-examples

Model for tabular data

    jenit: e.g. import into relational DB

    davide: may have some interesting data as example

    <jtandy> danbri - that looks like an internal link (googleplex)
    ... just tried it :-)

    subtopic: null fields

    <JeniT>
    [15]http://w3c.github.io/csvw/syntax/#core-tabular-data-model

      [15] http://w3c.github.io/csvw/syntax/#core-tabular-data-model

    jenit: "What is a null field" comment from D Booth
    ... absent and empty : same? different?

    jtandy: in the discussion, defaults value need to be handled.

    <danbri> lost audio

    jtandy: empty field returned. Have a explicit "null" marker
    (999, whatever)

    subtopic: packaging

    <JeniT> [16]http://w3ctag.github.io/packaging-on-the-web/

      [16] http://w3ctag.github.io/packaging-on-the-web/

    jenit: TAG work

    <jtandy> the "999" marker would be declared in the metadata
    annotation as a token indicating a "null field" / missing field

    jenit: need arises in various places
    ... general need for web development
    ... we need to do similar - CSV(s) and metadata

    <JeniT>
    [17]http://w3ctag.github.io/packaging-on-the-web/#downloading-d
    ata-for-local-processing

      [17] 
http://w3ctag.github.io/packaging-on-the-web/#downloading-data-for-local-processing

    jenit: link to draft of the TAG direction with a specific
    example for this WG
    ... individual file are still on the web
    ... but that a "package fetch" pulls them all at once.
    ... individual files LInk back to their metadata
    ... streamable proposed based on multi-part
    ... comments invited

    <jtandy> ok - packaging stuff looks interesting

    <phila> no questions but it's interesting, thank you

    danbri: Other groups feedback?

    jenit: no HTTP changes

    danbri: what about HTTP layer optimizations? e.g. caching

    jenit: overlap with HTTP/2
    ... would need packaging aware caching to cache sub parts but
    format allows cache header per part
    ... will write to the list

    subtopic: metadata packaging
    ... metadata format

    jenit: hold back until we know what's in it

    jtandy: been looking at "Simple Data Packaging" (now renamed)
    looks very close
    ... start from that?

    jenit: Would be good to start from there - except it assumes
    JSON.

    jtandy: start with the JSON assumption and see how it is
    received on WD

    <Zakim> danbri, you wanted to say start from SDP as a
    *vocabulary* is fine, but something that fits with RDF is also
    important

    danbri: schema.org ==> vocabulary start good, but syntax of
    JSON only might be a barrier.

    <jtandy> +1 to taking SDP metadata and expressing in RDF over
    JSON-LD

    phila: Uncomfortable if excludes the dataprotocols work when it
    need not.
    ... significant community
    ... at least add conversions to/from.

    <JeniT> AndyS: I think there was something that said the data
    package might become JSON-LD

    <danbri> i can't find a good link for SDF, was it renamed?

    <JeniT> ... I'd like to get a sense of how successful that
    format has been

    <JeniT> ... and if there are any others

    <danbri> [18]http://dataprotocols.org/tabular-data-package/

      [18] http://dataprotocols.org/tabular-data-package/

    <JeniT> ... I thought it was a good starting point, but I
    realised I didn't know what the reception had been

    jenai: DSPL alternative

    jenit: DSPL alternative

    <danbri> DSPL is [19]https://developers.google.com/public-data/
    ; Omar I mentioned earlier was working to migrate this to
    schema.org / RDF / JSON world

      [19] https://developers.google.com/public-data/

    <danbri> [20]https://www.w3.org/wiki/WebSchemas/LookInside

      [20] https://www.w3.org/wiki/WebSchemas/LookInside

    jenit: used the format in our (ODI) tools
    ... and providing feedback (ldodds)
    ... would they contrib a draft?

    phila: Rufus is IE in this WG because it helps align the work.
    ... this WG will likely go beyond that work as extensions.
    Maybe WG NOTE for existing work.

    <danbri> I'd suggest we take it as expressivity requirements
    and we 'should' at least have a clear mapping

    jenit: will contact Rufus
    ... can we take into account data package work?

    <JeniT> ... in the conversions

    <JeniT> [21]http://w3c.github.io/csvw/syntax/#package

      [21] http://w3c.github.io/csvw/syntax/#package

    jenit: AOB?

    jtandy: timescales?
    ... next publication esp UCR doc?

    phila: no lower limit on repub cycle

    jtandy: Happen to move forward in May

    <jtandy> s /Happen/Happy/

    jenit: UCR will remain "open" to capture new discoveries.

    jtandy: requirements are placeholders, more categorization and
    "accept" requirements

    jenit: aim of mid May with more UCs.
    ... ??
    ... after Easter , process to accept requirements.

    danbri: propose skip next week

    <jtandy> +1 to skip

    danbri to chair next time, 2 weeks time. Wed after Easter.

    ADJOURNED

    <phila> DNM 23 April
Received on Wednesday, 9 April 2014 13:04:23 UTC