Re: Spec review request: CSV on the Web from Ashok Malhotra on 2015-04-28 (public-csv-wg@w3.org from April 2015)

From: Ashok Malhotra <ashok.malhotra@oracle.com>
Date: Tue, 28 Apr 2015 11:55:43 -0400
To: Dan Brickley <danbri@google.com>
CC: Jeni Tennison <jeni@jenitennison.com>, www-tag@w3.org, "public-csv-wg@w3.org" <public-csv-wg@w3.org>
Message-ID: <553FAD7F.8020801@oracle.com>
Hi Dan:
Obviously I would have been happier if the WG had decided to create a new branch
of the spec that dealt with mapping CSV tables to Relational table but the WG decided
to do the next best thing which is to explain why it could not/did not do that.
That is acceptable.

What are the plans of the WG for the future?  If the WG is continuing, might we see
a mapping from CSV to Relational in the future?
All the best, Ashok

On 4/28/2015 10:57 AM, Dan Brickley wrote:
> On 19 April 2015 at 00:07, ashok malhotra <ashok.malhotra@oracle.com> wrote:
>> Shouldn't it be possible to create Relational tables from tabular data?
>> That is, after all, a popular use of tabular data.
>> There are probably existing tools and standards to do this but I would
>> think it was worth at least a mention.
> Hi Ashok,
>
> I took an action during last week's CSVW call to respond to you on
> behalf of the Working Group.
>
> Firstly we would like to thank you for your feedback on the specs. We
> hope the subsequent discussion here helped clarify our approach. As a
> result of that discussion, we have resolved to incorporate the bulk of
> text from https://github.com/w3c/csvw/wiki/Deviations-from-the-charter
> (at least regarding R2RML and RDB Direct Mapping) into our
> specifications. We hope that you consider that this addresses your
> concerns appropriately. If you would prefer us to register an ongoing
> open issue here for other feedback aspects, please let us know what
> the ongoing concern is.
>
> Many thanks,
>
> Dan, for the CSVW WG
>
> p.s. a note for www-tag mailing list participants - we would be
> extremely grateful if you could take care to update the Subject: line
> when your mails are on topics other than specific feedback to these
> versions of our specs, to help us keep track of things properly and
> not be overwhelmed by related discussion inspired by our
> (inspirational!) docs.
>
>
>> All the best, Ashok
>>
>>
>> On 4/18/2015 6:24 AM, Jeni Tennison wrote:
>>> Hello TAG,
>>>
>>> The CSV on the Web Working Group would like to request that the TAG review
>>> the following Working Drafts:
>>>
>>>     Model for Tabular Data and Metadata on the Web -
>>>       http://www.w3.org/TR/2015/WD-tabular-data-model-20150416/
>>>     Metadata Vocabulary for Tabular Data -
>>>       http://www.w3.org/TR/2015/WD-tabular-metadata-20150416/
>>>     Generating JSON from Tabular Data on the Web -
>>>       http://www.w3.org/TR/2015/WD-csv2json-20150416/
>>>     Generating RDF from Tabular Data on the Web -
>>>       http://www.w3.org/TR/2015/WD-csv2rdf-20150416/
>>>
>>> There are three things in particular that I’d like to draw the TAG’s
>>> attention to, where we have adopted a “pragmatic” rather than “correct”
>>> design:
>>>
>>> 1. We have a facility to enable transformations over tabular data using
>>> templates or scripts [1], to provide for transformations beyond those we’ve
>>> defined for JSON and RDF. In doing this we need to be able to indicate the
>>> format of both the result of the transformation and the format of the
>>> template or script that is being used.
>>>
>>> We think that the “correct” way of doing this would be to use media types.
>>> However, it’s quite rare for templating syntaxes (such as Mustache) to have
>>> a registered media type, so instead we have opted to use URLs to name those
>>> formats and encourage users to use URLs in the form
>>> http://www.iana.org/assignments/media-types/{mediatype} when there is a
>>> registered media type. Is this the right approach to take or should we be
>>> more insistent on the use of a media type?
>>>
>>> 2. In the conversion to RDF, we want to use the ‘describes’ link relation
>>> defined in [2] to say that a particular row in the tabular data describes a
>>> particular thing (such as a person or event). Because this is RDF, the
>>> relationship has to have a URL.
>>>
>>> However, as has been discussed elsewhere [3], IANA registered link
>>> relations do not have individual URLs and
>>> http://www.iana.org/assignments/link-relations/describes doesn’t resolve.
>>> Similarly, the link relation wiki doesn’t have individual URLs for link
>>> relations. We decided to create a URL for this relationship in our own
>>> namespace, with a reference to the proper definition (see discussion at
>>> [4]), but hope that this case might prompt the TAG to try to get some
>>> movement on this issue.
>>>
>>> 3. The model of access that we’re assuming for CSV and other tabular data
>>> files is that someone will link directly to the CSV file (as currently) and
>>> that processors will need to retrieve a metadata file about that CSV based
>>> on the location of the CSV file. Note that metadata files are file-specific;
>>> we wouldn’t expect a single metadata file that includes information about
>>> every CSV file on a particular site.
>>>
>>> We think that the “correct” way of getting this pointer to a metadata file
>>> (given that there is no scope for embedding information within the CSV file
>>> itself) is to use a Link header that points to the metadata file, and we
>>> have specified that here [5].
>>>
>>> However, we recognise that there are many publishing environments in which
>>> it is impossible for users to set HTTP headers, particularly on an
>>> individual file basis. We have therefore specified two other mechanisms to
>>> retrieve metadata files, used only if the URL of the original CSV file
>>> doesn’t include a query string:
>>>
>>>     * appending ‘-metadata.json’ to the end of the URL to get file-specific
>>> metadata [6]
>>>     * resolving the URL ‘../metadata.json’ against the URL to get
>>> directory-level metadata [7]
>>>
>>> Neither of these feels great: they require users who can’t use Link
>>> headers to structure their URL space in particular ways, and they use string
>>> concatenation on URLs which is horrible. However, we can’t see any better
>>> alternative to meet our requirement for what is in effect a file-specific
>>> well known URI.
>>>
>>> We’d obviously welcome wider review of the documents if you have time, but
>>> these are the three issues on which we’d particularly like your opinion.
>>>
>>> Thanks,
>>>
>>> Jeni
>>>
>>> [1]
>>> http://www.w3.org/TR/2015/WD-tabular-metadata-20150416/#transformation-definitions
>>> [2] http://tools.ietf.org/html/rfc6892
>>> [3] https://github.com/mnot/I-D/issues/39
>>> [4] https://github.com/w3c/csvw/issues/297
>>> [5] http://www.w3.org/TR/2015/WD-tabular-data-model-20150416/#link-header
>>> [6]
>>> http://www.w3.org/TR/2015/WD-tabular-data-model-20150416/#standard-file-metadata
>>> [7]
>>> http://www.w3.org/TR/2015/WD-tabular-data-model-20150416/#standard-directory-metadata
>>> --
>>> Jeni Tennison
>>> http://www.jenitennison.com/
>>>
Received on Tuesday, 28 April 2015 15:56:27 UTC