Re: Spec review request: CSV on the Web from Paul Libbrecht on 2015-04-20 (public-csv-wg@w3.org from April 2015)

From: Paul Libbrecht <paul@hoplahup.net>
Date: Mon, 20 Apr 2015 18:42:58 +0200
To: Yakov Shafranovich <yakov-ietf@shaftek.org>
CC: Jeni Tennison <jeni@jenitennison.com>, www-tag@w3.org, "public-csv-wg@w3.org" <public-csv-wg@w3.org>
Message-ID: <55352C92.80300@hoplahup.net>
Yakov,

the push to include clipboard specific data types stems of a discussion
in the math WG, I believe.
It appeared natural to bring it there and the group agreed to put it there.

Since then, a few years ago, I keep suggesting to the registrations
which seem to be applicable for clipboard types, they could consider it.
It did not work in all cases. The best case was SVG where it worked
right away.

Unfortunately, that push came too late for the registration process
revision of the IETF, so that it's not in the questions currently even
though the process is only one year old or so. Right after the revised
process appeared, a few discussions on the media-types mailing-list
seemed to indicate it would be a good idea, but only for the next
process revision... which could take long.

So, no discrepancy thus far, just an amount of extra information.

paul

On 20/04/15 04:54, Yakov Shafranovich wrote:
> Just as a side point, there seems to be a discrepancy in the way media
> types are being registered between the W3C and the IETF. Some W3C
> registrations are coming in with both Windows Clipboard Names and OSX
> UTIs while these parameters are not part of the IETF's media type
> registration template. The relevant IETF and W3C pages do not mention
> anything about this:
>
> https://www.iana.org/form/media-types
> http://www.w3.org/2002/06/registering-mediatype2014.html
>
> But MathML does include it:
>
> http://www.w3.org/TR/MathML3/appendixb.html
>
> Yakov
>
> On Sun, Apr 19, 2015 at 7:51 AM, Paul Libbrecht <paul@hoplahup.net> wrote:
>> Dear Jeni and all,
>>
>> do I mistake or there is nothing about clipboard formats in this set of
>> specs?
>> Ideally, such would be in a media-type-declaration but it seems like the
>> one here would also be suited.
>> Basically: it would be clipboard flavour names for windows and UTI for OSX).
>>
>> The lack of such a convention  has made it that HTML tables are sniffed
>> and partially successfully copy and pasted from some browsers to some
>> spreadsheets (thus far: Firefox + Excel only)... This seems to be the
>> only way thus far and, indeed, xls or csv exports are pretty common as
>> an extra service of web applications whereas a selection, copy, and
>> paste would widely more intuitive.
>>
>> thanks in advance.
>>
>> Paul
>>
>>
>>
>> On 18/04/15 12:24, Jeni Tennison wrote:
>>> Hello TAG,
>>>
>>> The CSV on the Web Working Group would like to request that the TAG review the following Working Drafts:
>>>
>>>   Model for Tabular Data and Metadata on the Web -
>>>     http://www.w3.org/TR/2015/WD-tabular-data-model-20150416/
>>>   Metadata Vocabulary for Tabular Data -
>>>     http://www.w3.org/TR/2015/WD-tabular-metadata-20150416/
>>>   Generating JSON from Tabular Data on the Web -
>>>     http://www.w3.org/TR/2015/WD-csv2json-20150416/
>>>   Generating RDF from Tabular Data on the Web -
>>>     http://www.w3.org/TR/2015/WD-csv2rdf-20150416/
>>>
>>> There are three things in particular that I'd like to draw the TAG's attention to, where we have adopted a "pragmatic" rather than "correct" design:
>>>
>>> 1. We have a facility to enable transformations over tabular data using templates or scripts [1], to provide for transformations beyond those we've defined for JSON and RDF. In doing this we need to be able to indicate the format of both the result of the transformation and the format of the template or script that is being used.
>>>
>>> We think that the "correct" way of doing this would be to use media types. However, it's quite rare for templating syntaxes (such as Mustache) to have a registered media type, so instead we have opted to use URLs to name those formats and encourage users to use URLs in the form http://www.iana.org/assignments/media-types/{mediatype} when there is a registered media type.. Is this the right approach to take or should we be more insistent on the use of a media type?
>>>
>>> 2. In the conversion to RDF, we want to use the 'describes' link relation defined in [2] to say that a particular row in the tabular data describes a particular thing (such as a person or event). Because this is RDF, the relationship has to have a URL.
>>>
>>> However, as has been discussed elsewhere [3], IANA registered link relations do not have individual URLs and http://www.iana.org/assignments/link-relations/describes doesn't resolve. Similarly, the link relation wiki doesn't have individual URLs for link relations. We decided to create a URL for this relationship in our own namespace, with a reference to the proper definition (see discussion at [4]), but hope that this case might prompt the TAG to try to get some movement on this issue.
>>>
>>> 3. The model of access that we're assuming for CSV and other tabular data files is that someone will link directly to the CSV file (as currently) and that processors will need to retrieve a metadata file about that CSV based on the location of the CSV file. Note that metadata files are file-specific; we wouldn't expect a single metadata file that includes information about every CSV file on a particular site.
>>>
>>> We think that the "correct" way of getting this pointer to a metadata file (given that there is no scope for embedding information within the CSV file itself) is to use a Link header that points to the metadata file, and we have specified that here [5].
>>>
>>> However, we recognise that there are many publishing environments in which it is impossible for users to set HTTP headers, particularly on an individual file basis. We have therefore specified two other mechanisms to retrieve metadata files, used only if the URL of the original CSV file doesn't include a query string:
>>>
>>>   * appending '-metadata.json' to the end of the URL to get file-specific metadata [6]
>>>   * resolving the URL '../metadata.json' against the URL to get directory-level metadata [7]
>>>
>>> Neither of these feels great: they require users who can't use Link headers to structure their URL space in particular ways, and they use string concatenation on URLs which is horrible. However, we can't see any better alternative to meet our requirement for what is in effect a file-specific well known URI.
>>>
>>> We'd obviously welcome wider review of the documents if you have time, but these are the three issues on which we'd particularly like your opinion.
>>>
>>> Thanks,
>>>
>>> Jeni
>>>
>>> [1] http://www.w3.org/TR/2015/WD-tabular-metadata-20150416/#transformation-definitions
>>> [2] http://tools.ietf.org/html/rfc6892
>>> [3] https://github.com/mnot/I-D/issues/39
>>> [4] https://github.com/w3c/csvw/issues/297
>>> [5] http://www.w3.org/TR/2015/WD-tabular-data-model-20150416/#link-header
>>> [6] http://www.w3.org/TR/2015/WD-tabular-data-model-20150416/#standard-file-metadata
>>> [7] http://www.w3.org/TR/2015/WD-tabular-data-model-20150416/#standard-directory-metadata
>>> --
>>> Jeni Tennison
>>> http://www.jenitennison.com/
>>>
>>
>
Received on Monday, 20 April 2015 16:43:33 UTC