Re: .well-known from David Booth on 2015-06-19 (public-csv-wg@w3.org from June 2015)

From: David Booth <david@dbooth.org>
Date: Fri, 19 Jun 2015 03:11:15 -0400
To: Ivan Herman <ivan@w3.org>
CC: Gregg Kellogg <gregg@greggkellogg.net>, W3C CSV on the Web Working Group <public-csv-wg@w3.org>
Message-ID: <5583C093.6070606@dbooth.org>
Hi Ivan,

On 06/19/2015 01:19 AM, Ivan Herman wrote:
> David,
>
> I see one issue with your argumentation. You say:
>
> [[[ - A *required* extra web access, nearly *every* time a conforming
> CSVW processor is given a tabular data URL and wishes to find the
> associated metadata -- because surely http://example/.well-known/csvm
> will be 404 (and not cachable) in the vast majority of cases. ]]]
>
> which is of course correct in the sense that .well-known is not
> widely deployed.

And it will continue to be not widely deployed, in my view, because the 
only real need someone would have for installing a .well-known/csvm file 
would be if they thought they might actually publish a JSON file next to 
a CSV file, with the exact content -- completely coincidentally -- 
required to be *accidentally* interpreted as a CSVW metadata file for 
that CSV file.  The chances of that happening seem so remote that I 
doubt it will *ever* happen in our lifetimes.

> But, I think, everybody in the discussion agreed
> that the ideal solution that we should propose is to use the Link
> header of the CSV file's HTTP return to refer to the metadata file
> and, in the scheme we have, if the HTTP returns that header, then the
> CSVW processor will *not* look into .well-known. Ie, in some sense,
> we push people to use the best solution…

Yes, that's a fair point, though I don't see many CSV publishers being 
able to set Link headers.

>
> B.t.w: why do you say 'not cachable'? Is that I restriction on
> .well-known? I have not seen any…

I said "not cachable" because the response will most likely be 404, and 
it is very unlikely that the host would make that cachable, especially 
since it didn't bother to put a .well-known/csvm file there.

David Booth

>
> Ivan
>
>
>
>
>
>> On 18 Jun 2015, at 21:29 , David Booth <david@dbooth.org> wrote:
>>
>> On 06/18/2015 12:56 PM, Gregg Kellogg wrote:
>>>> On Jun 17, 2015, at 7:43 PM, David Booth <david@dbooth.org>
>>>> wrote:
>>>>
>>>> On 06/17/2015 02:29 AM, Ivan Herman wrote:
>>>>> David,
>>>>>
>>>>> the .well-known mechanism is the result of a long discussion
>>>>> with the TAG that had difficulties with the principle of
>>>>> baking in URI-schemes like "-metadata.json".
>>>>
>>>> Is there a pointer to that discussion?   It sounds like the
>>>> TAG concern is URI squatting.  URI squatting is an important
>>>> concern, but I don't think it applies in this case, because --
>>>> if I've understood correctly -- a metadata file *explicitly*
>>>> references the relevant data file, which in effect means that
>>>> the URI owner has clearly indicated an intent to use that URI
>>>> for that purpose.
>>>
>>> Hi David, I found a link to the minutes here:
>>> https://github.com/w3ctag/meetings/blob/gh-pages/2015/telcons/06-03-csv-minutes.md
>>>
>>>
(already added to the issue).
>>>
>>> The minutes aren’t particularly illuminating, but the issue
>>> raised by mnot was definitely concern over squatting. At this
>>> point, it seems to be settled. I’ve implemented it in my
>>> implementation, and it was quite straight-forward, although it
>>> requires an extra GET, the result of this can be cached for some
>>> time (subject to policies, of course).
>>
>> Thanks very much for the pointer.  I've read through the discussion
>> and the TAG meeting minutes, and re-read RFC7320 , and I'm
>> convinced that concerns about URI squatting are unfounded in this
>> case.  I have written to the TAG to push back, explaining how this
>> case is different from URI squatting, and the use of .well-known
>> would actually cause more harm than good in this case:
>> https://lists.w3.org/Archives/Public/www-tag/2015Jun/0011.html
>>
>> BTW, you are extremely unlikely to be able to cache the result of
>> accessing .well-known/csvw , because in the vast majority of cases
>> it will be 404.
>>
>> Thanks, David Booth
>>
>>>
>>>> HOWEVER, I no longer see any mention of .well-known in the
>>>> current editor's draft, so maybe my concern is moot:
>>>> http://w3c.github.io/csvw/syntax/#locating-metadata
>>>
>>> It’s still in a PR that hasn’t yet been pulled:
>>> https://github.com/w3c/csvw/pull/605. You likely say a page based
>>> on that branch, rather than the gh-pages branch where the ED is
>>> available.
>>>
>>> It’s awaiting resolution of some minor wording on what “no such
>>> file is located” means, precisely.
>>>
>>> Gregg
>>>
>>>> Has the .well-known mechanism now been removed from the
>>>> algorithm for finding metadata?
>>>>
>>>> Thanks, David Booth
>>>>
>>>>> Note that the agreement is to have a default fall-back, ie,
>>>>> if the .well-known file does not exist then the client can
>>>>> fall back to a default value which, actually, reproduces the
>>>>> previous patterns. I think we should go ahead with this
>>>>> approach to cover all points of views.
>>>>>
>>>>> Ivan
>>>>>
>>>>>
>>>>>
>>>>>> On 17 Jun 2015, at 05:20 , David Booth <david@dbooth.org>
>>>>>> wrote:
>>>>>>
>>>>>> I'm sorry to ask this question at this point, but is
>>>>>> .well-known *really* needed for this?
>>>>>>
>>>>>> I am concerned that it is just adding complexity and
>>>>>> network accesses for dubious benefit.  AFAICT -- but please
>>>>>> correct me if I've overlooked something -- the only
>>>>>> "benefit" that .well-known adds here is to allow users to
>>>>>> use non-standard names for their metadata files.  And what
>>>>>> *real* benefit is that?  It seems to me to be adding
>>>>>> pointless variability.  Are there really cases where users
>>>>>> *cannot* name their metadata files to end with
>>>>>> "-metadata.json"?  If so what are they?
>>>>>>
>>>>>> David Booth
>>>>>>
>>>>>> On 06/16/2015 09:20 PM, Yakov Shafranovich wrote:
>>>>>>> Hmm. I am wondering if we can use the host-meta file
>>>>>>> instead, skipping the registration, as per this:
>>>>>>>
>>>>>>> https://tools.ietf.org/html/rfc6415#section-4.2
>>>>>>>
>>>>>>> On Tue, Jun 16, 2015 at 4:01 PM, Gregg Kellogg
>>>>>>> <gregg@greggkellogg.net> wrote:
>>>>>>>> On Jun 16, 2015, at 12:55 PM, Yakov Shafranovich
>>>>>>>> <yakov-ietf@shaftek.org> wrote:
>>>>>>>>
>>>>>>>> What's the proposed format?
>>>>>>>>
>>>>>>>> It's simply a file with one URI pattern per line. You
>>>>>>>> can see the proposed text here:
>>>>>>>> https://rawgit.com/w3c/csvw/98e728bcfef8d30e68c10f9cd798da0d39c7d172/syntax/index.html#site-wide-location-configuration
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>
>>>>>>>>
>>
>>>>>>>>
Gregg
>>>>>>>>
>>>>>>>>
>>>>>>>> On Jun 16, 2015 3:38 PM, "Ivan Herman" <ivan@w3.org>
>>>>>>>> wrote:
>>>>>>>>>
>>>>>>>>> Jeni, Gregg,
>>>>>>>>>
>>>>>>>>> I have just received the green light from our system
>>>>>>>>> people to set up the .well-known csw file. Can you
>>>>>>>>> ping me when the changes are added to the documents
>>>>>>>>> and the issue is closed? I would also need to know if
>>>>>>>>> it should contain anything else than the default.
>>>>>>>>>
>>>>>>>>> I will also take care of the registration when the
>>>>>>>>> document is available.
>>>>>>>>>
>>>>>>>>> Thanks
>>>>>>>>>
>>>>>>>>> Ivan
>>>>>>>>>
>>>>>>>>> ---- Ivan Herman +31 641044153
>>>>>>>>>
>>>>>>>>> (Written on my mobile. Excuses for brevity and
>>>>>>>>> frequent misspellings...)
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>>
>>>>> ---- Ivan Herman, W3C Digital Publishing Activity Lead Home:
>>>>> http://www.w3.org/People/Ivan/ mobile: +31-641044153 ORCID
>>>>> ID: http://orcid.org/0000-0003-0782-2704
>
>
> ---- Ivan Herman, W3C Digital Publishing Activity Lead Home:
> http://www.w3.org/People/Ivan/ mobile: +31-641044153 ORCID ID:
> http://orcid.org/0000-0003-0782-2704
>
>
>
>
Received on Friday, 19 June 2015 07:11:47 UTC