Re: [dxwg] Profile negotiation [RPFN] from Annette Greiner on 2018-06-06 (public-dxwg-wg@w3.org from June 2018)

From: Annette Greiner <amgreiner@lbl.gov>
Date: Tue, 5 Jun 2018 17:09:05 -0700
To: Ruben Verborgh <Ruben.Verborgh@UGent.be>
Cc: "public-dxwg-wg@w3.org" <public-dxwg-wg@w3.org>
Message-ID: <9cf6e4ab-98eb-3c96-cd93-0862c1bb4b89@lbl.gov>
What I'm seeing a requirement for is a standardized way to indicate the 
availability of alternative forms of a dataset with different profiles 
and to enable the end user (human or script) to receive the most 
appropriate one for their use.

Consider the case where the client is a human, browsing to find a 
dataset that matches a certain profile that they like. If they are using 
a typical commercial browser, they don't have a ready facility to use 
content negotiation.

Consider the case where the client is a script harvesting datasets for a 
catalog. If the catalog publishers want to be able to indicate which 
profiles are available for a dataset, they need to capture a list of 
available profile options. Using content negotiation, they need to make 
a request and then capture the list of available formats that the server 
returns in the header. For that to work, the script needs to be written 
to expect negotiation as one way it can get such data. If everyone 
publishes their data this way, that's fine. But what if content 
negotiation by profile follows the adoption trend of content negotiation 
by other dimensions? Then the script would need to expect other means of 
offering the list of possible profiles. Certainly at least initially, 
adoption will be low. So adding negotiation to the mix adds complexity 
rather than removing it.

Consider the case where the client is a script for a web application. 
The script needs data with a specific profile to work at all.  This case 
works with negotiation, but it's not clear to me that it wouldn't work 
as well with a link-based approach, e.g. a link with an attribute that 
indicates its profile. The threshold to use on the publisher's side is 
extremely low for that approach. On the client side, it's easier and 
faster to check an attribute in a link than to try to follow it and then 
parse the header to see if you received what you wanted.

Re registration, if you want user agents to be able to do anything with 
your MIME type other than download it, it needs to be registered. I 
suppose that, if the profile creator wants user agents to be able to do 
anything profile-specific with a dataset, they would supply a 
dereferenceable IRI.

Re representations vs resources, I think we agree that they are 
something of a continuum. That's what I mean when I say it's a choice 
whether to treat an entity as one or the other. I'm thinking of content 
negotiation, where a resource is a thing with a URL and a representation 
is a version of it that a user agent may receive depending on the accept 
headers in the request.

-Annette



On 6/5/18 2:13 PM, Ruben Verborgh wrote:
> Hi Annette,
>
>> What do you mean? Links are already available in http.
> Yeah, but you'd need a standardized way to say
> "this link points to representation of X with profiles Y, Z"
>
>>> Content negotiation is simply an existing mechanism
>>> for connecting a resource to representations,
>>> so reusing it seems better than inventing a new link-based negotiation mechanism.
>> You are assuming the need for negotiation. That's what I'm asking you to justify.
> No, I'm assuming a need for clients
> to automatically find the representation they want,
> and I'm proposing content negotiation for that
> as opposed to a link-only mechanism.
>
>>> Furthermore, linking assumes that there is a finite number of representations,
>>> and not a combinatorial explosion of all combinations that can be made.
>> There *is* a finite number of representations that would be available.
> Finite, yes. Necessarily small, no.
>
>> You would have to configure the server to return the right representations, and you would have to have created each of those representations.
> In any case, but that's independent of the mechanism to find them.
>
>>> Finally, it integrates with negotiation in order dimensions, such as
>>> "give me the French document in XML conforming to profiles X, Y, Z".
>> Yes, that is nice. But there are other possible dimensions to data. Why negotiate for this one?
> Quite the contrary: let's negotiate all dimensions.
> We already do this for content type and language.
>
>> One can think of different versions of datasets as different resources if one wants.
> Yes, the usage of content negotiation does not alter that.
>
>> In fact, one could argue that it is always a different resource because it contains different values.
> Sure, but that is independent of the mechanism to arrive at the right one.
>
>> It's a choice to decide that it should be treated as a representation. What motivates that choice?
> You seem to use "representation" as an opposite of "resource", but that's not correct.
> As I've explained on GitHub, "representation" is a relative notion, not an absolute one:
>
>>> To understand this, it's important to see that the "representation" concept is a relative notion. E.g., in the sentence "A is a representation of B", B the resource that A is the representation of. However, A is a resource in its own right.
>>>
>>> An example to clarify:
>>>
>>> 	• http://example.org/weather/amsterdam/2018-06-01 is the weather report for Amsterdam for 1 June
>>> 	• http://example.org/weather/amsterdam/2018-06-01.html is the weather report for Amsterdam for 1 June in HTML
>>> Regardless of whether 2 has its own URL, all of the following hold:
>>>
>>> 	• 1 is a resource
>>> 	• 2 is a resource
>>> 	• 2 is a representation of 1
>>>> Why is automated discovery needed?
>>> Because it's a manual thing otherwise.
>> That is a tautology.
> I'll try to explain better.
>
> If you have a client that fetches resources represented in a certain profile,
> do you want it to ask you every time what link it should follow,
> or do you want it to be able to select the right link itself?
>
>>> You don't want your client to ask you what links to follow.
>> Why not? That is how hypermedia APIs work.
> Nothing in hypermedia APIs requires clients to ask you such things.
> It is a possibility, but not a requirement.
>
>> Adding negotiation as a new alternative means that crawling the web of data has to involve checking for profile options by content negotiation in addition to checking what is available through links.
> You're still free to link to them.
>
>> But I get the feeling you have a specific use case in mind where this all makes immediate sense. *What is that use case?*
> I have a client that can read certain JSON profiles.
> I want that client to operate on dataset X.
> The client should be able to get X in a profile it understands.
>
>> Registration of new MIME types is needed.
> I'm afraid that's not correct.
> I can just start using application/vnd.my-thing whenever I want to,
> and I do not need to register that with IETF.
>
>> How do you get around new profiles needing to be registered?
> You mint an IRI for them.
>
> Best,
>
> Ruben

-- 
Annette Greiner
NERSC Data and Analytics Services
Lawrence Berkeley National Laboratory
Received on Wednesday, 6 June 2018 00:09:25 UTC