Re: [dxwg] Profile negotiation [RPFN] from Annette Greiner on 2018-06-06 (public-dxwg-wg@w3.org from June 2018)

From: Annette Greiner <amgreiner@lbl.gov>
Date: Tue, 5 Jun 2018 17:57:05 -0700
To: Rob Atkinson <rob@metalinkage.com.au>
Cc: Ruben Verborgh <Ruben.Verborgh@ugent.be>, "public-dxwg-wg@w3.org" <public-dxwg-wg@w3.org>
Message-ID: <be2dd8ec-0b8e-537c-7f71-c3e19cf0fe06@lbl.gov>
One thing to consider is how important it might be for data catalogs to 
have information about what profile options are available. As an end 
user seeking a dataset with which I might build an app or a 
visualization, I think it would be nice to be able to select a profile 
in a search and get a list of datasets that use that profile and also 
meet my other criteria. I wouldn't want to have to test each result 
separately to see if I can use it.

But I want to understand who you mean by "the user" here. A human? A 
developer writing a web app? A script? Who are they the client of? The 
original publisher? The data catalog publisher?


On 6/5/18 5:22 PM, Rob Atkinson wrote:
>
> This is what this UC is trying to cover .. 
> https://github.com/w3c/dxwg/issues/239
>
> All you say is correct - and at one level profile negotiation adds a 
> mechanism which is extra complexity. From the users perspective 
> however it means that an object identifier becomes a potential source 
> of the meta-information - you dont have all the extra complexity of 
> dealing with a catalog to find this info - or even finding the right 
> catalog. Server is its own catalog if you like (and in fact it may 
> even be implemented that way)
>
> This provides an optional mechanism that significantly simplifies the 
> user experience - at the cost of more server smarts.  Server smarts 
> are paid for once is the good news in that scenario. currently all the 
> burden is on the user with no standardised mechanisms and the user 
> pays (or in practice more likely is unable to access data)
>
> So - a great conversation to keep in mind all these factors and see if 
> we can find the right set of tools and recommendations for the best 
> solution for a Web of Data outcome, recognising that point solutions 
> for smaller communities already exist and will remain attractive. 
> Cataloguing these things is still probably the only option. Just 
> better if we have one information model for both cases.
>
>
>
>
>
> On Wed, 6 Jun 2018 at 10:09 Annette Greiner <amgreiner@lbl.gov 
> <mailto:amgreiner@lbl.gov>> wrote:
>
>     What I'm seeing a requirement for is a standardized way to
>     indicate the
>     availability of alternative forms of a dataset with different
>     profiles
>     and to enable the end user (human or script) to receive the most
>     appropriate one for their use.
>
>     Consider the case where the client is a human, browsing to find a
>     dataset that matches a certain profile that they like. If they are
>     using
>     a typical commercial browser, they don't have a ready facility to use
>     content negotiation.
>
>     Consider the case where the client is a script harvesting datasets
>     for a
>     catalog. If the catalog publishers want to be able to indicate which
>     profiles are available for a dataset, they need to capture a list of
>     available profile options. Using content negotiation, they need to
>     make
>     a request and then capture the list of available formats that the
>     server
>     returns in the header. For that to work, the script needs to be
>     written
>     to expect negotiation as one way it can get such data. If everyone
>     publishes their data this way, that's fine. But what if content
>     negotiation by profile follows the adoption trend of content
>     negotiation
>     by other dimensions? Then the script would need to expect other
>     means of
>     offering the list of possible profiles. Certainly at least initially,
>     adoption will be low. So adding negotiation to the mix adds
>     complexity
>     rather than removing it.
>
>     Consider the case where the client is a script for a web application.
>     The script needs data with a specific profile to work at all. This
>     case
>     works with negotiation, but it's not clear to me that it wouldn't
>     work
>     as well with a link-based approach, e.g. a link with an attribute
>     that
>     indicates its profile. The threshold to use on the publisher's
>     side is
>     extremely low for that approach. On the client side, it's easier and
>     faster to check an attribute in a link than to try to follow it
>     and then
>     parse the header to see if you received what you wanted.
>
>     Re registration, if you want user agents to be able to do anything
>     with
>     your MIME type other than download it, it needs to be registered. I
>     suppose that, if the profile creator wants user agents to be able
>     to do
>     anything profile-specific with a dataset, they would supply a
>     dereferenceable IRI.
>
>     Re representations vs resources, I think we agree that they are
>     something of a continuum. That's what I mean when I say it's a choice
>     whether to treat an entity as one or the other. I'm thinking of
>     content
>     negotiation, where a resource is a thing with a URL and a
>     representation
>     is a version of it that a user agent may receive depending on the
>     accept
>     headers in the request.
>
>     -Annette
>
>
>
>     On 6/5/18 2:13 PM, Ruben Verborgh wrote:
>     > Hi Annette,
>     >
>     >> What do you mean? Links are already available in http.
>     > Yeah, but you'd need a standardized way to say
>     > "this link points to representation of X with profiles Y, Z"
>     >
>     >>> Content negotiation is simply an existing mechanism
>     >>> for connecting a resource to representations,
>     >>> so reusing it seems better than inventing a new link-based
>     negotiation mechanism.
>     >> You are assuming the need for negotiation. That's what I'm
>     asking you to justify.
>     > No, I'm assuming a need for clients
>     > to automatically find the representation they want,
>     > and I'm proposing content negotiation for that
>     > as opposed to a link-only mechanism.
>     >
>     >>> Furthermore, linking assumes that there is a finite number of
>     representations,
>     >>> and not a combinatorial explosion of all combinations that can
>     be made.
>     >> There *is* a finite number of representations that would be
>     available.
>     > Finite, yes. Necessarily small, no.
>     >
>     >> You would have to configure the server to return the right
>     representations, and you would have to have created each of those
>     representations.
>     > In any case, but that's independent of the mechanism to find them.
>     >
>     >>> Finally, it integrates with negotiation in order dimensions,
>     such as
>     >>> "give me the French document in XML conforming to profiles X,
>     Y, Z".
>     >> Yes, that is nice. But there are other possible dimensions to
>     data. Why negotiate for this one?
>     > Quite the contrary: let's negotiate all dimensions.
>     > We already do this for content type and language.
>     >
>     >> One can think of different versions of datasets as different
>     resources if one wants.
>     > Yes, the usage of content negotiation does not alter that.
>     >
>     >> In fact, one could argue that it is always a different resource
>     because it contains different values.
>     > Sure, but that is independent of the mechanism to arrive at the
>     right one.
>     >
>     >> It's a choice to decide that it should be treated as a
>     representation. What motivates that choice?
>     > You seem to use "representation" as an opposite of "resource",
>     but that's not correct.
>     > As I've explained on GitHub, "representation" is a relative
>     notion, not an absolute one:
>     >
>     >>> To understand this, it's important to see that the
>     "representation" concept is a relative notion. E.g., in the
>     sentence "A is a representation of B", B the resource that A is
>     the representation of. However, A is a resource in its own right.
>     >>>
>     >>> An example to clarify:
>     >>>
>     >>>     • http://example.org/weather/amsterdam/2018-06-01 is the
>     weather report for Amsterdam for 1 June
>     >>>     • http://example.org/weather/amsterdam/2018-06-01.html is
>     the weather report for Amsterdam for 1 June in HTML
>     >>> Regardless of whether 2 has its own URL, all of the following
>     hold:
>     >>>
>     >>>     • 1 is a resource
>     >>>     • 2 is a resource
>     >>>     • 2 is a representation of 1
>     >>>> Why is automated discovery needed?
>     >>> Because it's a manual thing otherwise.
>     >> That is a tautology.
>     > I'll try to explain better.
>     >
>     > If you have a client that fetches resources represented in a
>     certain profile,
>     > do you want it to ask you every time what link it should follow,
>     > or do you want it to be able to select the right link itself?
>     >
>     >>> You don't want your client to ask you what links to follow.
>     >> Why not? That is how hypermedia APIs work.
>     > Nothing in hypermedia APIs requires clients to ask you such things.
>     > It is a possibility, but not a requirement.
>     >
>     >> Adding negotiation as a new alternative means that crawling the
>     web of data has to involve checking for profile options by content
>     negotiation in addition to checking what is available through links.
>     > You're still free to link to them.
>     >
>     >> But I get the feeling you have a specific use case in mind
>     where this all makes immediate sense. *What is that use case?*
>     > I have a client that can read certain JSON profiles.
>     > I want that client to operate on dataset X.
>     > The client should be able to get X in a profile it understands.
>     >
>     >> Registration of new MIME types is needed.
>     > I'm afraid that's not correct.
>     > I can just start using application/vnd.my-thing whenever I want to,
>     > and I do not need to register that with IETF.
>     >
>     >> How do you get around new profiles needing to be registered?
>     > You mint an IRI for them.
>     >
>     > Best,
>     >
>     > Ruben
>
>     -- 
>     Annette Greiner
>     NERSC Data and Analytics Services
>     Lawrence Berkeley National Laboratory
>
>

-- 
Annette Greiner
NERSC Data and Analytics Services
Lawrence Berkeley National Laboratory
Received on Wednesday, 6 June 2018 00:57:27 UTC