Re: [dxwg] Profile negotiation [RPFN] from Rob Atkinson on 2018-06-06 (public-dxwg-wg@w3.org from June 2018)

From: Rob Atkinson <rob@metalinkage.com.au>
Date: Wed, 6 Jun 2018 11:12:25 +1000
To: Annette Greiner <amgreiner@lbl.gov>
Cc: Rob Atkinson <rob@metalinkage.com.au>, Ruben Verborgh <Ruben.Verborgh@ugent.be>, "public-dxwg-wg@w3.org" <public-dxwg-wg@w3.org>
Message-ID: <CACfF9LyAPLS7o4OOfco2pVjOFkn3ABtPE=rUjDC0=HQMcnfxFA@mail.gmail.com>
The combination of dcterms:conforms to on a distribution,  and profileDesc
gives that option for a catalog using dcat...

And all of the above are clients a user might use...


On Wed, 6 Jun 2018, 10:57 Annette Greiner <amgreiner@lbl.gov> wrote:

> One thing to consider is how important it might be for data catalogs to
> have information about what profile options are available. As an end user
> seeking a dataset with which I might build an app or a visualization, I
> think it would be nice to be able to select a profile in a search and get a
> list of datasets that use that profile and also meet my other criteria. I
> wouldn't want to have to test each result separately to see if I can use it.
>
> But I want to understand who you mean by "the user" here. A human? A
> developer writing a web app? A script? Who are they the client of? The
> original publisher? The data catalog publisher?
>
> On 6/5/18 5:22 PM, Rob Atkinson wrote:
>
>
> This is what this UC is trying to cover ..
> https://github.com/w3c/dxwg/issues/239
>
> All you say is correct - and at one level profile negotiation adds a
> mechanism which is extra complexity. From the users perspective however it
> means that an object identifier becomes a potential source of the
> meta-information - you dont have all the extra complexity of dealing with a
> catalog to find this info - or even finding the right catalog. Server is
> its own catalog if you like (and in fact it may even be implemented that
> way)
>
> This provides an optional mechanism that significantly simplifies the user
> experience - at the cost of more server smarts.  Server smarts are paid for
> once is the good news in that scenario. currently all the burden is on the
> user with no standardised mechanisms and the user pays (or in practice more
> likely is unable to access data)
>
> So - a great conversation to keep in mind all these factors and see if we
> can find the right set of tools and recommendations for the best solution
> for a Web of Data outcome, recognising that point solutions for smaller
> communities already exist and will remain attractive. Cataloguing these
> things is still probably the only option. Just better if we have one
> information model for both cases.
>
>
>
>
>
> On Wed, 6 Jun 2018 at 10:09 Annette Greiner <amgreiner@lbl.gov> wrote:
>
>> What I'm seeing a requirement for is a standardized way to indicate the
>> availability of alternative forms of a dataset with different profiles
>> and to enable the end user (human or script) to receive the most
>> appropriate one for their use.
>>
>> Consider the case where the client is a human, browsing to find a
>> dataset that matches a certain profile that they like. If they are using
>> a typical commercial browser, they don't have a ready facility to use
>> content negotiation.
>>
>> Consider the case where the client is a script harvesting datasets for a
>> catalog. If the catalog publishers want to be able to indicate which
>> profiles are available for a dataset, they need to capture a list of
>> available profile options. Using content negotiation, they need to make
>> a request and then capture the list of available formats that the server
>> returns in the header. For that to work, the script needs to be written
>> to expect negotiation as one way it can get such data. If everyone
>> publishes their data this way, that's fine. But what if content
>> negotiation by profile follows the adoption trend of content negotiation
>> by other dimensions? Then the script would need to expect other means of
>> offering the list of possible profiles. Certainly at least initially,
>> adoption will be low. So adding negotiation to the mix adds complexity
>> rather than removing it.
>>
>> Consider the case where the client is a script for a web application.
>> The script needs data with a specific profile to work at all.  This case
>> works with negotiation, but it's not clear to me that it wouldn't work
>> as well with a link-based approach, e.g. a link with an attribute that
>> indicates its profile. The threshold to use on the publisher's side is
>> extremely low for that approach. On the client side, it's easier and
>> faster to check an attribute in a link than to try to follow it and then
>> parse the header to see if you received what you wanted.
>>
>> Re registration, if you want user agents to be able to do anything with
>> your MIME type other than download it, it needs to be registered. I
>> suppose that, if the profile creator wants user agents to be able to do
>> anything profile-specific with a dataset, they would supply a
>> dereferenceable IRI.
>>
>> Re representations vs resources, I think we agree that they are
>> something of a continuum. That's what I mean when I say it's a choice
>> whether to treat an entity as one or the other. I'm thinking of content
>> negotiation, where a resource is a thing with a URL and a representation
>> is a version of it that a user agent may receive depending on the accept
>> headers in the request.
>>
>> -Annette
>>
>>
>>
>> On 6/5/18 2:13 PM, Ruben Verborgh wrote:
>> > Hi Annette,
>> >
>> >> What do you mean? Links are already available in http.
>> > Yeah, but you'd need a standardized way to say
>> > "this link points to representation of X with profiles Y, Z"
>> >
>> >>> Content negotiation is simply an existing mechanism
>> >>> for connecting a resource to representations,
>> >>> so reusing it seems better than inventing a new link-based
>> negotiation mechanism.
>> >> You are assuming the need for negotiation. That's what I'm asking you
>> to justify.
>> > No, I'm assuming a need for clients
>> > to automatically find the representation they want,
>> > and I'm proposing content negotiation for that
>> > as opposed to a link-only mechanism.
>> >
>> >>> Furthermore, linking assumes that there is a finite number of
>> representations,
>> >>> and not a combinatorial explosion of all combinations that can be
>> made.
>> >> There *is* a finite number of representations that would be available.
>> > Finite, yes. Necessarily small, no.
>> >
>> >> You would have to configure the server to return the right
>> representations, and you would have to have created each of those
>> representations.
>> > In any case, but that's independent of the mechanism to find them.
>> >
>> >>> Finally, it integrates with negotiation in order dimensions, such as
>> >>> "give me the French document in XML conforming to profiles X, Y, Z".
>> >> Yes, that is nice. But there are other possible dimensions to data.
>> Why negotiate for this one?
>> > Quite the contrary: let's negotiate all dimensions.
>> > We already do this for content type and language.
>> >
>> >> One can think of different versions of datasets as different resources
>> if one wants.
>> > Yes, the usage of content negotiation does not alter that.
>> >
>> >> In fact, one could argue that it is always a different resource
>> because it contains different values.
>> > Sure, but that is independent of the mechanism to arrive at the right
>> one.
>> >
>> >> It's a choice to decide that it should be treated as a representation.
>> What motivates that choice?
>> > You seem to use "representation" as an opposite of "resource", but
>> that's not correct.
>> > As I've explained on GitHub, "representation" is a relative notion, not
>> an absolute one:
>> >
>> >>> To understand this, it's important to see that the "representation"
>> concept is a relative notion. E.g., in the sentence "A is a representation
>> of B", B the resource that A is the representation of. However, A is a
>> resource in its own right.
>> >>>
>> >>> An example to clarify:
>> >>>
>> >>>     • http://example.org/weather/amsterdam/2018-06-01 is the weather
>> report for Amsterdam for 1 June
>> >>>     • http://example.org/weather/amsterdam/2018-06-01.html is the
>> weather report for Amsterdam for 1 June in HTML
>> >>> Regardless of whether 2 has its own URL, all of the following hold:
>> >>>
>> >>>     • 1 is a resource
>> >>>     • 2 is a resource
>> >>>     • 2 is a representation of 1
>> >>>> Why is automated discovery needed?
>> >>> Because it's a manual thing otherwise.
>> >> That is a tautology.
>> > I'll try to explain better.
>> >
>> > If you have a client that fetches resources represented in a certain
>> profile,
>> > do you want it to ask you every time what link it should follow,
>> > or do you want it to be able to select the right link itself?
>> >
>> >>> You don't want your client to ask you what links to follow.
>> >> Why not? That is how hypermedia APIs work.
>> > Nothing in hypermedia APIs requires clients to ask you such things.
>> > It is a possibility, but not a requirement.
>> >
>> >> Adding negotiation as a new alternative means that crawling the web of
>> data has to involve checking for profile options by content negotiation in
>> addition to checking what is available through links.
>> > You're still free to link to them.
>> >
>> >> But I get the feeling you have a specific use case in mind where this
>> all makes immediate sense. *What is that use case?*
>> > I have a client that can read certain JSON profiles.
>> > I want that client to operate on dataset X.
>> > The client should be able to get X in a profile it understands.
>> >
>> >> Registration of new MIME types is needed.
>> > I'm afraid that's not correct.
>> > I can just start using application/vnd.my-thing whenever I want to,
>> > and I do not need to register that with IETF.
>> >
>> >> How do you get around new profiles needing to be registered?
>> > You mint an IRI for them.
>> >
>> > Best,
>> >
>> > Ruben
>>
>> --
>> Annette Greiner
>> NERSC Data and Analytics Services
>> Lawrence Berkeley National Laboratory
>>
>>
>>
> --
> Annette Greiner
> NERSC Data and Analytics Services
> Lawrence Berkeley National Laboratory
>
>
>
Received on Wednesday, 6 June 2018 01:13:22 UTC