Re: [dxwg] Profile negotiation [RPFN] from Karen Coyle on 2018-06-06 (public-dxwg-wg@w3.org from June 2018)

From: Karen Coyle <kcoyle@kcoyle.net>
Date: Wed, 6 Jun 2018 05:14:00 -0700
To: public-dxwg-wg@w3.org
Message-ID: <c9d80794-683d-c2fe-870d-6a3ba3704994@kcoyle.net>
Rob and Ruben,

The use cases that we address must express real situations that
illustrate a problem that we wish to solve. A good example of that is
the Europeana use case[1] which describes a real life, current situation
regarding uses of the EDM that result in some problems for Europeana.
Abstraction from a use case like that to a solution space is where we
need to go after use cases are well-defined.

Perhaps you could use the Europeana use case to illustrate your ideas
for content negotiation. (It is presented as a content negotiation
problem in the use case.)

I know that Lars provided some library-related illustrations in the IETF
document.[2] Those aren't given in great detail there, though, so it
might be necessary to flesh them out. I do think that the Europeana case
is the best one we have to ground this discussion.

kc
[1]
https://docs.google.com/document/d/13hV2tJ6Kg2Hfe7e1BowY5QfCIweH9GxSCFQV1aWtOPg/edit?ts=5b15cc3a#heading=h.poqzq68p2cgj
[2]
https://profilenegotiation.github.io/I-D-Accept--Schema/I-D-accept-schema

On 6/5/18 6:12 PM, Rob Atkinson wrote:
> The combination of dcterms:conforms to on a distribution,  and
> profileDesc gives that option for a catalog using dcat...
> 
> And all of the above are clients a user might use...
> 
> 
> On Wed, 6 Jun 2018, 10:57 Annette Greiner <amgreiner@lbl.gov
> <mailto:amgreiner@lbl.gov>> wrote:
> 
>     One thing to consider is how important it might be for data catalogs
>     to have information about what profile options are available. As an
>     end user seeking a dataset with which I might build an app or a
>     visualization, I think it would be nice to be able to select a
>     profile in a search and get a list of datasets that use that profile
>     and also meet my other criteria. I wouldn't want to have to test
>     each result separately to see if I can use it.
> 
>     But I want to understand who you mean by "the user" here. A human? A
>     developer writing a web app? A script? Who are they the client of?
>     The original publisher? The data catalog publisher?
> 
> 
>     On 6/5/18 5:22 PM, Rob Atkinson wrote:
>>
>>     This is what this UC is trying to cover ..
>>     https://github.com/w3c/dxwg/issues/239
>>
>>     All you say is correct - and at one level profile negotiation adds
>>     a mechanism which is extra complexity. From the users perspective
>>     however it means that an object identifier becomes a potential
>>     source of the meta-information - you dont have all the extra
>>     complexity of dealing with a catalog to find this info - or even
>>     finding the right catalog. Server is its own catalog if you like
>>     (and in fact it may even be implemented that way) 
>>
>>     This provides an optional mechanism that significantly simplifies
>>     the user experience - at the cost of more server smarts.  Server
>>     smarts are paid for once is the good news in that scenario.
>>     currently all the burden is on the user with no standardised
>>     mechanisms and the user pays (or in practice more likely is unable
>>     to access data)
>>
>>     So - a great conversation to keep in mind all these factors and
>>     see if we can find the right set of tools and recommendations for
>>     the best solution for a Web of Data outcome, recognising that
>>     point solutions for smaller communities already exist and will
>>     remain attractive. Cataloguing these things is still probably the
>>     only option. Just better if we have one information model for both
>>     cases.
>>
>>
>>
>>
>>
>>     On Wed, 6 Jun 2018 at 10:09 Annette Greiner <amgreiner@lbl.gov
>>     <mailto:amgreiner@lbl.gov>> wrote:
>>
>>         What I'm seeing a requirement for is a standardized way to
>>         indicate the
>>         availability of alternative forms of a dataset with different
>>         profiles
>>         and to enable the end user (human or script) to receive the most
>>         appropriate one for their use.
>>
>>         Consider the case where the client is a human, browsing to find a
>>         dataset that matches a certain profile that they like. If they
>>         are using
>>         a typical commercial browser, they don't have a ready facility
>>         to use
>>         content negotiation.
>>
>>         Consider the case where the client is a script harvesting
>>         datasets for a
>>         catalog. If the catalog publishers want to be able to indicate
>>         which
>>         profiles are available for a dataset, they need to capture a
>>         list of
>>         available profile options. Using content negotiation, they
>>         need to make
>>         a request and then capture the list of available formats that
>>         the server
>>         returns in the header. For that to work, the script needs to
>>         be written
>>         to expect negotiation as one way it can get such data. If
>>         everyone
>>         publishes their data this way, that's fine. But what if content
>>         negotiation by profile follows the adoption trend of content
>>         negotiation
>>         by other dimensions? Then the script would need to expect
>>         other means of
>>         offering the list of possible profiles. Certainly at least
>>         initially,
>>         adoption will be low. So adding negotiation to the mix adds
>>         complexity
>>         rather than removing it.
>>
>>         Consider the case where the client is a script for a web
>>         application.
>>         The script needs data with a specific profile to work at all. 
>>         This case
>>         works with negotiation, but it's not clear to me that it
>>         wouldn't work
>>         as well with a link-based approach, e.g. a link with an
>>         attribute that
>>         indicates its profile. The threshold to use on the publisher's
>>         side is
>>         extremely low for that approach. On the client side, it's
>>         easier and
>>         faster to check an attribute in a link than to try to follow
>>         it and then
>>         parse the header to see if you received what you wanted.
>>
>>         Re registration, if you want user agents to be able to do
>>         anything with
>>         your MIME type other than download it, it needs to be
>>         registered. I
>>         suppose that, if the profile creator wants user agents to be
>>         able to do
>>         anything profile-specific with a dataset, they would supply a
>>         dereferenceable IRI.
>>
>>         Re representations vs resources, I think we agree that they are
>>         something of a continuum. That's what I mean when I say it's a
>>         choice
>>         whether to treat an entity as one or the other. I'm thinking
>>         of content
>>         negotiation, where a resource is a thing with a URL and a
>>         representation
>>         is a version of it that a user agent may receive depending on
>>         the accept
>>         headers in the request.
>>
>>         -Annette
>>
>>
>>
>>         On 6/5/18 2:13 PM, Ruben Verborgh wrote:
>>         > Hi Annette,
>>         >
>>         >> What do you mean? Links are already available in http.
>>         > Yeah, but you'd need a standardized way to say
>>         > "this link points to representation of X with profiles Y, Z"
>>         >
>>         >>> Content negotiation is simply an existing mechanism
>>         >>> for connecting a resource to representations,
>>         >>> so reusing it seems better than inventing a new link-based
>>         negotiation mechanism.
>>         >> You are assuming the need for negotiation. That's what I'm
>>         asking you to justify.
>>         > No, I'm assuming a need for clients
>>         > to automatically find the representation they want,
>>         > and I'm proposing content negotiation for that
>>         > as opposed to a link-only mechanism.
>>         >
>>         >>> Furthermore, linking assumes that there is a finite number
>>         of representations,
>>         >>> and not a combinatorial explosion of all combinations that
>>         can be made.
>>         >> There *is* a finite number of representations that would be
>>         available.
>>         > Finite, yes. Necessarily small, no.
>>         >
>>         >> You would have to configure the server to return the right
>>         representations, and you would have to have created each of
>>         those representations.
>>         > In any case, but that's independent of the mechanism to find
>>         them.
>>         >
>>         >>> Finally, it integrates with negotiation in order
>>         dimensions, such as
>>         >>> "give me the French document in XML conforming to profiles
>>         X, Y, Z".
>>         >> Yes, that is nice. But there are other possible dimensions
>>         to data. Why negotiate for this one?
>>         > Quite the contrary: let's negotiate all dimensions.
>>         > We already do this for content type and language.
>>         >
>>         >> One can think of different versions of datasets as
>>         different resources if one wants.
>>         > Yes, the usage of content negotiation does not alter that.
>>         >
>>         >> In fact, one could argue that it is always a different
>>         resource because it contains different values.
>>         > Sure, but that is independent of the mechanism to arrive at
>>         the right one.
>>         >
>>         >> It's a choice to decide that it should be treated as a
>>         representation. What motivates that choice?
>>         > You seem to use "representation" as an opposite of
>>         "resource", but that's not correct.
>>         > As I've explained on GitHub, "representation" is a relative
>>         notion, not an absolute one:
>>         >
>>         >>> To understand this, it's important to see that the
>>         "representation" concept is a relative notion. E.g., in the
>>         sentence "A is a representation of B", B the resource that A
>>         is the representation of. However, A is a resource in its own
>>         right.
>>         >>>
>>         >>> An example to clarify:
>>         >>>
>>         >>>     • http://example.org/weather/amsterdam/2018-06-01 is
>>         the weather report for Amsterdam for 1 June
>>         >>>     • http://example.org/weather/amsterdam/2018-06-01.html
>>         is the weather report for Amsterdam for 1 June in HTML
>>         >>> Regardless of whether 2 has its own URL, all of the
>>         following hold:
>>         >>>
>>         >>>     • 1 is a resource
>>         >>>     • 2 is a resource
>>         >>>     • 2 is a representation of 1
>>         >>>> Why is automated discovery needed?
>>         >>> Because it's a manual thing otherwise.
>>         >> That is a tautology.
>>         > I'll try to explain better.
>>         >
>>         > If you have a client that fetches resources represented in a
>>         certain profile,
>>         > do you want it to ask you every time what link it should follow,
>>         > or do you want it to be able to select the right link itself?
>>         >
>>         >>> You don't want your client to ask you what links to follow.
>>         >> Why not? That is how hypermedia APIs work.
>>         > Nothing in hypermedia APIs requires clients to ask you such
>>         things.
>>         > It is a possibility, but not a requirement.
>>         >
>>         >> Adding negotiation as a new alternative means that crawling
>>         the web of data has to involve checking for profile options by
>>         content negotiation in addition to checking what is available
>>         through links.
>>         > You're still free to link to them.
>>         >
>>         >> But I get the feeling you have a specific use case in mind
>>         where this all makes immediate sense. *What is that use case?*
>>         > I have a client that can read certain JSON profiles.
>>         > I want that client to operate on dataset X.
>>         > The client should be able to get X in a profile it understands.
>>         >
>>         >> Registration of new MIME types is needed.
>>         > I'm afraid that's not correct.
>>         > I can just start using application/vnd.my-thing whenever I
>>         want to,
>>         > and I do not need to register that with IETF.
>>         >
>>         >> How do you get around new profiles needing to be registered?
>>         > You mint an IRI for them.
>>         >
>>         > Best,
>>         >
>>         > Ruben
>>
>>         -- 
>>         Annette Greiner
>>         NERSC Data and Analytics Services
>>         Lawrence Berkeley National Laboratory
>>
>>
> 
>     -- 
>     Annette Greiner
>     NERSC Data and Analytics Services
>     Lawrence Berkeley National Laboratory
> 

-- 
Karen Coyle
kcoyle@kcoyle.net http://kcoyle.net
m: 1-510-435-8234 (Signal)
skype: kcoylenet/+1-510-984-3600
Received on Wednesday, 6 June 2018 12:14:30 UTC