Re: ISSUE-32: Should hydra:returns and hydra:statusCodes be removed to avoid tight coupling? (was: More Thoughts on Links and Operation Subclasses)

Again, thanks for taking the time to break apart these two issues. 


On Feb 4, 2014, at 10:39 AM, Markus Lanthaler <markus.lanthaler@gmx.net> wrote:

> OK, this is the second thread. This one is trying to find an answer to the
> following question:
> 
>  "Should hydra:returns and hydra:statusCodes be removed to avoid tight
> coupling?" - ISSUE-32 
> 

Yes.

> 
> On Tuesday, February 04, 2014 2:11 AM, Ryan J. McDonough wrote:
>> On Feb 3, 2014, at 3:19 PM, Markus Lanthaler wrote:
>>> On Friday, January 31, 2014 4:50 PM, Ryan J. McDonough wrote:
> [...]
>> Look at some of the comments on the old Facebook API. You have people
>> whining because they expected a 200 (Ok) and get an image but instead
>> they got a 303 or 302 instead and they're all perplexed. A good number
>> of devs will take the documentation as fact rather than look at what's
>> coming back in the response. Intermediaries on the other hand will
>> never look at your documentation and will always look at the headers
>> and message body. Some would argue that the client is also an
>> intermediary.
> 
> Hmm.. that's a very good point. So in your opinion no additional information
> about the returned status codes is necessary? Not even at the API level? For
> example, to make it clear that if the quota limit is hit a "402 Payment
> Required" is returned instead of a "429 Too Many Requests”?

Ah, so it seems we have constraints and consequences but not so much return types? This is really interesting but I still don’t think returns is the right solution for something like this. Perfect example of this is the Yahoo YQL service and it’s service limits:

http://developer.yahoo.com/yql/

Being able to express these limits in Hydra would be incredibly powerful. But this is a great use case and we should gather more of these as they’ll help drive requirements. What do you think about capturing these in the GitHub Issues?

> 
> [...]
> 
>>>> The fact that HTML doesn't concern the browser with things like returns
>>>> types and potential response codes is one of the things that makes the
>>>> web work today. Beating the horse a little more, consider a checkout
>>>> process whereby one of my payment options use PayPal and I'm sending
>>>> data via POST to PayPal's payment API and awaiting the response from
>>>> PayPal. I'm going to send the client from my API in my domain, over to
>>>> PayPal, where I don't have control over PayPal's API, more importantly
>>>> their namespace or response codes. Using my ApiDocumentation to
>>>> describe what I think is PayPal's expected response types is recipe for
>>>> failure.
>>> 
>>> Then just leave it out :-) It's not required to specify these things.
>> 
>> You could, but there will be those who are expecting the response types
>> because it's in some Hyrda descriptors but not others.
> 
> Fair enough. So, in other words, you are saying that clients would fail
> because they can't find that information?

Yes. Either that, or they might somehow draw the conclusion that API that doesn’t define a hydra:returns property is somehow incomplete and broken. 

> 
> 
>>>> Now, without a doubt, HTML forms don't do much to describe what the
>>>> form does. HTML rely's on the fact that a Human can parse the text on
>>>> the page in order to determine the controls function. In Hydra, we're
>>>> trying to get at the machine parseable analog to descriptive text. I
>>> 
>>> Exactly. The solution I chose was to type operations as I felt that's
> the
>>> simplest solution that a lot of people will understand instinctively.
> How
>>> would you describe it instead?
>> 
>> I agree that people will instinctively get this, but HTTP doesn't work
>> this way. HTTP requests and have variable response types and Hydra is
>> doing what WADL and Swagger do and suggest that there's a single, fixed
>> response types.
>> 
>> Instead, I would recommend that developers work in an asynchronous,
>> even-driven fashion and specify handlers to sense and react to the data
>> in the response by looking at the headers to determine if the client
>> can in fact respond to it. Without a doubt, the service needs to work
>> within the constraints of the client. That is, response should likely
>> be in JSON-LD and ideally confirm to some type hierarchy defined in the
>> service descriptor.
> 
> OK. As you know, there are many vocabularies out there. For ecommerce, e.g.,
> there are Schema.org and GoodRelations. Now I think it is necessary to
> somehow document what kind of types the client should be prepared to handle.
> Similar to how you would need to ensure that your client understands the set
> of media types used in a specific API. This doesn't have to be described at
> the operation level though. We have "supportedClasses" on "ApiDocumentation"
> and could leverage that instead.
> 
> What do you think about that? Do you find that equally problematic? If so,
> where would you start if you were to program a client? Would you crawl the
> API? By trial and error?

I actually quite like that. It goes inline with my suggestion of using Link headers with the profile relation in order to determine what is in the JSON-LD response. But in order to react to that, I’d need to know something about the API’s capabilities first. Personally, I’d be looking for ApiDocumentation first and see what data models it references and what classes it supports. What you describe works for me.

> 
> 
>> I guess it would help to create an example, huh? :)
> 
> I think I understand what you mean but examples are always very helpful.
> 
> 
> [...]
> 
>> What I was trying to get at with the profile link header is that if the
>> data model is expressed up front and there's sufficient documentation
>> about what the types mean, a client can create a number of handlers
>> that could react to different responses. If the model is good enough,
>> the client could react better to responses that they didn't expect at
>> build time.
> 
> That sounds like ApiDocumentation/supportedClasses comes close to what you
> had in mind. Doesn't it?

Yes. Rather than specifying return types, the client should have an idea of what data models the application uses. This is one thing that I think WADL got right with it’s grammars element[1]. The XForms model[2] also offers similar functionality. In both cases, the client knows up front what are all the types that the application might be returning in responses. So yes, I like this approach. 

> 
> 
>> Some developers will read documentation and create code generators that
>> the illiterate ones will use. It is here that I feel Hydra will get
>> into trouble.
> 
> Yeah.. as you know, that's one of my main concerns as well. In the end,
> however, I think the only way to avoid that is to implement a generic,
> dynamic client that's better than statically generated clients. I know, a
> lofty goal :-)
> 
> 
>>> I have troubles extracting something actionable from your mails.
>>> Would removing returns/statusCodes address your concerns?
>> 
>> Absolutely!
> 
> OK, that's a start. What about statusCodes at the ApiDocumentation level?
> Would you remove them as well?

I would, yes.

> 
> I don't know how much you are into Semantic Web stuff in general,

A fair amount. I did quite a bit in my time at Nokia, but it’s been a while so I’m a little rusty now.

> but how do
> you feel about rdfs:range in this context? In a sense it is very close to
> hydra:returns
> 
>  :discussesWith rdf:type hydra:Link ;
>                 rdfs:range schema:Person .
> 
>  </people/markus> rdf:type schema:Person ;
>                   :discussesWith </people/ryan> .

Oooh, interesting! 

My mental model starts to go into the whole identity vs. location debate and I now see where Kingsley was going by requesting Turtle :) I guess the thing that changes for me is that I’ve never considered rdfs:range in the context of HTTP responses. In past, I’ve had to deal with URIs that were either URNs, tel URIs, and other non-HTTP URIs and never considered how the protocol reacts to dereferencing these URIs. In the tel: case, I didn’t want inadvertently call people :) 

With JSON-LD, ids should be dereferenceable HTTP URIs. Now we’re not only dealing with what the information means, but also how it reacts to being dereferenced via the protocol that URI specifies. I guess the biggest difference for me is that rdfs:range doesn’t assert that the relationship is conditional based on a successful HTTP request. It qualifies the relationship, but it doesn’t make any assumptions about the HTTP request. The Hydra spec currently states that hyrdra:returns represents "The information returned by the Web API on success.” I guess that’s the fundamental difference.

> 
> 
> 
>>> If so, what does it really change?
>> 
>> It'll force developers to look at at what's coming back in the response
>> headers rather than what's defined in the Hyrda description. One is a
>> hint and the other is fact (i.e. what the server is sending back). By
>> removing returns and status, you are now forcing developers to look in
>> the right place: the HTTP response headers.
>> 
>> I have development teams messing things up on a fairly frequent basic
>> with WADL and Swagger (seeing a pattern here? :) ) due to the fact that
>> they are expecting the server to return exactly what is specified in
>> the descriptor and not taking into account they may have to deal with
>> both a forward and reverse proxy in the mix.
> 
> It's not that trivial to handle all possible responses properly.

No, but you can trap and handle the different ranges and deal with the specific ones differently than the ones you hadn’t considered yet. More specifically, handle things like 400, 401, and 404’s more specifically and a catch-all for all other 4xx series response codes. I can then specialize handling as I see fit. 

> In a lot developers simply need/want to get their job done and choose the simplest
> route. Surprisingly that works quite well in most cases (I would say more
> than the famous 80%). But yeah, I see what you are getting at. I also have
> to admit that apart from the natural-language documentation generation use
> case I don't see that much value in this information either given that we
> have supportedClasses on ApiDocumentation (which again, is just a hint)..
> there's also statusCodes there which I still find has some value but I would
> need to think more about that.
> 

I do like the idea of supportedClasses in the ApiDocumentation, just sans statusCodes and returns. 

> 
>> It could also be the wording. If this is just a hint, then perhaps
>> instead of hydra:returns, which sounds a bit more committed than say
>> perhaps something more like hyrdra:intimation or perhaps even
>> hydra:anticipatedResponseType?
> 
> Well, naming is one of those two difficult things in computer science :-) I
> don't think it would change much if we would change its name. Perhaps the
> stronger signal would be sent by moving this to a separate vocabulary which
> adds a couple of other things to facilitate the generation of
> natural-language documentations.

IANACS? Can that be a thing?

Anyway, I actually do think it could change things. With JSON, you’re typically not dealing with computer sciencey types form the get go. JSON-LD has been a great tool to bridge the gap between those who are more comfortable working with structs and tree structures and those who grok the RDF stuff. Terms like “returns” is one to one with “returns” in most programming languages. Developers who skim the docs, will look for familiar things and may make assumptions as to it’s function. A term like intimation, or even rdfs:range, likely wouldn’t suggest that this is a hard return type. Just a thought.

> 
> 
>> I'm still not sold on status codes. For the most part, everyone is
>> going to expect something in the 200 range. There's too many
>> exceptional codes to deal with in a format like Hydra to be practical.
> 
> Right, there are many. Maybe its again about finding a compromise. I don't
> think we would lose much by expressing these things just at the API level
> instead of doing so at the operation level. Of course also this could be
> moved out of the core vocabulary into a "documentation" vocabulary.

That sounds reasonable. I mean I get that in some places, you want to be able to indicate that when you do “X”, you’re going to get a 202 rather than the usual 200, and that’s to be expected because this operations take longer than you’re willing to wait. The way statusCodes feels like it suggest that these are the only things you’d have to contend with.

> 
> 
>>> IMO it will be just a matter of time till someone else mints
>>> a URL for these things in order to, again, be able to transform a
>>> Hydra description into a nicely formatted HTML documentation.
>> 
>> Sure, and that's fine. AAA principle working at it's finest! At least
>> no one can point to Hydra Core and blame it for suggesting fixed
>> response types :)
> 
> :-)
> 
> 
>> But seriously, I have to find some time to demo these ideas to better
>> illustrate what I'm talking about. That might go a ways in clarifying
>> these points.
> 
> I feel we already made quite some progress on these recent discussions and
> have some concrete options to evaluate:
> 
>  - remove "returns" from "Operation"?
>  - remove "statusCodes" from "Operation"?
>  - remove "statusCodes" from "ApiDocumentation"?
>  - move all of them to a separate "documentation generation" vocabulary?
> 

That’d work. 

> We should also discuss whether "supportedClasses" on "ApiDocumentation" is
> enough or perhaps too much :-)

I do like that quite a bit actually and I don’t think that’s too much. 

Also, we should capture the use case you mentioned earlier on somewhere on the issues list.

> 
> 
> --
> Markus Lanthaler
> @markuslanthaler
> 
> 

[1]: http://www.w3.org/Submission/wadl/#x3-90002.4
[2]: http://www.w3.org/TR/2009/REC-xforms-20091020/

+-----------------------------------------------+
    Ryan J. McDonough (a.k.a John Yaya)
    http://damnhandy.com
    http://twitter.com/damnhandy

Received on Wednesday, 5 February 2014 03:13:42 UTC