RE: More Thoughts on Links and Operation Subclasses from Markus Lanthaler on 2014-01-28 (public-hydra@w3.org from January 2014)

From: Markus Lanthaler <markus.lanthaler@gmx.net>
Date: Tue, 28 Jan 2014 13:45:22 +0100
To: <public-hydra@w3.org>
Cc: "'Ryan McDonough'" <ryan@damnhandy.com>, "'Thomas Hoppe'" <thomas.hoppe@n-fuse.de>
Message-ID: <018c01cf1c26$d1396bd0$73ac4370$@lanthaler@gmx.net>
On Monday, January 27, 2014 4:19 PM, Ryan McDonough wrote:
> On Mon, Jan 27, 2014 at 2:51 AM, Thomas Hoppe wrote:
> > On 01/25/2014 04:58 PM, Ryan McDonough wrote:
> >>
> >> I’ve been trying to really grok what value Operation subclasses really
> >> have. Stock hydra has some base subclasses such as ReplaceResourceOpertion,
> >> DeleteResourceOperation, and CreateResourceOperation. Take
> >> DeleteResourceOpration for example; there’s really no difference between
> >> this:
> >>
> >> {
> >>   "@context": "http://www.w3.org/ns/hydra/context.jsonld",
> >>   "@id": "/an-issue",
> >>   "title": "An exemplary issue representation",
> >>   "operations": [
> >>     {
> >>       "@type": "DeleteResourceOperation",
> >>       "method": "DELETE"
> >>     }
> >>   ]
> >> }
> >>
> >> And this:
> >>
> >> {
> >>   "@context": "http://www.w3.org/ns/hydra/context.jsonld",
> >>   "@id": "/an-issue",
> >>   "title": "An exemplary issue representation",
> >>   "operations": [
> >>     {
> >>       "@type": "Operation",
> >>       "method": "DELETE"
> >>     }
> >>   ]
> >> }

Yeah, DELETE is very specific and it might look as if it doesn't have any value. There exist, however, APIs that archive resources that are being deleted. In such a case, I think it does make sense to qualify a DELETE with an ArchiveOperation or something similar. Basically the operation tells the client what kind of consequences it can expect if it invokes an operation. The HTTP methods are too generic for doing so IMO. Yeah, they tell you whether a request is safe, or idempotent etc. but that's it.

Another example. Let's say we have a something like this:

{
  "@context": "...,
  "@id": "/lkdjf934897cd9873kl",
  "operations": [
    {
      "@type": "Operation",
      "method": "POST",
      "expects": "#Product"
    }
  ]
}

What does this operation? Is it creating a Product? Does it add the product to a shopping cart? Does it share the product on Facebook? In Hydra these semantics are conveyed via the operation's type. It it's not Hydra's job to define them though and I think it gets always clearer that adding CreateResourceOperation etc. wasn't such a good idea.

Have a look at http://schema.org/docs/full.html There's a whole Action tree. That's how I intended operations to be used. It was great to see that Schema.org adopted the same idea and we should probably mention that in the spec. I designed Hydra before Google/Schema.org announced their work, that's the reason why it isn't mentioned anywhere yet.


> > I have an application for this that I already posted to the list:
> > Think about an API where some operations underly authorization
> > restrictions.
> > Such restrictions need to be managed, for example with a role based
> > access control system.
> > What you would need to express policies such as - Owners of Role A
> > can carry out Operation "CreateResourceOperation" - for example are
> > such operations.
> 
> Quite honestly, I wouldn't do that here. I would much rather make that
> determination before i send back the JSON-LD document to the client.

I agree with Ryan on this but that doesn't mean that you can't define it internally the way Thomas described it. In fact, I think in most cases it will be done somehow like that anyway.


> If I have a create operation that a client doesn't have rights to
> perform, they simply don't see it. Much like the way I won't present
> an HTML form to a consumer that is not entitled to perform a
> particular action. I'd evaluate rights higher up in the stack.

Exactly, the nice thing though is that you can still describe it using Hydra concepts which allows you to generalize your "stack".


> Thinking about this more, Hydra isn't really taking a "follow your
> nose" approach. It seems that Hydra is defining all possible options
> ahead of time in the context, thus giving little opportunity to make
> security decisions at runtime.

That depends entirely on how you use it. You can specify all operations inline using the "operations" property. Alternatively, you can bind them to properties or classes in an ApiDocumentation. It's up to you. The Hydra demo on my homepage [1] uses the latter style whereas Sam's demo [2] uses the former. Obviously you can also combine the two.


> > I hear you say that you could also specify a HTTP method but this would
> > result in a two-dimensional system (Operation + HTTP Method) if you still
> > want to use operations.
> > So I think in some cases operations are required but in your simple example
> > you could model everything with "@type": "Operation".
> > Due to fact that APIs have diverse semantics of what HTTP operations
> > really do (as you also examplify below), I think there are three ways
> > to go forward with operations:
> >
> > 1.) Create an IANA Registry with some "standard" operations and
> > clearly defined semantics
> 
> Perhaps not IANA, but have a somewhat robust set of initial operations
> in Hydra core would fit the bill.

I would rather not include them into Hydra core but leverage schema.org's work:
   http://schema.org/docs/full.html


> > 2.) Let each API provider create its own operations, also for very
> > simple ones or

You can't prevent that anyway..


> > hope for other vocabs to come up with a set of interoperable
> > operations

Luckily that happened already :-)


> > In any case I think the existing operations must be removed from
> > hydra.

I think I agree. They cause way more confusion than they help. This is being discussed as part of ISSUE-5 [3] by the way.


> >> In either case, the client is still going to issue an HTTP DELETE
> >> request. The function of the Hyrda operation is implied by the
> >> HTTP method that is specified.

Is it really? I don't think the HTTP methods' definitions are explicit enough.


> >> CreateResourceOperation is also confusing. A new resource could
> >> be created via PUT or POST. The intent of POST is not always create.

Right, but we say nowhere that CreateResourceOperartion can't be used with a PUT. Do we?


> >> In some of my applications, I’m using POST in a fire and forget
> >> model and using a 202 (Accepted) response to indicate success. The
> >> resource is receiving events may affect another resource rather
> >> that modifying the requested resource via CRUD actions. While yes,
> >> I subclass Operation and create my own “FireAndForgetOperation”
> >> class. However, the client doesn’t need to know anything about
> >> how my server is going to process the message.

How does it then decide which operation to invoke?


> >> The POST request may result in a creating a new resource, updating
> >> several others, or it may even take a few days to process. The 
> >> client only needs to know how to format the request (i.e. hydra:expects
> >> and hydra:supportedProperties) and how to send the message (hydra:method).
> >> With that said, I’m trying to determine if there’s value in having
> >> types at that point? There's not much to be gained by everyone inventing
> >> their own Operation types.

That's true. If everyone invents her own operation types they are useless. If, however, people stick to, e.g., schema.org's actions you can gain quite a lot IMO.


> >> I've said this before, but the Operation's type feels a lot like
> >> Link relations rel attribute. In Hydra, we seem to be using @type
> >> to indicate the function of the Operation. Arguably, Link relations
> >> do the same thing with the rel attribute with the addition of a set
> >> of standard, relatively well-understood relation types.

Operations describe the consequences of sending a specific HTTP request to a resource. They describe the behavioral model (interaction model) of a resource. Link relations on the other hand describe relationships between two resources. Don't get tricked by AtomPub's "edit" link relation. IMO that's an anti pattern. It should rather have been called something like "about" or "source".


> >> Assuming that the intent of Operation subclasses is similar to Link
> >> relation types, I can see Hydra both creating some standard Operation
> >> types for common types and potentially incorporating
> >> some IANA Link relation types. But this is where things get funky:
> >> practically all Link relation types are intended for GET requests.

Well.. they are not intended for GET requests but since they just describe the relationship between two resources you just follow them, which is, yeah, a GET.


> >> JSON Schema is using Link relations on non-idempotent links, but there’s

Do you mean unsafe operations?


> >> really not any link relation that I’m aware of that functions with
> >> anything other than an HTTP GET.

The AtomPub protocol describes how it reacts to PUTs/DELETEs to the target of an "edit" link.


> >> And this where things get a little more awkward for me. If we’re
> >> talking about a search use case, we could potentially express it
> >> with an IriTemplate:
> >>
> >> {
> >>   "@context": "http://www.w3.org/ns/hydra/context.jsonld",
> >>    "@type": "IriTemplate",
> >>   "template": "http://api.example.com/issues{?query}
> >> <http://api.example.com/issues%7B?query%7D>",
> >>
> >>   "mappings": [
> >>     {
> >>       "@type": "IriTemplateMapping",
> >>       "variable": "query",
> >>       "property": "#SearchCriteria",
> >>       "required": true
> >>     }
> >>   ]
> >> }
> >>
> >> But we might also be able to express it as:
> >>
> >> {
> >>   "@context": "http://www.w3.org/ns/hydra/context.jsonld",
> >>   "@id": "/issues",
> >>   "title": "Reads a Resource",
> >>   "operations": [
> >>     {
> >>       "@type": "ReadResourceOperation",
> >>       "method": "GET",
> >>       "expects" : "#SearchCriteria"
> >>     }
> >>   ]
> >> }
> >>
> >> Ignore the fact for a moment that unlike JSON Schema’s Hyper Schema
> >> or Collection+JSON, Hyrda does not specify that we should use the
> >> mappings as URL query parameters rather than as a new JSON-LD
> >> structure.

You can associate an operation to a templated link. That way you can specify which parameters are to be used as query parameters and which go into the body (if there's one).


> >> Something like a search is typically done using GET and could be
> >> expressed via an IriTemplate or Operation. Right now, a Link or
> >> TemplatedLink feels like an Operation with a method property of
> >> GET, yet Hyrda is treating read controls very differently than
> >> other type of links or operations.

Yeah, GETs are safe requests. Thus they don't have any side effects (or at least shouldn't have any). In most cases you thus just navigate the API, i.e, you follow links. There's no need to define an operation for that. But you can of course do so. Nothing stops you from defining an Operation whose method is set to GET.


> >> Lastly, I think we also need to add two new properties to in order
> >> to indicate the media type that should be sent to the server and
> >> another that hints at what might be returned from the request.
> >> For example:
> >>
> >> {
> >>   "@context": "http://www.w3.org/ns/hydra/context.jsonld",
> >>   "@id": "/an-issue",
> >>   "title": "An exemplary issue representation",
> >>   "operations": [
> >>     {
> >>       "@type": "Operation",
> >>       "method": "POST"
> >>       "enctype" : "application/x-www-form-urlencoded",
> >>       “mediaType" : "application/pdf"
> >>       ...
> >>     }
> >>   ]
> >> }
> >>
> >> The enctype informs the client as to how to format the request
> >> message while mediaType indicates the format that the server may
> >> return. If not specified, the value should be application/ld+json
> >> if not specified.

How would the client construct that urlencoded string? If you set enctype to a RDF compatible type than everything works fine, if you don't, the client won't know how to construct the payload. That's the reason why I left it out. It's tracked as ISSUE-22 [4] though.

If you want to define an operation which takes a binary input (e.g. an image) my current thinking is that defining a special class for that would allow you to do so. Something like "expects": "JpegFile". Or perhaps something like "Blob" which you describe further by a media type:

  "expects": {
    "subClassOf": "Blob",
    "mediaType": "image/*"
  }


> >> The mediaType property is more coarse grained than the current
> >> hyrda:returns property. in that we’re asserting that the response
> >> mediaType might be something other than JSON-LD.
> >
> > Theoretically these bits of information could be introspected via
> > HTTP requests (HEAD...) or on a higher level simply from the types
> > (classes) but I also think that this might be valuable in some scenarios,
> > e. g. if latency must be reduced.
> 
> For the media type values, yeah they could be gleaned from the
> headers. For properties like "enctype", this would need to be done in
> the descriptor. The enctype value would trigger how the request body
> needs to be formatted before it is sent to the server. Thus, headers
> aren't much help at that point. Think of a photo upload service: I
> need to tell the client I can have it upload multiple photos as long a
> they send them using multipart/mixed media type. That has to be said
> before the client produces the message.
> 
> > JSON Hyper Schema has essentially the same facility to express
> > expected and returned media/ types but I think they define no
> > default. In any case, this should be optional.
> 
> Totally cribbing this idea from JSON Schema as it's very handy. And
> yes, it's optional and use only in cases where your input format is
> not going be JSON-LD.

Yeah, we definitely need to do something about this. I'm not sure though how to address it best. What do you think about the idea I talked about above?


> >> Without these properties, it’s very difficult to work with existing
> >> content on the web, whether it be images, PDFs, or other formats
> >> such a Atom, HAL+JSON, etc. and even plain old JSON or XML. JSON-LD
> >> is awesome, but it’s not the only format on the web. While yes, some
> >> of these formats may not fit the Hydra/JSON-LD model, but these formats
> >> are in use today.

Fully agreed. The question is what level of support do we need? Specifying an operation which expects a image is one thing, describing Atom, HAL, or XML payloads is another one.


> >> For example, if my App is looking to see what the
> >> last 5 commits were on a GitHub project, I need to be able to format
> >> a GET request with the required URL parameters.

AFAICT, the only thing you would need to do is to have some vocabulary defining the properties you use as URL parameters. Hydra then allows you to use them to describe a IRI template.


> >> I’m also going to be getting back JSON, not JSON-LD.

The simplest thing would be to just omit "returns".


> >> Now I’m not suggesting that Hydra go
> >> all out an support translating JSON data into JSON-LD. No, that’s up
> >> to the client application. However, Hydra should be
> >> able to properly instruct a client to properly format a
> >> application/x-www-form-urlencoded request and

That may not be as trivial as it sounds. Of course, you can get pretty far by defining a, let's say, UrlEncodedPayload class and create ad-hoc subclasses for each form by using supportedProperties and extending it with "key" or something like that:

  "expects": {
    "subClassOf": "UrlEncodedPayload",
    "supportedProperties": [{
      "property": "http://schema.org/name",
      "key": "full-name",
      "required": true
    ]}
  }

Which would result in something like full-name=Markus+Lanthaler


> >> indicate that the response may not be a JSON-LD format.

Just don't describe the response. The client will be able to figure out himself that the server didn't return a JSON-LD document.


> >> This is also where I really start to get hanky about hydra:returns and
> >> hyrdra:statusCodes. Clients really need to get into the habit of
> >> reacting to responses and sensing content via Content-Type and
> >> Link headers rather than having a preconceived expectation at build
> >> time about what they might get back.

Fully agreed


> >> Perhaps abandoning the properties isn’t the right solution and
> >> perhaps it could be resolved through a Hydra client API. As it stands,
> >> these Hydra properties tee up client generators that would be a
> >> lot of like what tools like WADL, Swagger, etc. do now.

Yeah of course you can misuse them. Yet I think they have value, especially when you need to decide which operation to use when there are a number of candidates. 


> >> We need to make sure that Hydra can accommodate existing media types
> >> and even link relations to a certain degree. These formats and
> >> conventions exist now and applications and APIs are using them today.
> >> I’d really like to see Hydra accommodate them in some capacity.

Right. How do you find my proposal above? The reason I prefer it is that it keeps the model simple in most cases and avoids inconsistencies/ambiguities that may arise if you specify both enctype and returns. I don't have a strong opinion about it yet so I may easily be convinced otherwise :-)


Cheers,
Markus


[1] http://m.lanthi.com/hydra-api-demo
[2] http://code.sgo.to/crawler/yaap.html#url=http://code.sgo.to
[3] https://github.com/HydraCG/Specifications/issues/5
[4] https://github.com/HydraCG/Specifications/issues/22


--
Markus Lanthaler
@markuslanthaler
Received on Tuesday, 28 January 2014 12:45:56 UTC