RE: Schema.org Actions - an update and call for review from Markus Lanthaler on 2013-11-13 (public-vocabs@w3.org from November 2013)

From: Markus Lanthaler <markus.lanthaler@gmx.net>
Date: Wed, 13 Nov 2013 19:27:00 +0100
To: <public-hydra@w3.org>, "'W3C Web Schemas Task Force'" <public-vocabs@w3.org>
Cc: "'Thad Guidry'" <thadguidry@gmail.com>, "'Sam Goto'" <goto@google.com>
Message-ID: <005601cee09d$f24f1780$d6ed4680$@lanthaler@gmx.net>
Sam, Thad,

I'm going to respond to both of you in a single mail by folding your mails
together. See below.


On Tue, Nov 12, 2013 at 4:12 PM, Sam Goto wrote:
> On Mon, Oct 21, 2013 at 2:35 AM, Markus Lanthaler
<markus.lanthaler@gmx.net> wrote:
> > On Saturday, October 19, 2013 3:19 AM, Sam Goto wrote:
> > > Although this is in the right direction, we found it to be slightly
trickier
> > > than that. One example is that Netflix and Hulu may have
StreamableMovie on
> > > their inventories, *but* Netflix/Hulu may only play the
StreamableMovies on
> > > *their* inventories.
> > >
> > > So, you need to get deeper into the constraint and add that Netflix
can play
> > > movies that (a) needs to be a StreamableMovie but as well as (b) it
needs to
> > > have url = "netflix.com" for instance.
> > 
> > So basically you are saying you get the information about the movie
> > somewhere else, right? In my description I was assuming you are browsing
> > through Netflix' API and get the StreamableMovies there. Netflix
obviously
> > know whether their movie is streamable or not. I think we are talking
about
> > slightly different use cases or at least solutions.
> 
> Possibly. Most service providers expose their inventory via schema.org
> markup on their webpages, here is one example:
> 
> http://movies.netflix.com/Movie/The-Pursuit-of-Happyness/70044605

Yes, but that's still the same API, i.e., Netflix'. What's not clear to me
at the moment is whether you interested in cross-service relationships like
finding a movie on Hulu and playing it on Netflix. The latest draft you
published a couple of days suggests you aren't. So why would the approach is
proposed not work in this case? 


> One way or another, I'd like to create a bidirectional bound between these
> Movie instances and the Netflix service of streaming movies.

So you want to have a "PlayAction" which somehow links to all movies it can
play? Why do you need that? Wouldn't crawling all movies and creating a
reverse index achieve the same with much less complexity?


> The instance to service direction is trivial. Like you said, we can add a
> "operation" property to instances that point to the services. This
direction
> solves a wide variety of problems and that's great.

Right


> The service to instances direction is a bit more complicated, but still
> useful. That is: how do you make a service instance point to a set of
movie
> instances (i.e. movie instances that the service can stream)?

Why do you need that in the first place? In Hydra, we describe the "entry
point" of an API (hydra:entrypoint) from which you should be able to reach
every resource - similarly to a homepage from which you should be able to
reach any page. So you simply start there and follow your nose to the things
that are of interest to you.


> > > One direction we are exploring (which adds a *lot* of complexity), is
having
> > > some sort of constraint/restriction Type that could express these
> > > constraints. Here is one example of such a thing.
> >
> > Hmmm... this may work very well within a *single* API but I can't see
how
> > such an approach should work across the Web.
> 
> In my experience, the most common constraint that appears is to be able to
> delimiter an inventory. That is: a "Movie" is too broad, sometimes you
need
> to define "which specific Movies" you can act on (substitute "Movies" with
> "Songs you can listen to on my service", "Products you can by on my
grocery
> store", etc).

Why not use specific subclasses for this things? That's effectively what I
meant when I proposed to use StreamableMovie. I think the problem here stems
partly from the fact that schema.org tries very hard to cover everything
itself instead of relying on Linked Data and multiple vocabularies. It would
be trivial enough for Netflix to create a vocabulary which defines a concept
for *their* streamable movies.


> > > > If the operations/actions that the various resources of a Web API
offer
> > > > widely, it often makes more sense to attach the operations directly
to the
> > > > instance instead of binding it to a class. Hydra supports that via
its
> > > > "operations" property:
[...]
> > > Yep, that's another direction we are exploring. We are exploring
adding one
> > > property to http://schema.org/Thing called potentialAction (or
operation,
> > > exact name TBD) that does the mapping to the action that can be taken
on a
> > > specific Thing instance.
> > >
> > > Here is where we explored this idea.
> > >
> > > Basically, instead of expressing the map of action -> entities, we
instead
> > > ask publishers/developers to expose the reverse map of entity ->
actions and
> > > we then crawl and build the reverse index.
> > 
> > This sounds as you would like to build something similar to actions in
> > GMail, is that correct? Or why else do you suggest to "prefer" actions
over
> > entities?
> 
> Apologies if I'm not articulating myself clearly, some of this isn't
> entirely well formed yet :) Let me try to be more concrete using an
example:
> 
> Take for example the following entities:
> 
> (1) Hertz is an AutoRental is a business that has a service which is to
rent
> RentalCars
> 
> (2) Honda Civic is a RentalCar that is rented by an AutoRental, Hertz.
> 
> This specific Honda Civic (a) is on Hert'z inventory (b).
> 
> (a) is what i'm referring to the entity -> action problem, which is
simple.
> (b) is what's i'm referring to the service -> entities problem, which
depends
> on modelling the service's inventory.

With clarifications from Thad...

On Tuesday, November 12, 2013 11:45 PM, Thad Guidry wrote:
> And where Sam means :
>
> (a) the specific Honda Civic within Hertz's rental inventory that would be
> directly associated to a VIN number ( a unique identifier for automobiles
> in the last 50 years )
> http://en.wikipedia.org/wiki/Vehicle_Identification_Number
>
> (a) can only be rented out to a single person during a given
> timeframe...blah blah, and a host of other rules and unique actions at a
> specific instance level such as this.
>
> Honda Civic being a product model with many (a) instances around the
> world, including all the rental car companies.

and back to Sam...

On Tue, Nov 12, 2013 at 4:12 PM, Sam Goto wrote:
> If you extend (b) to a real life example, you'll get to the fact that
Hertz
> can actually rent *a lot* of RentalCars but not *all* RentalCars that can
> ever exist (e.g. it cannot rent a RentalCar that is in Budget's
inventory).

Right, but you wouldn't find such cars it in Hertz' inventory.


> The problem is: how do you scale (b) to represent a large (and possibly
> dynamic) set of RentalCars?

I'm not sure I understand this question. Why don't do the same thing that
websites do all the time? Just expose a collection (hydra:Collection) that
references all cars in the inventory. That collection doesn't have to be
static, it can of course be dynamic. It's very similar to my issue tracker
demo [1] where you can browse a collection of issues and you then you can
comment on each of them. Of course that collection isn't static but changes
as soon as you add or delete an issue.

 
> (a)
> 
> 1. {
> 2.   "@context": "http://schema.org",
> 3.   "@type": "RentalCar",
> 4.   "@id": "1234"
> 5.   "model": "Honda Civic"
> 6.   "operation": {
> 7.     "@type": "RentAction",
> 8.   "actionHandler": {
> 9.     "@type": "ActionHandler",
> 10.     "url": "http://www.hertz.com/rent"
> 11.   }
> 12.   },
> 13. }
> 14. 
> 
> (b)
> 
> 1. {
> 2.   "@context": "http://schema.org",
> 3.   "@type": "AutoRental",
> 4.   "name": "Hertz"
> 5.   "operation": {
> 6.     "@type": "RentAction",
> 7.   "actionHandler": {
> 8.     "@type": "ActionHandler",
> 9.     "url": "http://www.hertz.com/rent"
> 10.   "object": {
> 11.     "@type": "RentalCar",
> 12.     "@id": "1234"
> 13.   }
> 14.   }
> 15.   },
> 16. }

So your problem is that you would have thousands of actions in this case,
right?


> > > Here is an example:
> > >
> > > <script type="application/ld+json">
> > > {
> > >  "@context": "http://schema.org",
> > >  "@type": "Movie",
> > >   "url": "http://movies.netflix.com/WiMovie/Like_Crazy/70167118",
> > > "operation": {
> > >  "@type": "WatchAction"
> > >  "status": "proposed",
> > >  "handler" : {
> > >     "@type": "WebPageHandler",
> > >     "url":
> >
"http://movies.netflix.com/WiPlayer?movieid=70167118&trkid=1464504&t=Like+Cr
> > azy",
> > >    "method": "GET",
> > >  }
> > > }
> > > </script>
> > 
> > Wouldn't this specific example be much simpler if you would introduce a
> > specialized property as I suggested in my last mail [1]? Something like:
> > 
> > {
> >   "@context": "http://schema.org",
> > 
> >   "@id": "http://movies.example.com/Like_Crazy",
> >   "@type": "Movie",
> >   "stream": [
> >     "http://movies.netflix.com/WiPlayer?movieid=70167118&trkid=1464504",
> >     "http://hulu.com/89089409840650440",
> >   ]
> > }
> > 
> > It would of course also be possible to use objects instead of just URLs
in
> > "stream" to convey more information (e.g. price, provider etc.):
> > 
> >   {
> >     "@id: "http://hulu.com/89089409840650440",
> >     "provider": "http://hulu.com",
> >     "subscriptionReqired": true,
> >     ...
> >   }
> 
> Right. "provider" would certainly satisfy my criteria.
> 
> The modelling problem that we are facing is: how do you express "all
movies
> whose 'provider'=hulu"? That is, how do you programatically bind "hulu as
a
> service" to "all movies where hulu is set as a provider"?

The provider in the example above is set on Hulu's streaming URL and not on
the movie on example.com (which might be imdb.com). So you would have a
movie at imdb.com and list URLs where it can be streamed, in this case at
Hulu and Netflix.

In my world view there is no distinction between resources and "services".
You realize your use case by manipulating resources and that builds the
"service". I believe you are looking at the problem from a RPC (remote
procedure call) point of view which isn't how the Web works at all.


> > > I think you got the general idea of the options we are exploring.
> > >
> > > I think the entity -> actions mapping is fairly clear and solves a
huge
> > > number of problems.
> > 
> > Agreed
> > 
> > > We still think we need the action -> entities mapping too, we'd love
any
> > > input you may have in that area. We explored things like resource
shapes,
> > > sparql ask queries and prototype languages.
> > 
> > Hmm... since you were speaking about crawling and creating inverse
indexes
> > before, why do you think you need it?
[...]
> 
> Yep, I agree that the service -> entity mapping is a lot more complex, and
I
> agree that the entity -> operations mapping solves a really huge amount of
> problems. I'm still falling short of a killer use case that requires the
> service -> entity mapping, but as soon as I get one we can move on this
> discussion to a more concrete level.

Again, I think this stems from an RPC point of view. The Web however is a
resource-oriented architecture which works by manipulating the state of
resources and by navigating hypermedia graphs.


Cheers,
Markus


[1] https://bitly.com/15i8rpp


--
Markus Lanthaler
@markuslanthaler
Received on Wednesday, 13 November 2013 18:27:32 UTC