Re: OLDM using the Hydra vocabulary for expressing property constraints from Benjamin Cogrel on 2014-06-16 (public-hydra@w3.org from June 2014)

From: Benjamin Cogrel <benjamin.cogrel@bcgl.fr>
Date: Mon, 16 Jun 2014 12:36:47 +0200
To: public-hydra@w3.org
Message-ID: <539EC8BF.2030908@bcgl.fr>
Le 14/06/2014 22:46, Markus Lanthaler a écrit :
> On 13 Jun 2014 at 22:52, Benjamin Cogrel wrote:
>> Le 13/06/2014 00:57, Markus Lanthaler a écrit :
>>> This looks great. It's very intuitive. Great work! One thing that isn't clear to me (by just
>>> looking at the documentation) is how you do data validation. How does OLDM know
>>> that the email address isn't valid? Is that hard-coded for FOAF?
>>>
>> Happy to hear that! Yes, you guess right, for the email address, we have
>> a specific Python validator in Python that is registered for the
>> foaf:mbox and schema:email properties. But the email address is just a
>> special case and this practise should be avoided for regular properties.
>> Currently, it checks:
>>   - the XSD datatypes declared in the JSON-LD context,
>>   - the container (set, list and language maps) when appropriate,
>>   - the hydra:required, hydra:writeonly and hydra:readonly properties.
>> This is clearly not enough to provide an alternative to what common ORMs
>> do, so I have opened an issue for supporting new vocabularies [3].
>> I will first give a try a SPIN-based data quality constraint ontology: [4] .
> Do you intend to share this information with the client or will this just be used on the server side? In the latter case, I'm wondering whether it wouldn't be simpler to just create a mapping property-validator directly in code when you instantiate the ResourceManager or Model!?

Yes, I intend to share this information with the client (except if the
server wants to keep it secret). The client is then free to interpret it
or not.
What seems nice with SPIN templated constraints is that they allows us
to define in the same time new RDF properties and their automatic
transformations into SPARQL queries (please note that it does not imply
the presence of SPARQL endpoint; a query can be simply applied to the
RDF representation of a resource). I expect to generate new validators
(and their mappings) automatically from the definition of these
properties. Such declarative approach would kill two birds with one
stone :-)


>
>
>>> On the client side, in a lot of cases you have entities that look different that those that the
>>> server gives you. So you probably want something to map the representation returned by the
>>> server to your local entities. You shouldn't require the client to use exactly the same entities
>>> as the server as that would introduce tight coupling. From a Hydra point of view (and Web
>>> APIs in general) you probably also want some features that allow you to invoke operations.
>>> Also, in most cases you probably can't assume that a SPARQL endpoint will be available.
>>> So you have to navigate the resources expose by the server. Hydra supports (and this will be
>>> improved) some basic querying/filtering of collections. Ruben is working on more
>>> sophisticated querying in a project he called Linked Data Fragments [1]. He's also on this
>>> list and I'm sure he's more than happy to answer any question you might have. Since this is
>>> closely related to Hydra, those discussions would be very welcome on this list btw. :-)
>> Yes, I agree, on the client side we cannot always assume that (i) a
>> SPARQL endpoint is available, (ii) the server will make no validation
>> and (iii) accept our local representation. I will propose an interface
>> to abstract the use of a SPARQL endpoint. For me, implementations of
>> this interface (e.g. LD Fragments or Hydra clients) should in charge of
>> mapping the client and server representations. What do you think?
> The absence of SPARQL is one thing. The other thing I was talking about are entity representations themselves. A client might already have a Python class representing a person. When it retrieves the representation of a foaf:Person from the server, it somehow has to map the data it got to that class. Obviously that mapping has to be bidirectional.
>

First, let me clarify a technical detail. By contrast with the other
ORMs I know, there is no Python *class* representing a person in OldMan.
There is one Model *object* that represents a RDFS class such as
foaf:Person or my-example:LocalPerson [6]. A JSON-LD context + the
schema describing a RDFS class + an IRI generator should provide enough
information to generate a new Model object in most cases.


Here is a recap about the scope of OldMan is and what it could be.

The initial use case of this project is to provide an alternative to the
ORMs used in Web frameworks like Django, Ruby on Rails or Symfony for
building Web APIs. The relational database is replaced by a SPARQL
endpoint that Web developers still control; the main point is to make
them express most of the application logic declaratively in RDF and
JSON-LD, not imperatively in Python. I like to think of this project as
a transition tool for Web developers towards read-write Linked Data.
Because adopting a declarative style is already a big move, I think it
is important to preserve some reference points for this first step.

As a second step, we can relax the assumption that the (main) Web API
controls the SPARQL endpoint and see this Web API as the client of other
independent (sub) Web APIs. OldMan, as an OLDM, is the module of the
main Web API that is in charge of CRUD operations*. This OLDM can now be
seen as the client of a datastore where the latter may use a different
representation that the one of the main Web API** and may enforce its
own data validation. I will propose an interface between the core of
OldMan and client modules for interacting with SPARQL endpoints, LDF and
LDP servers, Hydra Web APIs, etc. These client modules will be in charge
of the mapping between local and remote representations you discussed.
As agents, they will execute the CRUD "goals" assigned by the core part.

You mentioned  the integration of non-CRUD hydra operations, this could
be a third step.  Currently, the OLDM uses the Hydra description of the
main Web API as the schema of its local representation. If non-CRUD
operations should appear on Resource or Model objects, I think they
should be operations provided (i) by the Web APIs the OLDM is client of
or (ii) by a common abstraction of them. The latter abstraction would
reduce the coupling. If I guess right, this would turn OldMan into a
generic Hydra client library, isn't it? However, one thing still
confuses me: how can we obtain nice Python methods (like [7]) from these
Hydra operations?


*: Please note that the CRUDController I mentioned my previous mail is
just an extension that is useful only if you want to build a CRUD-like
Web API. Even if your Web API is not a simple CRUD service, you may
still need to use some CRUD operations internally.
**: We assume that OldMan and the main WebAPI use the same representation.


>>>> Also, in the future, I would like to support the "@reverse" JSON-LD
>>>> keyword so I would be interested about having some "reversed supported
>>>> properties".
>>> We talked about this already. We might introduce a reversed flag (similar to required) for
>> supported properties to support it. This is already being tracked as ISSUE-40 [2].
>>
>> Ok, I will give a try to the hydra:reverseOf property and see if it
>> breaks my current design ;-)
> Please share your insights on this as it will help us with the design of Hydra.
>
>
>>> I find your project extremely interesting. What are your future plans? What's still missing
>>> that you plan to add?
>> Great, I hope it will be useful :)
>> In addition to the points I mentioned, I have started a CRUDController
>> [5] that manipulates resources having the same base IRI. This controller
>> may be extended to implement the collection pattern.
> Why did you decide to not expose those methods directly on your models? You could also generalize this by evaluating the available Hydra operations. This would then allow you to do things like
>
>    article.perform("LikeAction")

The CRUDController is a module I have quickly written some time ago for
having a better understanding of the scope of this project.
As it will be a main component of some CRUD Web API implementations, the
relation with some Hydra operations is something that should be clarified.

>
> You should perhaps also use a term like "fragment-less IRI" or "hash-less IRI" instead of "base IRI" as the "base IRI" can have a fragment as well.

You're totally right, thank you. Indeed, the "base IRI" term is
confusing because it has a different meaning in JSON-LD and, for
instance, in [8]. I have updated my code with the "hash-less IRI" term.

>
>
> Keep up the good work,
> Markus
>
>
>  
>>> [1] http://linkeddatafragments.org/
>>> [2] https://github.com/HydraCG/Specifications/issues/40
>> [3] https://github.com/oldm/OldMan/issues/9
>> [4] http://semwebquality.org/ontologies/dq-constraints
>> [5] http://oldman.readthedocs.org/en/latest/oldman.rest.html
>
> --
> Markus Lanthaler
> @markuslanthaler
>

Thank you Markus for this very interesting discussion,

Benjamin


[6]
https://github.com/oldm/OldMan/blob/master/examples/quickstart_schema.ttl
[7]
http://oldman.readthedocs.org/en/latest/oldman.html#oldman.model.Model.create
[8] http://www.w3.org/blog/2011/05/hash-uris/
Received on Monday, 16 June 2014 10:37:21 UTC