- From: Gregg Kellogg <gregg@greggkellogg.net>
- Date: Fri, 14 Mar 2014 12:02:58 -0700
- To: Markus Lanthaler <markus.lanthaler@gmx.net>
- Cc: Ruben Verborgh <ruben.verborgh@ugent.be>, public-hydra@w3.org
On Mar 14, 2014, at 7:23 AM, Markus Lanthaler <markus.lanthaler@gmx.net> wrote: > On Friday, March 14, 2014 2:26 PM, Ruben Verborgh wrote: >> Basic Linked Data Fragments share the URI template of Restpark. >> I actually had a rather similar experience as you; >> read about them and forgot until Luca pinged me. > > :-) > > >> However, whereas Restpark still has "query" in its terminology >> (for instance, there is a limit parameter); >> basic LDFs are really just specific fragment of a dataset >> that can (and should) be interpreted separately from their application. >> And that's where Hydra comes in: >> my client might use fragments to solve SPARQL queries, >> but other clients might do something completely different. > > > +1 > > >>>> :dbpedia void:subset <http://data- >>>> cdn.linkeddatafragments.org/dbpedia?subject=&predicate=dbpedia- >>>> owl%3AbirthPlace&object=dbpedia%3ANew_York>; >>>> hydra:search _:triplePattern. >>> >>> This all makes perfect sense to me.. the only thing that you might wanna >>> change (not sure) is to what hydra:search is attached to. In this case >>> here, I (as a client) would assume that you further query that Linked > Data >>> Fragment (instead of querying the whole DBpedia dataset). >> >> The above are two distinct triples, right? So I'm saying that: >> >>>> :dbpedia void:subset <....>. >>>> :dbpedia hydra:search _:triplePattern. > > Sorry, you are of course right. I've misread those two triples. I read it as > > <....> hydra:search _:triplePattern > > >> So this does capture the semantics that the whole dataset is searched? >> I.e., would the client know that the query searches DBpedia, not the >> fragment? > > Yes, my bad. > > >>>> 1) How should a parameter be serialized in the URI template? > [...] >>> We can either define (and fix) how >>> IRIs/literals are to be serialized or we add a mechanism to describe >>> how they should be serialized. >> >> That's it. >> But. the full flexibility that this use case needs >> will probably be overkill for many use cases. >> So I'm afraid there will have to be a mechanism, >> because few would want to go all-the-way. >> >> As I've shown, for this use case it's crucial to distinguish >> beween literals and URIs. It's a no-go to do anything else. >> But it would probably be unreasonable to expect >> that people will want to indicate this difference all the time. >> (For instance, always have < > around URIs or "" around strings.) Not a fan of using <> around URIs. I'd say that the general practice would be that values are either (URI-encoded) URIs, or quoted strings, with optional language using a Turtle-like syntax. I don't see any real advantage of including datatype information, though. Alternatively, language could be another parameter (subject=&predicate=&object=&language=). >>> Allowing to describe the expected serialization >>> format is much more flexible but makes the implementation of >>> (primarily) clients more difficult. >> >> What could work is "convention over configuration". >> (But still allowing configuration.) > > Yeah.. even though that's always a bit odd with the open world assumption. > But it is just a hint anyway.. so it should probably be good enough. > > >>>> And of course, there would be many more ways to parse parameters. >>>> I could live with only giving one that works for clients, >>>> but it should be consistent and allow to differentiate between >>>> strings and URIs. >>> >>> Would be your preference or can you "just live with it"? >> >> What I mean is that: >> the server currently supports different ways to pass a URI. >> You could abbreviate it with prefixes, or have to full URI in < >. >> It would be totally fine with me if Hydra were only able >> to explain just one of them, and not both. > > OK, yeah.. I think allowing prefixes makes the solution much complex without > bringing much advantages if we talk about machine clients. If a human user > is interacting such a service, the UI can still support CURIEs but expand > them before sending the URL to the server. Have you considered doing that in > your prototype? A positive side effect of doing so would be increased > cache-hit rates. +1, I don't see a real need to use compact IRIs; however, if you were to, the set of IRI prefixes could come either from @prefix (along with default prefixes for RDFa) or prefixes defined in a JSON-LD context. >> But it would need to explain one of them. >> >>> Do you think there >>> are many cases where a variable can take both an IRI and a literal >>> and the distinction is important? >> >> No, in the majority of cases it won't be; >> because there are few properties that could either take a URI or >> string. >> rdf:object is actually one of the only ones. For schema.org data, it's always legal to use Text where an entity reference is expected. For practical purposes, I solve this on ingestion, so that, e.g., :gregg schema:knows "Ruben Verborgh" gets expanded to :gregg schema:knows [schema:name "Ruben Verborgh"], but that won't univerally be the case, so an object may take either an IRI or a literal IMO. >> But in this case, it is rdf:object I need. >> >>> I kind of have troubles to find an example where that would matter. >> >> In the LDF use case it does, hence my mail ;-) > > :-) > > >> I understand that a spec cannot be tailored to individual needs, >> but LDF could be a big and compelling use case for Hydra. > > Definitely > > >> What I would propose is something like: >> >> _:object hydra:variable "object"; >> hydra:property rdf:object; >> hydra:serialization hydra:NodeSerialization. >> >> Where hydra:NodeSerialization is a way that distinguishes >> between IRIs, literals, blank nodes, and variables. >> >> The default ("convention over configuration") could be >> hydra:TextualSerialization, >> where the IRIs or literals as-is value is passed; losing the ability to >> distinguish. > > I had something similar in mind. I was thinking of something like > "ValueOnly" which would correspond to your "TextualSerialization" (IRI > as-is, only lexical form of literals) and "FullRepresentation" (with a > better name) which would correspond to NodeSerialization. As I mentioned, in some cases, you might need to be more flexible. In any case, coming up with a simple scheme should make this unambigious: IRIs and BNodes can easily be determined, literals always being quoted. If an IRI starts with a known prefix, then it can be expanded, etc. Gregg >> Summarized: simple cases stay simple, >> complex cases are supported and still simple. > > Yeah, I quite like this. > > >>>> 2) What do the subject, predicate, and object properties really >>>> mean? >>> >>> My take on this would be to either specialize the IriTemplate class >>> to something like a LdfIriTemplate or to specialize hydra:search.. >>> something like ldf:queryInterface. >> >> That's an interesting option and I like it. >> However, I wonder whether Hydra itself could also have >> "collection search semantics" built in; >> so a specialization of hydra:search that says >> "and I will now return those element of the collection >> that directly have the specified property values". > > Yeah, this is directly related to what we discussed some time ago (the > actor/blockbuster thing). I wouldn't be opposed to add something like > hydra:filter for this. However, before doing so I'd still like to evaluate > if property paths wouldn't be the more powerful alternative at the cost of > increasing the complexity a bit. > > >> A discussion in this direction is here: >> http://lists.w3.org/Archives/Public/public-hydra/2014Feb/0153.html > > Great.. so we are on the same page :-) As you see, I typically reply to > mails as I read them.. E-Mail Stream ProcessingT :-P > > >> I think such a use case would be common enough >> to justify its inclusion in Hydra. > > Yep.. even though I would say in most cases you can query/filter only by > object on some properties. So, for example just by last name or whether an > issue is closed/open and not on all "fields" as LDF currently does. Is there > already a way to describe that in LDF? > > >>> You could then even go as far as saying >>> >>> ldf:queryInterface a hydra:TemplatedLink ; >>> supportedOperation [ >>> a ldf:RetrieveBasicLdfOperation ; >>> hydra:method "GET" >>> hydra:returns ldf:BasicLdf >>> ] . >>> >>> (sorry, haven't looked up LDF vocabulary yet) >> >> Neither have I :-) >> >> I would also make it a subclass then of hydra:search; >> or the more specific property in Hydra is we decide to create that one. > > Yeah > > >>> I'm pretty excited about this as I really see a lot of potential. It >>> would be interesting to see if a Hydra ApiDocumentation would provide >>> enough information to dynamically "crawl" the data instead of querying >>> it by SPO. Have you spent any thoughts on that already? >> >> Oh! No I hadn't. Documentation is a very nice application area indeed. >> Thanks! > > Not sure you understood what I meant. I meant a Hydra ApiDocumentation along > with the used vocabularies basically provides a client a(n incomplete) map > of the graph a service is exposing. Could that map be used to dynamically > solve queries? > > Taking the demo issue tracker as example. Let's say I want to query for open > issues. With LDF I would query for *, vocab:isOpen, true. The service, > however, may not implement a LDF query interface. So, if you give the client > the entry point, it would have to look up the Hydra ApiDocumentation and the > used vocabularies in order to (try to) fulfill the query. In our case here > it would > > 1) retrieve entrypoint (/api-demo), -> ok, is of type vocab:EntryPoint > 2) ApiDocumentation -> vocab:EntryPoint -> supportedProperty -> vocab:issue > (ignoring for the moment that the range is just hydra:Collection and not > doesn't specify that members will be of type vocab:Issue) > 3) dereference vocab:issue link (/api-demo/issues) -> hydra:Collection with > vocab:Issue members > 4) dereference each vocab:Issue (e.g. /api-demo/issues/5) as the > vocab:isOpen is not included in collection representation > 5) filter retrieved issues and only return the ones marked as open > > Obviously, it would be much more efficient if the service would offer a > direct interface, but service providers can't always anticipate what > consumers are interested in. > > > > -- > Markus Lanthaler > @markuslanthaler > >
Received on Friday, 14 March 2014 19:03:33 UTC