Re: giving up on datasets/trig as more than a web cache

+1 to this as well. "Dataset metadata" can perfectly be put in its own named graph.


On Thu, Sep 27, 2012 at 02:47:38PM +0200, Antoine Zimmermann wrote:
> +1 to this.
> Note that, even if we do not recommend putting metadata in the
> default graph, it's still possible to do it. Metadata in RDF is also
> data in RDF, so you can put it in a named graph or in a default
> graph.
> Le 27/09/2012 14:09, Sandro Hawke a écrit :
> >Recently, I've tried to argue that trig (or whatever it's called) needs
> >to be able to carry distinguished metadata. This morning I've decided it
> >doesn't, really, at least for the use cases I think about. My
> >replacement idea is to think about trig as *just* being a Web Cache, as
> >just a convenient shorthand for pairing a bunch of URLs and their RDF
> >contents, so you can publish or fetch them all at once. I had been
> >thinking about it as something else, as more of a first-class KR, but
> >that doesn't seem to be flying. (I guess this is yet another hold-over
> >from my years of working with N3.)
> >
> >Let's see if I can explain, for anyone else who might think a dataset
> >could/should mean something more, and maybe myself, tomorrow.
> >
> >The use cases I think about are nearly all about data federation, the
> >stuff I wrote about and implemented as a federated phonebook [1].
> >They're all about data being gathered from original sources and
> >processing systems and passed on toward data consumers, as a package, as
> >a new combined-source. This seems to me like an incredibly important use
> >case that requires standardization and something could really benefit
> >from the idea of datasets and a dataset syntax.
> >
> >I envisioned it as a converging pipeline, starting with turtle files
> >(rdf graphs) as the leaves, but then having trig files (rdf datasets) as
> >the major trunks. The clients would always be getting a trig file (or
> >using a sparql endpoint with the same dataset). For example, in 2.4 we
> >get the situation where a division is gathering the data from its
> >departments, and then passing them up to headquarters in one combined feed.
> >
> >But if the feed is trig, and one is going to be able to figure out what
> >really came from where/when so that bugs and incorrect data can be
> >addressed, then trig has to have distinguished metadata. And I hear a
> >lot of people opposed to that, or at least opposed to any convenient was
> >of supporting it, because SPARQL doesn't really have it. So, instead,
> >how about we just make the main feed be turtle, and it only contains the
> >metadata. All the data I was putting in named graphs stays out on the
> >web, to be dereferenced by clients if they want.
> >
> >And then, for performance, if desired, the feed can also link to a trig
> >file, saying "here, I've done all the fetching for you; if you're going
> >to be dereferencing all this stuff anyway, you might as well take this
> >instead". It can do the same with providing a SPARQL end-point,
> >providing it for convenience/performance.
> >
> >*shrug* It should work fine. Maybe it's even better architecture. It
> >certain means the name should not be "SuperTurtle", since now trig
> >remains a fairly obscure/internal/dump format, and (unlike Turtle) can
> >not actually be used to express data, other than simple pairings of URLs
> >and graphs.
> >
> >-- Sandro
> >
> >[1]
> >
> >
> >
> -- 
> Antoine Zimmermann
> ISCOD / LSTI - Institut Henri Fayol
> École Nationale Supérieure des Mines de Saint-Étienne
> 158 cours Fauriel
> 42023 Saint-Étienne Cedex 2
> France
> Tél:+33(0)4 77 42 66 03
> Fax:+33(0)4 77 42 66 66

Received on Friday, 28 September 2012 12:09:03 UTC