- From: Gregory Williams <greg@evilfunhouse.com>
- Date: Fri, 25 Sep 2009 11:41:45 -0400
- To: Alexandre Passant <Alexandre.Passant@deri.org>
- Cc: "public-rdf-dawg@w3.org Group" <public-rdf-dawg@w3.org>
On Sep 25, 2009, at 3:38 AM, Alexandre Passant wrote: > On 25 Sep 2009, at 03:41, Gregory Williams wrote: > >> Beyond what's currently listed in the vocab section of the service >> description page[1], I think we need a way to describe the dataset >> provided by the endpoint. This goes beyond what things like voiD >> provide which is a way to describe a single graph. Therefore, I'd >> like to suggest something like this: >> >> <endpoint> sd:datasetDescription [ >> sd:defaultGraph <void-dataset-for-default-graph> ; >> sd:namedGraph [ >> sd:graphName <graph-name> ; >> sd:graphDescription <void-dataset-for-named-graph> ; >> ] . >> ] . > > So, in a quad store, you will describe each graph separately using > voiD ? > Won't it be too much information in the SD, e.g. if I have 1 million > RDF files in my store, will have 1 million of voiD descriptions in > the SD. > It may be more useful to directly querying each graph to get that > void-like information, if needed. > > What about having a simple description listing the list of graphs + > default one + the voiD description of the complete endpoint. Say you have 1 million named graphs (based on your description, I assume "1 million RDF files" means 1 million named graphs?) each with 1 million triples, and a single default graph with 100 triples. At least for my use cases (optimization and federation), getting a voiD description of the graph-merge might very well be worse than having no description at all. If I'm trying to estimate how many results I can expect for any query against the default graph, having the description of the graph merge wouldn't do me any good. On the other side of things, if I look at the dataset description and find that there's information about a single foaf:Person in the dataset and I want to retrieve that information, how am I meant to get to it if I don't know which of the million named graphs it's in? Having a huge number of named graphs is clearly a challenge w.r.t. size of the service description, but I'm worried that a voiD description of the merged dataset isn't all that useful if it doesn't give you enough information to turn around and query the dataset for things you're interested in. >> The lack of naming symmetry between sd:defaultGraph (for default >> graphs) and sd:graphDescription (for named graphs) could probably >> be made better (maybe sd:defaultGraphDescription?), but this >> modeling allows each graph in the dataset to be described as well >> as things to be said about the entire dataset. > > Strictly speaking, isn't the default graph also a named graph (since > it generally also have its own URI). Possibly, but it doesn't have to have its own name, does it? > <endpoint> sd:datasetDescription [ > sd:defaultGraph [ > sd:graphName <graph-name> ; > sd:graphDescription <void-dataset-for-default-graph> ; > ] . > sd:namedGraph [ > sd:graphName <graph-name> ; > sd:graphDescription <void-dataset-for-named-graph> ; > ] . > ] . If we can count on all graphs (even a default graph) having a name, then this would be a good way to generalize the modeling. For consistency, even without a graphName on the default graph, maybe we should do this? > while I'd think a simple way would be > > <endpoint> sd:datasetDescription <void-dataset-for-dataset> ; > sd:defaultGraph <graph-name> ; > sd:namedGraph <graph-name> . Again, I'm not sure how useful the <void-dataset-for-dataset> would be, since by this point you've lost the ability to discriminate between the named (and default) graphs. Also, it's come up briefly in the past, but we haven't done much talking about the difference between including the dataset description in the service description document, or providing a URL for retrieving the dataset description if/when needed. I think both are important, but it does make querying harder because you need to potentially handle both cases. Thoughts? thanks, .greg
Received on Friday, 25 September 2009 15:42:24 UTC