Re: dataset syntax metadata

On 09/26/2012 10:09 AM, Andy Seaborne wrote:
> On 26/09/12 13:53, Sandro Hawke wrote:
>> I'm surprised at some of the responses about the metadata questions in
>> my "Dataset Syntax - checking for consensus" email [1].
>> When people publish RDF for real, don't they usually put some triples in
>> it which indicates who created it, when it was created, and maybe why?
>> Maybe some folks don't do this, but many people consider this an
>> essential practice.   My sense is that every computer format either has
>> a metadata mechanism built into it, or one somehow gets hacked in later
>> (like the javadoc conventions).  In a few cases (like the Adobe formats)
>> that metadata is expressed in RDF.
> We have RDF -  it can already express metadata!
>> When people publish an RDF dataset, aren't they going to want to do the
>> same thing?
> Dunno - maybe they are just putting a collection of graphs on the web 
> and linking to it (e.g. N-Quads dumps).
> The "what it is" and "where it came from" is out-of-band e.g. on the 
> web page linking to the file.

My understanding is that in many situations, embedded metadata (in 
contrast to metadata that has to be maintained elsewhere) has proven its 
value enough to be considered an absolute requirement.

>> Yes, sometimes you can just throw that metadata into a named graph, but
>> what if (a) you don't get a chance to tell the consumer which named
>> graph you put it in, and (b) some named graphs are opaque/untrustred,
>> perhaps because they contain old information or information from other
>> souces (eg a Web Crawl).    (While these might not be the cases you work
>> with, it seems to me they'll be quite common if this syntax ever catches
>> on.)
>> Folks who are not convinced we need a metadata mechanism -- how do you
>> imagine solving this problem?  How can someone reading a serialized
>> dataset figure out which triples are the metadata?
> Can't they look for it with a query?
> SELECT * { GRAPH ?g { :s rdf:type :metadataRecord } }

No, because (in case (b) above) there might be some obsolete or 
incorrect metadataRecords in some of the data being managed.

> although the unnamed graph is a good place to put it IMO.
> Just don't invent a fixed name for the metagraph.

The Giant Global Graph?  :-)

I think you're saying not to use something like:

      <> { ... metadata here ... }

That hadn't even occurred to me, and I don't really like it.

I think it would be better than nothing, though -- it would at least 
address the use case of a client just given a dataset figuring out how 
the dataset was intended to be used.    If the group does NOT provide a 
standard metadata mechanism, this might end up being the best option in 
the community, sadly, since at least it minimizes any kind of conflict 
or misunderstanding.

       -- Sandro

>     Andy

Received on Wednesday, 26 September 2012 14:37:40 UTC