Re: NQuads

> On 21 Jan 2017, at 16:45, Gregg Kellogg <gregg@greggkellogg.net> wrote:
> 
>> On Jan 20, 2017, at 11:36 AM, Hugh Glaser <hugh@glasers.org> wrote:
>> 
>> This is obviously something I should know, but there you go :-)
>> 
>> https://www.w3.org/TR/n-quads/#BNodes says:
>> "A fresh RDF blank node is allocated for each unique blank node label in a document. Repeated use of the same blank node label identifies the same RDF blank node."
>> I note it says "document", not "graph".
>> 
>> However, when I look at the brilliant new WebDataCommons release, and download the first jsonld file from http://webdatacommons.org/structureddata/2016-10/stats/how_to_get_the_data.html, I get loads of blank nodes with the same name.
>> It is most easily seen by extracting the owl:sameAs triples (why else would I be looking there? :-) )
>> 
>> Here is the start:
>> _:b0 <http://schema.org/sameAs> <https://www.facebook.com/9fivers> <http://9five.com/blogs/9five-blog/12207401-9fivers-as-seen-on-tv-episode-2>   .
>> _:b0 <http://schema.org/sameAs> <https://twitter.com/9fivers> <http://9five.com/blogs/9five-blog/12207401-9fivers-as-seen-on-tv-episode-2>   .
>> _:b0 <http://schema.org/sameAs> <https://www.instagram.com/9fivers/> <http://9five.com/blogs/9five-blog/12207401-9fivers-as-seen-on-tv-episode-2>   .
>> _:b0 <http://schema.org/sameAs> <https://www.youtube.com/user/9fiveEyewear> <http://9five.com/blogs/9five-blog/12207401-9fivers-as-seen-on-tv-episode-2>   .
>> _:b0 <http://schema.org/sameAs> <https://www.facebook.com/androidayuda> <http://androidayuda.com/aplicaciones-android/>   .
>> _:b0 <http://schema.org/sameAs> <https://www.twitter.com/androidayuda> <http://androidayuda.com/aplicaciones-android/>   .
>> _:b0 <http://schema.org/sameAs> <https://plus.google.com/+androidayuda> <http://androidayuda.com/aplicaciones-android/>   .
>> _:b0 <http://schema.org/sameAs> <http://www.facebook.com/aplusapp> <http://aplus.com/a/beyonce-cfda-awards-speech>   .
>> _:b0 <http://schema.org/sameAs> <http://www.twitter.com/aplusapp> <http://aplus.com/a/beyonce-cfda-awards-speech>   .
>> _:b0 <http://schema.org/sameAs> <http://www.instagram.com/aplusapp> <http://aplus.com/a/beyonce-cfda-awards-speech>   .
>> _:b0 <http://schema.org/sameAs> <http://aplusapp.tumblr.com> <http://aplus.com/a/beyonce-cfda-awards-speech>   .
>> _:b0 <http://schema.org/sameAs> <http://youtube.com/aplusapp> <http://aplus.com/a/beyonce-cfda-awards-speech>   .
>> _:b0 <http://schema.org/sameAs> <http://pinterest.com/aplusapp> <http://aplus.com/a/beyonce-cfda-awards-speech>   .
>> _:b0 <http://schema.org/sameAs> <http://plus.google.com/+Aplusapp> <http://aplus.com/a/beyonce-cfda-awards-speech>   .
>> 
>> Am I misunderstanding the NQuad document, or should there be (presumably) 3 different blank nodes here?
> 
> I suspect that they are simply re-using the blank-node labels from the JSON-LD documents. Indeed, the scope of the blank-node label is to the document it came from, and merging the graphs (datasets) from multiple documents should result in freshly minted blank-node labels so that the scope remans consistent for the quads from a given document.
> 
> This could be the result of the way blank-nodes are labeled in the JSON-LD to RDF algorithm, which is fine in itself, except if the data store their placing in does not maintain separate scope. Data stores may retain the original blank-node labels but still represent different nodes, as it is often convenient to have a re-serialized document use the same labels, but when merging, these should go away, or the act of re-serializing should regenerate the blank-node labels.
> 
> Gregg
Yes, my suspicion too.
Merging what are meant to be these simple formats isn't always straightforward.
I think it will take a little careful thought to work out what is best here.
For example, is it sensible to try and do something useful, or just use random bnode IDs.

And then there is the question of what I should do with them in relation to my sameAs stores.

Cheers

PS
A pox on blank nodes!
I hate them with a passion.

> 
>> Best
>> Hugh
> 
> 

Received on Saturday, 21 January 2017 18:23:26 UTC