Re: Blank Node Identifiers and RDF Dataset Normalization from Kingsley Idehen on 2013-03-01 (public-linked-json@w3.org from March 2013)

From: Kingsley Idehen <kidehen@openlinksw.com>
Date: Fri, 01 Mar 2013 11:10:28 -0500
To: Steve Harris <steve.harris@garlik.com>
CC: Manu Sporny <msporny@digitalbazaar.com>, RDF WG <public-rdf-wg@w3.org>, Linked JSON <public-linked-json@w3.org>
Message-ID: <5130D2F4.1050902@openlinksw.com>

On 3/1/13 6:51 AM, Steve Harris wrote:
> On 2013-02-27, at 16:36, Kingsley Idehen <kidehen@openlinksw.com> wrote:
>
>> On 2/27/13 10:37 AM, Steve Harris wrote:
>>> I don't want to throw numbers about, but for us the cost of anything that significantly decreases the efficiency of our RDF storage carries a huge monetary cost - we couldn't justify it without a significant upside.
>> This is a very important point, and from the DBMS engineering perspective it's true. There are costs to existing RDF stores and DBMS engines.
>>
>> A suggestion:
>>
>> Manu: JSON-LD should make a note about the use of bnodes to denote graphs. That note could then hone into its special use case scenarios e.g., where there's high velocity data with little mass.
>>
>> Steve:
>> As already acknowledged above, you are correct about the optimization cost to existing RDF stores and DBMS engines (it will hit Virtuoso too) . Thus, when our engines encounter such data, we could simply  just remap the IRIs as part of our data ingestion (insert | import) routines. That's what we'll end up doing.
>>
>> Naturally, this means tweaking existing code re. data import, ingestion, and creation etc.. Personally, I believe we have the ability to close out this matter without holding up the various workgroups i.e., RDF 1.1 stays as is. JSON-LD has a fleshed out version of the note I suggested to Manu etc..
>>
>> Manu/Steve:
>>
>> What do you think?
> I believe that would be equivalent to defining the syntactic construct to generate Skolem URIs at parse time - but I've not through about it too deeply.
>
> - Steve
>

Yes, so a little work, but worthwhile since it keeps the data being 
loaded distinct from the store and its specific data management 
functionality. This also means that loaders can be crafted with switches 
to control the load modality etc..

Ultimately, this also means that the RDF model can evolve separately 
from notations (e.g., Turtle, TriG, JSON-LD etc..), consumer and client 
apps, data stores. Basically, everything remains (or becomes) loosely 
coupled.

-- 

Regards,

Kingsley Idehen	
Founder & CEO
OpenLink Software
Company Web: http://www.openlinksw.com
Personal Weblog: http://www.openlinksw.com/blog/~kidehen
Twitter/Identi.ca handle: @kidehen
Google+ Profile: https://plus.google.com/112399767740508618350/about
LinkedIn Profile: http://www.linkedin.com/in/kidehen

Attachments

application/pkcs7-signature attachment: S/MIME Cryptographic Signature

Received on Friday, 1 March 2013 16:10:55 UTC