Re: "Microsoft Access" for RDF? from Kingsley Idehen on 2015-02-21 (public-lod@w3.org from February 2015)

From: Kingsley Idehen <kidehen@openlinksw.com>
Date: Sat, 21 Feb 2015 14:29:23 -0500
To: Pat Hayes <phayes@ihmc.us>
CC: public-lod@w3.org
Message-ID: <54E8DC93.6010908@openlinksw.com>
On 2/21/15 1:58 PM, Pat Hayes wrote:
> On Feb 21, 2015, at 12:01 PM, Kingsley Idehen <kidehen@openlinksw.com> wrote:
>
>> On 2/21/15 9:48 AM, Martynas Jusevičius wrote:
>>> On Fri, Feb 20, 2015 at 6:41 PM, Kingsley  Idehen
>>>
>>> <kidehen@openlinksw.com>
>>>   wrote:
>>>
>>>> On 2/20/15 12:04 PM, Martynas Jusevičius wrote:
>>>>
>>>>
>>>> Not to criticize, but to seek clarity:
>>>>
>>>> What does the term "resources" refer to, in your usage context?
>>>>
>>>> In a world of Relations (this is what RDF is about, fundamentally) its hard
>>>> for me to understand what you mean by "grouped by resources". What is the
>>>> "resource" etc?
>>>>
>>> Well, RDF stands for "Resource Description Framework" after all, so
>>> I'll cite its spec:
>>> "RDF graphs are sets of subject-predicate-object triples, where the
>>> elements may be IRIs, blank nodes, or datatyped literals. They are
>>> used to express descriptions of resources."
>>>
>>> More to the point, RDF serializations often group triples by subject
>>> URI.
>>>
>> The claim "often group triples by subject"  isn't consistent with the nature of an RDF Relation [1].
> Sure it is.
>> "A predicate is a sentence-forming relation. Each tuple in the relation is a finite, ordered sequence of objects. The fact that a particular tuple is an element of a predicate is denoted by '(*predicate* arg_1 arg_2 .. arg_n)', where the arg_i are the objects so related. In the case of binary predicates, the fact can be read as `arg_1 is *predicate* arg_2' or `a *predicate* of arg_1 is arg_2'.") " [1] .
>>
>> RDF's specs are consistent with what's described above,
> Indeed. All the RDF relations are binary, so this is overkill, but...
>
>> and inconsistent with the subject ordering claims you are making.
> Not in the least.

Pat,

There is no implicit ordering in RDF.

The claim "RDF serializations often group triples by subject URI" is 
close to inferring a common practice ordering, when such isn't specified.

An application, e.g., the kind that's created this thread (i.e., an RDF 
Editor) could decide to order RDF statements by Subject, but it has 
implications. The very editor we will soon be releasing offers that kind 
of ordering, but not as the sole option.

At the start of this RDF Editor thread, I tried to bring a specific 
metaphor (Book [RDF Source], Pages [Named Graphs], Paragraphs [RDF 
Statements grouped by Predicate], and Sentences [RDF Statements]) into 
scope so that we had a simple basis for understanding issues that arise 
in a multi-user editor -- where operation atomicity is important, 
without totally killing concurrency.

As we already know from RDBMS experience in general, an RDF Editor (a 
client to a store) needs to be able to leverage optimistic concurrency, 
which may not actually match the UI/UX interaction experience of the 
user i.e., they see one thing, but at actual data persistence time 
something else is happening in regards to the actual atomic units that 
are subject to comparison with the original RDF source prior to actual 
persistence.

RDF Editors are not a trivial matter, which is why after 14+ years we 
only beginning to attend to this issue as a matter of course re., the 
broader Linked Open Data cloud.

> RDF triples can be organized in any way that suits the user, including by common subject if that is thought to be useful or intuitive.

Sets of RDF 3-tuples can be *ordered* in an app UI/UX as the developer 
sees fit, again there's no golden rule (including approaches that 
produce unusable apps that struggle with integrity and concurrency as 
data size and concurrent users increase) .

RDF 3-tuples are organized in line with RDF syntax rules. Assuming 
"organize" implies how an RDF 3-tuple (triple) is arranged, hence the 
subject->predicate->object structure.

>   The RDF spec says nothing about how triples are to be ordered or organized.

I never said or implied it did. I simply referred to the nature of an 
RDF relation. Basically, what rdf:Property is about.

>   RDF/XML syntax assumes a by-subject organization and so provides abbreviations which only work for that.

You know this already, but I just have to reply to your comment (we have 
an audience): RDF/XML != RDF :)

I wasn't replying to a question about RDF/XML, and I don't believe 
(circa. 2015) that RDF datasets are typically in RDF/XML form. That's 
not what I see these days, and I work with a lot of RDF data (not a 
secret to you or anyone else).


>
>> RDF statements (which represent relations) have sources such as documents which are accessible over a network and/or documents     managed by some RDBMS e.g., Named Graphs in the case of a SPARQL compliant RDBMS .
>>
>> In RDF you are always working with a set of tuples (s,p,o 3-tuples specifically)
> yes, but
>
>> grouped by predicate .
> Not necessarily. They can be grouped any way you like.

Yes, they can be grouped in an application, however the developer 
chooses, but that isn't what I was refuting or even concerned about.

My real concern boils down to an RDF Editor that can work against big or 
large RDF data sources without ignoring fundamental issues that arise in 
multi-user situations basically: integrity and concurrency, controlled 
on the client-side, when dealing with the addition, update, and removal 
of RDF statements for an RDF source.

As you will see, when we release our RDF Editor to the public, it 
addresses the issues above. It also presents a UI/UX where the user can 
interact with RDF data ordered by:

1. Subject
2. Predicate
3. Statement .

Regards,

Kingsley
>
> Pat
>
>
>> Also note, I never used the phrase "RDF Graph" in any of the sentences above, and deliberately so, because that overloaded phrase is yet another source of unnecessary confusion.
>>
>> Links:
>>
>> [1] http://54.183.42.206:8080/sigma/Browse.jsp?lang=EnglishLanguage&flang=SUO-KIF&kb=SUMO&term=Predicate
>>
>> Kingsley
>>>>>    Within a resource block, properties are sorted
>>>>> alphabetically by their rdfs:labels retrieved from respective
>>>>> vocabularies.
>>>>>
>>>> How do you handle the integrity of multi-user updates, without killing
>>>> concurrency, using this method of grouping (which in of itself is unclear
>>>> due to the use "resources" term) ?
>>>>
>>>> How do you minimize the user interaction space i.e., reduce clutter --
>>>> especially if you have a lot of relations in scope or the possibility that
>>>> such becomes the reality over time?
>>>>
>>>>
>>> I don't think concurrent updates I related to "resources" or specific
>>> to our editor. The Linked Data platform (whatever it is) and its HTTP
>>> logic has to deal with ETags and 409 Conflict etc.
>>>
>>> I was wondering if this logic should be part of specifications such as
>>> the Graph Store Protocol:
>>>
>>> https://twitter.com/pumba_lt/status/545206095783145472
>>>
>>> But I haven't an answer. Maybe it's an oversight on the W3C side?
>>>
>>> We scope the description edited either by a) SPARQL query or b) named
>>> graph content.
>>>
>>>
>>>> Kingsley
>>>>
>>>>
>>>>> On Fri, Feb 20, 2015 at 4:59 PM, Michael Brunnbauer <brunni@netestate.de>
>>>>>
>>>>> wrote:
>>>>>
>>>>>> Hello Martynas,
>>>>>>
>>>>>> sorry! You mean this one?
>>>>>>
>>>>>>
>>>>>>
>>>>>> http://linkeddatahub.com/ldh?mode=http%3A%2F%2Fgraphity.org%2Fgc%23EditMode
>>>>>>
>>>>>>
>>>>>> Nice! Looks like a template but you still may have the triple object
>>>>>> ordering
>>>>>> problem. Do you? If yes, how did you address it?
>>>>>>
>>>>>> Regards,
>>>>>>
>>>>>> Michael Brunnbauer
>>>>>>
>>>>>> On Fri, Feb 20, 2015 at 04:23:14PM +0100, Martynas Jusevi??ius wrote:
>>>>>>
>>>>>>> I find it funny that people on this list and semweb lists in general
>>>>>>> like discussing abstractions, ideas, desires, prejudices etc.
>>>>>>>
>>>>>>> However when a concrete example is shown, which solves the issue
>>>>>>> discussed or at least comes close to that, it receives no response.
>>>>>>>
>>>>>>> So please continue discussing the ideal RDF environment and its
>>>>>>> potential problems while we continue improving our editor for users
>>>>>>> who manage RDF already now.
>>>>>>>
>>>>>>> Have a nice weekend everyone!
>>>>>>>
>>>>>>> On Fri, Feb 20, 2015 at 4:09 PM, Paul Houle
>>>>>>> <ontology2@gmail.com>
>>>>>>>   wrote:
>>>>>>>
>>>>>>>> So some thoughts here.
>>>>>>>>
>>>>>>>> OWL,  so far as inference is concerned,  is a failure and it is time to
>>>>>>>> move
>>>>>>>> on.  It is like RDF/XML.
>>>>>>>>
>>>>>>>> As a way of documenting types and properties it is tolerable.  If I
>>>>>>>> write
>>>>>>>> down something in production rules I can generally explain to an
>>>>>>>> "average
>>>>>>>> joe" what they mean.  If I try to use OWL it is easy for a few things,
>>>>>>>> hard
>>>>>>>> for a few things,  then there are a few things Kendall Clark can do,
>>>>>>>> and
>>>>>>>> then there is a lot you just can't do.
>>>>>>>>
>>>>>>>> On paper OWL has good scaling properties but in practice production
>>>>>>>> rules
>>>>>>>> win because you can infer the things you care about and not have to
>>>>>>>> generate
>>>>>>>> the large number of trivial or otherwise uninteresting conclusions you
>>>>>>>> get
>>>>>>>> from OWL.
>>>>>>>>
>>>>>>>> As a data integration language OWL points in an interesting direction
>>>>>>>> but it
>>>>>>>> is insufficient in a number of ways.  For instance,  it can't convert
>>>>>>>> data
>>>>>>>> types (canonicalize
>>>>>>>> <mailto:joe@example.com> and "joe@example.com"
>>>>>>>> ),
>>>>>>>> deal
>>>>>>>> with trash dates (have you ever seen an enterprise system that didn't
>>>>>>>> have
>>>>>>>> trash dates?) or convert units.  It also can't reject facts that don't
>>>>>>>> matter and so far as both time&space and accuracy you do much easier if
>>>>>>>> you
>>>>>>>> can cook things down to the smallest correct database.
>>>>>>>>
>>>>>>>> ----
>>>>>>>>
>>>>>>>> The other one is that as Kingsley points out,  the ordered collections
>>>>>>>> do
>>>>>>>> need some real work to square the circle between the abstract graph
>>>>>>>> representation and things that are actually practical.
>>>>>>>>
>>>>>>>> I am building an app right now where I call an API and get back chunks
>>>>>>>> of
>>>>>>>> JSON which I cache,  and the primary scenario is that I look them up by
>>>>>>>> primary key and get back something with a 1:1 correspondence to what I
>>>>>>>> got.
>>>>>>>> Being able to do other kind of queries and such is sugar on top,  but
>>>>>>>> being
>>>>>>>> able to reconstruct an original record,  ordered collections and all,
>>>>>>>> is an
>>>>>>>> absolute requirement.
>>>>>>>>
>>>>>>>> So far my infovore framework based on Hadoop has avoided collections,
>>>>>>>> containers and all that because these are not used in DBpedia and
>>>>>>>> Freebase,
>>>>>>>> at least not in the A-Box.  The simple representation that each triple
>>>>>>>> is a
>>>>>>>> record does not work so well in this case because if I just turn blank
>>>>>>>> nodes
>>>>>>>> into UUIDs and spray them across the cluster,  the act of
>>>>>>>> reconstituting a
>>>>>>>> container would require an unbounded number of passes,  which is no fun
>>>>>>>> at
>>>>>>>> all with Hadoop.  (At first I though the # of passes was the same as
>>>>>>>> the
>>>>>>>> length of the largest collection but now that I think about it I think
>>>>>>>> I can
>>>>>>>> do better than that)  I don't feel so bad about most recursive
>>>>>>>> structures
>>>>>>>> because I don't think they will get that deep but I think LISP-Lists
>>>>>>>> are
>>>>>>>> evil at least when it comes to external memory and modern memory
>>>>>>>> hierarchies.
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>> --
>>>>>> ++  Michael Brunnbauer
>>>>>> ++  netEstate GmbH
>>>>>> ++  Geisenhausener Straße 11a
>>>>>> ++  81379 München
>>>>>> ++  Tel +49 89 32 19 77 80
>>>>>> ++  Fax +49 89 32 19 77 89
>>>>>> ++  E-Mail
>>>>>> brunni@netestate.de
>>>>>>
>>>>>> ++
>>>>>> http://www.netestate.de/
>>>>>>
>>>>>> ++
>>>>>> ++  Sitz: München, HRB Nr.142452 (Handelsregister B München)
>>>>>> ++  USt-IdNr. DE221033342
>>>>>> ++  Geschäftsführer: Michael Brunnbauer, Franz Brunnbauer
>>>>>> ++  Prokurist: Dipl. Kfm. (Univ.) Markus Hendel
>>>>>>
>>>>>
>>>>>
>>>> --
>>>> Regards,
>>>>
>>>> Kingsley Idehen
>>>> Founder & CEO
>>>> OpenLink Software
>>>> Company Web:
>>>> http://www.openlinksw.com
>>>>
>>>> Personal Weblog 1:
>>>> http://kidehen.blogspot.com
>>>>
>>>> Personal Weblog 2:
>>>> http://www.openlinksw.com/blog/~kidehen
>>>>
>>>> Twitter Profile:
>>>> https://twitter.com/kidehen
>>>>
>>>> Google+ Profile:
>>>> https://plus.google.com/+KingsleyIdehen/about
>>>>
>>>> LinkedIn Profile:
>>>> http://www.linkedin.com/in/kidehen
>>>>
>>>> Personal WebID:
>>>> http://kingsley.idehen.net/dataspace/person/kidehen#this
>>>>
>>>>
>>>>
>>>>
>>
>> -- 
>> Regards,
>>
>> Kingsley Idehen 
>> Founder & CEO
>> OpenLink Software
>> Company Web:
>> http://www.openlinksw.com
>>
>> Personal Weblog 1:
>> http://kidehen.blogspot.com
>>
>> Personal Weblog 2:
>> http://www.openlinksw.com/blog/~kidehen
>>
>> Twitter Profile:
>> https://twitter.com/kidehen
>>
>> Google+ Profile:
>> https://plus.google.com/+KingsleyIdehen/about
>>
>> LinkedIn Profile:
>> http://www.linkedin.com/in/kidehen
>>
>> Personal WebID:
>> http://kingsley.idehen.net/dataspace/person/kidehen#this
> ------------------------------------------------------------
> IHMC                                     (850)434 8903 home
> 40 South Alcaniz St.            (850)202 4416   office
> Pensacola                            (850)202 4440   fax
> FL 32502                              (850)291 0667   mobile (preferred)
> phayes@ihmc.us       http://www.ihmc.us/users/phayes
>
>
>
>
>
>
>
>


-- 
Regards,

Kingsley Idehen 
Founder & CEO
OpenLink Software
Company Web: http://www.openlinksw.com
Personal Weblog 1: http://kidehen.blogspot.com
Personal Weblog 2: http://www.openlinksw.com/blog/~kidehen
Twitter Profile: https://twitter.com/kidehen
Google+ Profile: https://plus.google.com/+KingsleyIdehen/about
LinkedIn Profile: http://www.linkedin.com/in/kidehen
Personal WebID: http://kingsley.idehen.net/dataspace/person/kidehen#this
Attachments

application/pkcs7-signature attachment: S/MIME Cryptographic Signature
Received on Saturday, 21 February 2015 19:29:47 UTC