Re: More on: Should information be merged from several RDF files? from Pat Hayes on 2014-07-10 (semantic-web@w3.org from July 2014)

From: Pat Hayes <phayes@ihmc.us>
Date: Thu, 10 Jul 2014 00:14:20 -0500
To: Victor Porton <porton@narod.ru>, Simon Spero <sesuncedu@gmail.com>
Cc: "semantic-web@w3.org" <semantic-web@w3.org>
Message-Id: <7CBC8045-8E01-4AD3-A223-928366C879A0@ihmc.us>
Just to clarify: there are no 'dark triples' in the current RDF specifications. This term was used in the internal RDF working group discussions back in 2002, and it referred to an idea to distinguish RDF triples into two categories, those used to make actual assertions (real data) and others, called informally 'dark triples', which couild be used for other purposes, such as encoding syntax of more elaborate notations. This idea was however rejected by the RDF WG (the decision was effectively made by Dan Connolly's objection http://lists.w3.org/Archives/Public/w3c-rdfcore-wg/2002Apr/0363.html to my proposal to include it informally in the RDF Semantics document) and was NOT part of the published 2004 RDF specification, and it has not been discussed since. Neither RDF 1.0 nor RDF 1.1  normative specifications support any such distinction between dark triples and other kinds of RDF triples. 

Pat

On Jul 8, 2014, at 7:53 AM, Victor Porton <porton@narod.ru> wrote:

> As far as I understood Pat Hayes's post "Unasserted triples, Contexts and things that go bump in the night.", which triples are dark is specified in the RDF files themselves.
>  
> On the other hand, in my system, which triples are ignored is decided by my program code. It is different than dark triples.
>  
> Also, I haven't understood what dark triples are for.
>  
> 08.07.2014, 07:35, "Simon Spero" <sesuncedu@gmail.com>:
>> Dark triples?
>> http://lists.w3.org/Archives/Public/w3c-rdfcore-wg/2002Mar/0253.html
>> 
>> Simon
>> 
>> On Jul 7, 2014 11:05 PM, "Pat Hayes" <phayes@ihmc.us> wrote:
>> 
>> On Jul 7, 2014, at 4:17 PM, Victor Porton <porton@narod.ru> wrote:
>> 
>> > I am writing a program.
>> >
>> > I read RDF files while executing my program.
>> >
>> > After each RDF loaded, my program does some actions (and probably terminates).
>> >
>> > It is not predictable which RDF file will be loaded next, because in intervals between loading RDF files my program does some computations and the next loaded RDF file depends on these computation.
>> >
>> > As such, I cannot first load all RDF files and merge information in them. Instead of this I need to load RDF files one-by-one and update my program data structure after reading each RDF file.
>> >
>> > If I would read all RDF files at once I would be able just to merge data from all RDF files. But I cannot do that.
>> >
>> > Upon reading each RDF file, I update internal data structures of my program based on RDF triples loaded.
>> 
>> So far, nothing you have said tells us why you are using RDF for this application. RDF is intended for use in transmitting assertional information across the Web, analogously with how HTML is designed to transmit hypertext. Does your application have any relationship to this kind of use?
>> 
>> > I cannot base building these internal data structures of my program on the result of set-theoretic union of all RDF triples loaded till the moment. The reason for this is that loading an additional RDF may render my data inconsistent
>> 
>> Two points in response.
>> 
>> First, this notion of 'inconsistent' which you are using is not the RDF notion of consistency. You are therefore, apparently, using some kind semantic extension of RDF. (See http://www.w3.org/TR/2014/REC-rdf11-mt-20140225/#semantic-extensions-and-entailment-regimes ) You might do well to try to describe this extension more precisely before proceeding. (The restriction you describe below is defined in the OWL semantic extension: it is the requirement that the predicate be a functional property.)
>> 
>> Second, it is of the essence of RDF and RDF extensions that they can express inconsistencies. RDF users should be prepared to deal with clashes or inconsistencies between data items and have strategies for dealing with them. These might range form simply throwing an error, to a sophisticated truth-maintenance system which finds maximally consistent subsets of RDF triples.
>> 
>> > (if it has two or more different objects for a predicate which should have no more than on value, as in an example below). So this would require removal of some data from my program data structures, what would aimlessly complicate the code. I want only to add new data structures, not remove them, to make my program easier.
>> 
>> With respect, this is rather like saying that I want to avoid doing arithmetic, so I want all my sums to be correct without having to add them up. RDF simply carries data to your code: if that data is faulty or more complicated than you would prefer it to be, don't blame RDF or seek to find an RDF magic bullet.
>> 
>> > So the only remaining option is to load RDF one-by-one and construct new internal data structures of my program based only on the last loaded RDF file (not all loaded RDF files together).
>> 
>> You have decided to resolve contradictions by preferring the most-recently read data over 'older' data. This sounds like a possibly workable simplification, but I would not want to rely on it for anything important.
>> 
>> > A question remains:
>> >
>> > # file-1.rdf
>> > <http://example.com> <#property-which-can-have-only-one-value> 1 .
>> >
>> > # file-2.rdf
>> > <http://example.com> <#property-which-can-have-only-one-value> 2 .
>> >
>> > Let we load first file-1.rdf and then later file-2.rdf. Should the triple from file-2.rdf be ignored? Or should I construct a new data structure from the data of both files, as if the subject URLs in these files would be different?
>> 
>> All of these are possible strategies for resolving conflicts. Nothing in RDF prefers one over the other. The choice is yours. Only someone who knows what your data means, and how it is created, would be able to make an intelligent decision here. There is no magic bullet.
>> 
>> Pat Hayes
>> 
>> >
>> > Here is my project, by the way:
>> > http://freesoft.portonvictor.org/namespaces.xml
>> >
>> > --
>> > Victor Porton - http://portonvictor.org
>> >
>> >
>> 
>> ------------------------------------------------------------
>> IHMC                                     (850)434 8903 home
>> 40 South Alcaniz St.            (850)202 4416   office
>> Pensacola                            (850)202 4440   fax
>> FL 32502                              (850)291 0667   mobile (preferred)
>> phayes@ihmc.us       http://www.ihmc.us/users/phayes
>> 
>> 
>> 
>> 
>> 
>> 
>  
>  
> --
> Victor Porton - http://portonvictor.org
>  

------------------------------------------------------------
IHMC                                     (850)434 8903 home
40 South Alcaniz St.            (850)202 4416   office
Pensacola                            (850)202 4440   fax
FL 32502                              (850)291 0667   mobile (preferred)
phayes@ihmc.us       http://www.ihmc.us/users/phayes
Received on Thursday, 10 July 2014 05:14:54 UTC