Re: Signal for semantic extensions from Richard Cyganiak on 2013-05-18 (public-rdf-comments@w3.org from May 2013)

From: Richard Cyganiak <richard@cyganiak.de>
Date: Sat, 18 May 2013 17:56:02 +0100
To: David Booth <david@dbooth.org>
Cc: public-rdf-comments <public-rdf-comments@w3.org>
Message-Id: <99CC2FD2-A647-4AE3-AC08-D23D33C23235@cyganiak.de>
David,

On 18 May 2013, at 15:56, David Booth <david@dbooth.org> wrote:

> Hi Richard,
> 
> Yes, if RDF did not define a standard rdf:requires (or similar) signal to explicitly indicate the semantic extension, then to be safe a client would have to assume that *any* unrecognized class or property *might* signal the need for additional entailments.  

But almost any such term *does* introduce additional entailments. If the class is defined as having a superclass, or the property has a defined domain and range, then you need their definitions to get all entailments.

> But that could result in an awful lot of false positives, which the client would have no automated way of distinguishing from true positives.

Can you give an example where this would result in a false positive, and explain why you think such cases would be frequent?

> I should clarify that when I talk about entailments, I'm primarily thinking of entailments that can be expressed in RDF as added triples.

Well, but anything can be expressed in RDF as triples, given appropriate vocabularies.

> If we assume that a Linked Data approach is used, then there is another way that this problem can be addressed.  Suppose every unrecognized class or property URI is followed to obtain its definition, expressed also in RDF.  If the definition itself is required to define all of the entailment rules that the client would need for that semantic extension -- i.e., the semantic extension implied by that class or predicate -- then the client could be assured that if it had obtained definitions for all unrecognized classes and predicates then it would be able to determine all entailments.  

Hm.

You assume that it is desirable that all parties in the communication have a shared sense of “all” or “complete entailments”. I don't think that's the case, in general. Enabling communication despite partial understanding is a key feature of RDF. It's common for server to publish data in a single document using a number of different vocabularies, because different clients understand different parts, and will generally ignore what they don't understand. As a publisher, I also assume that some clients will only understand parts of my data (e.g., generic SKOS) and I include bits in XKOS anyway for those clients that understand it. I also assume that some clients may draw conclusions from my data that I am completely unaware of, based on domain knowledge or based on combining it with additional data.

There are several aspects in the design of RDF that enable this. Most importantly, dropping some triples doesn't make any other triples false (although it may of course make the message unintelligible). Neither does adding additional triples. And semantic extensions are supposed to be designed in a way that they might produce additional triples, but not invalidate other triples.

In short, I don't think that we should be designing for the notion that there is a unique and complete full interpretation of a given graph. No matter what we say in any spec, and no matter what you say is required to fully understand your data, different consumers with different capabilities will draw different conclusions from your data.

Best,
Richard 


> But if the client was unable to obtain a definition for some class or predicate, then this would indicate that it may be missing some entailment rules -- with no need for rdf:requires -- and the client could notify the user.   However, for this approach to work, we would have to adopt a convention that says "if a class or predicate definition is supplied (in RDF), then it must either: (a) supply (in RDF) all of its associated entailment rules (directly or indirectly); or (b) use a class or predicate that has no RDF definition".  In other words, the lack of an available RDF definition would signal potentially missing entailments.  This means that if a URI owner wanted to provide only a partial definition in RDF of a class or predicate, then the URI owner would have to be sure that the definition also references a class or predicate that has no RDF definition, as a way to signal the existence of additional entailment rules.  This would work and might be the cleanest architectural design, but it is slightly more implicit than using a rdf:requires predicate.
> 
> David
> 
> 
> On 05/18/2013 05:01 AM, Richard Cyganiak wrote:
>> David,
>> 
>> (Unofficial response to ask for clarification)
>> 
>> Given that any RDF vocabulary is a semantic extension, isn't the
>> answer here simply that if a client sees a class or property IRI that
>> it doesn't know, then it must assume that additional inferences are
>> possible?
>> 
>> Richard
>> 
>> 
>> On 18 May 2013, at 04:05, David Booth <david@dbooth.org> wrote:
>> 
>>> This comment raises an issue that is somewhat theoretical at
>>> present.  I mentioned it over a year ago (message below) but have
>>> not seen any discussion about it.  I have not seen it be a problem
>>> in practice yet, so I do not think it is urgent for the working
>>> group to address.  But if RDF gains popularity over the coming
>>> years, and more semantic extensions are introduced, it could become
>>> a practical consideration, given the long time span between RDF
>>> versions.
>>> 
>>> At present there is no standard way in RDF to unambiguously signal
>>> the expectation of a particular semantic extension.  I'll explain
>>> further what I mean, and make a specific proposal.  Perhaps others
>>> will think of a better way to solve the problem, but hopefully this
>>> will at least explain what it is.
>>> 
>>> Suppose an RDF consumer receives a graph written by an RDF author
>>> and (roughly speaking) the RDF consumer wants to be able to fully
>>> "understand the author's intended meaning" of that graph.  More
>>> precisely, the RDF author has used certain semantic extensions that
>>> imply certain entailments, and wishes to allow consumers of that
>>> graph to be able to automatically (by machine) determine these
>>> entailments. In turn, the RDF consumer wishes to be able to compute
>>> all of those entailments.  Note that this is *not* suggesting that
>>> the RDF consumer be *required* to compute the RDF author's intended
>>> entailments.  It is only about *enabling* the RDF consumer to do so
>>> if desired.
>>> 
>>> For semantic extensions that are well known, such as OWL, the RDF
>>> consumer can detect the presence of well known URIs (such as OWL
>>> predicates) to know that those well known semantic extensions are
>>> intended.  But for semantic extensions that are *not* well know --
>>> non-standard semantic extensions -- the RDF consumer has no
>>> standard automatable way to know that certain URIs are intended to
>>> signal the use of particular semantic extensions.  Thus, the RDF
>>> consumer has no standard way of determining whether or not
>>> he/she/it has computed all of the entailments that the RDF author
>>> intended to convey.
>>> 
>>> When the RDF consumer processes an RDF graph, the processor should
>>> be able to clearly indicate to the user either: "I have computed
>>> all of the author's intended entailments" or "I cannot compute all
>>> of the author's intended entailments because I do not have the
>>> module for semantic extension
>>> 'http://example/BobsFavoriteExtension'.  Please load it and try
>>> again."  But this is only possible if the RDF author has an
>>> unambiguous standard way to signal the intended semantic
>>> extensions.
>>> 
>>> The motivation for this use case is to enable the vision of the
>>> semantic web to work, even in the presence of new semantic
>>> extensions.  This means that: (a) the RDF consumer cannot be
>>> expected to have any other communication with the RDF author (other
>>> than obtaining the graph that the author had provided); and (b) the
>>> RDF consumer must be able to perform these steps automatically (by
>>> machine).
>>> 
>>> I suggest the RDF working group define a standard predicate
>>> rdf:requires (or whatever name the group chooses) that an RDF
>>> author can use to indicate that a particular semantic extension is
>>> intended.  It could be used like this:
>>> 
>>> <> rdf:requires <http://example/BobsFavoriteExtension> .
>>> 
>>> which would indicate that the current document uses semantic
>>> extension <http://example/BobsFavoriteExtension> .  Hence, to be
>>> assured of determining all of the document author's intended
>>> entailments, the RDF processor must understand that semantic
>>> extension.
>>> 
>>> Furthermore, for backward compatibility with OWL, it would be good
>>> to define:
>>> 
>>> owl:imports rdfs:subPropertyOf rdf:semanticExtension .
>>> 
>>> and recommend that RDF processors also recognize owl:imports as
>>> signaling a semantic extension.
>>> 
>>> Again, since I have not yet seen this issue arise in practice, I
>>> would consider it a low priority to fix, and would not mind if the
>>> working group decides to defer it to a future RDF version.  On the
>>> other hand, it is a very easy gap to fix.
>>> 
>>> Thanks, David
>>> 
>>> 
>>> On 03/30/2012 06:18 PM, David Booth wrote:
>>>> -------- Forwarded Message -------- From: David Booth
>>>> <david@dbooth.org> To: Pat Hayes <phayes@ihmc.us> Cc: Jonathan A
>>>> Rees <rees@mumble.net>, Jeni Tennison <jeni@jenitennison.com>,
>>>> www-tag@w3.org List <www-tag@w3.org> Subject: Re: The TAG
>>>> Member's Guide to ISSUE-57 Discussion - F2F reading Date: Fri, 30
>>>> Mar 2012 18:17:06 -0400
>>>> 
>>>> Hi Pat,
>>>> 
>>>> On Wed, 2012-03-28 at 14:24 -0500, Pat Hayes wrote:
>>>>> FWIW, I am willing to work actively (on- or off-list) with
>>>>> anyone who wants to try reconciling any proposal with the RDF
>>>>> semantics, or just to explore any semantic issues. This is
>>>>> particularly timely as the RDF2 WG is right now debating issues
>>>>> which impinge on the RDF semantics framework, so it would be
>>>>> good to get any pending issues or problems out into the open.
>>>> 
>>>> I would suggest that the RDF WG look at Part 3 "Determining
>>>> Resource Identity" of "Resource Identity and Semantic Extensions:
>>>> Making Sense of Ambiguity":
>>>> http://dbooth.org/2010/ambiguity/paper.html#part3 That section
>>>> proposes a standard process for determining resource identity.
>>>> As far as I know, I did not invent this process.  I simply
>>>> documented what seemed to be the general ideas floating around.
>>>> 
>>>> However, I did identify one specific gap in the RDF specs: [[ At
>>>> present there is a minor gap in the RDF standards, in that there
>>>> is no standard way for an RDF processor to recognize that a
>>>> particular URI is intended to signal an opaque semantic
>>>> extension: the knowledge of which URIs are intended to signal
>>>> opaque semantic extensions must be externally supplied to the RDF
>>>> processor.  The RDF processor must magically know about them in
>>>> advance.  It cannot alert the user to the need for a new opaque
>>>> semantic extension that was previously unknown. This gap could be
>>>> addressed by defining a standard predicate, such as
>>>> rdf2:requires, to explicitly indicate when a particular semantic
>>>> extension is required.  However, since it currently seems
>>>> unlikely that many semantic extensions will be needed that cannot
>>>> be defined using standard inference rules, this does not seem
>>>> like a major gap. ]]
>>>> 
>>>> I will forward this message separately to the RDF comments list,
>>>> since I cannot post to the regular RDF list.
>
Received on Saturday, 18 May 2013 16:56:26 UTC