Closure? (was Re: Request for two new media types submitted)

Eric, Ivan, all,

I've been looking at the semantics issue again and have discussed it
further with Stasinos. This is my best shot at finding a resolution - by
arguing that what we have now is correct.

My understanding is that the debate is between:

1. Applying the extension at the RDF layer
2. Applying the extension at the application layer.

In POWDER's case the application is a reasoner and/or query engine: 
POWDER documents assign metadata to sets of resources circumscribed by 
IRI patterns. Semantically said, POWDER documents assert IRI 
pattern-defined classes as being owl:subClassOf classes defined by 
owl:RestrictionS.

As currently documented and implemented, the WG took the advice of
Jeremy Carroll and, lead by Stasinos in this area, followed the first
option. This does not mean that we have extended RDF core outside the
context of POWDER. RDF Semantics remain unchanged (so we're within our
charter which states: "This working group is not chartered to make
extensions to RDF core, neither is it chartered to research the broader
development of the Semantic Web." [1]. Furthermore, we state at the very
end of section 4.3 of the formal semantics doc:

"Software can distinguish those RDF graphs to which the extended
semantics apply by testing for the appearance of either the
wdrs:matchesregex or the wdrs:notmatchesregex resource as the object of
a triple. For instance, in Example 4-4 the following class description
suffices to recognize a document that uses the semantic extension:" [2]

Therefore we provide a clear and simple means for syntactically
recognizing RDF graphs that need the POWDER extension to be fully
understood.

Furthermore, the POWDER extension monotonically adds meaning to RDF
semantics, as no RDF vocabulary is affected.

As a test of this, Stasinos created the SemPP engine using the
TransOnto library [3]. This uses Jena and the Pellet Reasoner to process 
POWDER-S documents. The key implementation being that Jena's API through 
which a resource is added to the graph was overridden so that the 
matchesregex triples appear in the graph. As a result, the DL reasoner 
is unaffected, SPARQL queries are unaffected and, of course, other RDF 
data is unaffected.

Now, AIUI Eric's contention is that this is the wrong approach. A better
way is to work at the application (OWL inference) layer. In this
scenario, existing DL reasoners would not be useful, we'd need more 
specific software. We are not averse to specific software (we have 
defined and built two interoperable POWDER Processors that return RDF 
descriptions of input URIs) but we are also informed by experience.

In 2004 - that long ago - many of the folk involved with POWDER now came
up with a thing called RDF Content Labels [4]. It looks very much like
POWDER, in that it has attribution, a means of putting labels in order
and so on - all done with what looks superficially like RDF. Dan Bri
often told me that RDF-CL is OK as long as what consumes it knows what
to do with it. A general RDF tool kit certainly wouldn't make any sense
of it. RDF-CL is a vocabulary defined to do a particular job, but is not
a good citizen of the semantic web.

Therefore, I would argue that we have in effect tried something very
much like what Eric is suggesting. Indeed, that was what we were working
towards right up until TPAC 2007 when I was going round asking anyone I
could grab hold of how we solved the semantics issue. Take a look at the
editor's note just above [5] where the question is laid out. This was
the version of our main Description Resources doc we took to that TPAC
meeting, fully expecting to be at CR by Christmas that year. Oh if only.
Two people I asked went for the first option, two others for the second
(from memory those 4 were you, Eric, Dan Bri, Fabien Gandon and Max 
Froumentin).

It was Tim who, given a choice of A or B, said "C, none of these"  and
told us we should be using OWL classes, JJC who showed us how (with the
semantic extension) and Stas who's proved that it works with minimal
code. It feels to me as if one reading of Eric's proposal would be to
revisit a version of that original discussion. You'll understand my
reluctance to do so.

The reason it has taken us so long to get from there to where we are now
is precisely because we've been trying to fit the square peg of matching
URIs against patterns into the round hole of RDF with a minimum of 
geometric distortion. I genuinely believe we have achieved that in the 
current documentation and implementation.

A new OWL datatype property is easier to document but it pushes POWDER
into a silo where all software is specialist. The aim has always been to
devise a means whereby a lot of triples that describe Web resources can
be generated easily and processed as far as possible by existing
software - hence the use of a barely-adapted Jena and wholly unchanged
Pellet in SemPP.

I hope I've understood both sides of the argument correctly?

Taking all this into account, I am strongly inclined to leave the
document as is when seeking the transition to PR.

Cheers

Phil.


[1] http://www.w3.org/2007/02/powder_charter
[2] Just above
http://www.w3.org/2007/powder/Group/powder-formal/20090205.html#emptyIRIsets
[3] http://transonto.sourceforge.net/
[4] http://www.w3.org/2004/12/q/doc/content-labels-schema.htm
[5] http://www.w3.org/2007/powder/Group/powder-dr/20071102.html#basicQueries

-- 

Phil Archer
http://philarcher.org/

i-sieve technologies                |      W3C Mobile Web Initiative
Making Sense of the Buzz            |      www.w3.org/Mobile



Eric Prud'hommeaux wrote:
> * Stasinos Konstantopoulos <konstant@iit.demokritos.gr> [2008-12-23 10:00+0200]
>> On Mon Dec 22 21:26:34 2008 Eric Prud'hommeaux said:
>>
>>> * Stasinos Konstantopoulos <konstant@iit.demokritos.gr> [2008-12-21 07:14+0200]
>>>> On Dec 20, 2008, at 6:38 PM, Eric Prud'hommeaux wrote:
>>>>
>>>>> I'm not sure which of the following you are arguing:
>>>>>  1 "Extends RDF" could not be interpreted as "extends the RDF data  
>>>>> model"
>>>>>  2 my proposed clarification is incorrect
>>>>>  3 my proposed clarification is not an improvement
>>>> #2
>>> OK, here is the proposed wording:
>>>
>>> "POWDER-S uses an <a href=
>>> "http://www.w3.org/TR/2004/REC-owl-semantics-20040210/syntax.html#owl_DatatypeProperty_syntax"
>>>> OWL DatatypeProperty</a> to relate a resource to a regular expression
>>> which that resource matches. While POWDER-S uses OWL classes to group
>>> resources, any engine determining if a resource belonged in one of
>>> these OWL classes would need to be able to test a resource against a
>>> regular expression."
>>>
>>> What are you arguing is incorrect?
>> this bit here:
>>
>>> "any engine determining if a resource belonged in one of these OWL
>>> classes would need to be able to test a resource against a regular
>>> expression."
>> Although that is one possible way to go about implementing a POWDER-S
>> processor, it is not the only one, so it is not that case that "any
>> engine ... would need".
>>
>> One counter-example to the universal quantification in your wording
>> is the SemPP processor (http://transonto.sourceforge.net/).
>> In this approach the "engine determining if a resource belonged in one
>> of these OWL classes" (which in SemPP's case is vanilla Pellet) knows
>> nothing about regexps; the regexp matching is done at the RDF
>> layer where nothing is known about OWL classes or any other OWL
>> vocabulary.
> 
> I understand your point that the implementation seems like it is doing
> matching resources against regex patterns at the core level. I argue
> that the programmatic boundries don't coincide with the logical
> boundries, and that the behavoir is best described as a semantic
> extension. To wit, the POWDER Formal Semantics [PFS] asserts that
> [[
> <x, reg> is in IEXT(I(wdrs:matchesregex)) if and only if:
> 
>     * reg conforms with regular expression syntax, AND 
> ...
> ]]
> all of which is in the language set aside in RDF Semantics [RS] for
> semantic extensions (in fact, everything in the semext class in the
> document appears to be just that, a semantic extension). It's not that
> you *couldn't* define it as an extension to RDF core, it's just that
> it would be painful, and the behavoir of two such extensions would not
> be defined.
> 
>>>> Machinery further up the application stack (RDFS and OWL reasoners)
>>>> can remain happily ignorant about what's happening underneath.
>>> Ahh, I believe it is customary to treat DatatypeProperies like
>>> wdrs:matchesregex or my:isEvenInteger as extensions to the inference
>>> layer. 
>> Design choices are best made per application depending on each
>> application's specific needs. POWDER extends RDF and not RDFS or OWL.
> 
> To some degree, though RDF has some text which favors one path over
> another.
> 
>>>>> Equivalent, sure, but it's distracting for the reader because the
>>>>> they start looking for an intersection where there is none, and.
>>>> As already noted, there are good technical reasons for this  
>>>> inconvenience.
>>> The main reason I see for this is that the xml representation
>>> expresses intersections, but not other logical constructs such as
>>> unions or complements. I expect this represents the far majority of
>>> use cases, as regular expressions can already express both unions
>>> and if you feel like compiling them into a regex, complements.
>>>
>>> Thus, all patterns can be reduced to a pure conjunction, so there's
>>> less pressure for working group to include a step for simplification.
>> That's yet another interesting alternative for implementing POWDER-S.
>> But I don't think you would prefer to have to wade through
>> the single-regexp representation of a POWDER/XML <ol> element--even
>> with just two branches--in the document.
> 
> Were you to be earlier in your process, I'd argue for a
> post-processing XSLT with a single rule for
> owl:class/owl:intersectionOf/owl:Restriction[count(/*) == 1]
> to change
> [[
> <owl:Class>
>   <owl:intersectionOf rdf:parseType="Collection">
>     <owl:Restriction>
>       <owl:onProperty rdf:resource="http://www.w3.org/2007/05/powder-s#matchesregex" />
>       <owl:hasValue  rdf:datatype="http://www.w3.org/2001/XMLSchema-datatypes#string">(porn\.example)\/?</owl:hasValue>
>     </owl:Restriction>
>   </owl:intersectionOf>
> </owl:Class>
> ]]
> into
> [[
> <owl:Restriction>
>   <owl:onProperty rdf:resource="http://www.w3.org/2007/05/powder-s#matchesregex" />
>   <owl:hasValue  rdf:datatype="http://www.w3.org/2001/XMLSchema-datatypes#string">(porn\.example)\/?</owl:hasValue>
> </owl:Class>
> ]]
> but I'm content that the cost of a change like this would exceed the
> benefits.
> 
>>> I was just thinking that these details could go into an issues list.
>>> As editor, I found the value of the issues list was not just to
>>> document outstanding issues, but to serve as a bit of a FAQ. It's
>>> possible that others besides me will be struck by the complexity
>>> and search for the design decision.
>> Please see the relevant paragraph in Section 3.1 [1], right after the
>> first occurrence of a singleton intersection, and make an editorial
>> comment if you feel the explanation is not sufficient.
> 
> I wasn't arguing for the spec's sake, just for ease of tracking
> important controversial points.
> 
>>>> [1] http://www.w3.org/TR/2008/WD-powder-formal-20081114/#multiDRsemantics
> [PFS] http://www.w3.org/TR/2008/WD-powder-formal-20081114/#SE
> [RS] http://www.w3.org/TR/rdf-mt/#ExtensionalDomRang

Received on Wednesday, 4 March 2009 16:54:54 UTC