RE: (Ref.: ISSUE-12: Conjunction and disjunction) Semantics of resource set definitions

Hi David,

The precedent I had heard of was in XHTML2, namely the value of the
universal @role attribute:

http://www.w3.org/TR/xhtml2/mod-roleAttribute.html#s_roleAttributemodule

The intention being that a given element could be assigned roles from
multiple, unrelated knowledge systems, e.g.

<div role="dc:title wairole:grid" ...>

It's not clear (to me, at least :) whether this constitutes a logical
AND in the XHTML 2 context. 

Cheers
Kevin

Kevin Smith
Technology Strategist
Vodafone Research & Development
Mobile: +44 (0)7990 798 916
Text: +44 (0)7825 106 554
Email: kevin.smith@vodafone.com

Vodafone Group Services Limited
Registered Office: Vodafone House, The Connection,
Newbury, Berkshire RG14 2FN
Registered in England No 3802001/

-----Original Message-----
From: public-powderwg-request@w3.org
[mailto:public-powderwg-request@w3.org] On Behalf Of David
Sent: 24 May 2007 12:03
To: Public POWDER
Subject: Re: (Ref.: ISSUE-12: Conjunction and disjunction) Semantics of
resource set definitions


Hi Phil,

My two cents...

I'm having problems with white space separated lists as well. I think
that 
by using them we're relying on people to associate the white space with 
logical AND. I am not sure what the precedent is that Kevin mentions but

personally i see no reason why a white space (in this implementation)
could 
not be seen as any other type of logical operator. FWIW i think the
argument 
against option 3 should be the same as that against option 1.

I also don't like the REGEX option. I think that expecting the
implementers 
to be proficient in REGEX as well as RDF may be asking a little too
much. 
Yes, it does provide a solution to our dilema but i think we can find a 
better option.

I can't choose my favourite option as the arguments for and against are
all 
very compelling. I do think it is import to have a closed DR scope. Its 
probably then going to come down to a question of which is more
important - 
ease of implementation (use/understanding) or ease of processing.

Cheers,
------------------------------------------------------------------------
--------
David Rooks
Segala, Senior Standards Compliance Manager and Test Manager

HQ: 19 The Mall / Beacon Court / Sandyford /Dublin 18 / Ireland
UK:  2 Coltsfoot Drive / Burpham / Guildford / GU1 1YH / Surrey / UK
Office:   +44 (0)1483 572 800
Mobile:  +44 (0)7783 718 905
------------------------------------------------------------------------
--------

----- Original Message ----- 
From: "Phil Archer" <parcher@icra.org>
To: "Public POWDER" <public-powderwg@w3.org>
Cc: "Jo Rabin" <jo@linguafranca.org>
Sent: Wednesday, May 23, 2007 3:32 PM
Subject: Re: (Ref.: ISSUE-12: Conjunction and disjunction) Semantics of 
resource set definitions


>
> Thanks very much Kevin, I really appreciate you taking time to look at

> this.
>
> Keeping each property value to a single item, obviating the need for
list 
> parsing, is a good benefit. The only drawback is that it means we
can't 
> use OWL cardinality to restrict the number of, say, hasPathStartsWith 
> properties. That means that you can publish your DR and then on my
server 
> I can publish an RDF triple that says
>
> <your Resource Set's URI> wdr:hasPathStartsWith 'red'
>
> And a semantic system could pick that up and add it to your DR
definition. 
> True, the provenance of that triple can be checked, but this is what I

> mean by being open, as opposed to closed world.
>
> The other problem is that OWL set operators are predicates
(properties) 
> that therefore must have Classes as their value. So in fact your
example 
> would have to be written thus:
>
> <wdr:ResourceSet>
>   <wdr:hasAnyHostFrom>example.org</wdr:hasAnyHostFrom>
>
>   <owl:unionOf rdf:parseType="Collection">
>
>     <wdr:ResourceSet>
>       <wdr:pathStartsWith>foo</wdr:pathStartsWith>
>     </wdr:ResourceSet>
>
>     <wdr:ResourceSet>
>       <wdr:pathStartsWith>bar</wdr:pathStartsWith>
>     </wdr:ResourceSet>
>
>   </owl:unionOf>
> </wdr:ResourceSet>
>
> as opposed to
>
> <wdr:ResourceSet>
>   <wdr:hasAnyHostFrom>example.org</wdr:hasAnyHostFrom>
>   <wdr:pathStartsWithAnyOf>foo bar</wdr:pathStartsWithAnyOf>
> </wdr:ResourceSet>
>
> Yes, there's more processing of the values, but that's something that
an 
> application can do in a single line normally (in Perl certainly)
whereas 
> to extract multiple values from multiple properties of multiple sets
in an 
> OWL collection - that sounds like several SPARQL queries just to get
the 
> data. That said, it wouldn't surprise me if this is the solution an
RDF 
> head would prefer. Hmmm...
>
> But... your example does perhaps point towards the XML-based solution 
> proposed by Jo in the XG. And talking of Jo...
>
> I know he and others feel that REs are a road to confusion and error
and, 
> no doubt, in some cases that's true. As I've worked with them a bit I 
> reckon that's the easiest way forward but, well, that's what I expect
to 
> use most of the time and I guess you would too. But we need
alternative as 
> well. Also, as Andrea is usually quick to point out, they don't work
on RS 
> defined by resource property. For all that though I'm awfully tempted
to 
> put this in IRC next time
>
> PROPOSED RESOLUTION: Conjunctions are unnecessary since Regular 
> Expressions provide all the flexibility we need.
>
> ... but I'll keep that urge under control.
>
> We always knew this would be the hard part to resolve!
>
> Phil.
>
> Smith, Kevin, VF-Group wrote:
>> HI Phil,
>>
>> Good work! Some thoughts:
>>
>> There is precedent for whitespace-delimited lists in
element/attribute
>> values, but would another option be to use owl:unionOf within the RS:
>>
>> 3      <wdr:ResourceSet>
>> 4        <wdr:hasAnyHostFrom>example.org</wdr:hasAnyHostFrom>
>> <owl:unionOf rdf:parseType="Collection">
>> 5          <wdr:pathStartsWith>foo</wdr:pathStartsWith>
>> 5          <wdr:pathStartsWith>bar</wdr:pathStartsWith>
>>             </owl:unionOf>
>> 6      </wdr:ResourceSet> That may be more friendly to RDF parsers
(i.e. 
>> no extra string
>> operations needed to extract values). Not sure if that risks nested
set
>> operators and OWL Full, as you say.
>>
>> NB I was looking at Apache rewrite rules, since they also work on
>> matching URIs and have a widespread following. It appears there has
not
>> been developed a higher-level language of matching, but a use of
(often
>> complex) REs. IMO this gives credence to the use of REs for our kind
of
>> matching use cases.
>>
>> Overall, happy to see this written up further.
>>
>> Cheers
>> Kevin Kevin Smith
>> Technology Strategist
>> Vodafone Research & Development
>> Mobile: +44 (0)7990 798 916
>> Text: +44 (0)7825 106 554
>> Email: kevin.smith@vodafone.com
>>
>> Vodafone Group Services Limited
>> Registered Office: Vodafone House, The Connection,
>> Newbury, Berkshire RG14 2FN
>> Registered in England No 3802001/
>>
>> -----Original Message-----
>> From: public-powderwg-request@w3.org
>> [mailto:public-powderwg-request@w3.org] On Behalf Of Phil Archer
>> Sent: 22 May 2007 16:27
>> To: Public POWDER
>> Subject: Re: (Ref.: ISSUE-12: Conjunction and disjunction) Semantics
of
>> resource set definitions
>>
>>
>> Right, after a while away from this issue, here we are again, looking
at
>>
>> the conjunction document [1].
>>
>> It feels as if we could spend an entire face to face meeting
discussing 
>> this so let's see if we can avoid that!
>>
>> In recent posts, Andrea has been arguing for the implicit semantics
of 
>> option 1 so that our example of encoding "everything on example.com
OR 
>> example.org with a path containing foo OR bar" would be written as at
>> [2].
>>
>> I agree with Andrea in so far as if we want to express relatively
complex 
>> things then that's probably going to take some relatively complex
code. I 
>> just want to keep it as simple as possible (of course!).
>>
>> I also believe it is very much in our interests to reduce the
opportunity 
>> for the data we create in POWDER to be misused. In particular, I
think it 
>> generally a good thing to close off Resource Set definitions so that
you 
>> can't publish further triples whose provenance needs to be taken into

>> account before deciding whether to use them or
>> not.
>>
>> Where I disagree with Andrea is that the implicit semantics of [2]
are 
>> the least worst option. I really don't like the idea that if you have
two 
>> of a given property then you combine them with OR but different 
>> properties are combined with AND. It just sounds too woolly and error

>> prone to me.
>>
>> And how would we encode those rules?
>>
>> Limiting the cardinality of the various RDF properties is easy with
OWL 
>> Lite. Thus I generally favour option 3 [3] in which we give a list of

>> values as the value of the various RDF properties. Maybe a change in
name 
>> of those properties might help clarify thinking. How about this:
>>
>> <wdr:ResourceSet>
>>    <wdr:hasAnyHostFrom>example.com example.org</wdr:hasAnyHostFrom>
>>    <wdr:pathContainsAnyOf>foo bar</wdr:pathContainsAnyOf>
>> </wdr:ResourceSet>
>>
>> This is, again, a white space separated list but the altered RDF
property 
>> name makes it easier to read. We might consider defining 'list'
>>
>> versions of the RDF properties we have so that the ones we have now 
>> (hasHost, hasScheme etc.) remain as they are taking a single value,
but 
>> additional properties would take lists - but this seems overly
redundant 
>> since a list of length 1, such as 
>> <wdr:hasAnyHostFrom>example.com</wdr:hasAnyHostFrom> is valid.
>>
>> So to recap, this gives us the advantage of being able to limit 
>> cardinality of each of our set definition properties to 0 or 1
(adding to 
>> security). Each of these properties would be combined with logical
>> AND.
>>
>> Andrea makes good points about negation. Since this:
>>
>> (($host !~ /example.org) || ($host !~ /example.net/))
>>
>> is always true - a classic DeMorgan trap I think. So again, maybe a 
>> change of RDF property name can help. How about this
>>
>> <wdr:ResourceSet>
>>    <wdr:hasAnyHostFrom>example.org example.com</wdr:hasAnyHostFrom>
>>    <wdr:hasNotAnyHostFrom>search.example.org shopping.example.com
>>                                          </wdr:hasNotAnyHostFrom>
>> </wdr:ResourceSet>
>>
>> This translates as "if the host IS ANY of these but NOT ANY of these,

>> then it's in the Resource Set."
>>
>> Lists only take us so far. Again, referring to Andrea's comments,
what 
>> about anything on example.org with a path beginning with foo OR bar
and 
>> resources on example.com with a path beginning with bar (only). White

>> space separated lists won't get us out of this - we need to use
something 
>> like owl:unionOf.
>>
>> OK, let's actually use owl:unionOf.
>>
>> Notice that owl:unionOf is a property, not a Class, therefore,
Andrea's 
>> code needs a little tweaking to give this:
>>
>> 1  <wdr:ResourceSet>
>> 2    <owl:unionOf rdf:parseType="Collection">
>>
>> 3      <wdr:ResourceSet>
>> 4        <wdr:hasAnyHostFrom>example.org</wdr:hasAnyHostFrom>
>> 5        <wdr:pathStartsWithAnyOf>foo bar</wdr:pathStartsWithAnyOf>
>> 6      </wdr:ResourceSet>
>>
>> 7      <wdr:ResourceSet>
>> 8        <wdr:hasAnyHostFrom>example.net</wdr:hasAnyHostFrom>
>> 9         <wdr:pathStartsWithAnyOf>bar</wdr:pathStartsWithAnyOf>
>> 10     </wdr:ResourceSet>
>>
>> 11   </owl:unionOf>
>> 12 </wdr:ResourceSet>
>>
>> We have two Resource Sets here (which are Classes) and we use the 
>> owl:unionOf predicate to create the union. More complex examples are 
>> possible but given that we're supporting regular expressions, and, if
my
>>
>> line of argument holds, white space separated lists, the likelihood
of a
>>
>> more complex Resource Set definition than that shown here seems
remote -
>>
>> at least for the use cases under our consideration.
>>
>> This retains the closed world objective. RDF Collections are closed 
>> world - but I admit it's not clear to me how the constraint that a 
>> Resource Set can have a sub set if it's the subject of an
owl:unionOf, 
>> intersectionOf or owl:complementOf predicate. Incidentally, using
these 
>> set operators puts us firmly in OWL DL, not OWL Lite (and, if I 
>> understand it correctly, nested set operators might take us into OWL
Full 
>> so they should be strongly discouraged).
>>
>> So I think we're building up a picture here.
>>
>> If you want to define a set simple as 'everything on example.com'
(which
>>
>> remains the most likely scenario for our use cases) then you can do
it 
>> really easily
>>
>> <wdr:ResourceSet>
>>    <wdr:hasAnyHostFrom>example.com</wdr:hasAnyHostFrom>
>> </wdr:ResourceSet>
>>
>> If you want something a little more complicated - like multiple hosts
- 
>> put them in a white space separated list.
>>
>> If you need to create slightly more complex but still relatively
simple 
>> RS definitions that include multiple elements then that's possible
too, 
>> as we've seen with the original example.com/org plus foo/bar example.
>>
>> We can define even more complex sets where we have (multiple
definitions) 
>> OR (other multiple definitions) using OWL set operators.
>>
>> And if that isn't enough, you can always use a Regular Expression. 
>> Actually, there's a thought, can you (meaningfully) have a white
space 
>> separated list of regular expressions?? probably not - so that's one
of 
>> our RDF properties that can only have a single value.
>>
>> What about conjunctions of resources grouped by property? The group 
>> hasn't discussed this yet, but if we go with my current proposal,
below,
>>
>> then how will that affect things?
>>
>> Here's an RS definition for 'all resources on example.org that are in

>> French.
>>
>> <wdr:Set>
>>    <wdr:hasAnyHostFrom>example.org</wdr:hasAnyHostFrom>
>>
>>    <wdr:resourcesWith rdf:parseType="Resource">
>>      <ex:lang>fr</ex:lang>
>>    </wdr:resourcesWith >
>>
>>    <wdr:hasPropLookUp>
>>      <wdr:PropLookUp>
>>        <wdr:lookUpURI>$cURI</wdr:lookUpURI>
>>        <wdr:method
>> rdf:resource="http://www.w3.org/2006/http#HeadRequest" />
>>        <wdr:responseContains>Content-Language:
fr</wdr:responseContains>
>>      </wdr:PropLookUp>
>>    </wdr:hasPropLookUp>
>>
>> </wdr:Set>
>>
>> So this says that the language must be French and the way to find out

>> whether it is or not is to do a Head request to $cURI (the candidate 
>> resource's URI) and see if you get a header back that says 
>> "Content-Language: fr".
>>
>> Can we use a white space separated list here? Sometimes, would be the

>> answer, I guess. Imagine we wanted to define a set as all resources
on 
>> example.org in French OR German. Try this:
>>
>> <wdr:Set>
>>    <wdr:hasAnyHostFrom>example.org</wdr:hasAnyHostFrom>
>>
>>    <wdr:resourcesWith rdf:parseType="Resource">
>>      <ex:lang>fr de</ex:lang>
>>    </wdr:resourcesWith >
>>
>>    <wdr:hasPropLookUp>
>>      <wdr:PropLookUp>
>>        <wdr:lookUpURI>$cURI</wdr:lookUpURI>
>>        <wdr:method
>> rdf:resource="http://www.w3.org/2006/http#HeadRequest" />
>>        <wdr:responseContains>"Content-Language: fr"
>>                            "Content-Language:
de"</wdr:responseContains>
>>      </wdr:PropLookUp>
>>    </wdr:hasPropLookUp>
>>
>> </wdr:Set>
>>
>> I've had to quote the list elements in the responseContains property
but
>>
>> I don't think it's unusual to require quoting of strings if they are
to 
>> include white space!
>>
>> By way of an apology for the length of this post, let me summarise.
>>
>> 1. I don't like implied semantics and think we can do better.
>> 2. We must surely accept complexity where complexity is being
expressed
>> 3. Complexity should be as scarce as the use cases that demand it
>> 4. Changing the property names can make it clear (to humans) that the

>> value is a list
>> 5. REs are supported anyway so they're always available for people
who 
>> prefer them (like me)
>> 6. We can use OWL set operators where we need a union of otherwise 
>> separate sets.
>> 7. The multi-layered approach to conjunction can work just as well
for RS 
>> definitions by property, notwithstanding the need to support quoted 
>> strings so that they can include white space.
>>
>> Depending on your feedback, I'd like to write this up in the doc so
it 
>> can be presented properly. I would, however, like to include the 
>> XML-based approach in the doc [4] as an alternative to all this.
>>
>> Its principal attraction, for me, flows from the following argument:
It 
>> is likely that a generic RDF processor will be able to handle all
aspects 
>> of a DR, without modification, except the Resource Set. Since the
data in 
>> an RS definition needs to be handled slightly differently, it does
seem 
>> to be logical to make that explicit by quoting an XML Literal within
the 
>> RDF graph (which is what the pre-defined RDF datatype
>>
>> of XML Literal is designed to allow you to do).
>>
>> Its principal problem, IMHO, is that the definition of something as 
>> simple as 'everything on example.org' should not require running a 
>> separate XML parser/XPath query. I reckon we really need to see some 
>> SPARQL queries against the RS data examples to settle this one??
>>
>> Cheers
>>
>> Phil.
>>
>>
>> [1] http://www.w3.org/2007/powder/powder-grouping/conjunction
>>
>> [2] http://www.w3.org/2007/powder/powder-grouping/option1.rdf and 
>> http://www.w3.org/2007/powder/powder-grouping/option1.png
>>
>> [3] http://www.w3.org/2007/powder/powder-grouping/option3.rdf and 
>> http://www.w3.org/2007/powder/powder-grouping/option3.png
>>
>> [4] http://www.w3.org/2007/powder/powder-grouping/conjunction#option6
>>
>>
>> Phil Archer wrote:
>>> A few small comments inline below
>>>
>>> Andrea Perego wrote:
>>>> Hi, Phil.
>>>>
>>>>> [snip]
>>>>>
>>>>> In your discussion, you suggest 4 possible solutions to the
>> pathContains
>>>>> issue. The complexities get more severe when we get into negatives
>> and,
>>>>> from my perspective, we're getting a long way away from a design
>>>>> fundamental of simplicity with the real possibility that a
>>>>> semi-technically minded person could write a set definition by
hand
>> if
>>>>> necessary.
>>>> I think here we should consider if and why we should support
>> negation.
>>>> It is not just to support as much flexibility as possible. As was
>>>> reported in a previous version of the grouping document, negation
is
>>>> useful in order to simplify the specification of a scope by also
>>>> supporting exceptions.
>>>>
>>>> Suppose, for instance, that a given DR applies to a set of hosts
>>>> my.example.org, your.example.org, his.example.org, her.example.org,
>>>> our.example.org, but not to their.example.org.
>>>>
>>>> If negation is not supported, the scope of the DR must be specified
>> as
>>>> follows:
>>>>
>>>> <wdr:Set>
>>>>   <wdr:hasHost>my.example.org</wdr:hasHost>
>>>>   <wdr:hasHost>your.example.org</wdr:hasHost>
>>>>   <wdr:hasHost>her.example.org</wdr:hasHost>
>>>>   <wdr:hasHost>his.example.org</wdr:hasHost>
>>>>   <wdr:hasHost>our.example.org</wdr:hasHost>
>>>> </wdr:Set>
>>>>
>>>> otherwise, if a wdr:hasNotHost property is available, we can reduce
>> the
>>>> specification to
>>>>
>>>> <wdr:Set>
>>>>   <wdr:hasHost>example.org</wdr:hasHost>
>>>>   <wdr:hasNotHost>their.example.org</wdr:hasNotHost>
>>>> </wdr:Set>
>>>>
>>>> So the issue here, is to find a way of supporting negation in a
safe
>> and
>>>>  possibly `intuitive' way.
>>> I am certain that negation should be included and your example seems

>>> entirely intuitive to me. If, starting from the most significant 
>>> portion, the resource is on the example.org domain AND is NOT on 
>>> their.example.org, then it's in the Set. Easy.
>>>
>>> [snip]
>>>>> [snip] NB. use of intersectionOf and unionOf requires OWL
>>>>> DL, not OWL Lite - which gets us into more specialised inference 
>>>>> engines.
>>>> And, consequently, we may have undecidable resource set definitions
>>>> (which is not a nice thing). The solution based on implicit
semantics
>>>> (if resolved properly) is safe also with respect to this issue.
>>> Actually, no, it's OWL Full that does that. OWL DL is closed world
>> (just
>>> more complicated than OWL Lite).
>>>
>>>>> [snip: implicit conjunction inside a resource set definition - 
>>>>> wdr:hosHostList property]
>>>> I don't completely agree.
>>>>
>>>> If we assume that all properties in a wdr:Set are always in end,
>> saying
>>>> "all the resources hosted by example.org and a path starting with
foo
>> or
>>>> bar," will require two redundant resource set definitions:
>>>>
>>>> <wdr:Set>
>>>>   <wdr:hasHost>example.org</wdr:hasHost>
>>>>   <wdr:pathStartsWith>foo</wdr:pathStartsWith>
>>>> </wdr:Set>
>>>>
>>>> <wdr:Set>
>>>>   <wdr:hasHost>example.org</wdr:hasHost>
>>>>   <wdr:pathStartsWith>bar</wdr:pathStartsWith>
>>>> </wdr:Set>
>>>>
>>>> As you notice, this redundancy increases when we are talking of
>> hosts,
>>>> and not of path patterns, but I think that the need itself of
>> repeating
>>>> the same statement is far from being intuitive.
>>>>
>>>> I agree that it is preferable to combine *by default* all the
>> properties
>>>> in a resource set definition with the same Boolean operator, but
the
>>>> solution you propose has several drawbacks in terms of
>> expressiveness.
>>>> In other words, if we support AND (implicitly), we must support
also
>> OR
>>>> (explicitly) inside a resource set definition.
>>> Which brings us back to owl:unionOf and example 2A?
>>>
>>>> About the solutions to be
>>>> used for this, I'm not comfortable with space separated lists as
>> object
>>>> of RDF properties (in such a case why not using a RE? we have just
to
>>>> substitute a blank space with a `|'). Also, we are forgetting here
>>>> grouping by property. I'm not sure that the considerations above
>> apply
>>>> also to them.
>>> I think these do apply to grouping by resource property. If the
>> resource
>>> property in question is colour then you can have a white space
>> separated
>>> list of colours. And I agree on the white space or | issue. But
we're 
>>> trying to find an alternative to using REs for those who don't like
>> them
>>> and that is less error prone (noting that REs are always going to be

>>> supported).
>>>
>>>> In other words, I'm for using RDF to express this. Of course, it
may
>> be
>>>> verbose, not necessarily human-friendly, and require a lot
>> processing.
>>>> This is why I consider the `original' implicit semantics of
resource
>> set
>>>> definitions (i.e., same properties in OR, different properties in
>> AND)
>>>> preferable, even though it is not formally sound.
>>> OK, I misunderstood your thinking. I thought you were opposed to
>> option
>>> 1. Ah well.
>>>
>>> Phil
>>>
>>>
>>>
>
> 

Received on Thursday, 24 May 2007 13:17:54 UTC