- From: Smith, Kevin, VF-Group <Kevin.Smith@vodafone.com>
- Date: Thu, 24 May 2007 15:17:32 +0200
- To: "David" <drooks@segala.com>, "Public POWDER" <public-powderwg@w3.org>
Hi David, The precedent I had heard of was in XHTML2, namely the value of the universal @role attribute: http://www.w3.org/TR/xhtml2/mod-roleAttribute.html#s_roleAttributemodule The intention being that a given element could be assigned roles from multiple, unrelated knowledge systems, e.g. <div role="dc:title wairole:grid" ...> It's not clear (to me, at least :) whether this constitutes a logical AND in the XHTML 2 context. Cheers Kevin Kevin Smith Technology Strategist Vodafone Research & Development Mobile: +44 (0)7990 798 916 Text: +44 (0)7825 106 554 Email: kevin.smith@vodafone.com Vodafone Group Services Limited Registered Office: Vodafone House, The Connection, Newbury, Berkshire RG14 2FN Registered in England No 3802001/ -----Original Message----- From: public-powderwg-request@w3.org [mailto:public-powderwg-request@w3.org] On Behalf Of David Sent: 24 May 2007 12:03 To: Public POWDER Subject: Re: (Ref.: ISSUE-12: Conjunction and disjunction) Semantics of resource set definitions Hi Phil, My two cents... I'm having problems with white space separated lists as well. I think that by using them we're relying on people to associate the white space with logical AND. I am not sure what the precedent is that Kevin mentions but personally i see no reason why a white space (in this implementation) could not be seen as any other type of logical operator. FWIW i think the argument against option 3 should be the same as that against option 1. I also don't like the REGEX option. I think that expecting the implementers to be proficient in REGEX as well as RDF may be asking a little too much. Yes, it does provide a solution to our dilema but i think we can find a better option. I can't choose my favourite option as the arguments for and against are all very compelling. I do think it is import to have a closed DR scope. Its probably then going to come down to a question of which is more important - ease of implementation (use/understanding) or ease of processing. Cheers, ------------------------------------------------------------------------ -------- David Rooks Segala, Senior Standards Compliance Manager and Test Manager HQ: 19 The Mall / Beacon Court / Sandyford /Dublin 18 / Ireland UK: 2 Coltsfoot Drive / Burpham / Guildford / GU1 1YH / Surrey / UK Office: +44 (0)1483 572 800 Mobile: +44 (0)7783 718 905 ------------------------------------------------------------------------ -------- ----- Original Message ----- From: "Phil Archer" <parcher@icra.org> To: "Public POWDER" <public-powderwg@w3.org> Cc: "Jo Rabin" <jo@linguafranca.org> Sent: Wednesday, May 23, 2007 3:32 PM Subject: Re: (Ref.: ISSUE-12: Conjunction and disjunction) Semantics of resource set definitions > > Thanks very much Kevin, I really appreciate you taking time to look at > this. > > Keeping each property value to a single item, obviating the need for list > parsing, is a good benefit. The only drawback is that it means we can't > use OWL cardinality to restrict the number of, say, hasPathStartsWith > properties. That means that you can publish your DR and then on my server > I can publish an RDF triple that says > > <your Resource Set's URI> wdr:hasPathStartsWith 'red' > > And a semantic system could pick that up and add it to your DR definition. > True, the provenance of that triple can be checked, but this is what I > mean by being open, as opposed to closed world. > > The other problem is that OWL set operators are predicates (properties) > that therefore must have Classes as their value. So in fact your example > would have to be written thus: > > <wdr:ResourceSet> > <wdr:hasAnyHostFrom>example.org</wdr:hasAnyHostFrom> > > <owl:unionOf rdf:parseType="Collection"> > > <wdr:ResourceSet> > <wdr:pathStartsWith>foo</wdr:pathStartsWith> > </wdr:ResourceSet> > > <wdr:ResourceSet> > <wdr:pathStartsWith>bar</wdr:pathStartsWith> > </wdr:ResourceSet> > > </owl:unionOf> > </wdr:ResourceSet> > > as opposed to > > <wdr:ResourceSet> > <wdr:hasAnyHostFrom>example.org</wdr:hasAnyHostFrom> > <wdr:pathStartsWithAnyOf>foo bar</wdr:pathStartsWithAnyOf> > </wdr:ResourceSet> > > Yes, there's more processing of the values, but that's something that an > application can do in a single line normally (in Perl certainly) whereas > to extract multiple values from multiple properties of multiple sets in an > OWL collection - that sounds like several SPARQL queries just to get the > data. That said, it wouldn't surprise me if this is the solution an RDF > head would prefer. Hmmm... > > But... your example does perhaps point towards the XML-based solution > proposed by Jo in the XG. And talking of Jo... > > I know he and others feel that REs are a road to confusion and error and, > no doubt, in some cases that's true. As I've worked with them a bit I > reckon that's the easiest way forward but, well, that's what I expect to > use most of the time and I guess you would too. But we need alternative as > well. Also, as Andrea is usually quick to point out, they don't work on RS > defined by resource property. For all that though I'm awfully tempted to > put this in IRC next time > > PROPOSED RESOLUTION: Conjunctions are unnecessary since Regular > Expressions provide all the flexibility we need. > > ... but I'll keep that urge under control. > > We always knew this would be the hard part to resolve! > > Phil. > > Smith, Kevin, VF-Group wrote: >> HI Phil, >> >> Good work! Some thoughts: >> >> There is precedent for whitespace-delimited lists in element/attribute >> values, but would another option be to use owl:unionOf within the RS: >> >> 3 <wdr:ResourceSet> >> 4 <wdr:hasAnyHostFrom>example.org</wdr:hasAnyHostFrom> >> <owl:unionOf rdf:parseType="Collection"> >> 5 <wdr:pathStartsWith>foo</wdr:pathStartsWith> >> 5 <wdr:pathStartsWith>bar</wdr:pathStartsWith> >> </owl:unionOf> >> 6 </wdr:ResourceSet> That may be more friendly to RDF parsers (i.e. >> no extra string >> operations needed to extract values). Not sure if that risks nested set >> operators and OWL Full, as you say. >> >> NB I was looking at Apache rewrite rules, since they also work on >> matching URIs and have a widespread following. It appears there has not >> been developed a higher-level language of matching, but a use of (often >> complex) REs. IMO this gives credence to the use of REs for our kind of >> matching use cases. >> >> Overall, happy to see this written up further. >> >> Cheers >> Kevin Kevin Smith >> Technology Strategist >> Vodafone Research & Development >> Mobile: +44 (0)7990 798 916 >> Text: +44 (0)7825 106 554 >> Email: kevin.smith@vodafone.com >> >> Vodafone Group Services Limited >> Registered Office: Vodafone House, The Connection, >> Newbury, Berkshire RG14 2FN >> Registered in England No 3802001/ >> >> -----Original Message----- >> From: public-powderwg-request@w3.org >> [mailto:public-powderwg-request@w3.org] On Behalf Of Phil Archer >> Sent: 22 May 2007 16:27 >> To: Public POWDER >> Subject: Re: (Ref.: ISSUE-12: Conjunction and disjunction) Semantics of >> resource set definitions >> >> >> Right, after a while away from this issue, here we are again, looking at >> >> the conjunction document [1]. >> >> It feels as if we could spend an entire face to face meeting discussing >> this so let's see if we can avoid that! >> >> In recent posts, Andrea has been arguing for the implicit semantics of >> option 1 so that our example of encoding "everything on example.com OR >> example.org with a path containing foo OR bar" would be written as at >> [2]. >> >> I agree with Andrea in so far as if we want to express relatively complex >> things then that's probably going to take some relatively complex code. I >> just want to keep it as simple as possible (of course!). >> >> I also believe it is very much in our interests to reduce the opportunity >> for the data we create in POWDER to be misused. In particular, I think it >> generally a good thing to close off Resource Set definitions so that you >> can't publish further triples whose provenance needs to be taken into >> account before deciding whether to use them or >> not. >> >> Where I disagree with Andrea is that the implicit semantics of [2] are >> the least worst option. I really don't like the idea that if you have two >> of a given property then you combine them with OR but different >> properties are combined with AND. It just sounds too woolly and error >> prone to me. >> >> And how would we encode those rules? >> >> Limiting the cardinality of the various RDF properties is easy with OWL >> Lite. Thus I generally favour option 3 [3] in which we give a list of >> values as the value of the various RDF properties. Maybe a change in name >> of those properties might help clarify thinking. How about this: >> >> <wdr:ResourceSet> >> <wdr:hasAnyHostFrom>example.com example.org</wdr:hasAnyHostFrom> >> <wdr:pathContainsAnyOf>foo bar</wdr:pathContainsAnyOf> >> </wdr:ResourceSet> >> >> This is, again, a white space separated list but the altered RDF property >> name makes it easier to read. We might consider defining 'list' >> >> versions of the RDF properties we have so that the ones we have now >> (hasHost, hasScheme etc.) remain as they are taking a single value, but >> additional properties would take lists - but this seems overly redundant >> since a list of length 1, such as >> <wdr:hasAnyHostFrom>example.com</wdr:hasAnyHostFrom> is valid. >> >> So to recap, this gives us the advantage of being able to limit >> cardinality of each of our set definition properties to 0 or 1 (adding to >> security). Each of these properties would be combined with logical >> AND. >> >> Andrea makes good points about negation. Since this: >> >> (($host !~ /example.org) || ($host !~ /example.net/)) >> >> is always true - a classic DeMorgan trap I think. So again, maybe a >> change of RDF property name can help. How about this >> >> <wdr:ResourceSet> >> <wdr:hasAnyHostFrom>example.org example.com</wdr:hasAnyHostFrom> >> <wdr:hasNotAnyHostFrom>search.example.org shopping.example.com >> </wdr:hasNotAnyHostFrom> >> </wdr:ResourceSet> >> >> This translates as "if the host IS ANY of these but NOT ANY of these, >> then it's in the Resource Set." >> >> Lists only take us so far. Again, referring to Andrea's comments, what >> about anything on example.org with a path beginning with foo OR bar and >> resources on example.com with a path beginning with bar (only). White >> space separated lists won't get us out of this - we need to use something >> like owl:unionOf. >> >> OK, let's actually use owl:unionOf. >> >> Notice that owl:unionOf is a property, not a Class, therefore, Andrea's >> code needs a little tweaking to give this: >> >> 1 <wdr:ResourceSet> >> 2 <owl:unionOf rdf:parseType="Collection"> >> >> 3 <wdr:ResourceSet> >> 4 <wdr:hasAnyHostFrom>example.org</wdr:hasAnyHostFrom> >> 5 <wdr:pathStartsWithAnyOf>foo bar</wdr:pathStartsWithAnyOf> >> 6 </wdr:ResourceSet> >> >> 7 <wdr:ResourceSet> >> 8 <wdr:hasAnyHostFrom>example.net</wdr:hasAnyHostFrom> >> 9 <wdr:pathStartsWithAnyOf>bar</wdr:pathStartsWithAnyOf> >> 10 </wdr:ResourceSet> >> >> 11 </owl:unionOf> >> 12 </wdr:ResourceSet> >> >> We have two Resource Sets here (which are Classes) and we use the >> owl:unionOf predicate to create the union. More complex examples are >> possible but given that we're supporting regular expressions, and, if my >> >> line of argument holds, white space separated lists, the likelihood of a >> >> more complex Resource Set definition than that shown here seems remote - >> >> at least for the use cases under our consideration. >> >> This retains the closed world objective. RDF Collections are closed >> world - but I admit it's not clear to me how the constraint that a >> Resource Set can have a sub set if it's the subject of an owl:unionOf, >> intersectionOf or owl:complementOf predicate. Incidentally, using these >> set operators puts us firmly in OWL DL, not OWL Lite (and, if I >> understand it correctly, nested set operators might take us into OWL Full >> so they should be strongly discouraged). >> >> So I think we're building up a picture here. >> >> If you want to define a set simple as 'everything on example.com' (which >> >> remains the most likely scenario for our use cases) then you can do it >> really easily >> >> <wdr:ResourceSet> >> <wdr:hasAnyHostFrom>example.com</wdr:hasAnyHostFrom> >> </wdr:ResourceSet> >> >> If you want something a little more complicated - like multiple hosts - >> put them in a white space separated list. >> >> If you need to create slightly more complex but still relatively simple >> RS definitions that include multiple elements then that's possible too, >> as we've seen with the original example.com/org plus foo/bar example. >> >> We can define even more complex sets where we have (multiple definitions) >> OR (other multiple definitions) using OWL set operators. >> >> And if that isn't enough, you can always use a Regular Expression. >> Actually, there's a thought, can you (meaningfully) have a white space >> separated list of regular expressions?? probably not - so that's one of >> our RDF properties that can only have a single value. >> >> What about conjunctions of resources grouped by property? The group >> hasn't discussed this yet, but if we go with my current proposal, below, >> >> then how will that affect things? >> >> Here's an RS definition for 'all resources on example.org that are in >> French. >> >> <wdr:Set> >> <wdr:hasAnyHostFrom>example.org</wdr:hasAnyHostFrom> >> >> <wdr:resourcesWith rdf:parseType="Resource"> >> <ex:lang>fr</ex:lang> >> </wdr:resourcesWith > >> >> <wdr:hasPropLookUp> >> <wdr:PropLookUp> >> <wdr:lookUpURI>$cURI</wdr:lookUpURI> >> <wdr:method >> rdf:resource="http://www.w3.org/2006/http#HeadRequest" /> >> <wdr:responseContains>Content-Language: fr</wdr:responseContains> >> </wdr:PropLookUp> >> </wdr:hasPropLookUp> >> >> </wdr:Set> >> >> So this says that the language must be French and the way to find out >> whether it is or not is to do a Head request to $cURI (the candidate >> resource's URI) and see if you get a header back that says >> "Content-Language: fr". >> >> Can we use a white space separated list here? Sometimes, would be the >> answer, I guess. Imagine we wanted to define a set as all resources on >> example.org in French OR German. Try this: >> >> <wdr:Set> >> <wdr:hasAnyHostFrom>example.org</wdr:hasAnyHostFrom> >> >> <wdr:resourcesWith rdf:parseType="Resource"> >> <ex:lang>fr de</ex:lang> >> </wdr:resourcesWith > >> >> <wdr:hasPropLookUp> >> <wdr:PropLookUp> >> <wdr:lookUpURI>$cURI</wdr:lookUpURI> >> <wdr:method >> rdf:resource="http://www.w3.org/2006/http#HeadRequest" /> >> <wdr:responseContains>"Content-Language: fr" >> "Content-Language: de"</wdr:responseContains> >> </wdr:PropLookUp> >> </wdr:hasPropLookUp> >> >> </wdr:Set> >> >> I've had to quote the list elements in the responseContains property but >> >> I don't think it's unusual to require quoting of strings if they are to >> include white space! >> >> By way of an apology for the length of this post, let me summarise. >> >> 1. I don't like implied semantics and think we can do better. >> 2. We must surely accept complexity where complexity is being expressed >> 3. Complexity should be as scarce as the use cases that demand it >> 4. Changing the property names can make it clear (to humans) that the >> value is a list >> 5. REs are supported anyway so they're always available for people who >> prefer them (like me) >> 6. We can use OWL set operators where we need a union of otherwise >> separate sets. >> 7. The multi-layered approach to conjunction can work just as well for RS >> definitions by property, notwithstanding the need to support quoted >> strings so that they can include white space. >> >> Depending on your feedback, I'd like to write this up in the doc so it >> can be presented properly. I would, however, like to include the >> XML-based approach in the doc [4] as an alternative to all this. >> >> Its principal attraction, for me, flows from the following argument: It >> is likely that a generic RDF processor will be able to handle all aspects >> of a DR, without modification, except the Resource Set. Since the data in >> an RS definition needs to be handled slightly differently, it does seem >> to be logical to make that explicit by quoting an XML Literal within the >> RDF graph (which is what the pre-defined RDF datatype >> >> of XML Literal is designed to allow you to do). >> >> Its principal problem, IMHO, is that the definition of something as >> simple as 'everything on example.org' should not require running a >> separate XML parser/XPath query. I reckon we really need to see some >> SPARQL queries against the RS data examples to settle this one?? >> >> Cheers >> >> Phil. >> >> >> [1] http://www.w3.org/2007/powder/powder-grouping/conjunction >> >> [2] http://www.w3.org/2007/powder/powder-grouping/option1.rdf and >> http://www.w3.org/2007/powder/powder-grouping/option1.png >> >> [3] http://www.w3.org/2007/powder/powder-grouping/option3.rdf and >> http://www.w3.org/2007/powder/powder-grouping/option3.png >> >> [4] http://www.w3.org/2007/powder/powder-grouping/conjunction#option6 >> >> >> Phil Archer wrote: >>> A few small comments inline below >>> >>> Andrea Perego wrote: >>>> Hi, Phil. >>>> >>>>> [snip] >>>>> >>>>> In your discussion, you suggest 4 possible solutions to the >> pathContains >>>>> issue. The complexities get more severe when we get into negatives >> and, >>>>> from my perspective, we're getting a long way away from a design >>>>> fundamental of simplicity with the real possibility that a >>>>> semi-technically minded person could write a set definition by hand >> if >>>>> necessary. >>>> I think here we should consider if and why we should support >> negation. >>>> It is not just to support as much flexibility as possible. As was >>>> reported in a previous version of the grouping document, negation is >>>> useful in order to simplify the specification of a scope by also >>>> supporting exceptions. >>>> >>>> Suppose, for instance, that a given DR applies to a set of hosts >>>> my.example.org, your.example.org, his.example.org, her.example.org, >>>> our.example.org, but not to their.example.org. >>>> >>>> If negation is not supported, the scope of the DR must be specified >> as >>>> follows: >>>> >>>> <wdr:Set> >>>> <wdr:hasHost>my.example.org</wdr:hasHost> >>>> <wdr:hasHost>your.example.org</wdr:hasHost> >>>> <wdr:hasHost>her.example.org</wdr:hasHost> >>>> <wdr:hasHost>his.example.org</wdr:hasHost> >>>> <wdr:hasHost>our.example.org</wdr:hasHost> >>>> </wdr:Set> >>>> >>>> otherwise, if a wdr:hasNotHost property is available, we can reduce >> the >>>> specification to >>>> >>>> <wdr:Set> >>>> <wdr:hasHost>example.org</wdr:hasHost> >>>> <wdr:hasNotHost>their.example.org</wdr:hasNotHost> >>>> </wdr:Set> >>>> >>>> So the issue here, is to find a way of supporting negation in a safe >> and >>>> possibly `intuitive' way. >>> I am certain that negation should be included and your example seems >>> entirely intuitive to me. If, starting from the most significant >>> portion, the resource is on the example.org domain AND is NOT on >>> their.example.org, then it's in the Set. Easy. >>> >>> [snip] >>>>> [snip] NB. use of intersectionOf and unionOf requires OWL >>>>> DL, not OWL Lite - which gets us into more specialised inference >>>>> engines. >>>> And, consequently, we may have undecidable resource set definitions >>>> (which is not a nice thing). The solution based on implicit semantics >>>> (if resolved properly) is safe also with respect to this issue. >>> Actually, no, it's OWL Full that does that. OWL DL is closed world >> (just >>> more complicated than OWL Lite). >>> >>>>> [snip: implicit conjunction inside a resource set definition - >>>>> wdr:hosHostList property] >>>> I don't completely agree. >>>> >>>> If we assume that all properties in a wdr:Set are always in end, >> saying >>>> "all the resources hosted by example.org and a path starting with foo >> or >>>> bar," will require two redundant resource set definitions: >>>> >>>> <wdr:Set> >>>> <wdr:hasHost>example.org</wdr:hasHost> >>>> <wdr:pathStartsWith>foo</wdr:pathStartsWith> >>>> </wdr:Set> >>>> >>>> <wdr:Set> >>>> <wdr:hasHost>example.org</wdr:hasHost> >>>> <wdr:pathStartsWith>bar</wdr:pathStartsWith> >>>> </wdr:Set> >>>> >>>> As you notice, this redundancy increases when we are talking of >> hosts, >>>> and not of path patterns, but I think that the need itself of >> repeating >>>> the same statement is far from being intuitive. >>>> >>>> I agree that it is preferable to combine *by default* all the >> properties >>>> in a resource set definition with the same Boolean operator, but the >>>> solution you propose has several drawbacks in terms of >> expressiveness. >>>> In other words, if we support AND (implicitly), we must support also >> OR >>>> (explicitly) inside a resource set definition. >>> Which brings us back to owl:unionOf and example 2A? >>> >>>> About the solutions to be >>>> used for this, I'm not comfortable with space separated lists as >> object >>>> of RDF properties (in such a case why not using a RE? we have just to >>>> substitute a blank space with a `|'). Also, we are forgetting here >>>> grouping by property. I'm not sure that the considerations above >> apply >>>> also to them. >>> I think these do apply to grouping by resource property. If the >> resource >>> property in question is colour then you can have a white space >> separated >>> list of colours. And I agree on the white space or | issue. But we're >>> trying to find an alternative to using REs for those who don't like >> them >>> and that is less error prone (noting that REs are always going to be >>> supported). >>> >>>> In other words, I'm for using RDF to express this. Of course, it may >> be >>>> verbose, not necessarily human-friendly, and require a lot >> processing. >>>> This is why I consider the `original' implicit semantics of resource >> set >>>> definitions (i.e., same properties in OR, different properties in >> AND) >>>> preferable, even though it is not formally sound. >>> OK, I misunderstood your thinking. I thought you were opposed to >> option >>> 1. Ah well. >>> >>> Phil >>> >>> >>> > >
Received on Thursday, 24 May 2007 13:17:54 UTC