- From: Phil Archer <parcher@icra.org>
- Date: Tue, 25 Mar 2008 11:55:42 +0000
- To: Public POWDER <public-powderwg@w3.org>
N.B. This discussion refers to the Grouping Doc dated 20 March and available at [1], currently only with member access. This is expected to be published at the same URI within the next 24 hours or so. Over on the member list it has been suggested that POWDER-S should _only_ support IRI constraint by regular expression [2], although POWDER would retain things like includehosts for ease of use. The argument is initially attractive since we expect to see IRI sets like this most commonly: <iriset> <includehosts>example.org</includehosts> </iriset> i.e. a single domain name given as the IRI set so we're describing 'everything on example.org. This can be transformed into POWDER-S thus: <wdr:iriset> <owl:intersectionOf rdf:parseType="Collection"> <owl:Restriction> <owl:onProperty rdf:resource="&wdr;includeregex" /> <owl:hasValue>example.org</owl:hasValue> </owl:Restriction> </owl:intersectionOf> </wdr:iriset> i.e. the reg ex is the same in both cases. Easy. Since we expect POWDER to be the main transport mechanism and for POWDER-S to (almost) always be derived programmatically, it doesn't matter how complex a POWDER-S doc is. But let's make this progressively more complex and see whether we can convert _all_ possible POWDER IRI sets into POWDER-S versions with a single reg ex. Let's try multiple hosts. <includehosts>example.org example.com</includehosts> becomes example.org|example.com OK, let's cut to the chase. POWDER allows very sophisticated IRI set definitions like this: <iriset> <includeschemes>http https</includeschemes> <includehosts>example.org example.com</includehosts> <includepathcontains>foo bar</includepathcontains> <includepathcontains>red blue</includepathcontains> </iriset> Here we have either http or https. OK, in reg ex that's https? add in the host and we get ^https?://(.*\.)?(example.com|example.org) But those multiple path constraints are going to kill us. They say that the path must contain either foo or bar AND either red or blue _in any order_. So the following all match: http://example.com/red/bar http://example.com/foo/blue https://example.org/bluefoo/bar.html And this doesn't: http://example.org/foo/bar/ Now, I _could_ work out a Reg Ex that did all this, but I'm not sure I could write some code that turned _any valid_ POWDER IRI set definition into a Reg Ex. And would anyone like to hazard a bit of code that rendered this as a reg ex: <iriset> <includeschemes>http https</includeschemes> <includehosts>example.org example.com</includehosts> <includepathcontains>foo bar</includepathcontains> <includepathcontains>red blue</includepathcontains> <excludeexactqueries>name1=value1&name2=value2 </excludeexactqueries> </iriset> Bearing in mind that this means that if the query string contains name 1 = value 1 and name 2 = value 2 pairs in any order then they're to be excluded? Yikes! So, I think it would be a lot easier to retain string-based matching in POWDER-S. Phil. [1] http://www.w3.org/2007/powder/Group/powder-grouping/20080320.html [2] http://lists.w3.org/Archives/Member/member-powderwg/2008Mar/0119.html -- Phil Archer Chief Technical Officer, Family Online Safety Institute w. http://www.fosi.org/people/philarcher/
Received on Tuesday, 25 March 2008 11:56:25 UTC