- From: <parcher@icra.org>
- Date: Thu, 10 Apr 2008 14:00:27 +0100 (BST)
- To: "Martin Duerst" <duerst@it.aoyama.ac.jp>
- Cc: "Phil Archer " <parcher@icra.org>,"Felix Sasaki" <fsasaki@w3.org>, "Eric Prud'hommeaux" <eric@w3.org>,public-powderwg@w3.org, public-i18n-core@w3.org
Thanks Martin, answers inline. > At 17:14 08/04/10, Phil Archer wrote: > >>Our basic need is that we must be able to be certain whether a given IRI >> does or does not match a small data set. Typically, something like > > A very, very basic question: Just for a moment assuming that a complete > solution isn't possible, what's more of a problem for your application: > > False positives (IRIs/URIs match when they shouldn't) or > false negatives (IRIs/URIs don't match when they should)? > > If you can't decide on one or the other, can you at least describe > potential consequences in each case? That's a very interesting question that as far as I know we've not thought of, believing (naively) that a complete solution was possible. Now, if that is not the case then that actually makes life easier. Unless someone can think of a counter argument, I think it's always the case that we'd want to err on the side of caution, i.e. false nagatives are always preferable to false positives. POWDER is about descibing lots of resources at once - everything on example.org is red and square being our generic example. If I make such an assertion, it's better that there are some cases where my claim that things are red and square is not recognised than that my claim be applied to resources I may know nothing about. Also, in my conversation with Eric P the otherr day I was a little concerned by his saying that the kind of canonicalisatiion you carry out really depends where you are in the chain - UI level, Network level etc. Now... if we can _ligitimately_ say that there are circumstances where canonicalisation is not always possible, that allows us to change the tenor of the text to say that applications should make a _best effort_ to canonicalise and then give a series of possible steps to take. The ones that are concrete, OK, do them, ones that are less prescriptive may lead to a false negative or positive and POWDER publishers should be aware of this and create data accordingly. This, for example: <iriset> <includehosts>xn--exmpless-jua.org exåmpless.org</includehosts> </iriset> means anything on exåmpless.org OR xn--exmpless-jua.org, so, I think I understand that this might lead to a false positive since we can't be sure that the double s is just that or an Eszett. If I'm right then we'd caution _against_ doing this and say just quote exåmpless.org (and make sure that the XML file really was in UTF-8 and served with the correct HTTP headers and so on) > > Also, do you describe/talk about/work with actual retreivable > resources, or also others? Hmmm... this has been a tricky one. In the end we actually talk about IRI sets and say that the descriptions may be applied to all resources that are dereferenced from any IRI that is a member of the set. We say that we don't limit what kind of IRI is used but we make it easy to use http-style ones which is what our use cases are about. Phil.
Received on Thursday, 10 April 2008 13:01:16 UTC