- From: Pat Hayes <phayes@ihmc.us>
- Date: Thu, 26 Jan 2006 15:27:15 -0600
- To: "Miles, AJ \(Alistair\)" <A.J.Miles@rl.ac.uk>
- Cc: "Booth, David \(HP Software - Boston\)" <dbooth@hp.com>, "Ben Adida" <ben@mit.edu>, "SWBPD list" <public-swbp-wg@w3.org>, "public-rdf-in-xhtml task force" <public-rdf-in-xhtml-tf@w3.org>
>Pat Hayes said: > ><quote> >[4] has a clear and explicit description (at >http://www.w3.org/TR/webarch/#indirect-identification >) of a condition which seems to apply almost >perfectly to the situation which arises in RDF/A >and which Alistair deplores, and which is >correctly described as not constituting a URI >collision. Using the same name to refer both to a >thing, and to a piece of a document which itself >refers to the same thing, seems clearly to be an >example of indirect reference. As [4] says, >somewhat pithily," Identifiers are commonly used >in this way." ></quote> > >I understood [4] to be referring to 'indirect identification' as >expressed in RDF via properties of type >owl:InverseFunctionalProperty. I.e. the following triple: > >_:aaa foaf:homepage <http://jo-lamda.blogspot.com/>. > >... uses the URI <http://jo-lamda.blogspot.com/> to 'indirectly >identify' the blank node _:aaa because the property foaf:homepage is >declared by the FOAF ontology [1] to be an inverse functional >property. That isn't how I read [4]. The kind of usage you describe is not 'indirect', since all the identifiers are being used with a single referent in mind: "http://jo-lamda.blogspot.com/" denotes a web page and "_:aaa" denotes its owner. > >If this is indeed the intended meaning of 'indirect identification' >at [4] then I strongly suggest the RDF/A primer does NOT use the >term 'indirect identification' to refer to the practice of using >URIs to denote both a piece of XML (effectively a part of a >document) and an entity in the 'real world' (e.g. a person). > >See also related email [2]. > >Pat Hayes said: > ><quote> >It is impossible, both practically and >theoretically, to completely avoid all ambiguity >in using referential names. Reference is not >access. While URLs must be unambiguous locators, >in the sense of resolving unambiguously to a >particular Web resource, referential names - >which is how URI references are used in RDF - >cannot possibly be specified so exactly as to >refer uniquely and unambiguously in all >circumstances. Even globally recognizable proper >names like "Mount Everest" do not have unique >referents in all possible circumstances, since >the exact referent depends on the ontological >framework being mutually assumed (Where is the >exact edge of a mountain? Are we talking about >people as agents or as medical cases? At a >particular time or as endurants? etc..) Under >these circumstances, to view every referential >ambiguity as a Bad Thing is about as useful as >trying to stamp out breathing. > >Like words in human language, URIs can be safely >overloaded under conditions which allow possible >misunderstandings to be securely resolved by >their local context, without requiring >negotiation: and this need not even require that >the resolution be actually done, provided that >the necessary context - which is the case under >discussion, is likely to be the ontology >identified by the root URI of the RDF property - >can be accessed when required. In English we >safely use "bank" to refer to a side of a river, >a turning motion or a building, in part because >these meanings are so divergent that the >ambiguity can almost always be immediately >resolved by the immediate context. Similarly, an >email address can be safely used to refer to its >owner in part because almost anything that can be >coherently said about a person could not possibly >apply to an email account, and vice versa. Even >the use of a literal string in a context which >requires a reference to a named agent can be >interpreted as making sense, since it clearly >requires a coercion, and it would be natural to >use the string as a referring name. Whether or >not this is in some fundamental sense 'correct' >or 'proper' is not worth discussing: what matters >is only that a community of agents all agree to >use the same kind of coercion strategy when it is >required, which allows strings to be used to >refer to agents; and to the extent they do, then >they thereby become genuinely referring names. >This is how the world comes to use language, both >in the large and in the small >(http://www.economist.com/science/displayStory.cfm?story_id=5135495). ></quote> > >OK. Tell me what 'local context' is exactly. I apologize for using the c-word, which I normally try to avoid. I didn't mean to imply that there are actual things called 'contexts'. The context for a URI occurring in some RDF content is that RDF itself, plus any other relevant RDF that can reasonably be presumed to be accessible, e.g. the ontologies accessible from the base URIs of other identifiers in the transmitted RDF, or in imported ontologies. I meant only that if the identifier is transmitted from A to B, then there is enough information available at B to do the necessary disambiguation, without having to go back to A and ask for clarification. In ordinary conversation, this corresponds to not having to say something like "what sense of 'bank' did you mean, exactly?". >How do I as a publisher ensure that sufficient 'context' is >available for the applications I intend to support? You don't. But how, as a publisher, do you ensure that there enough of anything to support the processing you hope will happen at the other end? You cannot establish this absolutely, in all cases. The best you can do is to provide pointers to anything that you feel is relevant, and in many cases rely on a presumption that both you and your readers share some common ground. It seems to me that there is absolutely no way to avoid making assumptions like this. >What about unforeseen applications? As a consuming application, how >do I get at the 'context' see above >, and how do I use it to resolve ambiguities? Well, my point is that most of these apparent ambiguities will either not in fact need to be resolved, or their resolution will be done by applying conventions that have evolved within a community of use, which in a Web context means a community which uses a particular vocabulary consistently in a certain way. The use of webpage address URIs to denote people in FOAF is an excellent example. But the basic point is that inferential processing (drawing conclusions, querying, checking consistency, etc.) can all be done without needing to 'resolve' ambiguities. The ambiguity of a URI's reference, if present, can usually simply be left ambiguous. The logical semantics underlying inference presumes that identifiers are ambiguous in this way: ambiguity is the norm. In fact, the reduction (not total elimination) of ambiguity is often one of the main reasons for doing inference. > Where are these issues addressed in current specifications? > >Surely it is good practice for publishers to clearly understand how >and when ambiguities can arise, to be aware of each and every action >that could lead ambiguity, and to undertake such actions in full >knowledge of the consequences. Well, yes, it is hard to argue with that. But if 10|6 websites, say, already use webpage URIs to refer to their owners, and if the normative semantic theories in the specifications do not prohibit this (as they do not) and all the machinery that processes this information works (as it does) why is it considered 'good practice' to set out to re-educate everyone and to oblige them to to change? Seems to me it might be more productive to take a more empirical and less judgmental stance, and ask why and how this situation, which theory predicts should lead to confusion, apparently does not lead to confusion. The TAG recommendations seem to be based on an implicit theory of ambiguity and communication. Projects like FOAF seem to me to be empirical refutations of this theory. >Surely it is also good practice for publishers in the majority of >cases to design systems that do not lead to ambiguity No. Ambiguity is inherent in the very idea of using names to refer in a descriptive formalism. This is the point I tried to get across to the TAG. There is a common presumption that ambiguity is a Bad Thing, and so we should make every reasonable effort to Stamp It Out. But this is nonsense: ambiguity *of reference* is not only not a bad thing, it is a *necessary* thing. There are theorems which show that only an uncomputable amount of assertional effort could ever completely remove it. Even trying to remove it in realistic cases is unfeasible. Take an ordinary unambiguous name: what *exactly* does "Mount Everest" refer to? (What is its volume? Where are its edges? etc..) Or take my name, and ignore the fact that there are many Pat Hayes' in the world: does the "Pat Hayes" that identifies me refer to me now, me throughout my lifetime; me considered as a social agent, me considered as an organism, etc.? These are all distinctions that formal ontologies regularly make. So these are ambiguities too: the fact that they are not acknowledged by linguists doesn't make them any less real, it is just a testament to our human ability to communicate successfully using ambiguous notations. Almost all names are referentially ambiguous; and ironically, every attempt to remove this ambiguity by imposing more exactly defined lexica (mountain-as-physical-object, mountain-as-geographical-entity, mountain-as-climbing-peak, etc.) actually makes the ambiguity worse for all other names, since it provides for making finer and finer ontological distinctions elsewhere, thereby creating (or perhaps revealing) ambiguity where none was previously noticed. If there are ten distinct referents for "Pat Hayes" and also for "Jackie Hayes" then there are a hundred different types of binary relation between us that could all be described as "marriedTo". There is no final end state where every name is unambiguous: this vision is a chimera. One reaction I meet when I try to point this out is along the lines: even if what you say is true, it is like saying that the world is full of sin: but still, we should all strive to be good. But this misses my point. I'm not saying that the problem is unsolvable. Im saying that there is no problem. Ambiguity does not get in the way of communication or inference. Setting out to remove all ambiguity is like setting out to walk to the moon: its a futile goal since it can't be done, and also because there is absolutely no need to even try to do it. It is certainly not good practice in general. Of course there are cases (medicine, biology, science generally, law, international standards) where 'ordinary' identifiers are not precisely defined enough for some technical usage, and specialized lexica are necessary, often requiring careful management, because certain kinds of ambiguity must be caught and corrected. I don't mean to imply that this kind of effort is pointless: only that to assert as a general property of the Web architecture that all identifiers should be unambiguous, is nonsensical. >, or that minimise the potential for ambiguity, because in doing so >they simpify the management of change, and increase the ease with >which their data can be repurposed in unforseen contexts? I.e. by >acting to minimise the potential for ambiguity, a publisher >increases the value of its published data, because the data is more >portable. Well, that might be a good case, but the conclusion isn't obvious. I'd like to see a really good (mathematical?) account of why and how less ambiguity makes for improved portability. I can see good informal arguments both ways. >A practical question: If I operate under the assumption that the >same URI will commonly be used to denote both a person and their >home page, doesn't this make the notion of logical consistency >effectively useless? No. Absolutely not. >Don't domains and ranges become effectively useless also? No. Although, to be fair, one common way to understand domains and ranges, as 'constraints' on what can be said, which can be 'checked' to detect 'errors', would indeed be in opposition to what Im saying here. But none of those words Ive highlighted have any natural place in an inferential framework: they all come from thinking about programming language design. >E.g. if I have: > ><http://jo-lamda.blogspot.com/> foaf:mbox <mailto:jo.lambda@example.org>. > >... and I also have: > >_:aaa foaf:homepage <http://jo-lamda.blogspot.com/>. > >... then via the domain of foaf:mbox and the range of foaf:homepage >I may conclude: > ><http://jo-lamda.blogspot.com/> a foaf:Agent, foaf:Document. > >What is the usefulness of this new information? I don't vouch for its usefulness, but I would argue that it is a reasonable statement of exactly the overloading or punning condition that I have no problems with, which is that a single URI can usefully play several referential roles at the same time. So, it might not be particularly useful, but it can be harmlessly true. Pat > >Cheers, > >Al. > >[1] http://xmlns.com/foaf/0.1/ >[2] http://lists.w3.org/Archives/Public/public-swbp-wg/2006Jan/0145.html >[4] http://www.w3.org/TR/webarch/#indirect-identification > > >-----Original Message----- >From: public-rdf-in-xhtml-tf-request@w3.org on behalf of Pat Hayes >Sent: Wed 25/01/2006 05:30 >To: Booth, David (HP Software - Boston) >Cc: Ben Adida; SWBPD list; public-rdf-in-xhtml task force >Subject: RE: [ALL] RDF/A Primer Version > > >>I hate to say this, but I think the URI identity issues that Alistair >>raised in email[3] after yesterday's teleconference are important enough >>to delay publication until they are either fixed or visibly marked as >>problems. The WebArch document is clear that URI collisions[4] are A >>Bad Thing. It would seem wrong to endorse such collisions, even >>implicitly. > >I beg to differ. > >[4] has a clear and explicit description (at >http://www.w3.org/TR/webarch/#indirect-identification >) of a condition which seems to apply almost >perfectly to the situation which arises in RDF/A >and which Alistair deplores, and which is >correctly described as not constituting a URI >collision. Using the same name to refer both to a >thing, and to a piece of a document which itself >refers to the same thing, seems clearly to be an >example of indirect reference. As [4] says, >somewhat pithily," Identifiers are commonly used >in this way." > >It is impossible, both practically and >theoretically, to completely avoid all ambiguity >in using referential names. Reference is not >access. While URLs must be unambiguous locators, >in the sense of resolving unambiguously to a >particular Web resource, referential names - >which is how URI references are used in RDF - >cannot possibly be specified so exactly as to >refer uniquely and unambiguously in all >circumstances. Even globally recognizable proper >names like "Mount Everest" do not have unique >referents in all possible circumstances, since >the exact referent depends on the ontological >framework being mutually assumed (Where is the >exact edge of a mountain? Are we talking about >people as agents or as medical cases? At a >particular time or as endurants? etc..) Under >these circumstances, to view every referential >ambiguity as a Bad Thing is about as useful as >trying to stamp out breathing. > >Like words in human language, URIs can be safely >overloaded under conditions which allow possible >misunderstandings to be securely resolved by >their local context, without requiring >negotiation: and this need not even require that >the resolution be actually done, provided that >the necessary context - which is the case under >discussion, is likely to be the ontology >identified by the root URI of the RDF property - >can be accessed when required. In English we >safely use "bank" to refer to a side of a river, >a turning motion or a building, in part because >these meanings are so divergent that the >ambiguity can almost always be immediately >resolved by the immediate context. Similarly, an >email address can be safely used to refer to its >owner in part because almost anything that can be >coherently said about a person could not possibly >apply to an email account, and vice versa. Even >the use of a literal string in a context which >requires a reference to a named agent can be >interpreted as making sense, since it clearly >requires a coercion, and it would be natural to >use the string as a referring name. Whether or >not this is in some fundamental sense 'correct' >or 'proper' is not worth discussing: what matters >is only that a community of agents all agree to >use the same kind of coercion strategy when it is >required, which allows strings to be used to >refer to agents; and to the extent they do, then >they thereby become genuinely referring names. >This is how the world comes to use language, both >in the large and in the small >(http://www.economist.com/science/displayStory.cfm?story_id=5135495). > >I suggest that if current real-world usage of a >metadata vocabulary seems to be causing no actual >operational problems, it might be better to study >this real-world usage carefully with a view to >learning something about how symbols actually are >being used on the Web, than to set out to take >great pains to improve it. > >In the meantime, I also suggest that RDF/A might >usefully use the term "indirect identification" >to point out that subjects of RDF triples can >both be pieces of XML markup and also refer to >entities in the real world, and that this need >not be deplored as harmful ambiguity. > >Pat Hayes > >>David Booth >> >>[3] Identity issues raised by Alistair: >>http://lists.w3.org/Archives/Public/public-swbp-wg/2006Jan/0113.html >>[4] TAG's Web Architecture: >>http://www.w3.org/TR/webarch/#URI-collision >> >> >>> -----Original Message----- >>> From: public-swbp-wg-request@w3.org >>> [mailto:public-swbp-wg-request@w3.org] On Behalf Of Ben Adida >>> Sent: Tuesday, January 24, 2006 12:03 PM >>> To: SWBPD list >>> Cc: public-rdf-in-xhtml task force >>> Subject: [ALL] RDF/A Primer Version >>> >>> >>> >>> >>> Hi all, >>> >>> I made a mistake in the version of the RDF/A Primer that I presented >>> at the telecon yesterday. I have just finished uploading the right >>> version, which you can find here: >>> >>> http://www.w3.org/2001/sw/BestPractices/HTML/2006-01-24-rdfa-primer > >> >>> With the WG and specifically the reviewers' approval (DBooth, >>> GaryNg, >>> and also "unofficial" reviewers), I am hoping that we can rapidly >>> agree that this latest version should be the one that becomes our >>> first published WD. >>> >>> The only difference in content is that the new version has an extra >>> section (section #2), and the old sections 2 and 3 are merged into >>> the new section 3 for purely organizational purposes (no text >>> is lost >>> or added in those sections, just reorganized.) The point of the new >>> section 2 is to add an even simpler introductory example. We believe >>> this additional section is in line with the comments we >>> received from >>> reviewers, both official and earlier, unofficial reviews. In >>> fact, we >>> began writing it in part to respond to some of these early >>> comments 2 >>> weeks ago. >>> >>> The already-approved version is still at the old URL for >>> comparison: >>> http://www.w3.org/2001/sw/BestPractices/HTML/2006-01-15-rdfa-primer > >> >>> I want to stress that this is entirely *my* mistake: the TF had >>> agreed [1,2] that this second version would be presented to the WG >> > yesterday, and I simply forgot. Publishing these additional examples >> > now is quite important for getting the word out about RDF/A and >> > making it competitive against other metadata inclusion proposals, >>> outside of W3C, that are gaining traction. >>> >>> Apologies for my mistake. I hope you'll see that these edits do not >>> constitute a substantive change to the document, rather they help >>> make the same points more appealing to and understandable by >>> a larger >>> audience. >>> >>> -Ben Adida >>> ben@mit.edu >>> >>> [1] Discussion during last segment of January 10th TF >>> telecon: http://www.w3.org/2006/01/10-swbp-minutes >>> >>> [2] Discussion, at beginning, of Mark's new examples during January >>> 17th TF telecon: >>> http://www.w3.org/2006/01/17-swbp-minutes >>> >>> > > >-- >--------------------------------------------------------------------- >IHMC (850)434 8903 or (650)494 3973 home >40 South Alcaniz St. (850)202 4416 office >Pensacola (850)202 4440 fax >FL 32502 (850)291 0667 cell >phayesAT-SIGNihmc.us http://www.ihmc.us/users/phayes -- --------------------------------------------------------------------- IHMC (850)434 8903 or (650)494 3973 home 40 South Alcaniz St. (850)202 4416 office Pensacola (850)202 4440 fax FL 32502 (850)291 0667 cell phayesAT-SIGNihmc.us http://www.ihmc.us/users/phayes
Received on Thursday, 26 January 2006 21:27:54 UTC