RE: [ALL] RDF/A Primer Version from Miles, AJ \(Alistair\) on 2006-01-25 (public-rdf-in-xhtml-tf@w3.org from January 2006)

From: Miles, AJ \(Alistair\) <A.J.Miles@rl.ac.uk>
Date: Wed, 25 Jan 2006 19:30:54 -0000
To: "Pat Hayes" <phayes@ihmc.us>, "Booth, David \(HP Software - Boston\)" <dbooth@hp.com>
Cc: "Ben Adida" <ben@mit.edu>, "SWBPD list" <public-swbp-wg@w3.org>, "public-rdf-in-xhtml task force" <public-rdf-in-xhtml-tf@w3.org>
Message-ID: <677CE4DD24B12C4B9FA138534E29FB1D0CC4D0@exchange11.fed.cclrc.ac.uk>
Pat Hayes said:

<quote>
[4] has a clear and explicit description (at
http://www.w3.org/TR/webarch/#indirect-identification
) of a condition which seems to apply almost
perfectly to the situation which arises in RDF/A
and which Alistair deplores, and which is
correctly described as not constituting a URI
collision. Using the same name to refer both to a
thing, and to a piece of a document which itself
refers to the same thing, seems clearly to be an
example of indirect reference. As [4] says,
somewhat pithily," Identifiers are commonly used
in this way."
</quote>

I understood [4] to be referring to 'indirect identification' as expressed in RDF via properties of type owl:InverseFunctionalProperty. I.e. the following triple:

_:aaa foaf:homepage <http://jo-lamda.blogspot.com/>.

... uses the URI <http://jo-lamda.blogspot.com/> to 'indirectly identify' the blank node _:aaa because the property foaf:homepage is declared by the FOAF ontology [1] to be an inverse functional property.

If this is indeed the intended meaning of 'indirect identification' at [4] then I strongly suggest the RDF/A primer does NOT use the term 'indirect identification' to refer to the practice of using URIs to denote both a piece of XML (effectively a part of a document) and an entity in the 'real world' (e.g. a person).

See also related email [2].

Pat Hayes said:

<quote>
It is impossible, both practically and
theoretically, to completely avoid all ambiguity
in using referential names. Reference is not
access. While URLs must be unambiguous locators,
in the sense of resolving unambiguously to a
particular Web resource, referential names -
which is how URI references are used in RDF -
cannot possibly be specified so exactly as to
refer uniquely and unambiguously in all
circumstances. Even globally recognizable proper
names like "Mount Everest" do not have unique
referents in all possible circumstances, since
the exact referent depends on the ontological
framework being mutually assumed (Where is the
exact edge of a mountain? Are we talking about
people as agents or as medical cases? At a
particular time or as endurants? etc..) Under
these circumstances, to view every referential
ambiguity as a Bad Thing is about as useful as
trying to stamp out breathing.

Like words in human language, URIs can be safely
overloaded under conditions which allow possible
misunderstandings to be securely resolved by
their local context, without requiring
negotiation: and this need not even require that
the resolution be actually done, provided that
the necessary context - which is the case under
discussion, is likely to be the ontology
identified by the root URI of the RDF property -
can be accessed when required. In English we
safely use "bank" to refer to a side of a river,
a turning motion or a building, in part because
these meanings are so divergent that the
ambiguity can almost always be immediately
resolved by the immediate context. Similarly, an
email address can be safely used to refer to its
owner in part because almost anything that can be
coherently said about a person could not possibly
apply to an email account, and vice versa. Even
the use of a literal string in a context which
requires a reference to a named agent can be
interpreted as making sense, since it clearly
requires a coercion, and it would be natural to
use the string as a referring name. Whether or
not this is in some fundamental sense 'correct'
or 'proper' is not worth discussing: what matters
is only that a community of agents all agree to
use the same kind of coercion strategy when it is
required, which allows strings to be used to
refer to agents; and to the extent they do, then
they thereby become genuinely referring names.
This is how the world comes to use language, both
in the large and in the small
(http://www.economist.com/science/displayStory.cfm?story_id=5135495).
</quote>

OK. Tell me what 'local context' is exactly. How do I as a publisher ensure that sufficient 'context' is available for the applications I intend to support? What about unforeseen applications? As a consuming application, how do I get at the 'context', and how do I use it to resolve ambiguities? Where are these issues addressed in current specifications?

Surely it is good practice for publishers to clearly understand how and when ambiguities can arise, to be aware of each and every action that could lead ambiguity, and to undertake such actions in full knowledge of the consequences. Surely it is also good practice for publishers in the majority of cases to design systems that do not lead to ambiguity, or that minimise the potential for ambiguity, because in doing so they simpify the management of change, and increase the ease with which their data can be repurposed in unforseen contexts? I.e. by acting to minimise the potential for ambiguity, a publisher increases the value of its published data, because the data is more portable.

A practical question: If I operate under the assumption that the same URI will commonly be used to denote both a person and their home page, doesn't this make the notion of logical consistency effectively useless? Don't domains and ranges become effectively useless also?

E.g. if I have:

<http://jo-lamda.blogspot.com/> foaf:mbox <mailto:jo.lambda@example.org>.

... and I also have:

_:aaa foaf:homepage <http://jo-lamda.blogspot.com/>.

... then via the domain of foaf:mbox and the range of foaf:homepage I may conclude:

<http://jo-lamda.blogspot.com/> a foaf:Agent, foaf:Document.

What is the usefulness of this new information?

Cheers,

Al.

[1] http://xmlns.com/foaf/0.1/ 
[2] http://lists.w3.org/Archives/Public/public-swbp-wg/2006Jan/0145.html 
[4] http://www.w3.org/TR/webarch/#indirect-identification 


-----Original Message-----
From: public-rdf-in-xhtml-tf-request@w3.org on behalf of Pat Hayes
Sent: Wed 25/01/2006 05:30
To: Booth, David (HP Software - Boston)
Cc: Ben Adida; SWBPD list; public-rdf-in-xhtml task force
Subject: RE: [ALL] RDF/A Primer Version
 

>I hate to say this, but I think the URI identity issues that Alistair
>raised in email[3] after yesterday's teleconference are important enough
>to delay publication until they are either fixed or visibly marked as
>problems.  The WebArch document is clear that URI collisions[4] are A
>Bad Thing.  It would seem wrong to endorse such collisions, even
>implicitly.

I beg to differ.

[4] has a clear and explicit description (at 
http://www.w3.org/TR/webarch/#indirect-identification 
) of a condition which seems to apply almost 
perfectly to the situation which arises in RDF/A 
and which Alistair deplores, and which is 
correctly described as not constituting a URI 
collision. Using the same name to refer both to a 
thing, and to a piece of a document which itself 
refers to the same thing, seems clearly to be an 
example of indirect reference. As [4] says, 
somewhat pithily," Identifiers are commonly used 
in this way."

It is impossible, both practically and 
theoretically, to completely avoid all ambiguity 
in using referential names. Reference is not 
access. While URLs must be unambiguous locators, 
in the sense of resolving unambiguously to a 
particular Web resource, referential names - 
which is how URI references are used in RDF - 
cannot possibly be specified so exactly as to 
refer uniquely and unambiguously in all 
circumstances. Even globally recognizable proper 
names like "Mount Everest" do not have unique 
referents in all possible circumstances, since 
the exact referent depends on the ontological 
framework being mutually assumed (Where is the 
exact edge of a mountain? Are we talking about 
people as agents or as medical cases? At a 
particular time or as endurants? etc..) Under 
these circumstances, to view every referential 
ambiguity as a Bad Thing is about as useful as 
trying to stamp out breathing.

Like words in human language, URIs can be safely 
overloaded under conditions which allow possible 
misunderstandings to be securely resolved by 
their local context, without requiring 
negotiation: and this need not even require that 
the resolution be actually done, provided that 
the necessary context - which is the case under 
discussion, is likely to be the ontology 
identified by the root URI of the RDF property - 
can be accessed when required. In English we 
safely use "bank" to refer to a side of a river, 
a turning motion or a building, in part because 
these meanings are so divergent that the 
ambiguity can almost always be immediately 
resolved by the immediate context. Similarly, an 
email address can be safely used to refer to its 
owner in part because almost anything that can be 
coherently said about a person could not possibly 
apply to an email account, and vice versa. Even 
the use of a literal string in a context which 
requires a reference to a named agent can be 
interpreted as making sense, since it clearly 
requires a coercion, and it would be natural to 
use the string as a referring name. Whether or 
not this is in some fundamental sense 'correct' 
or 'proper' is not worth discussing: what matters 
is only that a community of agents all agree to 
use the same kind of coercion strategy when it is 
required, which allows strings to be used to 
refer to agents; and to the extent they do, then 
they thereby become genuinely referring names. 
This is how the world comes to use language, both 
in the large and in the small 
(http://www.economist.com/science/displayStory.cfm?story_id=5135495).

I suggest that if current real-world usage of a 
metadata vocabulary seems to be causing no actual 
operational problems, it might be better to study 
this real-world usage carefully with a view to 
learning something about how symbols actually are 
being used on the Web, than to set out to take 
great pains to improve it.

In the meantime, I also suggest that RDF/A might 
usefully use the term "indirect identification" 
to point out that subjects of RDF triples can 
both be pieces of XML markup and also refer to 
entities in the real world, and that this need 
not be deplored as harmful ambiguity.

Pat Hayes

>David Booth
>
>[3] Identity issues raised by Alistair:
>http://lists.w3.org/Archives/Public/public-swbp-wg/2006Jan/0113.html
>[4] TAG's Web Architecture:
>http://www.w3.org/TR/webarch/#URI-collision
>
>
>>  -----Original Message-----
>>  From: public-swbp-wg-request@w3.org
>>  [mailto:public-swbp-wg-request@w3.org] On Behalf Of Ben Adida
>>  Sent: Tuesday, January 24, 2006 12:03 PM
>>  To: SWBPD list
>>  Cc: public-rdf-in-xhtml task force
>>  Subject: [ALL] RDF/A Primer Version
>>
>>
>>
>>
>>  Hi all,
>>
>>  I made a mistake in the version of the RDF/A Primer that I presented 
>>  at the telecon yesterday. I have just finished uploading the right 
>>  version, which you can find here:
>>
>>  http://www.w3.org/2001/sw/BestPractices/HTML/2006-01-24-rdfa-primer
>>
>>  With the WG and specifically the reviewers' approval (DBooth,
>>  GaryNg, 
>>  and also "unofficial" reviewers), I am hoping that we can rapidly 
>>  agree that this latest version should be the one that becomes our 
>>  first published WD.
>>
>>  The only difference in content is that the new version has an extra 
>>  section (section #2), and the old sections 2 and 3 are merged into 
>>  the new section 3 for purely organizational purposes (no text
>>  is lost 
>>  or added in those sections, just reorganized.) The point of the new 
>>  section 2 is to add an even simpler introductory example. We believe 
>>  this additional section is in line with the comments we
>>  received from 
>>  reviewers, both official and earlier, unofficial reviews. In
>>  fact, we 
>>  began writing it in part to respond to some of these early
>>  comments 2 
>>  weeks ago.
>>
>>  The already-approved version is still at the old URL for
>>  comparison:
>>  http://www.w3.org/2001/sw/BestPractices/HTML/2006-01-15-rdfa-primer
>>
>>  I want to stress that this is entirely *my* mistake: the TF had 
>>  agreed [1,2] that this second version would be presented to the WG
>  > yesterday, and I simply forgot. Publishing these additional examples
>  > now is quite important for getting the word out about RDF/A and
>  > making it competitive against other metadata inclusion proposals, 
>>  outside of W3C, that are gaining traction.
>>
>>  Apologies for my mistake. I hope you'll see that these edits do not 
>>  constitute a substantive change to the document, rather they help 
>>  make the same points more appealing to and understandable by
>>  a larger 
>>  audience.
>>
>>  -Ben Adida
>>  ben@mit.edu
>>
>>  [1] Discussion during last segment of January 10th TF
>>  telecon: http://www.w3.org/2006/01/10-swbp-minutes
>>
>>  [2] Discussion, at beginning, of Mark's new examples during January 
>>  17th TF telecon:
>>  http://www.w3.org/2006/01/17-swbp-minutes
>>
>>


-- 
---------------------------------------------------------------------
IHMC		(850)434 8903 or (650)494 3973   home
40 South Alcaniz St.	(850)202 4416   office
Pensacola			(850)202 4440   fax
FL 32502			(850)291 0667    cell
phayesAT-SIGNihmc.us       http://www.ihmc.us/users/phayes
Received on Wednesday, 25 January 2006 19:31:07 UTC