Re: please review issue-57 document draft before Tuesday telcon from David Booth on 2011-03-21 (public-awwsw@w3.org from March 2011)

From: David Booth <david@dbooth.org>
Date: Mon, 21 Mar 2011 17:17:45 -0400
To: Jonathan Rees <jar@creativecommons.org>
Cc: nathan@webr3.org, AWWSW TF <public-awwsw@w3.org>
Message-ID: <1300742265.1954.109482.camel@dbooth-laptop>
On Fri, 2011-03-18 at 12:59 -0400, Jonathan Rees wrote:
> On Fri, Mar 18, 2011 at 12:06 PM, David Booth <david@dbooth.org> wrote:
> > On Wed, 2011-03-16 at 07:52 -0400, Jonathan Rees wrote:
> >> I guess my question is: Under "take it at face value," many URIs are
> >> supposed to be taken to refer to WS(u). So, under what set of
> >> circumstances, if any, will an automaton be able to detect that URI is
> >> supposed to refer to IR(u)?
> >
> > I like the way you're putting this -- that the criterion is to enable an
> > automaton to make the detection -- but I think the way you've stated it
> > reflects too strong an RDF bias.  The vast majority of URIs on the web
> > -- easily 99%+ -- arguably refer to IR(u), not WS(u), so I think it
> > would make more sense to state the goal with the opposite default:
> > [[
> > Under "take it at face value," many URIs are
> > supposed to be taken to refer to IR(u).
> 
> You're talking not about "take it at face value" but about some hybrid
> approach in which some u refer to IR(u) and some don't.  Pure "face
> value" would mean read the document and see what it says about its own
> URI. If it says nothing about its own URI, you know nothing. That's
> not the same as knowing that u is meant to refer to IR(u) in some
> (most) cases.

Ah, I see.  Yes, I misunderstood the intent of the "take it at face
value" approach.

> 
> Maybe I wasn't clear - or maybe I ought to redefine what "face value"
> means in the report - but then the sometimes-IR(u) sometimes-WS(u)
> case is really quite different always-WS(u), no matter how the
> partition is calculated.
> 
> > So, under what set of
> > circumstances, if any, will an automaton be able to detect that URI u is
> > supposed to refer to WS(u)?
> > ]]
> 
> My point is that the introduction of "face value" for *any*
> derefenceable URIs creates FUD around *all* dereferenceable URIs.
> That's why I'd prefer to say that IR(u) is the exception to a WS(u)
> default rule. 

I think I understand what you intended now.  And I agree that the "face
value" approach is problematic.

> Another reason to put it this way is because proponents
> of "face value" have pretty much put it this way by saying "we don't
> need no metadata". That is, to them, the IR(u) case never happens in
> situations of importance to LD. (This must mean that @href is not
> referential; and I would be willing to give that to them, not because
> I like it but because it's a completely defensible position.)
> 
> In the end I don't think it matters much, but I'll consider whether
> flipping it works better, 

I don't think there is a need to flip it, now that I understand what
your intent behind "face value" is.  I just don't think it is a viable
option, but that doesn't mean that it shouldn't be listed.

> or doing something more symmetric. I guess
> there are three cases: all robots can detect IR(u), all robots can
> detect WS(u), and gray area where some can tell them apart, and some
> can't.
> 
> > But I think the bigger problem occurs if you implicitly assume that
> > IR(u) is not the same as WS(u).
> 
> If I ever assume that, even implicitly, then there is a mistake in the
> report. Please give me specific instances. My guess is that you are
> drawing invalid conclusions; but so would others, and they have to be
> averted early.
> 
> You can always establish that IR(u) = WS(u) just by making the content
> be the statement <u> :accessibleVia "u".
> 
> > In other words, the definition of :IR
> > MUST NOT say that it is disjoint with anything.
> 
> I never say this. All I say is that most IR(u)'s will make good
> metadata subjects and so metadata statements will be true of them. Let
> people make of this what they will.
> 
> Will try to clarify.

Okay.

> 
> >  (And lest anyone
> > complain that if :IR is not disjoint with anything, then we have said
> > nothing useful about :IR, I will remind them that IsProvable(P) is not
> > the same as !IsProvable(!P).  Knowing that something has been "tagged"
> > as an :IR may indeed be useful to some applications.)
> >
> > I think a way to state the chimera case
> > http://www.w3.org/2001/tag/awwsw/issue57/latest/#id35999
> > is to say that an HTTP GET of u returning a 200 status implies:
> >
> >  <u> a :IR ; log:uri "u" .       # Assertion set gh
> 
> With the :accessibleVia predicate there is no need to provide the type
> assertion. *Any* unnecessary use of IR as a type (which I think is all
> uses) has to be avoided, since otherwise the case for IRs is weakened
> by an appearance of decree.

But that just sounds like you're saying the domain of :accessibleVia is
already known to be :IR.

Even if you don't explicitly mention :IR, the use of :accessibleVia
implicitly defines a class of all resources ?r such that there exists
a ?u such that:

 ?r :accessibleVia ?u .

So you might as well give it a name :IR.  

> 
> >
> > Let's call this set of assertions "gh" (graph implied by the HTTP
> > response).
> >
> > Then, if the document retrieved from u happens to be interpretable as
> > RDF, that RDF might further say something like:
> >
> >  <u> a :Toucan .                 # Assertion set gd
> >
> > Let's call this set of assertions "gd" (graph expressed in the URI
> > declaration, or "account").
> >
> > But is :IR owl:disjointWith :Toucan?  Not necessarily.
> 
> That is not the question on the table. The question is whether any
> member of :Toucan makes a good metadata subject - can they have topics
> and authors and so on. That is really up to whoever is trying to talk
> about :Toucans.
> 
> I don't know if you noticed but my IR theory really is strong enough
> to let you conclude quite a bit about deployed IRs - it lets you write
> metadata. Different metadata in each case, to be sure. The metadata in
> itself may be enough to get someone to decide to mint a new URI rather
> than use the one for the IR. In fact we don't really have anything
> else to offer. Knowing something is an IR completely useless, since
> nothing follows from that, 

That is not true at all!  To reiterate:
[[
IsProvable(P) is not
the same as !IsProvable(!P).  Knowing that something has been "tagged"
as an :IR may indeed be useful to some applications.
]]

In particular, it may be useful when they start adding other properties
or constraints on :IR.

> but metadata is useful, and it follows from
> what you GET (in aggregate).
> 
> The chimera case is simply that IR(u) = WS(u). 
> The consequences of
> that can be worked out in many different ways. I thought I covered
> them pretty well with my three points. 

The three "Ways that this can fail" that you list nicely highlight some
of the challenges that users can face.  (And BTW, #1 and #2 are just
different expressions of the exact same issue: ambiguity of reference.)
These challenges are not unique to the chimera approach, BTW.  They
would show up in any approach, though perhaps in a different form.  

But in any case, the biggest issues I see is that the proposed solution
needs to have much more detail, such as about exactly what graph is
being analyzed.  I made several suggestions (using terms gh, gd and ga
for three kinds of RDF graphs), but you seem to have ignored them.

> Maybe I missed something, or
> wasn't clear - I can always use help being more complete and clear.
> 
> I have thought about trying to talk about interoperability in terms of
> graph merging and consistency checking (= construction of satisfying
> interpretations).  This is sort of weak since it's so hard to get an
> inconsistency in RDF and most people won't have a clue what's being
> discussed. It would be a lot easier with OWL but that would not make
> us popular. I think it's better just to appeal to the reader's common
> sense, especially since the main audience is people who [say they]
> don't care about inference.

I'd be wary of only appealing to common sense.  The more specific and
formal you can be, the better.  At present, many people's "common sense"
is leading them to believe the four myths I list here:
http://dbooth.org/2010/ambiguity/paper.html#part0

> 
> Examples will help. But I think introducing the "declaration" and
> "owner" ideas or anything like them will very much weaken the report -
> and I say this not just because I don't believe them, but because of
> Occam's razor. People already have ideas of how all this works, and I
> want to enlist and refine their existing understanding, not throw
> something new at them. 

The concepts of "URI owner" and "URI declaration" are hardly new.  URI
ownership was described in AWWW drafts at least as early as 2003
http://www.w3.org/TR/2003/WD-webarch-20031209/#uri-ownership 
and the term "URI declaration" was coined in 2007, though the concept
dates from "follow your nose" discussions going back to at least 2002
http://lists.w3.org/Archives/Public/www-webont-wg/2002Oct/0162.html .

> Remember the only purpose here is to start the
> issue 57 conversation going.
> 
> Hey, here's an idea: rename "information resource" to "metadata
> subject" throughout?

I think that would hinder understanding more than helping.  As I stated
before, I think it is helpful to reuse existing terms.  If you need to
slightly redefine a term for a particular purpose within the document,
then that's fine, but the document is easier to follow if the term is at
least evocative of its ancestry.

David

> 
> Jonathan
> 
> > But suppose
> > someone else supplies another set of assertions "ga" (graph of ancillary
> > statements):
> >
> >  :IR owl:disjointWith :Toucan .  # Assertion set ga
> >
> > Now an RDF consumer that chooses to merge graphs gh, gd and ga will
> > obviously have a problem, because the result will be self-contradictory.
> > It will either need to forgo some of these assertions, or it will need
> > to split the identity of <u>.
> > http://dbooth.org/2007/splitting/
> >
> > On the other hand a different RDF consumer may happily choose to merge
> > only graphs gh and gd, thus allowing <u> to (ambiguously) denote
> > something that is both an :IR and a :Toucan in the merged graph.
> >
> > Furthermore, a third RDF consumer may happily choose to merge only
> > graphs gh and ga, thus viewing <u> as only an :IR.
> >
> > This brings us into the whole question of which assertions *should* be
> > used in what circumstances, i.e., what should these graphs contain,
> > which of them should be merged by whom under what circumstances, and
> > what should be done if there are contradictions?
> >
> > There are several questions:
> >
> > 1. Precisely what assertions should be included in gh, the graph implied
> > by the HTTP 200 response?  The n3 rules at
> > http://www.w3.org/wiki/AwwswDboothsRules
> > were intended as a first crack toward nailing this down.
> >
> > 2. What responsibilities does a URI owner have in minting a new URI,
> > configuring his/her server, and hosting a URI declaration (or
> > "account")?  Many people have talked about this, including the "Cool
> > URIs for the semantic web" document:
> > http://www.w3.org/TR/cooluris/
> > and my papers on URI declarations:
> > http://dbooth.org/2007/uri-decl/
> > and "The URI Lifecycle in Semantic Web Architecture":
> > http://dbooth.org/2009/lifecycle/
> >
> > 3. If a URI owner does host a URI declaration (or "account"), what
> > assertions should it include or avoid?  For example, if gh asserts that
> > <u> is an :IR, should the URI declaration (or "account") include a
> > disjointness assertion like "<u> owl:disjointWith :Toucan" if <u> is
> > *only* intended to denote a toucan?  This has mostly been only vaguely
> > addressed in the past, however my paper on "The URI Lifecycle in
> > Semantic Web Architecture"
> > http://dbooth.org/2009/lifecycle/
> > does propose that the URI declaration (or "account") should not contain
> > any assertions that would cause the transitive closure of URI
> > declarations (i.e., the ontological closure) to be contradictory.
> >
> > 4. What should be the RDF statement author's responsibilities, in
> > writing a graph of statements involving <u>, to help ensure that the
> > intended "meaning" of <u> will be understood by an RDF consumer?  The
> > URI Lifecycle paper also proposed an answer to this question.
> > http://dbooth.org/2009/lifecycle/
> >
> > 5. What process and graphs should the RDF consumer use, in attempting to
> > determine the referent of a URI?  The URI Lifecycle paper proposes an
> > (initial) answer to this
> > http://dbooth.org/2009/lifecycle/
> > and the paper on "Resource Identity and Semantic Extensions: Making
> > Sense of Ambiguity"
> > http://dbooth.org/2010/ambiguity/paper.html
> > goes into much more detail on the proposed process or algorithm.
> >
> > All of these questions need more work before we can hope to reach
> > community consensus, but we do at least have some starting points.
> >
> >
> >
> > --
> > David Booth, Ph.D.
> > http://dbooth.org/
> >
> > Opinions expressed herein are those of the author and do not necessarily
> > reflect those of his employer.
> >
> >
> 
> 

-- 
David Booth, Ph.D.
http://dbooth.org/

Opinions expressed herein are those of the author and do not necessarily
reflect those of his employer.
Received on Monday, 21 March 2011 21:18:19 UTC