Re: New draft of section 5.5

On Mon, 2011-04-11 at 15:45 -0400, Jonathan Rees wrote:
> On Mon, Apr 11, 2011 at 12:37 PM, David Booth <david@dbooth.org> wrote:
> > General comments on
> > http://www.w3.org/2001/tag/awwsw/issue57/latest/
> >
> > 1. The scope of the document looks considerably more focused now, such
> > that I think I have a much clearer understanding now of your intent for
> > this document.  Kudos!  If I am correctly understanding your intent,
> > there is no need for the document to get into the issue of a URI's
> > meaning, because the document is only about the conventions for
> > publishing and locating the URI owner's intended *definition* of the URI
> > -- not how that definition is interpreted or what is done with it.  Does
> > that sound correct?
> >
> > 2. The abstract looks clearer now.  But I think it is important that the
> > abstract set the context as being primarily RDF, since that is the
> > context that is primarily motivating the issue.  The question does not
> > arise in the same way or to the same degree on the conventional web.
> >
> > 3.
> > [[
> > Two contexts of special interest to this report are in natural language
> > (e.g. "The W3C home page is 'http://www.w3.org/'"), and in declarative
> > languages such as RDF and OWL.
> > ]]
> > I don't think we should try to address the natural language case.  It is
> > the RDF case that is motivating this, and it will be difficult enough
> > addressing that without adding a new can of worms by trying to address
> > the natural language case.
> 
> I explained this to you before. I have explicit direction from the
> TAG, with which I concur, to make this a webarch issue, not an RDF
> issue.
> 
> Maybe you can suggest wording that says that RDF is the motivating use
> case, while making it very clear that it is not the only one - it's a
> matter of web architecture, not something special to RDF.

The latest wording seems to address this already: ". . . notably the
Semantic Web and Linked Data."  This looks fine to me.

> 
> > 4. Malformed sentence:
> > [[
> > Definition discovery is not the same as Web dereference, however, since
> > dereferencing a URI gives you information - i.e. the document, image,
> > etc. specified by the URI - not necessarily related to defining
> > anything.
> > ]]
> 
> It's perfectly well formed, but perhaps overly complicated and hard to
> parse.  I will add "that is" before "not", and flag it for further
> work (I'm not sure how to say this).

Oh, I see now how you intended it.  Yes, I think adding "that is" makes
it easier to parse.  Thansk.

> 
> > 5. I think this needs to be reworded:
> > [[
> > In theory dereference could play a role in explaining the meaning of a
> > dereferenceable URI (see 5.6 'Hashless' URI dereferences to its
> > definition (incompatibly)), but this is not generally done at present,
> > ]]
> > because dereference *does* currently play a role at present.  In
> > particular, people use dereference to conclude that something is an IR.
> 
> Who does this? Who would care about that?
> 
> If I wanted to see whether something was an information resource, I'd
> consider its properties and ask if it had the sort of properties that
> an information resource has. But usually I would care about the
> properties themselves, not about any kind of classification.
> 
> I can improve this, though: instead of "play a role in explaining",
> "provide information that helps explain"
> 
> and I'll insert 'hashless'.

The part of the sentence that seems misleading to me is the part that
says "but this is not generally done at present".  The httpRange-14 rule
*is* used as present to find out if a URI refers to an IR when that URI
yields a 200 status code.  That certainly seems to me to qualify as
providing "information that helps explain the meaning of a
dereferenceable URI".  For example, I believe TimBL said quite a while
back that Tabulator does this already.

I suggest rewording this:
[[
In theory dereference could play a role in explaining the meaning of a
dereferenceable URI (see 5.6 'Hashless' URI dereferences to its
definition (incompatibly)), but this is not generally done at present,
since a dereferenceable URI refers to the information resource
accessible via that URI, not to what that information resource defines
or describes (see 7.3 Using a URI to refer to the information resource
accessible via that URI).
]]

as:
[[
Dereference currently plays a role in explaining the meaning of a
dereferenceable (hashless) URI, but only to the extent of indicating (by
the httpRange-14 rule) that the URI refers to an information resource
accessible via that URI.  (See 7.3 Using a URI to refer to the
information resource accessible via that URI.)  Section 5.6 'Hashless'
URI dereferences to its definition (incompatibly) describes how
dereference might also be used to obtain the definition of a URI that is
not intended to refer to an information resource.
]]


> 
> > 6. Very good:
> > [[
> > The reason we define definition discovery methods is interoperability -
> > so that everyone gets the same definition of each URI.
> > ]]
> > HOWEVER, the idea of everyone using the same definition is *not*
> > universally accepted in the SW community.  I particularly remember
> > Michel Dumontier using URIs that specifically lacked universal
> > definitions, such that the user could choose his/her desired definition.
> > Perhaps we should acknowledge this as a dissenting view.
> 
> It's not really a disagreement, it's about different goals. I think
> the statement that common definition helps interoperability is fine,
> it's just that not everyone is interested in interoperability. In fact
> this is explicitly stated at the end of the paragraph, so I'm not sure
> what more needs to be said.  Maybe it's not just about communities,
> but about individuals? - although an individual doesn't communicate
> with itself so I'm not sure how the issue would even arise.

Okay, sounds reasonable.

> 
> Michel's fooling himself, since there *is* common meaning to the URIs
> he's working with; if there weren't then he wouldn't be bothering with
> them at all as it would be a waste of time. I think he's just saying
> that he doesn't want to write down the particular way in which his
> URIs are being used.

I agree.

> 
> > 7. If we're preparing this for a wider audience, maybe we should revert
> > to the term "representation" instead of "version" throughout, since
> > that's what AWWW uses.
> 
> I explained to you earlier why we can't do this: it is the lesser of
> two evils. The greater evil would be saying that metadata properties
> such as DC, RDFS, CCREL operate on a union class of information
> resource + representation. I prefer to say that information resources
> are metadata subjects, and that we have fixed information resources
> that are like representations but work as metadata subjects.

But this document no longer needs to get into those issues if it merely
focuses on the *mechanics* of providing and obtaining URI definitions,
rather than getting into the issue of trying to determine the "meaning"
of the URI.

> 
> If you can get TimBL to agree that representations are information
> resources, which IMO is the correct approach, then I might be
> persuaded that "representation" is OK, although all my objections to
> the term would remain.
> http://odontomachus.wordpress.com/2011/03/07/are-you-confused-yet-about-the-word-representation/
> 
> Nathan made an explicit plea *not* to use "representation".

I would urge both you and Nathan to scan the current draft of 
http://www.w3.org/2001/tag/awwsw/issue57/latest/
and note how the word "version" is used.  In nearly every case, the word
"representation" (IMO) would be direct replacement that would be more
understandable to the TAG audience.  Here are several examples:

"the versions obtained by dereferencing that URI"

"one generally needs to dereference it (and even then one only knows a
single version of it"

"is governed (according to [rfc3986]) by the media type of some version
of the information resource"

"the media type registration defers to the content of the version"

"the version itself gets to arbitrarily define what the 'hash' URI
means.[4]"

"a definition of the URI that is carried by (a version of) the
information resource itself"

"If IR(u) has a version with media type 'application/rdf+xml'"

In every one of these examples, the word "representation" would be a
direct replacement for "version" and would then correspond to the
terminology that is nearly universally used in webarch discussions and
related technical documents.

> 
> > 8. In the definition of "dereferenceable", I don't understand why this
> > clause is included:
> > [[
> > or to perform some other action on an associated resource ([rfc3986]
> > section 1.2.2)
> > ]]
> > I find it confusing, because it would seem to make a hash URI
> > dereferenceable, as the HTTP GET would be performed on the associated
> > resource -- the hashless part.
> 
> The clause is there for compatibility with RFC 3986.  There is
> probably a better way to say it.
> 
> It doesn't matter what the consequences are for hash URIs, since the
> report nowhere depends on anything about what happens if you
> dereference a hash URI. I believe we discussed this before.

I suggest cutting that clause then.

> 
> > 9. Is the notion of "fixed information resource" needed in this
> > document?  Similarly, are definitions of "metadata", "refer" and "term"
> > needed?
> 
> It is a handy way to describe those things that consist of content +
> metadata. They figure quite prominently in the narrative as the IRs
> that are versions of other IRs.
> 
> "Metadata" is both used and defined, so I'm not sure what the issue
> is. The definition is necessary because computer scientists often use
> the word incorrectly (with respect to dominant usage).
> 
> "Refer" is both used and defined, so again I don't know what you're
> saying. Someone asked me to define "refer" - actually I thought it was
> you, but it could have been Alan. It seems useful to relate "refer"
> and "mean"
> 
> I think someone also asked me to define "term". It needs to be said
> that it is not the same as "word" and that it subsumes "URI". So I am
> not sure why you question this.

There is no big harm in leaving these definitions in place, but the
document has now been simplified and focused to make them unnecessary.

> 
> > 10. s/in which the occurs/in which the URI occurs/
> 
> ok
> 
> > 11. Re "This is the approach taken by OWL": I think it would be more
> > accurate to say that OWL *supports* this approach.
> 
> Really?  I think it would be more accurate to say what I said. Have
> you read the OWL specs?

I'm sure you know OWL better than me, but I am assuming that the OWL
specs do not *preclude* definitions from being provided by the merging
of additional graphs, in addition to providing definitions via
owl:imports.  In other words, I am assuming that an OWL graph can
consist of the merge of other graphs.  Am I correct in this assumption?

> 
> > 12. I notice that sec 3.4 discusses some pros/cons of its approach, but
> > sections 3.1 and 3.2 do not.  In general, critiques of the approaches
> > are in section 4 -- separate from the descriptions of the approaches.
> 
> I don't see any cons of LSID in 3.4.  Are you referring to the
> footnote? I guess I could make a 4.x section saying that LSIDs don't
> have a FYN story, but to me it seemed like the sort of detail that is
> better left to a footnote.

I may have said 3.4 when I meant 3.3, or perhaps the numbering changed.
No matter.

> 
> > Section 5 purports to describe proposed new approaches,
> 
> ok. changed "alternative methods" to
>         modifications
>         to current definition methods, as well as new
>         methods,
> 
> > but most of them
> > are actually just refinements of existing approaches.  Furthermore, each
> > refinement needs to have pros/cons discussed, just as the basic
> > approaches do.  Hence, I would suggest restructuring the document as
> > follows, to keep the pros/cons with the approach descriptions, and to
> > keep the refinements with their respective base descriptions.  I have
> > kept the original section numbers so that you can more easily see how
> > they are being moved:
> > [[
> > 3 Current definition methods
> >    3.5 'Hash URI'
> >        Con: 4.1 Fragment identifiers are fragile
> >        Refinement: 5.2 'Hash URI' with fixed suffix
> >            Pros/cons
> >        Con: 4.2 The common 'hash URI' pattern fails with large
> > namespaces
> >        Refinement: Use multiple base URIs
> >            Pros/cons
> >        Con: 4.3 Hash URIs don't support REST architecture
> >    3.6 'Hashless URI' with HTTP 303 See Other redirect
> >        Con: 4.4 303 is difficult, sometimes impossible, to deploy
> >        Con: 4.5 303 leads to too many round trips
> >        Refinement: Use 303-redirect service
> >            Pros/cons
> >        Refinement: Define optimization pattern in .well-known
> >            Pros/cons
> >        Con: 4.6 303 makes the URI difficult to bookmark
> >    5.5 'Hashless' URI dereferences to its definition
> >        Pros/Cons
> >    3.3 Register a URI scheme or URN namespace
> >        Pros/Cons
> >    3.4 Use the LSID getMetadata() method
> >        Pros/Cons
> >    3.2 Link to documents containing definitions
> >        Pros/Cons
> >    3.1 Colocate definition and use
> >        Pros/Cons
> > 4 Possible new definition methods
> >    5.3 'Hashless URI' with site-specific discovery rules
> >        Pros/cons
> >    5.4 'Hashless URI' with new HTTP request or response
> >        Pros/cons
> > ]]
> 
> I understand this proposal. I had my reasons for not organizing the
> document in this way. I want the discussion to first get consensus
> that we are currently doing things in a certain way. That is section
> 2. If we get this far, we'll have common ground.
> 
> Then, section 3 acknowledges the pain points, and states them clearly,
> without prejudice as to solution. If we get consensus on section 3,
> that will be good.
> 
> Then, section 4 collects remediations. If we get consensus on the
> statements made about the remediations, that will be good.
> 
> If we put current methods alongside proposed one that makes it look
> like they are being compared on equal terms. It's important to avoid
> that appearance. Any change will be painful and disruptive. So I think
> the discussion is better ordered to help the consensus process than by
> an analysis that puts all methods, pros, and cons on equal footing -
> even if this ends up being a bit redundant or unnatural.
> 
> Nathan, any thoughts on this?

I just thought it may be clearer with the structure I suggested, but I'm
okay with keeping the current structure if you prefer.  

> 
> > 13. Sec 3.3: I think the scheme registration example needs to be
> > modified to describe *delegated* URI definitions, since it is not
> > realistic to think that the IANA registration itself would directly
> > define the meanings of all of the URIs in the scheme.   For example, the
> > scheme "mountain:" may be for mountains, but it would have to say how an
> > individual would define a new mountain URI under that scheme.
> 
> I see no essential difference between a URI scheme registration and a
> URI definition and I would like to convey their uniformity somehow.
> mailto: and data: define the meanings of all URIs in their schemes,
> and http://tools.ietf.org/html/draft-holsten-about-uri-scheme-06 gives
> the meanings of particular URIs.

Okay, but the case where the URI scheme directly defines the meaning of
*all* of the URIs in that scheme seems extremely limited.  Maybe that is
the case that you intended, but I would have thought that you would want
to cover the case -- like for the LSID subscheme -- where the scheme
provides a way to delegate the provision of a URI definition.  For
example, the definition of the URI

  tdb:2009:http://en.wikipedia.org/wiki/IETF

is not directly defined by the tdb scheme, but is the document that
existed at http://en.wikipedia.org/wiki/IETF at the end of 2009.

I think this approach should be covered also.

> 
> > 14. s/refers to a what/refers to what/
> 
> ok
> 
> > 15. I think it would be helpful if the use case in sec 2.2 also included
> > the specific questions that need to be answered by each proposed
> > solution.  On the other hand, I am uncertain that this use case is
> > really important enough to include.   [Added later:]  On further
> > reflection, I think use case 2.2 should be dropped, as it does not add
> > enough to be worthwhile.
> 
> I'm inclined to agree, but the scenario has to be acknowledged somehow
> as it does come up in the wild and is logically very different from
> the previous use case, even creating nasty ambiguities in some cases.
> 
> > 16. Similarly, I think it would be helpful if each proposed solution
> > explicitly stated what Alice, Bob and Carol should do, according to that
> > solution: "According to this approach, in scenario 2.1, Alice
> > should . . . Bob should  . . . Carol should . . . ".  I first noticed
> > the need for this in sec 3.4 (LSID), perhaps because I don't know the
> > details of how LSID works.
> 
> will review
> 
> > 17. s/since otherwise it would refer to/since otherwise under this
> > approach it would refer to/
> 
> since otherwise, by the <loc href="#ir-ref"
>    >IR reference rule</loc>,
> 
> > 18. s/the URI does not refer IR('http://example/p16')/the URI does not
> > refer to IR('http://example/p16')/
> 
> ok
> 
> > 19. The diagrams are nice!  One suggestion: s/specifies/defines/, since
> > "definition" is the term that is used elsewhere in the document.
> 
> ok
> 
> > 20. Sec 3.4: I'm a little surprised to see LSIDs singled out, since it
> > feels like there have been a zillion identifier techniques proposed,
> > including DOI, ARK, XRI, "info:", "tag:", "tbd:", etc.  I'm not sure how
> > we should address them all, as the proponents of each thought theirs to
> > be uniquely important in some particular way.
> 
> They deserve to be singled out, because unlike all of those others,
> they give a way to define individual URIs.  And they are actually
> deployed in this way, with active user communities, at least one of
> which is an audience I have in mind for this note.

I don't understand.  AFAIK they *all* provide ways to define individual
URIs by delegation.  The example URI I gave above

    tdb:2009:http://en.wikipedia.org/wiki/IETF

is an individual URI whose meaning is defined indirectly (via
delegation) by the tdb scheme.  Isn't it?  If not, why not?

> 
> > 21. I think it would be helpful to list the approaches in (approximate)
> > descending order of popularity in the document, which I guess would be:
> > hash URIs, 303-redirect, 'Hashless' URI dereferences to its definition,
> > link to documents containing definitions, LSID/other schemes, new URI
> > scheme, colocate definition and use.
> 
> My preferred ordering is logical. E.g. URI scheme is logically prior
> to all others. Then things that are not FYN can be disposed of early,
> leaving the presentation to focus on what's most important.

Okay, no big deal.

> 
> > 22. It occurs to me: Doesn't sec 3.3 "Register a URI scheme or URN
> > namespace" belong in the proposed new approaches section, rather than in
> > the existing approaches section?
> 
> No, because it's a method that is currently available and that's used

Okay, no big deal.

> 
> > 23. Sec 4: Each criticism should also include rebuttals or mitigating
> > techniques.  For example, in "sec 4.2 The common 'hash URI' pattern
> > fails with large namespaces", in would be good to point out the large
> > namespaces can be subdivided into multiple hashless base URIs, although
> > this may make them harder to use (because multiple @prefixes may need to
> > be declared).
> 
> then there would be multiple namespaces, right?  I don't really get
> this.  You're just saying, to avoid large namespace, avoid large
> namespaces.

Exactly.  Perhaps it is pointing out the obvious, but if a man is
complaining that it hurts every time he chooses to poke his spoon into
his eye, it seems reasonable to point out that he doesn't *need* to be
poking his spoon into his eye.  There are other ways to drink tea.

The point is to make the trade-offs clear: namespace simplicity versus
large file size.

> 
> > 24. Sec 4.3 "Hash URIs don't work with HTTP PUT, POST, or DELETE
> > methods": I am not familiar with this criticism.  Pointer please?
> 
> added.  (you saw this email- Manu's
> http://lists.w3.org/Archives/Public/public-awwsw/2011Jan/0012.html )

Ah yes.  Thanks.

> 
> > 25. Sec 4.4 "303 is difficult, sometimes impossible, to deploy": Again,
> > this can be mitigated by use of a 303-redirect server, such as
> > http://thing-described-by.org/ , or an equivalent distributed technique
> > based on .well-known RFC5785:
> > http://tools.ietf.org/html/rfc5785
> 
> I'll think about putting this in - do you know of anyone who actually
> does these?

I remember someone -- I've forgotten who -- mentioning that he was going
to use http://thing-described-by.org/ .  But I don't know if he still
does.  Note also that purl.org now has the ability to do 303 redirects.
Certainly a lot of people use purl.org, but I don't know how many use
its new 303 redirect capability.

> And IIUC these two have very different character - the first requires
> no change to clients or to the FYN story, while the second would
> require every client to be aware, so would constitute a new method.
> Although you haven't really said enough for me to know what you have
> in mind, so please explain the .well-know idea (such that it's
> different from what I describe in 5.3).

I think 5.3 covers it well enough.  There are a lot of ways it could be
done, but I don't think we need to get into them.

> 
> > 26. Ditto for sec 4.5 "303 leads to too many round trips"
> 
> Don't know what you mean

The problem of extra round trips when the 303 redirect approach is used
can be mitigated by using syntactic URI patterns to recognize which URIs
are going to cause a 303 redirect, skip the redirect, and go directly to
the new target resource.  For the central 303-redirect server approach,
this optimization is described at
http://thing-described-by.org/#optimizing

But similar optimizations can *also* be done using a
decentralized .well-known approach: the server can provide a URI rewrite
pattern that transforms the original URI into the redirect URI.  The
client therefore only needs to access the server *one* extra time -- the
first time -- to obtain the rewrite pattern, and thereafter it can use
the rewrite pattern to optimize away the extra redirects.

> 
> > 27. Sec 4.7 "The normative specifications are incomplete": Which
> > approach is this criticizing?
> 
> Hash URI and 303.  The criticisms are somewhat different in the two
> cases; I'll fill this out. I think Harry's complaint was about both,
> but mainly 303.
> 
> > 28. I think sec 5.1 "Use something other than a URI" can be deleted,
> > since the value of using a URI is well established and quite fundamental
> > to web architecture.
> 
> I think it is an important potential mitigation and some people have
> proposed it.

I disagree, but I'd rather not waste energy arguing it.

> 
> > 29. The "'Hashless' URI dereferences to its definition" approach is an
> > *existing* approach that some use, so I think it belongs in section 3.
> 
> Examples?

I believe the FOAF namespace -- or was it SKOS?  Now I've forgotten
which -- for a long time returned a 200 status instead of a 303, though
they both seem to return 303 now.  And there have been a number of
complaints in the LOD community about how some data providers are
serving a 200 status instead of using 303 redirects.  I don't know
exactly which ones are doing it.

> 
> > 30. Sections 5.5 "'Hashless' URI dereferences to its definition
> > (compatibly)" and 5.6 start talking about how a definition is
> > *interpreted*, which (by my new reading of the rest of the document)
> > seems out of scope with the document's current intent of focusing only
> > on the *mechanics* of providing and obtaining a URI definition.
> 
> If a proposed method leads to problems those problems have to be
> raised and explained. It doesn't matter what kind of problems they
> are. Or, if the correctness of an approach relies on no one "getting
> confused", there has to be some definition, even if information of
> what "not getting confused" means.

If a proposed method failed to differentiate between different possible
interpretations of "Mount Everest", and this caused problems for *some*
applications, would we blame the method?  Of course not.  But that is
*exactly* what we're doing if we're expecting section 5.5 or 5.6 to
define ways to differentiate between the IR and the toucan.  The
difference between the IR and the toucan just happens to be one axis;
the difference between different interpretations of "Mount Everest"
happens to be a different axis.  But the principle is *exactly* the
same: it boils down to the issue of ambiguity of reference -- an issue
that will *never* go away, regardless of the method employed.

So the problem with differentiating sec 5.5 from 5.6 is that it
*misleads* the reader about the fundamental nature of the problem, while
addressing only one specific manifestation of the problem.

> 
> > Shouldn't these sections be merged into one, that merely states the
> > mechanism and points out the pros/cons?  The potential conflict of
> > meaning between what the URI definition says and the information derived
> > from the httpRange-14 rule does represent a "con" for this approach.
> 
> They seem very different to me. In 5.5 anyone who uses these URIs to
> name information resources is happy; it's the linked-data folks who
> are at risk, and they're the ones who don't care about soundness
> anyhow. 

They'll only be happy until they run into the next case of resource
ambiguity, and then they'll be unhappy again.  But they may not
recognize it as being the same kind of problem, because there isn't
likely to be a URI definition retrieval method available for
disambiguating their next ambiguity.  They may instead see it as a case
of people being sloppy -- what I've been calling myth #3:
http://dbooth.org/2010/ambiguity/paper.html#myth3 

> That seems OK to me. Whereas in 5.6 everyone is at risk
> because suddenly there is no (interoperable) way to refer to
> information resources (or there has to be a new way).

No, it is not true that everyone is at risk.  It depends on the
*application*.  In sec 5.6, there will be problems *only* for those
applications that *need* to distinguish between the canoe and the
canoe's description.  This is what I was pointing out in sections 5.5.3,
5.5.4 and 5.5.5 of this draft:
http://lists.w3.org/Archives/Public/public-awwsw/2011Apr/att-0040/meaning-of-a-URI.html

But the exact same problem exists for *every* resource definition: for
some applications, a resource definition will be unambiguous, while for
others that require finer distinctions the exact same definition will be
ambiguous.  This problem is not unique to the IR/non-IR distinction.

> 
> Remember my goal is to make sure that httpRange-14 is preserved, or if
> not, to standardize on an alternative. This is because I care about
> metadata (and, on behalf of my organization, CC REL and the license
> chooser). From this point of view 5.5 and 5.6 are totally different.
> 
> > This "con" could be framed as a problem of two (competing) URI
> > definitions having been provided: one implicitly by the httpRange-14
> > rule (indicating that the resource is an IR)
> 
> I would not call it implicit; when HTTPbis gets done, and/or when we
> publish our finding/rec for issue 57, it will be quite explicit.

Ok.

> 
> > and the other explicitly by
> > the retrieved document content.  In framing the problem this way, the
> > question would be which definition or combination of definitions to
> > believe.  Note that we don't currently ask this question of other
> > approaches, even though different definitions could be provided by
> > multiple approaches.
> 
> for example?   (I know the host-specific rule description ought to say
> something about this; will add a TBD)

Suppose one definition is provided per section 3.1, another per section
3.2 and a third per section 3.5 or 3.6.  Or suppose multiple definitions
are provided per sec 3.2, as may happen if two graphs are merged.

> 
> > It seems to me that we should either entirely steer clear of getting
> > into the "meaning" of the URI, or we will have to get in much deeper,
> > which is what I previously thought your intent was, and which was why I
> > was trying to elicit what you meant by "meaning" and insisting on
> > discussing only *observable* characteristics.
> >
> > Other responses inline below . . .
> 
> Thanks for the close reading.
> 
> > On Sat, 2011-04-09 at 19:12 -0400, Jonathan Rees wrote:
> >> On Mon, Apr 4, 2011 at 9:34 PM, David Booth <david@dbooth.org> wrote:
> >> > Attached is an updated version.  Inline responses . . .
> >> >
> >> > On Mon, 2011-04-04 at 16:37 -0400, Jonathan Rees wrote:
> >> >> "its meaning should be obtained from that definition instead of from
> >> >> the httpRange-14 rule regarding information resources."
> >> >> - I invoke the "IR reference rule" in the document, and it can be hyperlinked.
> >> >>   (Actually the httpRange-14 rule as we know is wrong in all sorts of
> >> >> ways to referring to it directly is very risky.  E.g. we know the
> >> >> purpose isn't to say that the URI refers to *any* information
> >> >> resource, i.e. it has nothing to do with typing; it really means to
> >> >> say - and I think most people have understood it to say - that it
> >> >> refers to a *particular* information resource.)
> >> >
> >> > Yes, I think I agree, but I'm not sure what you are suggesting.  My note
> >> > "[TODO: Say somewhere what the httpRange-14 rule is]" was meant as an
> >> > editorial reminder that we should say more explicitly what inference
> >> > rule we mean, when we refer to the "httpRange-14 rule".  In that draft
> >> > document I have assumed that the consequence of the rule consists of the
> >> > two assertions in graph gh, but we really should say explicitly (e.g.,
> >> > in n3) what rule we're assuming.
> >> >
> >> >>
> >> >> "Because of the 200 status code, Bob applies the httpRange-14 rule and
> >> >> concludes the following:"
> >> >>
> >> >> It doesn't matter how Bob concludes that metadata, but it would be
> >> >> harmful to say that a single HTTP response is adequate to justify it;
> >> >> for the metadata to be useful it has to be true of what someone who
> >> >> reads Bob's metadata will get. I think it is better to be vague since
> >> >> this has nothing to do with this section.
> >> >
> >> > But if we don't provide any justification, then we could just as well
> >> > say that Bob concludes that <http://example/p16> refers to an elephant.
> >>
> >> No we couldn't - the scenario would be different then.
> >>
> >> > The point is that the 200 status code *justifies* Bob's statements about
> >> > <http://example/p16> as a web-accessible thing.
> >>
> >> It's not a question of justification, but of convention. Most people
> >> have adopted the IR reference rule. That is why Bob uses it - because
> >> he wants to be understood.
> >
> > Right, that's exactly the rationale I meant.  I was suggesting that we
> > be clear about *why* Bob was treating <http://example/p16> as a
> > web-accessible thing.
> >
> > However, I think this issue may be moot if the document is only focusing
> > on the mechanics of providing and obtaining an authoritative URI
> > definition.
> >
> >>
> >> I believe I've corrected the scenario in a few ways since you made
> >> these comments, so perhaps your objections here are moot. In this
> >> case, Alice and Bob and Carol will all know that some new protocol is
> >> in effect, and the question is just what that protocol needs to be.
> >>
> >> >>
> >> >> "web:hasUri"  -- the document already defines the predicate (if it's
> >> >> the one I think you mean) and it's called :accessibleVia.  There is no
> >> >> reason to say that the subject has an information resource type and
> >> >> doing so weakens the document.
> >> >
> >> > Okay, I've changed web:hasUri to :assessibleVia throughout.  I also
> >> > removed class web:IR from the RDF, as it is not needed for the
> >> > inferencing, and in the prose changed it to "web-accessible thing".
> >> >
> >> >>
> >> >> Bob actually concludes that the URI refers to the IR at that URI. It
> >> >> is better to say this in English since in the example he really does
> >> >> conclude this. If written in RDF it will have to be translated for the
> >> >> benefit of readers, and that's redundant.
> >> >
> >> > I think it is important to focus on *observable* facts.  What Bob
> >> > privately believes in his own mind is irrelevant.  The point is that
> >> > graph gh is what gives Bob license to make assertions about
> >> > <http://example/p16> as a web-accessible thing.
> >>
> >> I disagree. Sometimes arguments on general principle are easier to
> >> understand than those containing distracting and extraneous detail.
> >> And logical arguments are often not in terms of observables, but
> >> rather in terms of ... logic.
> >
> > If we stick to the level of discussing only the mechanics of how URI
> > definitions are provided and obtained, then I think we'll be fine.  But
> > if we start trying to talk about the *meaning* of a URI, then I firmly
> > believe we must clearly state what we mean by "meaning" and it must be
> > stated in terms of *observable* outcomes.  Otherwise, I think we will be
> > wasting everyone's time.  If it looks like we need to get into this
> > area, and we cannot agree whether we should focus on observable outcomes
> > then we should ask others for input.
> 
> I disagree that there is anything dangerous about meaning. But it
> would be nice to hear what others think.
> 
> As I said Ed thinks 5.5 is just fine because the meaning is always
> clear. It's not just a question of where the definition comes from -
> there are two in his world, and they are both in effect (different
> ones for different occurrences). To describe Ed's solution at all, you
> *have* to acknowledge that there is a potential confusion, and to make
> it actually work you have to address the question of meaning, at least
> to the extent of showing that the approach doesn't lead to mistakes
> (and it's not at all obvious that it doesn't lead to mistakes).

Yes, there is a potential confusion, but it is not universal.  It
depends on the application.  *Some* applications will be confused, other
will not be confused.  That is no different than if they were talking
about a definition of Mount Everest: some applications will find it
unambiguous, and others will find it ambiguous.

> 
> >> >> I don't see any reason to go into such detail on what Carol wants to
> >> >> do. Most of the detail you've provided is unnecessary and distracting.
> >> >> She really just needs to figure out what was meant by each use of the
> >> >> URI.
> >> >
> >> > The point is to nail down more explicitly what we mean by "what was
> >> > meant by each use of the URI".
> >>
> >> This seems perfectly clear to me. What confusion do you think someone
> >> reading this might have?
> >
> > I think much of this is moot now, as the document focus now seems to be
> > on the *mechanics* of providing and obtaining an authoritative URI
> > definition.
> >
> >>
> >> > If we're going to make progress on this
> >> > if we cannot be hand-waving about what we mean by "meaning".
> >>
> >> I'm not hand-waving at all and I resent your describing it so.
> >
> > Please do not take offense, as no offense is intended.  We're all trying
> > to figure this stuff out, and we may bring different styles, experiences
> > and assumptions to the table, but I think we are all doing so with
> > sincere intent to work collaboratively toward figuring it out.  We may
> > disagree on some things, and if so, I think it is helpful if we identify
> > what they are.  In this case, I was trying to stress what I believe is
> > the importance of making our assumptions explicit.
> >
> >>
> >> > We need to
> >> > be very explicit, and that's what I'm trying to do.  We need to make
> >> > *all* relevant assumptions explicit -- such as Carol's implicit rules ri
> >> > -- and we need to be talking about *observable* facts -- not what is
> >> > hidden in Carol's head.
> >>
> >> I don't agree. People are very good at reasoning about what is in
> >> heads, and about correctness of applications and protocols.
> >>
> >> >>
> >> >> Carol's problem is *not* caused by combining the graphs
> >> >
> >> > Huh?  But the problem does not exist if those graphs are not combined.
> >> > There is no contradiction if those graphs are not combined.
> >>
> >> The question is not consistency, but correctness. There are lots of
> >> ways to be wrong without anyone detecting a contradiction.
> >>
> >> If I tell you one day that mercury is the closest planet to the sun,
> >> and then the next that it is liquid at room temperature, you need to
> >> figure out what each occurrence of "mercury" means, even if you've
> >> forgotten the second day what I said to you the first day. It has
> >> nothing whatsoever to do with graph combination. It has to do with
> >> correct interpretation.
> >
> > I disagree.  The whole notion of "correct interpretation" depends on
> > what I wish to *do* with the information.
> 
> I would say it depends on what you *actually* do with the information.
> If you misinterpret it, I will often be able to tell.

But you will only be able to tell if there is some kind of *observable*
difference.

> 
> > What goes on inside my brain
> > -- whether I experience the color red they way you experience the color
> > green -- is irrelevant as long as we both stop our cars when the light
> > turns red and go when it's green.  "Correct interpretation" is
> > irrelevant if it cannot be state in terms of *observable* outcomes.
> > Perhaps we need to agree to disagree about this.
> 
> I'm not objecting to objectivity, I'm objecting to a mode of discourse
> that can't use ordinary language and insists on cluttering every story
> with details that are much more likely to be wrong than the simpler,
> more abstract version.

I'm not sure what to say to this.  Obviously I'm not trying to clutter
the story unnecessarily.  But you and I seem to have a different
tolerance for vagueness and common sense on this topic.  I am concerned
that the vagueness and common sense that you find acceptable may hide
important assumptions and flawed logic that would become visible if the
story were expressed more precisely in terms of observable outcomes.

> 
> >>
> >> >>  - it is caused
> >> >> by Alice and Bob using the same URI in different ways. She would have
> >> >> to figure out what they mean
> >> >
> >> > Please be more explicit about what you mean by "what they mean".  What
> >> > RDF assertions will be made?  What *observable* action will occur?
> >>
> >> The scenario has little to do with RDF. The question is what is being
> >> communicated and what knowledge the receiver will have after the
> >> communication. It doesn't matter that the sender and receiver are
> >> automata since they will have their own correctness criteria - with
> >> respect to *real* semantics, not hobbled RDF quasi-semantics - and
> >> will need to be subject to audit from agents who are not automata.
> >>
> >> >> even if she didn't do any graph
> >> >> combining, if she processed the two graphs separately. In particular
> >> >> she'd be confused about whether to apply the IR reference rule or not,
> >> >> in either case.
> >> >
> >> > Carol's mental confusion seems irrelevant to me, because it is not
> >> > observable.  If her application produces the wrong output, then that is
> >> > observable, and we should talk about how and why that happens.
> >>
> >> Carol writes lots of applications and says lots of things. If she's
> >> confused we may very well hear about it. This does not seem like a
> >> confusing point to me. It is just common sense.
> >>
> >> > Can you translate Carol's confusion into her application producing
> >> > incorrect output, so that it is observable?
> >>
> >> I probably could but I don't see the point. If the example included
> >> "likes" and the "application" had to make a list of information
> >> resources that were liked - something like that. But the main thing is
> >> that Carol (and the artifacts she's responsible for) mustn't say
> >> things that are not licensed by Alice or Bob, like that there exists
> >> an entity (either a canoe or a version or other IR) that has both the
> >> given title and the given mass.
> >>
> >> >>
> >> >> The rest seems at best unnecessary to me; and as you know I find your
> >> >> "application" idea to be wrong and harmful as meaning is not a
> >> >> function of application.
> >> >
> >> > The "application" idea is merely a device to enable us to talk about
> >> > observable facts.  There may be a better way to do that, and if so we
> >> > could switch.  But I don't believe we can make headway on these topics
> >> > if we just make claims about unobservable beliefs that are in people's
> >> > heads.  We need to make the discussion absolutely explicit, concrete and
> >> > observable -- no "then a miracle occurs" steps.
> >>
> >> I think we can make headway in exactly this way, and do.  This is how
> >> social processes work - you reason about what other people know. This
> >> is not miraculous or mysterious (any more than any other everyday
> >> human process).
> >>
> >> >> Cases in which there is no problem due to
> >> >> some coincidence are uninteresting and don't need to be presented.
> >> >
> >> > Which cases do you mean?
> >>
> >> The last three examples in your document
> >
> > I guess you are referring to sec 5.5.3 Erin, 5.5.4 Frank and 5.5.5 Gail
> > in the attached.  I think it is very misleading to say that there is no
> > problem due to coincidence.  Their applications are engineered to work
> > correctly within a certain range of capability.  It isn't *coincidence*
> > that they do not step outside of that range of capability.
> >
> > Imagine instead that their applications functioned correctly when given
> > data that happened to include some unneeded map data that modeled the
> > earth as flat.  Clearly the earth isn't flat.  But the fact that their
> > applications still functions correctly even when given some obviously
> > wrong (but ignored) data is not a *coincidence*, it is by design.
> 
> We are talking about communication, not applications. It doesn't
> matter whether an application functions correctly on its own terms;
> that's not the business of an interchange language. What matters is
> whether what the application is doing is a correct interpretation of
> the given message (RDF graph in this case). Correctness of message
> processing has to do with the sender/receiver system, not either one
> of them.

I disagree.  There *is* no objective notion of "correctness" if we don't
talk about observable outcomes.  What matters is not whether the
receiver obtains the same interpretation of "red" that the sender
intended.  What matters is only that the *observable* outcome is
correct: the receiver stops when the sender transmits the "red" signal,
and goes when the sender transmits "green".

We may have to agree to disagree about this.

> 
> >>
> >> >>
> >> >> I had been focusing on how to construct an RDF satisfying
> >> >> interpretation (i.e. proof of soundness) in this case, but I think
> >> >> this is a secondary problem. The first thing is to figure out how
> >> >> Carol would reconstruct the intent, in the best of circumstances.
> >> >
> >> > What intent?  Can you state the problem in terms of observable output
> >> > that is incorrect?
> >>
> >> I think this is silly. The receiver needs to distinguish between
> >> possible states of affairs known by the sender. It is obvious that if
> >> you have a properly functioning communication scenario involving a
> >> language with symbols {a, b} (among others), and then replace all b's
> >> with a's in each message, then there is a potential problem in that
> >> states of affairs that are distinguishable in the richer language
> >> might not be in the less rich language. Maybe there is no problem, but
> >> if there isn't that would have to be proved - you'd have to show that
> >> the replacement is harmless. It doesn't matter whether anything is
> >> observable or not - they could be talking about angels on pinheads,
> >> and the communication problem would be the same. It's a problem of
> >> information, not application behavior.
> >
> > I disagree.  People have already accused us of pointlessly arguing how
> > many angels can dance on the head of a pin.
> 
> Who?

Here's one from 2002:
http://lists.w3.org/Archives/Public/www-tag/2002Oct/0313.html
another from 2007:
http://lists.w3.org/Archives/Public/www-tag/2007Jul/0018.html
and another from 2009:
http://lists.w3.org/Archives/Public/www-tag/2009Jun/0101.html
I also remember Carol Goble making such a comment in person a few years
ago.

> 
> > If we cannot state our
> > assumptions explicitly and frame the problem in terms of *observable*
> > outcomes then I think we are wasting everyone's time.
> 
> If I am wasting someone's time they will figure that out quickly and
> go away or tell me to shut up. I think that someone who insists on
> eliminating perfectly useful abstract vocabulary (communication,
> meaning, belief, knowledge, judgment, truth, interpretation, etc.) not
> directly related to observation is wasting *my* time. I don't
> generally have much trouble converting these into scenarios that
> matter, and I think that someone who pretends that these words are
> unimportant or meaningless is being disingenuous.

Okay, well I guess we'll just muddle forward as best we can, since it
seems we're unlikely to agree on this.

David

> 
> Jonathan
> 
> >> >> If
> >> >> she can do this then I'm sure there'd be some clever formal
> >> >> construction leading to an interpretation. If there weren't, well,that
> >> >> would make the case against this approach quite a bit stronger, but
> >> >> saying so doesn't help in presenting this option, and the first
> >> >> responsibility here is to give it a fair shake - we're not obligated
> >> >> to analyze it in detail, and doing so might even hurt socially.
> >> >>
> >> >> Remember this document is meant to bring people into conversation
> >> >> about issue 57. By going on and on we'd only scare people away. For
> >> >> this section, the people to be engaged would be Harry and Ed Summers
> >> >> and others who think this way. They are not formalists and already
> >> >> have little patience with careful analysis. They should not be
> >> >> bombarded with details.
> >> >>
> >> >> The presentation has to be as brief as possible - just long enough to
> >> >> enable them to recognize that this is the solution that they're
> >> >> proposing, while allowing us to describe the solution in terms used
> >> >> elsewhere in the document to make comparisons possible.
> >> >
> >> > Yes, I agree that when it comes time to present this, we need to make it
> >> > as succinct as possible.  But I don't think we're anywhere near being
> >> > able to do that yet.  I think *we* first need to come to agreement on
> >> > what the problem is.  Once we have done that in in enough detail to be
> >> > sure we have captured it, we can then try to simplify the presentation.
> >> > But thus far, I keep seeing too much hand waving and claims about things
> >> > that are not observable.
> >>
> >> I don't know where I'm being unclear. As far as I can tell I have
> >> described the problem pretty well, and since you and I have been
> >> arguing along these lines for years, I have little confidence that
> >> *we* will ever agree. But if you can point to particular points that
> >> you think are unclear or are open to misinterpretation, that might be
> >> helpful as it may give me a chance to sharpen the prose.
> >
> > I'll do my best.
> >
> > thanks,
> > David
> >
> >
> >>
> >> Jonathan
> >>
> >> >> As I said I rewrote 5.5 last week. I just now fixed a couple of
> >> >> problems with and have tried to fix up a couple of things that might
> >> >> have confused you.
> >> >
> >> > I'll take a look at that also.  I haven't gone through your re-write
> >> > yet.
> >> >
> >> > thanks,
> >> >
> >> >
> >> > --
> >> > David Booth, Ph.D.
> >> > http://dbooth.org/
> >> >
> >> > Opinions expressed herein are those of the author and do not necessarily
> >> > reflect those of his employer.
> >> >
> >>
> >>
> >
> > --
> > David Booth, Ph.D.
> > http://dbooth.org/
> >
> > Opinions expressed herein are those of the author and do not necessarily
> > reflect those of his employer.
> >
> 
> 
> 

-- 
David Booth, Ph.D.
http://dbooth.org/

Opinions expressed herein are those of the author and do not necessarily
reflect those of his employer.

Received on Tuesday, 12 April 2011 03:39:49 UTC