- From: David Booth <david@dbooth.org>
- Date: Tue, 12 Apr 2011 09:21:48 -0400
- To: Jonathan Rees <jar@creativecommons.org>
- Cc: Harry Halpin <hhalpin@w3.org>, AWWSW TF <public-awwsw@w3.org>
Just a few comments inline . . . On Mon, 2011-04-11 at 17:38 -0400, Jonathan Rees wrote: > Thanks for your detailed comments, and hoping to hear from you tomorrow. > > On Mon, Apr 11, 2011 at 1:50 PM, Harry Halpin <hhalpin@w3.org> wrote: > > Late, but better than never. Will try to make telecon tomorrow - > > reviewing, although I started a few days ago and missed JAR's latest > > changes: > > > > http://www.w3.org/2001/tag/awwsw/issue57/latest/ > > > > The goal of this document should be to precisely define the problem, > > perhaps iterate through a few possible solutions, and then finally settle > > on a solution. I see we have just got to describing the problem and some > > possible solutions. My big executive summary is: > > > > - Add in IRW ontology or specialized vocabulary as one solution > > I did include :accessibleVia for exactly this purpose and inspired to > some extent by IRW. Would like to hear your other ideas regarding this > approach - particular properties that solve particular problems. From > where I stand I don't see any others needed at present, other than > maybe one relating a URI to its definition and maybe one relating an > IR to a version of the IR. (I have my own ontology somewhere, can dig > it up.) > > > - Add in the need for a Metadata protocol > > Good, this goes in 4.7 I think. (assuming by "metadata" you really > mean "URI data" or something of the sort) > > > - And I strongly support "application/rdf+xml" mimetype, and future RDF > > mime-types, just meaning that this URI denotes whatever things the RDF > > statements that use that URI accessible from the URI itself describe. > > Would this include RDFa? If so then all Creative Commons metadata > becomes wrong, doesn't it? > > I think there are now about 8 or 10 RDF mime types. I think we might > need a registry to keep track of them. > > Does the RDF representation take priority over all others? I.e. you > would have to conneg for RDF if you wanted to know what the URI meant? > What if RDF were succeeded by something better in the future? > > How would you know whether the document defined the URI, and what > would you do if it didn't? > > > - I'm becoming more partial to a quoting mechanism to describe the named > > graph, i.e. the document itself. Historically in languages like LISP > > quotes mean "do not interpret", which is precisely what we want them to > > mean here. > > Yes, but how is this relevant? > > > - Add in part about browser support in browser. > > Good. > > > - 1. Introduction - > > > > Upon first reading, you use the term "Peak XY". Why not just use a URI, > > like http://www.example.org/PeakXY? I think the problem space should be > > constrained to "If a user-agent is presented with a URI, how can that > > agent determine the *intended* meaning of the URI." Right now, Section > > reads like an introduction to a general theory of discovering the meaning > > of symbols, which is difficult - albeit related - waters. > > Hmm. Wonder why what it says isn't clear - it seemed straightforward > to me. It's just stating a general pattern, not trying to relate to > any general theory of anything. > > The danger we constantly run is that people think that "meaning" means > something different in RDF as compare to the real world. This may be > true of linked data but it's not true of the way I use RDF (which I > think agrees with TimBL, OWL, the KR folks, etc.). So this is why I > like to hammer home the point that there are real phenomena of > communication and meaning and that RDF/URI is just one vessel for > them. Perhaps you can correct me, or tell me how to say it better. Perhaps I am in the camp of believing that '"meaning" means something different in RDF as compare to the real world', because: (a) "meaning" can be precisely defined in RDF (as a set of assertions, or a set of satisfying interpretations per RDF semantics), whereas it seems doubtful that humans could agree on a precise definition; and (b) RDF applications can easily follow precise algorithms to ensure that they are following proper protocols in determining meaning, whereas humans are not so good at that. > > > Given that caveat, I would notice that (before the "nature of definitions" > > paragraph) "The primary ways terms agents in natural language determine > > the meaning of a term is by its use in context. However, on the Web the > > context in which a URI is presented can often be limited, and as to enable > > interoperability between user-agents there should be a clear algorithm to > > follow that lets the intended meaning of the URI be clear." > > I don't see how the two situations are different. In RDF you get > meaning from context, and in natural language you get meaning from > dictionaries (e.g. French and the Academy's dictionary), and there is > nothing normative about FYN, it's just there to be helpful. > > > - intention of who? > > ? > > > - I am not entirely sure about this term "dereference". I would prefer the > > term "access", as I think its a bit more obvious. Can you explain a bit > > better the > > difference between them? It seems when you access/dereference, you can > > successfully use a HTTP code to retrieve an associated information > > resource. > > "Dereference" is something you do to a URI. "Access" is something you > do to a resource. To call a URI "accessible" would be simply > incorrect. > > I get this term from RFC 3986 and it seems as good a term as any. It's > nice because it's not too objectionable, and it's normative. > > Dereference is more general than HTTP. > > > 2. Glossary > > > > - Put Glossary at end. Otherwise, I doubt anyone will get past it. > > Sounds reasonable > > > accessible via > > When a URI is dereferenceable, "the information resource accessible > > via a URI" (abbreviated IR(that URI), see below) is the information > > resource whose versions are the versions obtained by dereferencing > > that URI. > > > > definition: > > > > The "information" could be prose, RDF, OWL, or some combination. -> > > "The "information" could be human-readable prose in natural language, > > machine-readable RDF, OWL, or some combination." > > How about: > any human-readable or machine-readable language, > or combination of languages. > > > fixed information resource > > > > I thought the entire point of this according to TimBL was that it was just > > an information resource that *did* not change. I would merge this > > definition with that of information resource, with fixed being just the > > subset that is not intended to change, in particular over time. > > Hmm. I think I had it that way in an earlier draft. It seems pretty > important, so that's why I pulled it out. > > It does need its own definition, since this is where everything > grounds out - information resource is defined in terms of it. This > goes back to whether to call them "representations" or "fixed > information resources", which I've brought up about ten times on this > list to little response. > > If you try to combine the two definitions, the way it reads is: an > information resource is -- hey wait, here's what a fixed information > resource is -- and then an information resource is -- > which is pretty awkward. > > > term > > > > A URI, word, name, or phrase that can serve in subject or object position > > in a statement. -> To be pedantic, a URI can also serve as a predicate. > > Just say "that can serve in a position that forms a statement. On the > > Semantic Web, statements are RDF triples where a URI could be in the > > subject, object, or predicate position.." > > hmm. will fix. > > by the way, do you have a reference for 'semantic web'? > > > refer > > > > For the purposes of this report, reference is just one way to mean. > > There may be other ways to mean other than to refer, but none are > > specified here. -> This just confuses me a bit. > > Why? > > >I tried to present a > > more coherent theory in my dissertation distinguishing between > > meaning/reference, but you can also just state that "To refer to > > something, a term should be understood by an agent as "standing-in" > > for some object in the universe of discourse, where that object can be > > separated from the term in space or time." > > That's more of a theory than I wanted to have here. That's why I said > "for the purposes of this document". So not sure what to do. > > > version (of an information resource) > > > > This just confuses me. An information resource associated with another > > one? So is anything linked a "version"? I know you've done some > > deep-thinking on this Jonathan, but I'm not convinced by this definition > > quite yet. > > In order to be able to explain IRs, we need to be able to apply > metadata predicates at both the representation level and the IR level, > and relate the two levels via metadata. (This is the ONLY way, after > five years, that I've been able to make sense of 'information > resource', and I think it works very well - a longer conversation.) But I don't think that is needed in *this* document, because *this* document (AFAICT) is about the *mechanics* of providing and obtaining definitions -- not about how they are interpreted. > > "a version of" is the relation between the representation-like things > that have metadata, and the IRs that all have metadata. We could say > that you can apply metadata to both representations and to IRs, and > (if need be) have a bigger class that's the union of the two, but I > feel that the IR idea comes across much better if IRs and > representations are seen as the same kind of thing. > > Since many TAG members seem to hate the idea of considering > representations to be IRs, or applying metadata to representations, I > chose the "fixed IR having metadata is a version of an IR having > metadata" approach over the "representation having metadata is a > representation of an IR having metadata" approach, inspired to some > extent by what Roy Fielding is doing with HTTPbis (which I initially > found repulsive but have come to accept). > > If I can manage to get enough of your time so that you can understand > what I'm saying, I'd be happy to take your advice on which of the two > approaches is more likely to be understood. > > > A fixed information resource associated with an information resource > > is a version of the information resource. -> "When an information > > resource that is fixed as an octet-stream but this resource is > > associated with another information resource that changes, the fixed > > information resources can be considered versions of the original > > changing information resource. For example, a version is "snapshot" > > of a changing information resource at a given time, or via forks, and > > so on." > > OK, this just tells me that my attempt to be precise, concise and > consistent has left me with something hard to understand. Maybe I > should go back to "representation" - although that has a long > *history* of confusing people... Yes, I think it will be more easily understandable if existing terms are used. > > Time for gensyms maybe. > > > Use-cases > > > > 3 - General methods in current use. > > > > 3.1 Colocate definition and use: "Just collating definition and use is not > > enough, as one of the features of URIs is that they can be removed from a > > given context and then re-used in another one." > > I would say it this way: that this is fragile because the use can get > separated from the definition. (SPARQL is a good example.) > > This would of course apply to the "just be clear" idea that you've > proposed - i.e. for an "ambiguous" URI just provide additional triples > to clarify which sense is meant. > > Hmm, so you're with David that the critiques should be inline with the > method descriptions. > > > 3.2 Link to documents containing definitions > > > > One could say "Link to a URI with the definition using a special kind of > > link", as I think you want to separate linking from just having the > > definition accessible from the URI." > > hmm. > > > 3.3 Register a URI scheme or URN namespace > > > > I think the answer to this should be a strong "No" and should be > > discouraged, rather than heavily described as currently is. I feel too > > much space is used on this example. > > There are people advocating this and they need to feel that they are > heard, in this document at least. We can follow on with something that > takes a stand later. > > And I think many readers will not be adequately informed on this > subject, and not realize that a URI scheme registration is the same > kind of thing as a URI definition, which they are. > > However you are probably right about the space. > > I don't think we're in any danger of anyone actually doing this. > > > 3.4 Use the LSID getMetadata() method > > > > I understand why this is in here, but again, I'd say discourage it. > > Again there is a specific audience who, like you and me, need to feel > they're heard; and this report shouldn't be encouraging or > discouraging, but just presenting information. > > > 3.5 'Hash URI' > > > > You might want to add "Combined with content negotiation, which determines > > the media-type, there could be a problem where the hash URI is therefore > > context dependent. So a hash URI for "http://example/sale#p16" could mean > > a segment of a document (paragraph 16) if "text/html" was returned, and > > could mean a resource describing a canoe if "application/rdf+xml" was > > returned. This is obviously problematic, but seems to be ignored by the > > RDF community so far in practice." You might want to add this to the > > "Critique" bit of Section 4. > > Yes, this is the kind of text I was intending to write > > > > 3.6 'Hashless URI' with HTTP 303 See Other redirect > > > > I'm going to point out yet another giant whole in the 303 story. How do > > you get "back" from a URI pointed to by 303. See my comment to 4.6 > > Why do you need to? And what if there are multiple ways to get back? > > Certainly there should be a URI for the URI-defined-in-document > predicate, and then you could just use the inverse of the predicate. > > > 4.1 "Fragment identifiers are fragile" -> "fragment identifiers are > > context-dependent" > > > > See above at 3.5. > > There are many problems, and conneg/session/useragent/time sensitivity > is just one of them. I guess each can have its own section heading; > will review. > > I was hoping for more detail from you on this: > "People forget to put it there > when writing and cut and pasting URIs." > Because it's outside my experience I can't write this up very well. > > > 4.4 303 is difficult, sometimes impossible, to deploy > > > > As the person who originally brought this up (you might want to cite my > > email by > > URI), this is a total mess for people to deploy unless they are using tools > > or comfortable using .htaccess. Also, some server software does not support > > .htaccess, and many people do not have access to edit their servers .htaccess > > files. > > > > Another problem is connecting the document URI back to the URI about the > > "resource". So when one uses http://example/p16 one gets redirected to > > http://example/about-p16. However, how does one go BACK from > > http://example/about-p16 to http://example/p16? One could imagine a > > back-link (we provide this type in IRW), but it's not clearly part of the > > status-code and there's no natural back-link. > > Yes, there needs to be a way to express the relationship. I'm hoping > that will be part of the followon consensus effort. > > > On a referential leve, I'm just going to point out that the reason that > > the use of the 303 status code can not possibly tell us that the resource > > redirected from was used for referring, arises because the 303 status code > > was specified before the advent of the Semantic Web. As an HTTP response, > > there is no reason why it can't be used to simply to redirect from one > > information resource to another æºnformation resource, and in fact that > > can and is done. > > It was only used for POST. The HTTP WG felt that redefining it in the > case of GET was safe, so that's what they're doing for HTTPbis. > > We can amplify this in the rec track document we're going to produce, > although I think HTTPbis pretty well has it covered. > > > As put by RFC 1738 "this method exists primarily to allow the output of a > > POST-activated script to redirect the user agent to a selected resource, > > not to solve a logical problem about URIs on the Semantic Web and > > information resources." > > > > I'd add a critique: > > > > 4.8 There is no metadata discovery protocol > > > > I would add "There is no easy way, given a multitude of possible ways to > > access RDF about something, such as RDFa in HTML, following 303, Link > > elements in HTML (i.e. Dublin Core), and following Link headers. > > Therefore, given a URI, a developer does not know how to get all the RDF > > accessible from that URI, much less sort out contradictions if they arise > > in OWL. Practically, this means that a developer cannot deploy RDF at a > > URI and be assured of what RDF a consuming application will actually > > find." > > Yes, I think the document ought to say something like this, although > it's not a critique of any method, it's just a statement that > interoperability is desirable and therefore standardization is > desirable. > > > 5.1 Use something other than a URI > > > > Not bad as a reminder, but I'd delete and scope us to working with URIs. > > Seemed like an important reminder - if the "take it at face value" > solution gets traction then Creative Commons will need to retool and I > think this would be the best approach. I agree with Harry on this. Better to just delete this and keep the scope of the document to URIs. > > > 5.2 'Hash URI' with fixed suffix > > > > I also do not really see how this solves anything, it just introduces yet > > another arbitrary convention, and it doesn't solve any of the problems > > with hash URIs. > > It does not introduce a new convention that clients need to know > about. It doesn't solve all of the problems, but it does address one > complaint which is that the namespaces don't scale. We've heard this a > lot. In conjunction with other remediations such as avoiding conneg > and use of CURIEs I think it would work pretty well. > > > 5.3 'Hashless URI' with site-specific discovery rules > > > > I like this approach, but would note that the addition of using > > .well-known and .host-meta will require a general metadata discovery > > protocol. > > Yes, this method would require people to know about it - but only > those who care about the performance benefit. I'm not sure why this > wasn't clear - I do say "this is a new protocol". > > Benefiting from this would not require a standard discovery protocol, > but it would be nice if it meshed well with information resource > discovery. Will need to think about that. > > (Sandro and I have been discussing this.) FWIW, I think this approach has potential also. > > > 5.4 'Hashless URI' with new HTTP request or response > > > > I agree this might work, but you still have the "reversability" problem > > noticed earlier, and it adds unnecessary complexity. > > > > 5.5 'Hashless' URI dereferences to its definition > > > > I agree with Ed basically. There is no reason why a URI cannot refer to > > both to an object and its description, see URI rule: If IR(u) has a > > version with media type 'application/rdf+xml', then take u to be defined > > by IR(u), otherwise take u to refer to IR(u). > > But not turtle, n3, OWL, RDFa, Manchester syntax, right? And with a > priority system so that you *must* check for application/rdf+xml > first? > > > This should just be part of the media-type definition. As RDF does not > > constrain reference to a single *thing* (see the paper on "In Defense of > > Ambiguity"), the best we can do with RDF is provide a description whose > > interpretation can be a number of things, some of which may be other URIs > > and others which may be things in the world outside the Web that we want > > to refer to. We can assume when someone is publishing RDF at a URI that > > their URI refers to *anything* that satisfies the interpretation of the > > RDF statements available at that URI that use that URI. > > Even if the URI doesn't occur in the document, right? Then it could > refer to anything at all. RDF statements can still indirectly constrain the interpretations, even if the URI does not appear in those statements. > > > If they want to refer to the document itself, they need to give that a > > distinct name, i.e. a named graph. > > No, I gave a solution in the draft, you just say [ :accessibleVia > "http://example/doc" ] > meaning the IR at that URI. > > Can you explain what named graphs have to do with this? > > I don't mean to be testy; I am really interested to understand your > view and am sort of desperate for interaction with someone who takes > this view - I've not been successful at getting others on this list to > represent it. > > > Then there should be some convention > > that says we are talking about the description itself, not its > > interpretation, which could be something as simple as using the URI of the > > named graph in quotes (finally, a good use for distinguishing strings from > > URIs). This is also done via quotes in N3. > > sorry, I don't get it. > > > 5.6 'Hashless' URI dereferences to its definition (incompatibly) > > > > This would also work for me, and I don't see the difference really between > > this and 5.5. > > Under 5.5 Creative Commons and Tabulator still work. Under 5.6 they > don't. So from my POV there is a huge difference. The Creative Commons license use case is a really excellent use case for demonstrating this ambiguity issue. Since the point of *this* document is to focus on the *mechanics* of providing and obtaining definitions -- not interpreting them -- then it seems to me that it falls outside of the scope of this document. But I do think it is a use case that we should exploit in another document. David > > > I'm going to point out two other solutions: > > > > 5.7 Get browsers to do something with RDF > > > > One of the reasons for Linked Data 303 has been that you can put the URI > > in the browser and get something resembling an human-readable HTML out via > > 303+conneg. However, it seems odd to use content negotiation and 303 when > > it seems like the real problem is that browser vendors do not support > > doing something interesting with RDF, so that when a page is uploaded > > I would like to include something like this but have not been able to > think of anything concrete enough. > > If we were to ask for a miracle from the browser folks, I wonder if a > new URI scheme might be the answer... > > Your sentence seems to have been cut off. I'd like to hear more. > > > 5.8 Use an ontology to describe the status of the resources > > > > My one request is to *please* add this. This is the entire point of the > > IRW ontology is to give people the options to do this. That people, if > > they wish to try to constrain interpretations in some meta-logical > > fashion, make distinguishments between IRs and NIRs, and so on - at least > > be given the option of making what they want *explicit* and they can do > > that in RDFa, RDF/XML available via 303, RDF/XML published directly > > without 303, Link Headers, and the like. > > As I said above this is something I want to do in the > consensus-building phase. It's sort of a meta-problem but I'll try to > say something about it. > > There is no reason to distinguish IR vs. NIR, by the way - I really > would like to squash this meme. What you need is the distinction > between u defined by IR(u) and u refers IR(u), which is different. > > > 5.9 Combine all the various approaches in a unified Metadata Discovery > > Protocol. > > > > See above comments, but something for RDF modeled on Eran's "Web Linking" > > draft would be ideal. To be honest, we really need to simplify the RDF > > stack to get it to take off, and I think the largest simplification would > > be "just publish RDF" and "here's a very clear protocol implementers can > > follow to get all RDF from a given URI" that then includes all the various > > cruft that the community has generated. > > I'd love to see clarity and consensus. This document is just the first > step, and building or altering consensus on httpRange-14 would be the > second. > > I actually thought there was a fairly well understood procedure for > getting a definition, but what do I know, I'm not in the trenches > these days. > > I can't do this alone. After this document goes out some volunteers > may appear, but who do you think would be interested in working on the > problem? > > Jonathan > > -- David Booth, Ph.D. http://dbooth.org/ Opinions expressed herein are those of the author and do not necessarily reflect those of his employer.
Received on Tuesday, 12 April 2011 13:22:13 UTC