- From: Jonathan Rees <jar@creativecommons.org>
- Date: Wed, 17 Mar 2010 14:15:01 -0400
- To: www-tag@w3.org
AWWSW status (F2F prep) The AWWSW group was started because Alan Ruttenberg and I were doing quite a bit of ontology design and ontology advising and didn't understand the resource/representation relationship (and the "information resource" idea, which is intimately bound up with it) well enough to do our work or guide others. The question comes up when you have things that you want to give a URI to, and you want to use 200 responses (non-# non-303 URI), but want to be protected against someone coming along later and saying "hey, that's not an information resource," or "but you said it's an IR, and that implies xxx" where you don't mean to say xxx, or "that's an IR, but not the one you want it to be". This is dual (equivalent) to the question: Suppose you get 200 responses, is it OK to then decide that the named resource is some particular thing or has certain properties? E.g., if I am the owner of dx.doi.org, can I say that the URI http://dx.doi.org/10.1093/bib/bbn051 names the journal article that's indicated in the representation (so that I can license others to use the URI when recording metadata)? (Note that this is a subtle example. The httpRange-14 rule by itself is not adequate to rule this in or out. In particular the representation might fail to be "of" the journal article even if we decide the journal article is an IR. Also there is redirection involved, which complicates things further.) Alan and I approached the TAG, which said essentially "you figure it out." (Shortly thereafter I discovered that I was on the TAG.) Some ontologies where this is an issue include FRBR, Dublin Core, Bibo, SWAN, CiTO, IAO, and IRW, but as the practice of metadata deployment, document and media annotation, etc. increases (perhaps with the help of the Link: header?), I expect there to be many more. A broader motivation, which I share with TimBL, is that if we had a logical framework (perhaps expressible in RDF or OWL), we'd have a tool that we could use to help clear up a number of number of web architecture muddles. httpRange-14 is just an example; another recent one on www-tag was "are HTML elements information resources?" A third motivation is that an RDF vocabulary for webarch could be useful in a number of application domains, e.g. testing and validation, or recording change logs (e.g. Memento), or "HTTP over SPARQL", or further developing Tim's generic resources ontology (genont). Additional concerns have been raised in the group about how URIs might become bound to things, but I have not pursued this (yet). My current theory is that URI binding is a personal matter subject to your belief set, and how you come to that is your own business. You may choose to let what happens on the Web influence your beliefs, and there may be a recommended elective way to allow this to happen, and perhaps an outcome of this project, in the future, might be such a way. I can't say we've made a lot of visible progress, but I think I do understand the problem better now that I did before. First, Roy Fielding is right: We're not just talking about HTTP semantics, but rather the semantics of that part of web architecture that is expressible in HTTP. This includes the resource/representation relationship, the various redirects (including 303), and possibly existence (creation and deletion). I think webarch as deployed might include REST as a subset, but certainly there are resources deployed using GET+200 that do not obey REST discipline, and we need to account for these somehow. Second, TimBL has provided more information about his view of what is and isn't an information resource, and he thinks they're like. I have been unable so far (my inadequacy) to combine these use cases with other constraints (such as grandfathering all possible web pages) into an actionable definition that makes sense to me, but I continue to work at it. Third, "authoritative" per the updated http: URI scheme in HTTPbis is, I think, orthogonal to the R/R problem. The "authoritative" responses do not determine the resource uniquely, they only say that it belongs to a class of resources that participate in the R/R relationships communicated by the responses. A contradiction between an "authoritative" response and other information believed about the resource might lead you to discount the "authoritative" response (as recommended by the GBIF persistent identifiers report) or to stop using that URI to name the resource, just as easily as it might lead you to doubt what you thought you knew about the resource. Of course, the ability of an agent to speak HTTP-authoritatively about a resource may be due to the agent's ability to control the resource and therefore its "representations". For these particular resources, the R/R relationship holds because the agent says so. For others (such as Moby Dick) it might hold in spite of what the agent says. I am concentrating on the resource/representation relationship. My ambition is that if we have a story about when this holds and doesn't hold - in particular how to falsify it - then answering the question "what is an information resource" will fall out as a side effect: an IR is simply something which happens to be able to participate in this relationship. So far the best lead I've encountered so far in understanding the relationship is ABLP logic, as is being pursued by Dan Connolly. It may be that ABLP can't be used directly, as convincing someone that a web page is a principal, or that "principal" has any ontological consequence, might be a tough sell. Or it may be that this, too, is an ontological wild goose chase, or that ABLP is about the URI/resource relationship instead of the resource/representation relationship. But it's worth pursuing. Open issues on which these considerations impinge: ISSUE-50 URNs and registries - persistence vs. trust in "authority" ISSUE-57 HTTP redirections - consequences of 30x ISSUE-63 metadata architecture - metadata for http:-named resources ISSUE-53 generic resources (appears to be closeable) Next step (for me): Look in more detail at the kinds of metadata, including class memberships, one might want to write using the abovementioned ontologies for some sample resources, and attempt to generalize from there. I'll try to have slideware ready in time for the F2F. Thanks to Michael Hausenblas and David Booth for their help. This email is in the first person because they haven't seen it to agree with it or not, but I am happy to expand "I" to "we" for anything they want to take credit for above. Thanks also to many others including Alan, Tim, Harry Halpin, Stuart Williams, and Noah for their contributions. Jonathan too pressed for time to look up URIs for all the things cited. here are the obscurest ones: memento: http://www.readwriteweb.com/archives/memento_protocol-based_time_travel_for_the_web.php gbif: http://www2.gbif.org/Persistent-Identifiers.pdf iao: http://code.google.com/p/information-artifact-ontology/ genont: www.w3.org/DesignIssues/Generic.html the others you should be able to get from google or tracker.
Received on Wednesday, 17 March 2010 18:15:35 UTC