- From: Sean B. Palmer <sean@mysterylights.com>
- Date: Wed, 28 Nov 2001 05:05:18 -0000
- To: <www-archive@w3.org>
To hash or to slash... here are some things to consider. This is also me doing my bit for rdfms-fragments [1]. Abstract: The question is whether or not "namespaces" on the Semantic Web should end with a hash "#", or a slash "/" (or even a question mark "?"). For example, the RDF Schema terms are all in the "namespace":- http://www.w3.org/2000/01/rdf-schema# wheras the Dublin Core terms are in:- http://purl.org/dc/elements/1.1/ The difference? Terms with the former will necessarily be URI-references, that is, a URI with a FragmentID on the end, whereas terms with the latter will most likely be URIs. TimBL's view is, apparently, that of the hashist. Quotable notables:- [[[ It is important, on the Semantic Web, to be clear about what is identified. An http: URI (without fragment identifier) necessarily identifies a generic document. This is because the HTTP server response about a URI can deleiver a rendition of (or location of, or apologies for) a document which is identified by the URI requested. A client which understands the http: protocol can immediately conclude that the fragementid-less URI is a generic document. This is true even if the publisher (owner of the DNS name) has decided not to run a server. Even if it just records the fact that the document is not available online, still a client knows it refers to a document. This means that identifiers for arbitrary RDF concepts should have fragment identifiers. This, in turn, means that RDF namespaces should end with "#". ]]] - http://www.w3.org/DesignIssues/Fragment [[[ 20:48:31 <timbl> The HTTP implicitly says that http; URIs idenifiy works - abstract generic documents. [...] 20:53:55 <timbl> Anything which uses the same URI for a document about Dn and for Dan himself. 20:54:12 <timbl> You can't use an HTTP URI for Dan because HTTp can't return you Dan. [...] 20:54:43 <timbl> Now, if we had a "277 Here is some stuff ABOUT what you wanted" then we could have abstarct things. [...] 21:12:49 <timbl> danbri: 1) if i trust any document which talks about ...#Person I will know thinsg about it and 2) if I do a HTTp GET then I go through a process which incldues the media type and I end up with more (rather definitive in this cae) infromation about ...#Person. [...] 21:14:20 <timbl> To do this properly, RDF needs its won content type. because text/xml fragids refer to parts of a document, not to an abstract object described by th document. [...] 21:21:15 <timbl> Now, when the document is XML, then you have to look to the namespace to find out what the thing means. [...] 21:22:41 <timbl> because o this limitation, publishers cannot allocate URIs with fragment identifiers which could be construed difefrently for documents for whcih they support content negotiation. ]]] - http://ilrt.org/discovery/chatlogs/rdfig/2001-03-31.txt A rather clear view unfolds... [[[ The HTTP spec provides a whole protocol for giveing representations of documents. You can't change a few words and make it a protocol for getting information about abstract things described by documents. [...] The problem with the worldnet "Logo" URI ( .../Logo) is that it actually does identify, usefully, a document. We still need the URI for that document. Fortunately, the fragment ID allows us to refer to something defined or described by the document, and that can be quite abstract. [...] Documents are documents. They are powerful because (with HTTP and slew of existing and future languages) we can do a whole lot with them. We can argue about their contents logically. I don't mind the semantic web architecture being built on a infrastructure of documents ((and messages)). ]]] - http://lists.w3.org/Archives/Public/www-rdf-interest/2001Nov/0182 So, from this POV, we can conclude:- 1) Resources identified with HTTP URIs are necessarily "documents" 2) Concepts can be described by these documents. Content negotiation worries are a known issue 3) In XML, the namespace of the language is inherently bound to defining what the FragIDs mean On the "content negotiation" worries, DanBri is often wary about it:- [[[ 20:55:30 <danbri> Aside from that, there is the architectural quirk that HTTP content negotiation, lang neg etc allow the same URI to be associated (in complex ways) with 'concrette' documents at various stages of their lifecycle. [...] 21:12:48 <danbri> ie. when you're not in a retrieval contenxt, _something_ else determines what #foo means. ]]] - http://ilrt.org/discovery/chatlogs/rdfig/2001-03-31.txt [[[ '#' is a downright broken bit of web architecture. The '#' fragment/view semantics are defined as being relative to the mime type of the object. Since mime types can be content-negotiated, that's hairy since a single URI plus '#' doesn't mean much without additional assumptions about mime types. For example, http://www.w3.org/Icons/WWW/w3c_main has both GIF and PNG mime-typed variants. So the semantics of http://www.w3.org/Icons/WWW/w3c_main#foo can't be considered outside the context of some HTTP transaction, since the mime type of the resource isn't an instrinsic property of the resource identified. ]]] - http://lists.w3.org/Archives/Public/www-rdf-interest/2000Mar/0028 Indeed, I have tried to sort this subject out myself:- [[[ Another fact that comes into play is that persistence across FragID space should also be maintained. When we derference a URI in one browser, we expect the dereferenced result to be similar in functionality to the same URI having been dereferenced in another browser. We tend to get upset when it isn't [...]. This is similar to the fact that if we have different ACCEPT headers in our HTTP requests, we still expect resources that are similar in functionality, and I suggest that that functionality extends to FragIDs representing more or less the same "thing", notwithstanding the fact that the interpretation of a FragID depends upon the amount of information available, and that this clearly depends upon the method of derferencing, and the context. ]]] - http://lists.w3.org/Archives/Public/uri/2001Oct/0019 Aaron has argued strongly on many occasions that HTTP resources can be anything. His namespaces are stadfastly "#" less, and he has even argued that URI-views don't belong in RDF. I asked him about rdfms-fragments:- [[[ 01:43:07 <AaronSw> well you know my argument... it's just not a resource 01:43:20 <sbp> It is according to RDF M&S 01:43:26 <AaronSw> pffft [...] 01:44:26 <sbp> if you start using HTTP resources as RDF terms, you lose a way to address the HTTP resource as a network retrievable entity 01:44:39 <sbp> case in point: your logicerror.com stuff [...] 01:45:47 <AaronSw> Well that needs to be sorted out, but # is not the solution. 01:46:04 <AaronSw> the network retrievable entity never had the URI 01:46:15 <sbp> Precisely, it needs to be sorted out. HTTP URIs can't identify two different resources. So, either "#" or "urn" 01:46:15 <AaronSw> HTTP is very clear on this: a URI represents a Resource, which can be anything 01:46:25 <AaronSw> the server just sends back a bag of bits which is somehow a resource. 01:46:30 <AaronSw> err related to the resource [...] 01:51:17 <sbp> as TimBL said, you can't ask for Dan over HTTP 01:51:51 <AaronSw> Of course not! 01:52:14 <AaronSw> what's your point 01:52:30 <sbp> you can't identify Dan in HTTP space 01:52:41 <sbp> because he's not there 01:52:45 <AaronSw> That's not true. 01:53:03 <AaronSw> HTTP identifies resources, of any sort. 01:53:12 <AaronSw> But it can't return resources, no protocol can. 01:53:31 <AaronSw> Amazon is an implementation of the isbn: scheme [...] 02:00:02 <sbp> Point me to the acceptable return code for a query on Dan 02:00:11 <sbp> http://www.w3.org/People/Connolly/Dan 02:00:34 <AaronSw> 200 02:00:49 <AaronSw> 200 OK 02:01:26 <sbp> 10.2.1 200 OK 02:01:27 <sbp> The request has succeeded. The information returned with the response 02:01:27 <sbp> is dependent on the method used in the request, for example: 02:01:27 <sbp> GET an entity corresponding to the requested resource is sent in 02:01:27 <sbp> the response; 02:01:33 <sbp> what entity should be returned? 02:01:49 <sbp> is has to be a representation of Dan. That's impossible 02:02:05 <AaronSw> A page that states, the URI you have requested represents Dan Connolly and gives a description of him, etc. 02:02:08 <AaronSw> a representation? 02:02:12 <AaronSw> it doesn't say that! 02:02:21 <AaronSw> it says: "an entity corresponding to the requested resource" 02:02:28 <AaronSw> that's what's being returned, my friend 02:02:58 <sbp> Well, I'm very unhappy about it. I'd say that it's information *about* Dan, not Dan himself 02:03:34 <AaronSw> Well it corresponds to the resource, no? 02:03:39 <AaronSw> I mean, I don't want to give you Dan. 02:03:47 <AaronSw> Perhaps I should send you a Not Authorized? 02:03:58 <AaronSw> Don't take away my Danny-boy! 02:05:22 <sbp> I think you're twisting what HTTP should be able to do... it's a hypertext transfer protocol: transferring data suitable for HyperText systems. That's just data, MIME an' all 02:05:35 <AaronSw> I never contradicted that. 02:05:42 <AaronSw> I don't disagree. 02:06:00 <sbp> er... so you agree? 02:06:11 <AaronSw> Yes, I agree that's what HTTP Is supposed to do. 02:06:22 <sbp> * sbp wonders why you didn't save a few characters 02:06:26 <AaronSw> But there's also a social contract, of sorts, involved. 02:06:36 <sbp> It's a very weak social contract 02:06:39 <AaronSw> If I request a resource, I want something related back. 02:06:52 <AaronSw> Otherwise URIs wouldn't be very useful. ]]] - http://blogspace.com/swhack/chatlogs/2001-08-03.txt As you can tell, my opinion switches from one way to the other depending upon the weather... but Aaron remains quite set. Onto Roy Fielding's stuff:- [[[ In any case, since there is nothing that cannot be identified by an http URL, including the notion of "nothing" should someone be inclined to dedicate an identifier to it, I just cannot understand why this question keeps being raised. It is an identifier, pure and simple, and has all the mathematical properties of any other symbolic identifier. Enumerating those properties in every spec is a hopeless waste of time. ]]] - http://lists.w3.org/Archives/Public/uri/2001Nov/0027 [[[ > That's like saying that, because a 'mailto:' URI is a URI and > URI's can identify anything, I can use a 'mailto:' URI to > denote an abstract concept ... Yes, you can. It is just an identifier. A variable. A mathematical symbol described by a sequence of characters in a syntax defined by the first part of that string leading up to the colon character. ]]] - http://lists.w3.org/Archives/Public/uri/2001Nov/0044 Speaks for itself. Roy and Mark Baker have done a lot of work talking about REST, which seems pertinent to the whole discussion. Mark's view? [[[ Earlier you suggested that "brilliance" was abstract, yet I happen to have a URI for it here; http://www.markbaker.ca/2001/11/Brilliance/ ]]] - http://lists.w3.org/Archives/Public/uri/2001Nov/0038 At this point, we can quite easily go around in circles. The questions still remain, and are:- * What do HTTP URIs necessarily identify. What the the semantics of these resources, and how do they differ from the broad set of all resources that may be denoted by a URI * What do fragment IDs identify, how do they relate to the concepts of "resource" in both the URI and RDF senses of the word I would argue that the definition of "resource" is consistent across both the URI and RDF specifications. RFC 2396:- [[[ Resource A resource can be anything that has identity. Familiar examples include an electronic document, an image, a service (e.g., "today's weather report for Los Angeles"), and a collection of other resources. Not all resources are network "retrievable"; e.g., human beings, corporations, and bound books in a library can also be considered resources. The resource is the conceptual mapping to an entity or set of entities, not necessarily the entity which corresponds to that mapping at any particular instance in time. Thus, a resource can remain constant even when its content---the entities to which it currently corresponds---changes over time, provided that the conceptual mapping is not changed in the process. ]]] - http://www.ietf.org/rfc/rfc2396.txt RDF M&S:- [[[ Resources All things being described by RDF expressions are called resources. A resource may be an entire Web page; such as the HTML document "http://www.w3.org/Overview.html" for example. A resource may be a part of a Web page; e.g. a specific HTML or XML element within the document source. A resource may also be a whole collection of pages; e.g. an entire Web site. A resource may also be an object that is not directly accessible via the Web; e.g. a printed book. Resources are always named by URIs plus optional anchor ids (see [URI]). Anything can have a URI; the extensibility of URIs allows the introduction of identifiers for any entity imaginable. ]]] - http://www.w3.org/TR/REC-rdf-syntax/ Because URIs can denote anything with identity, then it follows that *whatever* URI references + FragID (i.e. with hash) denote, they denote a subClassOf the resources that URIs can denote (which may be equivalent). It's not a difficult piece of logic to grasp. This is what RFC 2396 has to say about URI references:- [[[ The semantics of a fragment identifier is a property of the data resulting from a retrieval action, regardless of the type of URI used in the reference. Therefore, the format and interpretation of fragment identifiers is dependent on the media type [RFC2046] of the retrieval result. The character restrictions described in Section 2 for URI also apply to the fragment in a URI-reference. Individual media types may define additional restrictions or structure within the fragment for specifying different types of "partial views" that can be identified within that media type. A fragment identifier is only meaningful when a URI reference is intended for retrieval and the result of that retrieval is a document for which the identified fragment is consistently defined. ]]] - http://www.ietf.org/rfc/rfc2396.txt Also of interest is the "Are URI-References bound to resources?" thread on uri@w3.org. Here's a sample excerpt (Roy):- [[[ [...] how is access control assigned to "things" on the Web. By the resource. Is it possible to define separate access control to different fragments of the same resource? No. Therefore, a fragment is not a resource until it is bound as some other URI by a naming authority that can control access to the fragments as separate, identifiable resources. Because if we decided the other way -- that a fragment was a resource too -- then we'd have to define a new term for that subset of "old-style resources" that were actually subject to the Web behavioral model. The same logic applies to many other aspects of the Web design beyond access control. That's why I separated the definition of resource and representation, and hence why REST is an acronym for representational state transfer. I needed to do that for HTTP/1.1 because the old model just didn't fit things like CGI, Apache modules, and URN indirection. ]]] - http://lists.w3.org/Archives/Public/uri/2001May/0021 DanC:- [[[ That's one view. The RDF spec takes another view. Note that the view taken by TimBL's cwm.py code is this "a fragment is not a resource" view; he's taken the view that it's too confusing to have two different definitions of 'resource' around, and that the RDF specs (or at least: his RDF code) should use 'Thing' for the general case of something that's either an RFC2396-resource or something-identified-by-an-absolute-URI-with-fragment-identifier. ]]] - http://lists.w3.org/Archives/Public/uri/2001May/0024 Still, the question for me remains the wholly practical one: when I create a new namespace, should I end it with a hash or a slash? I don't feel that we're coming any closer to answering that question, and that is quite saddening. Flip a coin, perhaps? [1] http://www.w3.org/2000/03/rdf-tracking/#rdfms-fragments -- Kindest Regards, Sean B. Palmer @prefix : <http://webns.net/roughterms/> . :Sean :hasHomepage <http://purl.org/net/sbp/> .
Received on Wednesday, 28 November 2001 00:05:41 UTC