- From: Tim Berners-Lee <timbl@w3.org>
- Date: Fri, 2 Sep 2011 15:41:36 -0400
- To: Ian Davis <me@iandavis.com>
- Cc: Jonathan Rees <jar@creativecommons.org>, Harry Halpin <hhalpin@ibiblio.org>, Manu Sporny <msporny@digitalbazaar.com>, www-tag@w3.org
On 2011-08 -31, at 16:52, Ian Davis wrote: > I think there are a number of contributing factors: > > [... see my previous red herring message] > > 2) Fragments are not sent to the server when they are dereferenced which means the server has to guess what information to send. > Well, this is an architectural decision. There are two valid architectures. 1. There is a very common architecture in which the document is a well-defined unit. In business, there are documents like catalogs, order, delivery notes, invoices, and checks, which have specific provenance, trust, and role in various protocols. If I want to refer to a line on an invoice, then it is reasonable to get back the whole invoice, as the graph in the invoice is a considered set of triples, which were issued as a message on a specific date, by specific author, and which only make sense together. Business protocols operate in terms of these documents, and the integrity of them, and the ability to express data about them is crucial. In law, similarly, it is rare that an abstract concept is typically defined by reference to a particular document -- an act or regulation. The Act is the the unit of information, it has (as invoices do) references to others, but it has well-defined bounds, and provenance and metadata. [1] Many systems are built so that the documents while they get big they are constrained not to be massive, as they are the units of transport and everything is hunky-dory. So in these systems, it isn't a question of the server having to guess what information to send. It is the publisher, it knows what information goes in a document. It publishes various documents, some about similar things, and when people quote a URI they quote it to another person knowing what sort of info it will return. So will I say "Hi, I am <http://www.3.org/People/Berners-Lee/card#i>" I am using an identifier which specifically refers to me as on my business card. That's useful. One way of looking at it is that there document-based system has been defined, and the design of that system determines what goes into a document, what the server sends, and what the client expects. 2. There is another architecture where there is no concept of a document. When in the semantic web we have aggregated large amounts of data and you are running a query service behind which there is a large aggregation of data, and all manner of shapes of graphs. We find there is no typical size of node, in fact precisely the whole graph becomes scale free. There is no natural division of the data into documents. In these the operation of a GET on an item is not so well defined. Indeed, you ask, rightly, how should the server know what information to send if I ask not for a document, but for a node? In these systems it is natural to not use a hash, after all, there is no document, and so no document URI. In these systems we currently use 303. (I wish we had a 209 or something as 303 is a terrible waste of roundtrips). Of course one can access thee things using SPARQL, which resolves the question. but suppose a client doesn't know a lot about the graph, and just wants to ask about the item itself? In these systems, though, problem that the server doesn't know what to send has not gone away, it has reappeared in a different form. The server,looking at a random node in a random graph, has to guess what the clients wants to know. This is the same as the SPARQL DESCRIBE problem of course. In fact, in many cases, giving the client the immediate arcs but recursively including those on unidentified bnodes will actually result in a graph of reasonable size. My favorite describe algorithm. But you probably want to have a limit on that in a real system containing arbitrary graphs. Now technically on the web, you can use hash URIs fine here, where they are of the form /id13498579#it and the document /id13498579 contains the data just about id13498579#it So there is a virtual document, the result of doing a SPARQL DESCRIBE id13498579#it query, whose URI is /id13498579 . This is actually useful as even if the server doesn't have a document concept, other people do and someone can annotate it to say whether they trust it, etc. I assume the only issue with that is it looks ugly. So *my* summary of why people don't like hash URIs would be that while for the first sort of system they seem natural, for the second they look ugly. Which is not an insignificant issue!! It makes code which generates them more complicated. It makes it more difficult for people to remember, and so on. > If you're storing data for that URI in a database you have to key it against the hashless version of the URI along with all other URIs that share that hashless part. No, you can just generate the hash uri of the form table6/id-821374#it where the #it is a constant addition. You don't have to store the relationship in your table. > Also the server can't log accesses to the full URI which means you don't get accurate analytics. > With the #it method above it can as there is a 1-1 correspondence. > 3) You can't use HTTP headers or status codes to refer to a hash URI. For example you can't 404 a hash URI or redirect it. With the #it method above it can as there is a 1-1 correspondence. > > 4) The role of the fragment is changing in modern web development practice. Its becoming a bearer of state and/or part of the interaction architecture of an application. See #! URLs or javascript techniques for tabbed pages. The fact that people are building on the fragid in one way doesn't mean we shouldn't also build on it in this way. > > Ian > Tim [1] "The residential status of a person is decided under two different Acts, one under Income Tax Act, 1961, ( I.T. Act) and another under Foreign Exchange Regulation Act, 1973 (FERA). The concept of Non-Resident under FERA is different as compared to that under Income Tax Act. Under Income Tax Act, the residential status of a person is determined on the basis of number of days he stays in India whereas under FERA, it is the intention of a person to be in India or outside India would be an important factor determining his residential status." - http://www.vakilno1.com/nri/taxation/definitions.htm
Received on Friday, 2 September 2011 19:41:41 UTC