- From: Kendall Clark <kendall@monkeyfist.com>
- Date: Wed, 14 Dec 2005 11:30:17 -0500
- To: Jeen Broekstra <jeen.broekstra@aduna.biz>
- Cc: RDF Data Access Working Group <public-rdf-dawg@w3.org>
On Dec 14, 2005, at 4:55 AM, Jeen Broekstra wrote: > [[4.7 Bandwidth-efficient Protocol > > The access protocol design shall address bandwidth utilization > issues; that is, it shall allow for at least one result format that > does not make excessive use of network bandwidth for a given > collection of results. > > Status: Accepted.]] > > Whether or not the LC design meets this requirement is subjective I > guess (what is "excessive", exactly?), however it has been shown that > more bandwidth-efficient variations are not only possible, but > workable. One way to gloss "excessive" -- though I'm not claiming this is necessarily what anyone *intended* -- is this: an excessive use of bandwidth is one where we choose to use bandwidth in such a way that is (a) functionally equivalent to some other, (b) more efficient use of bandwidth. That is, we take "excessive" to obligate us to choose the most efficent from among otherwise functionally equivalent design alternatives. "Functionally equivalent" is a bit loose, granted, but for a data format we'd say, at the least, that it means "conveys the same information". I won't add "is a roughly equivalent processing burden" because, well, we didn't adopt a requirement or objective that the results format be easy to process. I don't remember anyone in the WG ever even discussing that. We talked a lot about having an XML format so that, for example, we could integrate with XQuery, but I don't recall any worry about ease-of-processing. That has all come, near as I can tell, after the fact of Ron Alford suggesting the existing format was a bit bloated. As for ease of processing, there is a sense in which we are quibbling over trivialities. All of the formats, as people have demonstrated repeatedly, are *roughly* equivalent in the processing burden they impose on a competent programmer -- which is a cost borne by fewer agents than the cost of an inefficient protocol or data format, which is borne by *everyone* involved in some sense. If or when programmers want *actually easy to process results*, they tend not to use XML at all. At least, one can make a very strong argument in that direction. So, for example, I've been working on a document with some WG members for a JSON(.org) serialization of the query results format, since that's *trivially easy* for a programmer to process and integrates nicely with things like "Web 2.0", "AJAX", and "lightweight REST web services". (Where "integrates nicely with" means "is what people expect who build".) Thus, I have to conclude that I shouldn't give much, if any weight in my own deliberations to the ease of processing argument. I think it focuses too much on very tiny perceived gains, for a relatively small number of people, at the expense of a cost that is imposed pretty much across the board. > For these reasons, I feel that currently we can not really claim that > XSLT processing is made so much harder by going for this design, and > therefore I think option c is the right way to go. In my deliberations about all of this I simply *granted* that option c did make processing more difficult to some degree. Even with that increased difficulty, I don't find that consideration very weighty on more general grounds. I'm glad to hear that there is some evidence to suggest that it's not really any more difficult at all. That strengthens the case for option c significantly. > > ============== > > Quite seperately from this, there is the issue of having an *XML*- > based > result format in the first place. It has been shown that for > purposes of > bandwidth efficiency, the choice of XML in the is a limiting > factor, and > a dedicated (binary) format is much better. My primary concern is that we don't *assume* that this is an either- or situation. I think there are good reasons to prefer several query result formats, even if only one is an actual Recommendation eventually. I can see utility in having more than one result format documented in various WG Notes. I intend to submit my JSON work as such once it's finished. A binary, non-XML textual, and (standards- blessed) XML results format seems a helpful mix covering a diversity of use cases. > As one can see, the performance gain on practically all fronts by > using > a binary format completely dwarfs any performance gain by > optimizing the > XML format. Well, it's faster but I don't know about "completely dwarfs"! :> The one advantage of compressed XML over binary, vis-a-vis the last call design, is that it's *utterly trivial* to specify, requiring some tweaked language in the existing last call design doc. But, that having been said, the only way I would oppose a binary results format is if it were to replace an XML format -- which I don't believe anyone would suggest -- or if I had to do the work to specify it! :> > So a separate question is whether or not the WG wants to > sanction (informally?) a specification of such a binary format (I know > that Andy and I are at least interested in submitting such a format to > W3C). As I said above, I think this is an interesting niche for us to think about post-SPARQL 1.0, after a recharter. But it may make sense to do it before that. In either case, I think this is a good use of WG resources. > If we do decide to do this, at takes away part of the reason for > changing the XML format, but of course still both options will be > open. I think they are completely orthogonal, especially given my preferred way of glossing "excessive" above. I don't think a binary format and an XML format (any XML format) are functionally equivalent. They scratch different itches and serve different use cases. > I do, however, believe that if we decide to stick with the LC > design, we > will _have_ to sanction this additional binary format, because > otherwise > we have not sufficiently provided for requirement 4.7. Which is an excellent reason, all other things being equal, for supporting option c. Cheers, Kendall Clark
Received on Wednesday, 14 December 2005 16:30:32 UTC