Re: Draft minutes of 15 March 2005 Telcon

Dear TAG,

allow me to offer some clarifications on this topic.

noah_mendelsohn@us.ibm.com wrote:
> VQ: Characterization WG chair reports their first two docs are quite 
> stable.

Indeed, of our four deliverables, three are published and the two first 
below will only see minor spelling corrections before the end of our 
charter.

   XML Binary Characterization Use Cases
     http://www.w3.org/TR/xbc-use-cases/

   XML Binary Characterization Properties
     http://www.w3.org/TR/xbc-properties/

   XML Binary Characterization Measurement Methodologies
     http://www.w3.org/TR/xbc-measurement/

The latter as well as our fourth document (XML Binary Characterization) 
are likely to be published next week. You can access editor copies from 
our member-only group page but I wouldn't rely on those for reviewing 
purposes.

Please note that our intention is that one would first read the fourth 
document and from that read the others in any order to provide the 
necessary supplemental information. However it makes more sense to write 
it last so you won't read the documents with the context that we hope 
the last document will provide.

> NM: I had hoped that if they were going to propose actual work on a 
> binary standard (which they are about to propose), that it would be 
> backed not with general information about performance, but with concrete 
> measurements showing that text is too slow.

Showing that something is too slow is trivial, you put a human being in 
front of it and ask him what he thinks about it. If he says something 
along the lines of "it's too slow", you hold your proof. I jest here but 
in some ways it's not so far from the truth.

Speed is only one of the issues (and then oftentimes it is rather 
expressed in terms of battery life rather than raw time), the other big 
one is compactness.

The fourth document

> TB: I share Noah's concern that there are too many use cases, and not 
> enough focus on truly important ones.
> ... If anything is done, it needs to meet truly common needs.

That is not how we elected to proceed. We accepted almost all use cases 
that were brought to the WG with the goal of covering a large amount of 
ground where XML is being used and some level of performance issues has 
been identified (sometimes clearly due to XML, sometimes less clearly so 
but if there were a way to make things faster, smaller, etc. it'd likely 
be used).

We tried to rate use cases so that more important ones would be outlined 
but of course they all came out as "Very Important" since each of them 
describes what a given industry is doing, and all those industries were 
represented on the WG. It turned out that it wouldn't have been a useful 
thing to do anyway, based on what follows.

Then we looked at each use case and asked not "what features/properties 
would you like?" as that would have produced a very long laundry list, 
but "what properties, if absent, would prevent you from adopting a 
binary XML format?", the goal being to avoid creeping featuritis. That 
is where the constraint was made strong, that's where we placed the 
focus, and the list produced from that is a lot shorter.

> DO: See the BEA paper to the binary workshop. I was looking for more 
> justification that there truly are common mechanisms that would meet a 
> core set of needs.

The feasability of a format that corresponds to the common requirements 
is covered in the fourth document. I wouldn't expect to find it in a 
document describing use cases but if you would we are open to comments.

> VQ: More comments before reviewing?
> ... We'll revisit after the first round of reviews are in from Ed and 
> Norm, but we need volunteers to review the 2nd doc. Anyone?
> 
> (silence)

You're missing out, it's an interesting document :) Besides, all the 
others (including Use Cases) make direct references to it so a reading 
of them without this one is likely moot.

It wasn't easy to write and there were many times when it made us wish 
you guys had written an "Architecture of the XML Family" document as it 
would have helped us a lot.

> HT: I'm tempted to say: "I won't support a charter that doesn't say that 
> a CR exit criteria is to provide X-times benefit in time or speed"

That wouldn't be a good exit criterion (it's too restrictive compared to 
the problem) but it has been my intention to include a notion of 
concrete measurements performed in the draft charter that I've promised 
W3M. The format that an eventual WG would come up with would have to be 
measured against our list of core requirements and perform well on all.

> ED: They need some concrete criteria

We have them! You've apparently only looked at the use cases document, 
where do all those judgements come from?

> HT: Right, though I know I am being a bit too aggressive.

I would say not aggressive enough. Exit criteria for the follow-up WG, 
if there is one, need to ensure that whatever technology has been 
produced measures up real well on our list of MUSTs. If they're not 
it'll be work for nothing.

> NM: Keep in mind that some of the popular parsers for text-based XML 
> don't come close to the performance that's possible with careful tuning. 
> The justification, if any, has to be against the best of what standard 
> text XML can do.

Multiple participants in the WG have worked on tuning XML parsers and 
not gotten fast enough. That doesn't prove you can't do better but at 
some point you have to make do with what you have. One could make 
hundreds of millions of euros selling an XML parser fast (energy saving) 
enough for the mobile industry yet no one has. The theoretical best 
cycles/per bit of information one can do with XML is hard to measure as 
one already has to agree about what the payload of an XML document 
really is. Taking an empirical approach, I notice that not one single 
person I know (at least not one I'd admit to knowing) would want to use 
binary when XML cuts it, that there's a fortune to be made selling XML 
parsers that are fast enough, and yet that no such thing has happened. 
So either everyone is stupid and therefore deserves the binary 
pestilence, or it's not possible and we have to deal with it.

> DO: I thought that one of the interesting presentations at the workshop 
> from Sun analyzed not just message size (and thus network overhead) but 
> also what was happening in the processor.
> ... A lot of time was spent in the binding frameworks.
> ... Even if you came along and doubled the network performance by 
> halving the size, you might get only 1/3 of improvement

Yes, if you're doing a lot of other things that aren't XML, then 
speeding up XML won't help. But when you're rendering an SVG document 
and the vast majority of your time is spent waiting for the network and 
parsing the XML, then you know there's going to be speedup.

-- 
Robin Berjon
   Research Scientist
   Expway, http://expway.com/

Received on Wednesday, 16 March 2005 18:54:35 UTC