Re: HTML and XML

Sucked back in to clarify. Other replies to Elliotte have been off list.

On 17 Feb 2009, at 14:41, Elliotte Harold wrote:

> By the way, on a side note I just checked dblp.xml; all 500MB of it;

Sorry, this was historical, i.e., 3-5 years back. I would have been  
surprised if it still were mal formed.

[snip]
> I don't think you actually said what department your class was in.  
> By any chance were your Ph.D. students that had all the problems  
> philosophy  majors instead of C.S? If so, that would likely explain  
> why we see such radically different issues with the teaching and  
> usability of this material.

If you go back to my original note, you'll see was at the University  
of Maryland, and they were all CS students. Some were PhD some were MSc.

BTW, I don't mean to say that these students were fundamentally  
incapable, incompetent or what have you. I was, at the time,  
surprised, but lots of people have trouble with lots of things for  
lots of reasons. Perhaps, I was the incompetent one. Or perhaps I  
didn't realize how much training is necessary to get people on board  
with well formedness (I spent a lot of time on training, but perhaps  
my memory is wrong).

My point, which I think you agree with, is that well formed XML is  
non-trivial to produce and (faithfully) transmit in some fairly  
general cases. (E.g., it requires fairly specific, dedicated  
training, including on tooling.) It was not clear to me that this  
thread was willing to acknowledge this fact. Similarly, writing  
programs that output well formed XML is non-trivial under a wide  
variety of not uncommon circumstances. (E.g., blog feeds.)

If we agree on this fact, then what remains is what conclusions to  
draw from it and what actions are reasonable. One conclusion one  
might draw is that the costs are worth the benefits. Another might be  
that there are some things one might do to mitigate the costs.  
Another would be that the costs are essential to the benefits. (Which  
I think is your view.)

Similarly, one might look at well formedness and the available  
infrastructure and conclude that well-formedness errors are a matter  
of shame and contempt worthiness. Or conclude that the infrastructure  
needs to be improved. Another is that the language needs improvement.  
If one draws the last conclusion, then a key question is whether  
that's possible, all things considered, or worth the effort. It's  
entirely possible that we are stuck where we are. (Consider XML 1.1  
and 5th edition. Or consider the persistence of DOCTYPEs.)

I'm relatively agnostic about all this, at the moment (well, I reject  
the stigmata one, utterly). But I don't feel that these are questions  
where the evidence for the conclusion or the possible future action  
is overwhelming.

Obviously, you feel differently.

Cheers,
Bijan.

Received on Tuesday, 17 February 2009 15:33:17 UTC