- From: Elliotte Harold <elharo@metalab.unc.edu>
- Date: Mon, 16 Feb 2009 15:11:06 -0800
- To: Bijan Parsia <bparsia@cs.man.ac.uk>
- Cc: www-tag@w3.org
Bijan Parsia wrote: > I'll also point out that programming in these languages is a specialist > activity with high rewards. Even then, it would be interesting to see > how broken failed projects are and how much time goes into syntax > management: Not much, in the case of experienced programmers. I do think properly designed syntax is important. I do not think that makes non-draconian error handling a good idea. Nor do I find XML syntax to be particularly onerous compared to, for example, Haskell's, Ruby's, or C++'s. The basic syntax of angle bracketed tags, quoted attribute values, and start-tag-end-tag matches seems sound. I don't see anything to be gained by changing it at this point. > > The cognitive overhead of well formedness can be negligible or severe > when creating XML documents. Well formedness, of course, is rarely the > *point* or the *interesting* set of constraints. It seems quite possible > that it's more difficult than it needs to be. I don't think so. We've tried some alternatives in the SGML space--omitted end-tags for example. They didn't carry their weight. >> Consequently syntax errors rarely make it into production (except >> among college students of questionable honesty). > > Blah. I'm not sure what the point of this comment was. However, in the > context, it's not very nice. Somebody--I forget who--brought up the issue of college students who turned in syntactically incorrect programs that somehow magically compiled into .class files. As a former professor myself, I'm a little surprised anyone could be naive about exactly how this situation arises. > From experience? I would love to see the data. I know Interlisp's DWIM > facility didn't "take off", but there could be many reasons. All I could > easily find on this was: > http://catless.ncl.ac.uk/Risks/7.13.html#subj3 AppleScript is the case I'm most familiar with, but there have been others. DWIM I'm not familiar with. But there's good reason the programming languages landscape is what it is. Functional vs. imperative we argue about. Compiled vs. interpreted we argue about. Garbage collected or manually managed we argue about. Precise vs imprecise language we don't argue about any more. The answer's just too obvious. The only way you'd convince anyone otherwise would be by producing an imprecise, error-corrected language that worked. Short of that, don;t expect anyone to waste much time on this any more. >> Fixing syntax errors at the compiler level leads to far more serious, >> far more costly, and far harder to debug semantic errors down the line. > > Really? I just don't know. Some interpreted language environments miss > lots of syntax errors until you hit that line of code during a run. True. The earlier errors are exposed the better. Not all programming languages are equal in this regard. >> Draconian error handling leads to fewer mistakes where the person >> sitting at the keyboard meant one thing but typed another. > > I've no idea, really. > >> Syntax errors are one of the prices developers have to pay in order to >> produce reliable, maintainable software. Languages have been developed >> that attempt, to grater or lesser degrees, to avoid the possibility of >> syntax error. They have uniformly failed. > > Of course, we're not talking about avoiding the possibility of syntax > error, but of how to cope with error. One way and one way only: fail fast, early, and hard. Reveal the error at the first opportunity. Do not allow the error to build up steam and cause bigger problems. > One key difference between programs and data is that I often need to > manipulate the data even if it has syntax errors. I usually end up doing > that with text tools. How is that better than dealing with a structure > that might be extracted? That's what ends up happening *anyway* a good > deal of the time as I patch the errors so I can just *see* and *query* > the thing. Because human intelligence is far better at these problems than computers are. I don't know whether software might some day be good enough to handle this. It certainly isn't now. > Isn't the question not which is easier to program against. I totally > prefer well formed XML etc. etc. I thought the issue was how best to > cope with problem data and the prevalence of that problem data. The > claim has been advanced that people (some people) can always, more or > less, with relative ease, produce well formed XML and transport it in > various ways to consumers over the Web. > > This just doesn't seem to be true. I disagree. A poor environment that has been excessively forgiving of bad syntax has provided non incentive for document producers to create well-formed documents. Change the environment, and the producers will change their output. Yes, this is something of a chicken-and-egg problem. I do not know if it will be resolved. I am certain it could be resolved given the will in any one of the right organizations. > But given the reality of invalid HTML5 and non-well-formed XML...how do > we minimize the cost of the errors? How do we distribute the costs where > they can be effectively borne? That's another permathread. See the archives of the whatwg group. > I've not see that quick learning, even within computer science, as my > first message showed. I have. I've taugght XML multiple times at a perhaps somewhat less-elite university than you mentioned (or perhaps not--I'm not intimately familiar with yours) and well-formedness was simply not an issue or a concern. It was a significantly lower hurdle to jump over than syntax issues when I taught Java to the same students (and these were students who already knew C++). There's no reason any computer science student at any level can't handle this. > Also, if we expect XML to be used by broader populations in wider > contexts, then this seems unrealistic. I'm not sure we do expect that, any more than we expect typical computer users to write VBA macros from scratch. However those power users who do write VBA macros can certainly handle XML syntax. > I don't know why y'all ignore my DBLP example. It was real. I never > ended up using the data, alas. I don't recall if I reported it, but, > frankly, it was a clearly a significant challenge to fix. Perhaps that's > just one price I must pay for people to have the colossal benefits. I ignored it because I'm not familiar with it. There's a lot of XML data in the world. > I'm not clear why one category of errors (well formedness ones) are so > much worse than other levels (e.g., validity ones). They are all errors. > One nice thing about XML is separating these classes of errors so that > even if the document is not valid wrt the relevant schema, you can still > work with it (transform it, etc.). What's so much worse about well > formedness errors? Another permathread: syntax is interoperable. Semantics are not. You and I can share syntax, and we can share it with 10,000 other disconnected individuals. We cannot, will not, and should not attempt to share semantics when we were are working on different projects with different purposes and understandings. > In a standards situation there are lots of different possible costs > including opportunity costs. Perhaps we'll have to live with XML as it > is. Perhaps we can do better. But surely it's better to investigate > carefully, rather than make rather unsupported claims with colossal > confidence :) You may be new here. There have been demonstrably better syntaxes over the last 10+ years. They've gone nowhere, and attracted casual interest at best. None of them offered sufficient improvements to justify the cost of switching over. (Maybe that's too harsh: I can think of one that clearly has gone somewhere though it's certainly not a 1-1 mapping with XML or a full replacement for it.) > To engage requires, at the very least, either acknowledging some common > standards of evidence, or proposing some alternative ones, or critiquing > the ones I've provided. That is, if we are interesting in finding stuff > out. You seem to be coming at this from an academic approach that this community has not historically bought into. Most folks here, even those of us who come from academia, don't put much stake in academic studies of programming productivity and the like. We put a great deal of stake in what we've learned from our own experience and those of our colleagues, as well as what the market has accepted. The only way you're going to convince anyone of anything here is by producing something better. If you have a better alternative to XML as it exists today, then build it. If it really is better enough, then--all arguments to the contrary,--people will switch to it, and quickly. We've seen this happen before, more than once. However for every new format nd language that succeeded, there have been a hundred failures or more. Some of the failures were interesting and we learned from them. Most weren't even that. -- Elliotte Rusty Harold elharo@metalab.unc.edu Refactoring HTML Just Published! http://www.amazon.com/exec/obidos/ISBN=0321503635/ref=nosim/cafeaulaitA
Received on Monday, 16 February 2009 23:11:43 UTC