- From: Paul Prescod <papresco@calum.csclub.uwaterloo.ca>
- Date: Mon, 19 May 1997 14:46:48 -0400
- To: w3c-sgml-wg@w3.org
Steven J. DeRose wrote: > > But every step that helps for RDB-ish data hurts for document data (by > complicating the parser, compromising error-detection possibilities, > complicating the DTD and perhaps making it required, reducing redundancy, > etc). The short-end-tag proposal is just one point along the continuum. Thanks, Steven, for that well-reasoned post on documents and RDB data. Here's my take: When we started talking about DTD-less documents, I had all kinds of interesting ideas about database records in XML, .ini/.rc files, catalog files etc. The furor over XML-style DTDs and catalogs show that others have these same ideas. But the more I think about it the less I care about those other applications. How often do you really want to process a relational database or .ini file in an SGML editor? How often do you want to look at a relational database using the Grove model? Why did I care back in the heady days of October? I think I had the kind of monopolistic ideas that Lisp programmers from the sixties had: code is data, data is code. If everything shares the same syntax everything can be manipulated uniformly. It turns out not to be so interesting. Hardly any Lisp programmers care about the fact that Lisp uses a data-like syntax anymore. Hardly anyone builds code at runtime as parenthesized strings. I mean there are major benefits to the fact that Lisp has a simple syntax, but not that it has a *uniform syntax that is the same as its data*. That's just an analogy, but I think an important one. Simplicity is important but uniformity is not. .INI files and comma delimited database files are simple: easy to parse and use. Why change them? Five years from now the world will not be a significantly different place if CDF of OCF is XML-based or not. I would be interested to hear from the MS folks if I am wrong: are there significant technical benefits to uniting these syntaxes or are we just playing buzzword games? Now if Microsoft were thinking about redoing RTF or HTML, markup languages, then I would listen very carefully. Markup languages are our target audience. A failure in dealing with RTF will quite possibly indicate a failure in dealing with other markup languages. The line between a "document" and a "database" is vague, but we can usually make the distinction by asking "what is the right formalism for this data"? If it is a grove, the thing may well be a document. If it is a relational or flat-file database the thing is probably not. I think it would be an interesting project to make a standard that unites all file formats under a machine-readable syntax description file. But I don't think that XML and XML DTDs are really the right starting place. Perhaps BNF, or ASN.1, or YACC, or even SGML (which already has features for minimizing/changing syntax). Paul Prescod
Received on Monday, 19 May 1997 14:50:30 UTC