- From: Rick Jelliffe <ricko@allette.com.au>
- Date: Sat, 17 May 1997 19:08:23 +1000
- To: <w3c-sgml-wg@w3.org>, "James Clark" <jjc@jclark.com>
> From: James Clark <jjc@jclark.com> > > Proposal: All machine-readable schemata, whatever their other > > characteristics, are structured data, and so XML itself is a good > > carrier syntax for schema expression. We should design a general > > structure for writing schemata in XML. > This was discussed *very* fully when we decided to stick with SGML's syntax > for DTDs. Calling it a schema rather than a DTD doesn't change the basic > issue. Reaching agreement on an instance syntax would require a lot of > time, which I don't think we can afford at the moment, given the enormous > amount we have still to do. Two comments: 1) We have very clear goals and a timetable for XML 1.0. When the draft causes responses, like some of these (& I am certainly not saying they are not good ideas), that conflict with our goals and timetable, we should push ahead to finish XML 1.0, though certainly with the addition of namespaces. (By the way, I am interested to know in the Microsoft namespace model how they intend that various content models from various DTDs intertwine and still retain validity. Does everything have an (implicit) declared content type of #ANY in their scheme?) You cannot have a single markup syntax that is optimal for every purpose. So maybe XML 1.0 (1997) is optimised for perl hackers and XML 2.0 (1998) will be optimised for database transfers. But people who need XML 2.0 can use XML 1.0 in the interim, with only slight speed penalty. In other words, I think XML is simple enough that it can widely used in industry, but that wide deployment will be an enormous generator of new requirements and goals. This is why we maybe * need to be disciplined to get at least XML 1.0 finished, * need to keep SGML compatibility, because otherwise XML will fragment into a million incompatible pieces: SGML provides at least a base by which we can say "this wheel has been already invented" or "if you need this extra feature, bump yourself up to some XML-like form of simple SGML" * make sure that XML doesn't grow into SGML, but keeps its identity as a small language (using the 20 page rule) 2) There is an idea in the background that you cannot use DTDs to specify enough usful information about elements types. I hope HTML people realise that in SGML (& XML?) you can add fixed attributes to element types that reference: * PI entities, that allow you to pass any instructions that will help process the element. * NOTATIONs, that allow you to specify the format of the contents of the element. So there is no need to extend or replace DTDs with any new tags to tell you what is in a element, or how it should be processed. So the infrastructure is already there. We are just missing agreed on notations and PIs. For example: <!NOTATION comma-delimited-data PUBLIC "IDN//W3.ORG//NOTATION XML comma delimited data//EN"> <!NOTATION pipe-delimited-data PUBLIC "IDN//W3.ORG//NOTATION XML pipe delimited data//EN"> <!ELEMENT database - - (#PCDATA)> <!ATTLIST database xml-notation NOTATION (comma-delimited-data | pipe-delimited-data) #IMPLIED> lets you have documents like: <?XML version="1.0" ?> <DATABASE xml-notation="comma-delimited-data"> dog,rover,stinks cat,happy,scratches </DATABASE> In other words, you can, with nice simple XML as it now stands, embed your own particular highly efficient data. XML becomes more like a wrapper. You get the best of both worlds, maybe: you get a standard syntax for metadata, and as efficient (and proprietary) as you want for the contents. SGMLs main strength (and HTML) is that it lets you embed other notations: you can use it for what it is good for. I don't think we should expect XML to be otherwise. Rick Jelliffe
Received on Saturday, 17 May 1997 05:07:59 UTC