- From: Peter Murray-Rust <Peter@ursus.demon.co.uk>
- Date: Thu, 24 Apr 1997 11:50:29 GMT
- To: w3c-sgml-wg@w3.org
In this discussion many contributors have seen XML as a natural extension of the way the SGML community works at present and I appreciate the concerns that they have. It seems to be clear that a large number of humans using SGML at present would be unhappy to be required to create WF documents - I accept this as a current fact. I take the optimistic view that XML will become ubiquitous and therefore most creation of documents will not be created by _humans in the current SGML tradition_. In developing Chemical Markup Language I see five sources (and sinks) of XML documents. They are not all relevant to other disciplines, but I expect there are further possibilities I have missed. Assuming that CML becomes widely used, I will attempt to put percentages on them (correct to a few orders of magnitiude :-) Instruments (50%); Most data are now collected on instruments with digital outputs. There is no technical reason why these cannot be CMLised. (The current output formats are often awful. If you worry about legacy text data, legacy numeric data (especially binary or FORTRAN) is far worse). Calculations(5%); Many scientists don't do lab work, they compute what reality ought to be. This is good for those selling large numbercrunchers. It would be a low cost option to adapt most of the programs to output CML. Databases(40%); A large amount of data is transferred to and from databases. (Most of it comes from instruments, of course). Scientifc publications (journals) (1%) The primary place where results are published, though this could change. Most publishers know that SGML exists and many use it. Normally the markup is very large-grained. Humans(1%); They normally write the publications, but there are some disciplines (such as crystallography) where a short fully marked-up journal article is output by a machine. Apart from the humans, none of the others has any incentive not to produce WF-documents - why should they? We agree that bandwidth is not a problem (article of faith). All these outputs are produced by a program of some sort which can count enough to balance quotes and tags. I'd hate to think that we were giving a signal to authors of these programs that they should do anything other than produce WF XML. (>>99% of them wouldn't dream that anything else was potentially acceptable - they are used to having to work to given rules). In 1993 I imagine most HTML was hacked by humans. In 1997 I suspect only a very small amount is (apart from mine :-). In 2 years at most that will certainly be true for XML. I think we should try to look at 1999 and ask what people will really want _then_, rather than assuming it's driven by what we can see now. P. -- Peter Murray-Rust, domestic net connection Virtual School of Molecular Sciences http://www.vsms.nottingham.ac.uk/
Received on Thursday, 24 April 1997 07:17:25 UTC