- From: C. M. Sperberg-McQueen <cmsmcq@blackmesatech.com>
- Date: Wed, 04 May 2022 20:54:45 +0200
- To: Steven Pemberton <steven.pemberton@cwi.nl>
- Cc: public-ixml@w3.org
Steven Pemberton writes: > ixml is about taking implicitly structured (textual) data, recognising > that implicit structure, and making it explicit in some way or another > on output. > > XML is one of the targets for that explicit output, and currently the > best for representing the abstractions. It need not be the only one, > ... I'm not quite sure what to do with an argument of this kind. My understanding of hermeneutics (which can perhaps be thought of as the study of what is involved in sentences of the form "X is about Y"), such as it is, inclines me to believe that time will tell what ixml is about, and it won't start telling until ixml is actually finished. That does not, of course, prevent us from attempting to interpret ixml now, in its unfinished state. But it does suggest that some caution is advisable. My understanding of collaborative work makes me think that when X is the prospective work product of a collective, "what X is about" is determined by the group, and not by any one member of the group. Again, that does not prevent anyone in the group from trying to explain to others what they think are the particular core ideas that make the project worthwhile, in the hopes of persuading others to set an equally high valuation on those ideas. But again, some caution may be advisable. My understanding of collaborative work also tells me that I should not resent it when other members of the group appear to discount the collaborative nature of the effort and tell me what I should think instead of attempting to persuade me of a position -- but being human I fail often to do what I should or refrain from what I should not do, and this may be a case in which I fall short of perfection. I will only observe that it does not appear that ixml has always been understood thus -- or, at least, that it has not always been described thus. The paper Pemberton 2013, which I believe first introduced the idea of ixml to the world, describes ixml this way: Is it not possible to combine the best of both worlds, and have authorable formats, that can still use the XML tool chain? Couldn't XML become the underlying format for everything? The Approach The approach presented here is to add one more step to the XML processing chain, an initial one. This step takes any textual document, and a (reference to) a suitable syntax description, parses the document using the syntax description, and produces as output a parse tree that can be treated as an XML document with no further parsing necessary (or alternatively, the document can be serialised out to XML). I see nothing here about making the structure visible "some way or another", or about XML being "one of" multiple output formats. Invisible-XML processors are described as a first step in an XML tool chain, not as a broader more general replacement for yacc and lex and similar tools. > ... even though I recognise that some of you are involved only because > of the XML aspect. I have spent many happy parts of my career writing grammars, working with parsers, and learning about parsing. I might well be interested in an effort to make general parsing more easily accessible. But you are probably right to suspect that as a historical fact ixml attracted my interest because it was described as a step in an XML processing flow. > input -> ixml -> output > The real ixml is that middle bit. > However, ixml is not XML, nor, contrary to what you may think, does it > contain any XML-specific items: I'm not quite sure what any of these sentences mean. That "ixml is not XML" appears to be self-evidently true, and not worth saying if it bears that self-evident meaning. ixml is a notation for context-free grammars and a set of rules for using grammars in that notation to parse data streams into XML. XML is a notation for documents which ensures that documents have a particular set of properties, can trivially be parsed, and can usefully be processed in various ways. I infer that the intended force of the utterance is to mean something different from the surface meaning of the words "ixml is not XML". But I do not know what that something might be. Nor do I understand what an "XML-specific" item might be. Perhaps you are saying that nothing in ixml is designed to work well with XML. I refer you to Pemberton 2013, which seems to me to say something rather different. > ^ represents structured data, and was initially chosen because it > looks like a tree, and has the added benefit of looking like an XML > bracket on its side. So ... not 'insertion', then. That is a useful clarification, and makes complete nonsense of the arguments recently given for using ^ for injection of literals into the output. > @ represents data that is made unstructured on output (you could say > it is destructured). I had several candidates for the mark, such as > =, which looks flattened and un-tree-like, but in the end I chose @ > as the symbol used in XML for flat data. > Namespaces are not a concept anywhere within ixml, nor do they map to > any concept within ixml. It is purely a feature of XML, and one that > was not even originally in the design of ixml ("not for generating any > particular version of XML"). Adding explicit notation for namespaces > somewhat fouls the ixml nest, making it specifically about a > particular output format. I think you made it about a particular output format when you described IXML processing as producing XML documents. The original description of the project could easily have been about replacing yacc and lex with something more general. It could easily have been about a general-purpose format conversion tool, although in that case the proposal would have been exposed to awkward questions about round-tripping, and it would be easy to predict that the project would meet roughly the same fate as all the absolutely general-purpose anything-to-anything format conversion projects I have encountered over the last forty years. (Hint: I cannot remember most of their names.) Defining ixml as a method of using context-free grammars to parse input into XML made it concrete enough to be interesting and small enough to be tractable. If ixml is intended to produce XML output, I do not see why we should spend so much time and space in the ixml spec (and CG) re-litigating so many details of the XML specification, from name structure (nonterminal names could just be defined as NCNames, but we seem to be attached to the idea that it's better to design yet another identifier syntax, because Lord knows the world doesn't have enough of them yet), to the existence or non-existence of namespaces, to the existence or non-existence of processing instructions. > I proposed a way of doing namespaces using the existing mechanisms: > > data: @xmlns:iso, iso:date+. > @xmlns:iso: ^"http://example.com/ns/date". > iso:date: ...etc... > Thank you for the reminder -- another topic to relitigate: structure of qualified names. XML has settled on allowing at most one colon. Making non-terminal names follow that rule would be trivially easy, but why should we do that? Let's allow multiple colons! Why not redesign XML names yet again? After all, none of us has anything better to do with our time. Oh, wait. I do have better things to do. I should go do them. -- C. M. Sperberg-McQueen Black Mesa Technologies LLC http://blackmesatech.com
Received on Wednesday, 4 May 2022 18:55:08 UTC