- From: C. M. Sperberg-McQueen <cmsmcq@blackmesatech.com>
- Date: Tue, 15 Feb 2022 22:01:06 -0700
- To: public-ixml@w3.org
In fulfillment of the action I took today, I have created an ixml/samples directory (the action said ixml/grammars but I don't think anyone cared much about the name, and 'samples' seemed better when thought about it. Franklin Delano Roosevelt agreed when I consulted him.) I have seeded the directory with a README.md file, a grammar for the IETF Augmented BNF notation and a grammar for ISBN-13 book numbers. (Strictly speaking, the language defined by ISBNs is regular and could be recognized by a regular expression, but writing the regular expression to require the correct check digit is a bit of a challenge. For that matter, I found writing an ixml grammar to require the correct check digit also more challenging than I had expected. If time allows, I expect to add ISBN-10 and ISSN to the grammar as well.) At the moment, neither grammar has been tested. I think it would be useful to have ixml grammars that work for as many well known notations or syntaxes as we can manage. Programming languages, Relax NG compact syntax, XPath, some flavor or other of SQL, CSS, CSS Selectors, URIs, all come to mind. What do people think about - Mail headers (RFC 822 and successors) - IETF dates, ISO dates, ... - The lexical spaces of the built-in datatypes of XSD - XPath and XSD regular expressions; other regex notations - XSLT match patterns (as distinct from XPath in general) - The subset of XPath which XSD processors are required to support for uniqueness constraints and assertions - REx grammar notation - Turtle, N3, other notations used in Semantic Web work - A grammar that can read XML and produce a kind of rudimentary representation of the XML (not, as things currently stand, the XML an XML parser would produce) - A rational form of CSV (if such a thing exists) - Some flavor of Markdown (there are so many to choose from) or one of its competitors. Given our use of Github, perhaps Github-flavored Markdown would be helpful. Are those worth trying to find grammars for and/or create ixml grammars for? Unfortunately, all of those seem rather computer-oriented; I am having trouble thinking of things for which there is something like an authoritative syntax that are not computer-oriented. The best I have managed so far are: - The syntax(es) used for formal logic by various theorem provers (no two seem to use the same syntax, so there is a wide range of choice) - The syntax used in Principia Mathematica (if someone can figure out how to express the rules for dots in a context-free grammar) - If anyone understands how legal citations are structured in the U.S. or in any other jurisdiction, that would be interesting, especially if accompanied by an explanation for those of us who don't. - Are there grammar rules for things like chemical formulas? For standard names of molecules? - The notation used for describing syntax trees in the Susanna corpus of English. (If there are other documented syntax notations, I'd be happy to work on them. My recollection is that the Hamburg Dependency Corpus has a non-XML notation. And my recollection is that the syntax notations in the Susanna corpus don't always parse according to the rules given in the documentation.) The more examples we can think of, the better. I think people should get double points for suggestions in non-computer domains, and double points again if there is something like an authoritative definition of the language in question. Michael -- C. M. Sperberg-McQueen Black Mesa Technologies LLC http://blackmesatech.com
Received on Wednesday, 16 February 2022 05:01:38 UTC