- From: L Peter Deutsch <lpd@major2nd.com>
- Date: Tue, 20 Oct 2015 07:51:47 +0200
- To: public-music-notation-contrib@w3.org
Please excuse this very lengthy posting. I have put a great deal of thought into its content in the years since my first contact with MusicXML in 2006, and this group is finally the forum in which I trust it will be received thoughtfully as well. To give some background for this posting, I've attached a summary of my relevant experience and qualifications at the end. Agenda ------ What I would like to see this group accomplish is to develop at least one standard for digital score representation that will have the quality, success, and longevity of the specifications / standards for other digital document representations such as PostScript, PDF, HTML, and SVG. (I will use "specification" and "standard" interchangeably, because high quality of either one demands high quality of the other, regardless of whether the standard is de jure, as for SVG and HTML, or de facto, as for PostScript; and of course the best de facto standards become de jure over time.) In my opinion, a high-quality standard must meet the following criteria, which apply to the other high quality digital format specifications that I've examined: * It must provide a mechanically executable, unambiguous test for syntactic conformance that consumers can and should implement. * It must provide a mechanically checkable, unambiguous test for semantic conformance that consumers can and should implement. This includes not only type and range specifications for all individual data items, but clear validity criteria for all relationships between items and structures. * It must define the *meaning* of every construct that passes the above two tests. If a construct does not have visual or semantic meaning, such as a line with only one end, it must not be allowed. * It must not be bent to cater to implementation bugs. However, non-conformances in leading implementations should be documented in an annex, to help embarrass implementors into fixing them. In my opinion, these goals have been achieved well for PostScript, PDF, HTML, and SVG, as well as for many simpler formats such as Standard MIDI files, and also for mup (www.arkkra.com), the score language that I use for my own work as a composer. If a standard or specification leads to an endless stream of questions from consumer implementors about what some construct means, or whether some construct is valid, that is, in my opinion, a sign of poor quality. Likewise, widespread interoperability difficulties are an indication of quality problems in the specification. Some degree of vigilance, and an independent effort to monitor producer implementation quality, are needed to keep increasingly sloppy producer implementations from gaining ground (since, as for all data formats, consumers should always make reasonable efforts to compensate for sloppy producers), but I believe these costs have a tremendous payoff in long-lived usability of both software and (more importantly) human-produced data. I cannot emphasize too strongly that achieving these goals is not simply a matter of "writing a specification." Whether they can be achieved at all, or with a reasonable amount of effort, depends greatly on the structure of what is being specified. I believe strongly that the degree to which a design can be specified clearly, succinctly, and fully is an important measure of the quality of the design itself. MusicXML is an obvious starting point for such standard, and because it has made such tremendous practical contributions to digital music and has been adopted so widely, I believe it is worth the very considerable effort that will be required to address its problems. The remainder of this posting is intended to lay out what I think are the major mindset and technical issues that need to be addressed in MusicXML, and to suggest how they might be addressed. Mindset issues -------------- Several mindset issues stand in the way of evolving MusicXML to a high quality standard. The first, and most serious, concern is the misapplication of the concept of "selective encoding." Of course MusicXML producers are free to encode only those aspects of the score that they wish, and of course they can only encode those that MusicXML can represent. However, this concept has also been used to legitimize the idea that MusicXML consumers are free to ignore data ad lib even if their own semantic model can express what that data encodes. That is not "selective" anything: it is a bug, or at best, a missing feature. I encountered this personally with respect to Finale and the handling of margins: Finale's own model of margins is rich enough to handle most (perhaps all) of what MusicXML can encode, but it simply ignored margins in imported MusicXML data, and when I reported this, it was shrugged off with an implicit appeal to "selective encoding." Of equal concern is the special pleading that "music is so complex that a score format cannot be specified completely or rigorously." I think this reflects a deep misunderstanding of the nature of both score formats and specifications. MusicXML is first of all (1) a format for representing printed scores, and (2) fundamentally oriented towards semantics, like HTML, rather than towards page description, like PostScript and PDF (more on this issue below). While no semantic specification should attempt describe in detail *how an engraver must render* a score, I have seen no evidence that it cannot have clarity and completeness about the *semantic and general visual relationships* of the elements it names. The specificational problems I see in MusicXML, of which the most serious are discussed below, arise not from the nature of score notation but from inadequately considering specifiability and meaningfulness in the design. A further concern is the tolerance of undocumented and untagged additions. In an XML-based standard, I can think of four opportunities for producers to pollute the standard by addition (and have probably missed a few more): new contexts for existing element types, new element types, new attributes for existing element types, and new values for attributes or CDATA that have a specified range (including, in particular, those that only take on an enumerated list of possible values). A standard that is intended to be interoperable should, in general, state clearly that none of these are allowable: while consumers may tolerate them (as consumers should tolerate non-conformances in general), they should be flagged as non-conforming and reported to the user with an encouragement to report them to the producer. To what extent a young specification such as a cleaned-up MusicXML should allow for unsanctioned additions in limited, designated contexts is a legitimate subject for disc! ussion, b ut the starting point should be not to allow them anywhere. The final concern is the embodiment in the standard of tweaks to compensate for bugs in the two leading implementations. The ones I know about are tolerance of Sibelius's failure to produce both "alter" and "accidental" elements and to terminate beams properly. These are bugs, and they should not distort a standard. I hope that any W3C work on MusicXML will address these issues thoroughly. Technical issues ---------------- There are likewise a few technical issues in the MusicXML design so serious that they must be considered for fixing before proceeding. The most serious issue is the relationship between flowed-location and fixed-location constructs, which manifests the tension between a semantic format like HTML and a concrete format like PDF. The first release of MusicXML was essentially the Humdrum format re-cast into XML syntax, and was completely semantic -- it had no constructs that referred to putting marks at specific positions on a page. The page-level elements added later, in contrast, are completely concrete: unlike all other score formats known to me, MusicXML has no way to express the basic semantic concept of "this happens on all / odd / even pages." This, in turn, means that in order to have any non-flowed page-level marks at all (such as page numbers), a MusicXML file must indicate all page breaks explicitly -- destroying the ability of a reasonable consumer to reflow pages for smaller screens, for example. Even worse, MusicXML can express fixed placement of individual notes and auxiliary marks. As a res! ult, prod ucers have to choose between what amount to two completely different dialects of MusicXML: one that is HTML-like, in which these elements are placed only relative to other elements or to their "default" engraver-chosen positions, and one that is PDF-like, in which *every* element is given a fixed location and reflowing is impossible. If the two are mixed to the slightest degree, the result will render correctly only with engravers whose placement algorithms are identical to those of the original producer, making a mockery of interoperability. As a practical instance of this problem, an earlier posting in this discussion observed that users (at least of Finale) inadvertently use fixed positioning for elements that should be coupled to flowed material, leading to interoperability problems. I have seen this myself in Finale files where lyrics have been entered with fixed positions on the page rather than linked to the music. To the extent that Finale makes this easy or natural, I would consider it poor design in Finale's user interface. Fixing this problem in MusicXML will require a concerted redesign to clearly separate flowed and fixed material. One part of this must be to introduce all / odd / even page formatting, including a way to indicate insertion of page numbers and of sequential page numbering in a variety of forms. However, fixing the user interface of Finale (I don't know whether Sibelius or other widely used score creators have the same issue) is beyond the scope of this discussion. Nearly as serious as the mismatch of flowed and fixed elements are the two constructs that create friction between file text order and time sequence: <chord/> and <backup>/<forward>. <chord/> raises two questions: what elements can intervene between the notes of a chord? What attributes and child elements must be the same between the notes of a chord, and what may be different? The answers create interactions between <chord/> and many other elements of the specification -- a red flag for a design. The obvious fix for <chord/> is to replace this element with a <chord> element at the measure level whose children are the notes of the chord, and to consider carefully which of the current attributes and elements of notes should be associated only with chords, only with notes, or (in as few cases as possible) with either. <backup> and <forward> are much worse: they interact with *every* element that refers to the conceptual flow of time in the score *or* settings such as margins that may change in the course of the file. For every such element, the specification must state whether "before" and "after" refer to the time sequence as modified by backup/forward, or to the sequential position of the element in the file. Again, a very large red design flag. Removing backup/forward requires finding another way to deal with multiple time flows (not just voices) within a part. A good simple approach would be to require every note, chord, rest, slur start/end, cresc/dim start/end, etc. to specify an explicit starting time within the measure, subject to a constraint that time cannot flow backward (the start time of each element must not be less than the start time of the previous one); but there may be other equally good simple approaches that meet the essential criterion of no disconnect between time order and file order. Even with the explicit approach, the following simple abbreviation rule would allow eliding nearly all starting times: the default starting time of each element is the default starting time of the previous element (in the same part and measure) tagged with the same voice number, plus the duration of that element, or 0 if it is the first element in that part and voice in that measure; if not tagged with a voice n! umber, it is the starting time plus duration of the previous element, or 0 at the start of the measure. However, the underlying model would still be one of explicit tagging. On another topic, perhaps it is obvious from the introduction, but the specification must define, for every construct that uses start...stop tags, exactly what the constraints are on what elements can participate in the construct (limited to same measure? same part? same voice? adjacent notes/chords? etc.), and of course must require semantic well-formedness (every "start" must have a "stop" and vice versa, "continue" may only occur between "start" and "stop", etc.). Finally and least seriously, there are a large number of different contexts in which text and/or symbol strings can appear in MusicXML files, and each one of them has different restrictions on what attributes can appear with the strings. Here is the table I derived from the MusicXML 3.0 DTD (please view this in a fixed-pitch font): %position = default-x default-y relative-x relative-y %font = font-family font-style font-size font-weight %text-decoration = underline overline line-through direction dynamics mn.beats lyr.text reh. words m'nome lyrics lyr.el/ex placement X X X %position X X X X X %font X X X X X X X color X X X X X X X %text-decoration X X X justify X X h/valign X rot,dir X X X letter-spacing * X X line-height X enclosure X X * = in MusicXML 3.0 only While the above covers all of the larger issues, I ran into a number of other concrete issues with the existing MusicXML definition when writing my own MusicXML producer and consumer software. I posted many of them on the then-existing MusicXML discussion list and received answers to almost all of those, but there are a few others documented only in comments in my code. I will be happy to provide any W3C committee with the complete list. Recommendations --------------- As should be clear from the above observations, I believe all existing versions of MusicXML have issues, both with respect to mindset and with respect to specific technical issues, so significant that no effort should be devoted to trying to write specifications for them, even in the short term. I advocate strongly that our efforts be directed first to updating MusicXML into a design that does have the potential of meeting the criteria for a high quality standard / specification. As a "straw man," I suggest that this be the responsibility of a committee consisting of the following people: * Someone who has at least as much experience as I do in reading and writing high-quality specifications and standards, preferably someone with a strong musical background. (I personally know one person other than myself who would fit this profile well, and I'm sure some other members of this discussion would qualify even better.) * Michael Good, as originator of MusicXML and also to represent Finale's MusicXML import and export implementation. * Three software developers who have implemented MusicXML import and export functions for score applications and who are not connected with MakeMusic or Finale, one of whom should be from the Sibelius team. Ideally, they should represent those applications that implement the largest fraction of MusicXML constructs. * A person tasked with writing a fully automatic converter from all existing MusicXML formats to the new one. Doing this in parallel with the design discussion should greatly increase the chances of identifying omissions and unclarities in the existing designs. Avoiding "second system syndrome" in this effort is of great importance, as is producing a result within a reasonable amount of time. The effort should take as its goals: * Producing a design that is documented by a high-quality specification (as defined above). The design itself, the specification document, and a mechanical converter should be developed together. * Fixing all of the identified *significant* problems with the current MusicXML design, including but not limited to those listed above. * Not adding any other functionality unless it "falls out" of what should be a simpler and more orderly design. (See above re text attributes.) * Changing as little else as possible, to reduce the effort of creating updated importers and exporters. Again, this is a "straw man" in its details, but its motivation -- to clean up MusicXML before putting any effort into documenting or specifying it further -- is the main thesis of this long posting. I hope any W3C work on MusicXML will consider this thesis thoughtfully. Conclusion ---------- MusicXML has been tremendously successful as a "first good enough" digital score representation for widespread use. We are at a turning point where we have the opportunity to take it to a new level of quality that would give it a much better chance of taking a place among the best multi-decade and eventually de jure successful standards. Please, let's not lose the opportunity. ================================================================ *Annex: Qualifications and experience* I believe I have an unusually broad combination of experience and qualifications in this group: as a reader and writer of careful data specifications, as a system designer and implementor, as a student of digital score representations, and as a composer. * I was the primary author of the RFCs (reference specifications) for DEFLATE compression (used in zip and gzip) and the gzip file format. I was one of the very few reviewers of the PostScript and PDF reference documentation outside Adobe. I was also a reviewer for the Java Language and Java Virtual Machine Specifications. * I was the primary author of Ghostscript; the co-author of a seminal paper on just-in-time compilation, as well as the architect and primary implementor of the original just-in-time compiler for Smalltalk-80; and a co-recipient of the ACM Software System Award for my work on Interlisp. * I have studied the syntax and semantics of MusicXML, Finale, Sibelius, and mup in careful detail. I have written software that to a substantial degree converts between all of these formats, limited mostly by my available time, by the issues I have found in MusicXML's specification and semantics, and by the deliberate efforts of Sibelius (and, since 2014, Finale) to lock up their data formats. In 2010, I wrote a graduate-school paper roughly comparing the four formats. * I have a Music M.A. (composition) from Cal State Hayward, studying with Frank La Rocca. I have been a reasonably serious composer since 2003, including three small commissions for choral works and half a dozen performances of instrumental chamber music on San Francisco Bay Area NACUSA concerts. ================================================================
Received on Tuesday, 20 October 2015 07:26:52 UTC