Re: Getting Off The Ground from Johannes Kepper on 2015-09-19 (public-music-notation-contrib@w3.org from September 2015)

From: Johannes Kepper <kepper@edirom.de>
Date: Sat, 19 Sep 2015 13:38:49 +0200
To: public-music-notation-contrib@w3.org
Message-Id: <A882B035-237B-4826-A7BD-ABA95665D39B@edirom.de>
Hi all,

first, I need to give a full disclosure as Andrew did. But, while I am the chair if the MEI Board, I write this email from my own perspective, so this is no "official" response of MEI or something like that. 

It's great to hear that MEI seems to be recognized beyond its active community. However, besides the technical aspects that Glenn and Andrew discussed, I believe it's helpful to take a broader perspective on the respective qualities of MusicXML and MEI first, as requested in Glenn's first mail. While I won't elaborate on MusicXML – I believe most of you are sufficiently familiar with it – I'll try to give some background information on MEI. 

Most important, MEI is more of a framework than it is a standard. That means it offers a wide range of encoding solutions for music notation, metadata on music, and other aspects I'll come back to in a second. For instance, there are four different solutions to encode beams, all of which operating on different levels of complexity and with a different scope. Not all of these solutions are actively used (actually, one or two may be deprecated in the future), but they all address different use cases. In order to use MEI properly, one should never use it out of the box, but instead create a customization (read: custom profile). This is possible through TEI's ODD language (http://www.tei-c.org/Guidelines/Customization/odds.xml), which is used to define MEI in a modular way. 

We're fully aware that this technical setup is not very friendly to beginners, and creates much technical overhead before one can get started. However, the flexibility offered by this setup is required for the main purposes of MEI. As someone pointed out, MusicXML has been designed as an interchange format (even though its origin in and baffling similarity to MuseData (http://www.ccarh.org/publications/books/beyondmidi/online/musedata/) seems to contradict this). MEI, on the other hand, is intended to cover various scholarly requirements. It doesn't only seek to encode all that's necessary to describe a page of music coming out of Finale, Sibelius, or a human engraver from the 19th century. It is equally interested in manual additions and corrections of the composer in that 19th century print. Every handshift can be encoded, together with a description of the writing medium and the most likely writer (alongside possible alternatives). It is possible to point out the differences between multiple sources without duplicating the shared stuff. In essence, it seeks to cover all information contained in one or more manuscripts and prints, not only the aspects that directly influence the music (notation). 

With this in mind, the lack of software support for MEI in the past has been a benefit for its maturation. With no applications being around, the focus could go on the encoding, leading to very appropriate concepts in most cases. If an encoding model had to change, a break of backwards compatibility wasn't an issue, as there was not much that could be broken. Actually, such changes were most often requested by those who had data in that field, and wanted to overcome limitations of some kind. 

However, this situation has changed in the last couple of years, at least for the core areas of MEI (there is still sufficient room for exploration in more distant repertoires…). Applications like Verovio use MEI, and of course they can't support four different solutions for the same problem. In this regard, I expect that application support will define some kind of de-facto standard within MEI, and we're well advised to actively manage this process and take conscious decisions on what to keep and what to kick. At the same time, it is absolutely clear that no single renderer will ever be able to hand out all the information possibly contained in an MEI file. Obviously, one could use color or overlays to illustrate different writers of a manuscript, but there's no point in printing different readings from multiple sources simultaneously. Here it becomes clear that MEI requires an interactive rendering, where the application shows one alternative at a time, together with a hint that others are available. A demo of this concept is available at http://zolaemil.github.io/meiView, but I'm sure we'll see more advanced approaches in this direction in the future. 

This mismatch between encoding model and application support isn't perceived as restriction. First of all, many projects in the MEI context use digital facsimiles alongside their encodings. As MEI allows to relate the encoding to one or more facsimiles, there's no point in replicating the manuscript perfectly – in essence, you don't try to encode the (graphical) document, but instead the (logical) text it contains. At the same time, many projects pre-process their MEI instances before putting them into a renderer like Verovio. A good example of this might be http://beethovens-werkstatt.de/demo, where absolutely everything comes out of one single MEI file (which is available from the "Codierung" tab). We then use XQuery and XSLT to extract those parts which are required to display the different components. The message here is that software requirements are not the major design factor in the development of MEI, even though applications start to catch up and increasingly influence MEI itself. 

Obviously, it makes a difference for the encoding if you seek to describe an existing source (which is often, but not always the primary motivation for MEI-based projects) or if you intend to prescribe how a renderer should lay out a digital score. While MEI tries to cover both aspects, a single encoding normally takes one or the other approach, as they may interfere. MusicXML is clearly designed with the second perspective in mind, and for good reason. The description of existing sources is a niche use case, which is – outside of the scholarly world – probably only relevant for Optical Music Recognition (OMR). Even then, it's clearly a very small part of the "market" for music encoding. 

In essence, MEI has been developed in a small niche, with edge cases of music encoding in mind. It has clearly benefitted from having no software support in the past, and is now mature enough to deal with software requirements at least in the core areas. If its conceptual model and setup is of interest to a wider, potentially commercial community, this shows that the community's investment into proper modeling was fruitful. However, there might be conflicts between straight-forward use cases and the more obscure scholarly requirements. They are certainly manageable, but we all should be well aware of them.

One thing that has been discussed every now and then in the MEI community is the creation of a "simplified" MEI customization / profile, which takes away both the ambiguity in the model and the editorial / transcriptional additions which make the main difference between MEI and any other music encoding scheme. This would very much reflect the purpose of TEI Lite (http://www.tei-c.org/Guidelines/Customization/Lite/index.xml), which intends to cover 90% of the needs of 90% of its user base. I could very well see how such an approach would interact with this W3C group, and how it could lead to an encoding standard built on and implementing the best of both MusicXML and MEI. I believe I speak not just for myself when saying that the MEI community is very interested in participating in such an effort, with no predetermined solution on either end. 

Best,
Johannes















Am 19.09.2015 um 01:29 schrieb Glenn Linderman <v+smufl@g.nevcal.com>:

> On 9/18/2015 3:32 PM, Andrew Hankinson wrote:
> 
> Thanks for the pointers to the various projects and their goals. Will make interesting reading.
> 
>> I have experimented (successfully) with round-tripping MEI XML to JSON and back to XML, so there is no reason why this is not possible -- a hierarchical markup language is a hierarchical markup language. There's also no reason it could not go to YAML or any other representation. 
> 
> Agreed, and I can't presently think of a case where XML is insufficiently expressive, just that it is excessively wordy, and, for web/javascript use, requires an extra layer of parsing to obtain something useful to javascript. But you have a mental model below that makes JSON equivalence of XML appear more complex than it needs to be. We don't need to be encumbered with misperceptions of that sort.
> 
> The point I was trying to make is that if the standard wants to be web-friendly, maybe it needs to have its primary expression be     web-friendly JSON.
> 
> That doesn't preclude having a standard translation between JSON and XML so that one can borrow and benefit from schema tools and validation, if those are more available and more robust than JSON schema tools, validators, and parsers.
> 
> I have seen what appear to be iron-clad JSON parsers for nearly every language, so XML doesn't seem to have an advantage there. While I have seen schema tools and validators "advertised" for JSON, I have no experience with them, nor how they compare to their XML equivalents, nor in how many different languages they may be available.
> 
>> The simplicity of JSON is somewhat deceptive, though. To get the same functionality as XML you need special treatment of attributes, text, and nested tags which are concepts that are not in many "markup" languages. So, an XML snippet like:
>> 
>> <foo bar="baz">This is some <b>bold</b>text</foo>
>> 
>> Would need to expand to something like:
>> 
>> {
>>   "foo": [{
>>     "@attributes": { "bar": "baz" },
>>     "@value": "This is some",
>>     "@tail": "text",
>>     "b": [{
>>        "@value": "bold"
>>     }]
>>   }]
>> }
>> 
>> Which quickly becomes rather unreadable.
>> 
> 
> But that is because it isn't done "correctly", not being normalized properly, and wouldn't easily support 
> 
> <foo bar="baz">This <snicklefritze>is</snicklefritze> some <b>bold</b>text</foo>
> 
> This is a rather deeper topic than I expected in this discussion, but the devil is in the details, so if there are misperceptions about details, it is better to clear them up early than to let them influence design choices.
> 
> An XML nested tag consists of 3 subitems: a tag name, a collection of attribute values indexed by key, and an array of content.
> 
> A simple convention is to represent an XML tag by an array, where the tag and attributes are the first and second items, and remaining items are content, where content is either string or child tag.
> 
>    [
>         "foo",
>         {"bar": "baz" },
>         "This is some",
>         [
>               "b",
>               {},
>               "bold",
>         ],
>         "text",
>     ]
> 
> 
> The above convention probably produces the most compact representation of XML in JSON, but it may or may not be perceived as friendly to access from programs, or to keep track of when mentally interpreting it. A different convention would be to add noise words to name the three items associatively, with content still being either string or subtag:
> 
>    {
>         "tag": "foo",
>         "attr": {"bar": "baz" },
>         "data": [
>             "This is some",
>             {
>                  "tag": "b",
>                   "attr": {},
>                   "data": [
>                         "bold",
>                   ]
>             },
>             "text",
>         ]
>     }
> 
>> -Andrew
> 
> Glenn
Received on Monday, 21 September 2015 07:49:48 UTC