Re: Update on MusicXML Use Cases

Bonjour Joe et al.

Following your advice, I have made the exercise to write concrete use-cases to help understand what may seem OMR-related specificities.

Re-reading use-case MC3, I think it does not belong to "Music Creation" category. Strictly speaking, there is no "creation" per se (as by a composer) but rather a conversion from one non-convenient format (sheet of paper) to a much more usable format (digital encoding). I would suggest the word "transcription" instead (it's the precise word used for this case by BnF - Bibliothèque nationale de France / French national Library).

Music transcription can be considered as an activity by itself. It opens paper-based music to the world of digitally-encoded music. Transposing, as mentioned in MC3, is only one out of the numerous possibilities brought by digital encoding. And after all, this whole "Music Notation" community activity we are discussing right now would merely not exist without digital encoding.

Should we define a specific "Music Transcription" category? (distinct from music creation)

Here below is a first shot of most obvious use-cases.

MT0: Editor wants to manually transcribe sheet music to digital encoding format

E has access to some sheet music, either a physical sheet of paper or a scanned image of it. The sheet can contain hand-written music or printed music.

E reads the source music and manually enters a corresponding transcription. The digital output should preserve semantic data of course and perhaps performance data as well. Visual data is not essential, especially if the source was hand-written music.

Note: in a variant of this use-case, multiple users playing an E role on Internet are provided with the image of say just one simple measure and prompted for its manual transcription (this is rather similar to Google "text reCAPTCHA").
 
MT1: Editor wants to transcribe printed sheet music to digital encoding format with help of OMR software

E does not enter the whole transcription from scratch but works in two steps. In step 1, an OMR software reads the source image and provides an annotated transcription. In step 2, E reviews OMR output and manually corrects the wrong or missing items in the final digital encoding.

This approach is interesting only when the efforts spent in step 2 are significantly lower than in MT0. This depends on many factors, notably initial image quality, music complexity, OMR recognition rate, OMR interaction abilities.

OMR outputs need to provide hints for user review: confidence in each symbol, abnormal voice durations, detected incompatibilities, missing barlines, significant  objects with no interpretation, etc, in order to call user attention on these points.
Visual data is key to easily relate any given music symbol to the precise location in the initial image.

E should use an "OMR UI" specifically meant for OMR validation / correction. As opposed to standard music edition UI, such OMR UI should focus on fidelity with initial image, avoid any over-interpretation of user actions, even switch off validation while a series of user actions is not explicitly completed.

MT2: Editor wants to transcribe sheet music to digital encoding format without manual intervention

This can be seen as a variant of MT1 without step 2 (review). However, it must be considered as a use-case on its own because, for large libraries with millions of pages, having human beings spend several minutes on each page review is out of reach.
See SIMSSA project regarding the use of OMR (not perfect) data as a hidden navigation layer on top of source image display.

A side advantage in by-passing human review, is that is allows to re-launch at minor cost a campaign of transcription if significant progress is made in OMR technology. Such progress is helped by the openness and modular architecture of the OMR pipeline software...

MT3: Many editors help improve OMR service via collaborative OMR over the web.

This use-case extends on MT1 when used over the web on a shareable OMR service: In this approach, each user reviewing action, whether it's a validation or a correction, is linked back whenever possible to the underlying OMR item.

If we do have an identified OMR item, then it can be recorded as a representative sample (physical appearance and assigned shape). Samples are accumulated and later used to asynchronously improve the training of shape classifiers. A value commonly accepted in today's deep learning projects is to have sets of at least 5000 samples per shape. Such numbers would be easily reached with this collaborative approach.

If we don't have a clear OMR item identified, then the user could be prompted to select a case in a list of typical errors. We could that way increment a tally of typical errors with their concrete context. Later, an OMR developer could select one among the most common errors and have immediate access to all the related concrete examples.


Bye,
/Hervé

-- 
Hervé Bitteur
Audiveris OMR 
www.audiveris.com <http://www.audiveris.com/>
 <https://www.avast.com/sig-email?utm_medium=email&utm_source=link&utm_campaign=sig-email&utm_content=emailclient&utm_term=oa-2200-a> Garanti sans virus. www.avast.com <https://www.avast.com/sig-email?utm_medium=email&utm_source=link&utm_campaign=sig-email&utm_content=emailclient&utm_term=oa-2200-a>

Received on Monday, 11 April 2016 20:23:43 UTC