Re: Unique ids in MusicXML 3.1 and beyond from Joe Berkovitz on 2017-09-11 (public-music-notation-contrib@w3.org from September 2017)

From: Joe Berkovitz <joe@noteflight.com>
Date: Mon, 11 Sep 2017 12:43:44 -0400
To: Jeremy Sawruk <jeremy.sawruk@gmail.com>
Cc: James Sutton <jsutton@dolphin-com.co.uk>, Michael Good <mgood@makemusic.com>, public-music-notation-contrib@w3.org
Message-ID: <CA+ojG-YiWJOcEB-2LVQvTKtxvgJnSa++3JodSUOxWywc8_2=mw@mail.gmail.com>
Hi all,

The co-chairs just discussed this question on our call this morning. Here
are our thoughts:

- As a point of order, in the future let's always use the Github issues
mechanism to record discussions like this. That way we can properly
monitor, categorize, cross-reference and maintain clear status on all
issues in play. Michael Good is going to record the issue in the MusicXML
repo and reference this discussion so far in the mailing list archives.

- The chairs are in agreement that, as per the XML ID spec
https://www.w3.org/TR/xml-id/, the only "interoperable" aspect of xml:id is
its uniqueness within a document. We're not allowed to impose special
syntactic constraints, or assign domain-specific semantics to the actual
content of the ID, so we consider it closed for MusicXML 3.1. An xml:id
must be an opaque label, and no more.

- In line with Jeremy Sawruk's last reply, we're not able to think of any
application functionality that is prevented by letting applications choose
IDs according to any scheme they like. Applications can always remap both
ID declarations and ID references in an incoming document to any desired
internal scheme, without any perceivable effect to the end user. The sole
function of the ID is to resolve a reference to some object, and that is
all. It does not matter if an element is labeled "measure1" or
"MyAuntFlossie".

- For MNX, going forward, anyone -- perhaps James Sutton -- can open an
issue regarding addressability or IDs that expresses an unmet need, and we
can continue the discussion on that thread.

- Indexing-based schemes like "bars[0].parts[0].voices[0].beats[1]" do not
address the need for stable addressability that James originally raised
(and we think xml:id suffices for this need). My hope is that a good choice
of schema would allow XPath selectors to provide this sort of support.
Anyway, if someone feels we'll need something like this, by all means let's
file an issue for that too. We'll look into EMA separately; Andrew, perhaps
we can connect on this topic when I see you in Charlottesville.

Best,

.            .       .    .  . ...Joe

Joe Berkovitz
Founder
Noteflight LLC

49R Day Street
Somerville MA 02144
USA

"Bring music to life"
www.noteflight.com

On Fri, Sep 8, 2017 at 6:14 PM, Jeremy Sawruk <jeremy.sawruk@gmail.com>
wrote:

> James asked "However if one program uses 1,2,3.. for ids and another uses
> "a", "b", "c".. another uses decimal numbers, another uses hex, and another
> uses "part1-bar1" how can they interoperate in their use of the id?"
>
> My response is that it doesn't matter what the ID is, you can still use
> them sequentially if you absolutely must do that. You would do this by
> using an associative array of ID -> Int. Every time you find an ID that
> isn't in your associative array, you store that value and increment your ID
> counter. Now if you need the document's ID given your sequential number,
> you just do a lookup: ID[1] might return "abc". Because this is an
> associative array, the lookup should be O(1), though there is a memory
> overhead. Fortunately, most individual MusicXML files are relatively small
> (< 1MB), so the memory overhead isn't too much.
>
> If that isn't a viable solution, then you could just replace the IDs in
> the document with the IDs that you need in a preprocessing step. I don't
> know enough about James' software to know why this is useful, but there is
> nothing stopping him from doing this (nor should there be). The IDs will
> always have xml:id semantics when transmitted in documents, but there is
> nothing to stop a developer from reinterpreting them once they are inside
> of a MusicXML client. The MusicXML/MNX specifications cannot dictate HOW
> software is written, they merely specify the interchange format between
> different pieces of software.
>
> On Fri, Sep 8, 2017 at 4:31 PM, James Sutton <jsutton@dolphin-com.co.uk>
> wrote:
>
>> Hi Michael and all,
>>
>> comments inline..
>>
>> James Sutton
>> Dolphin Computing
>> http://www.dolphin-com.co.uk
>> http://www.seescore.co.uk <http://www.dolphin-com.co.uk>
>> http://www.playscore.co <http://www.dolphin-com.co.uk>
>>
>>
>>
>> On 8 Sep 2017, at 19:54, Michael Good <mgood@makemusic.com> wrote:
>>
>> Hi James and all,
>>
>>
>> ...
>>
>> I don’t see where that causes a problem though. What difference does it
>> make how a unique ID is formatted as long as it is unique within the
>> document, which any XML validator will check?
>>
>>
>> Not true. CodeSynthesis xsd/e (which SeeScore uses) does not check this.
>> This is a sensible approach as it can be expensive (time *and* space) to
>> check in the general case, especially for applications which don't care
>>
>>
>> With many MusicXML applications, these id attributes will not be
>> preserved on a round trip. MusicXML applications tend to read in a MusicXML
>> file and convert to the application’s underlying data structures. Data that
>> doesn’t fit into the application’s data structures is ignored. When the
>> file is exported it is coming from the application’s internal data, not the
>> original MusicXML file. So many things may change between import and
>> export. Mogens has mentioned this many times, but it seems inherent in how
>> MusicXML is currently used for document interchange.
>>
>>
>> yes
>>
>>
>> With MNX we are working on a format that is better suited as a native
>> representation for applications, for use cases that go well beyond document
>> interchange. So the preservation of data across cooperating applications
>> becomes a more interesting issue for exploration there.
>>
>>
>> ok, but I am sure we are all apprehensive about the huge pile of work to
>> adopt MNX! Even more for those of us where MusicXML is the document format.
>>
>>
>> Of course there are applications already using MusicXML as a native
>> format, or the basis for a native format. Those applications can define the
>> format of the unique IDs however they wish. But I still don’t understand
>> how a standardized id format would enhance interoperability.
>>
>>
>> numbers are cheap for the sort of processing needed for uids. It is a
>> natural choice. However if one program uses 1,2,3.. for ids and another
>> uses "a", "b", "c".. another uses decimal numbers, another uses hex, and
>> another uses "part1-bar1" how can they interoperate in their use of the id?
>> In particular, if you edit the file, add an annotation say,  how can the
>> editor generate a new uid in a file which uses different standards? The
>> only possibility is to regenerate all the ids in the file using the
>> standard that the editor uses, notwithstanding strictures from Johannes ;-).
>> If simple numbers are not seen as a good choice then we could agree some
>> other, but anything involving the index of the item in the
>> file/part/measure does not work as these could not be invariant on file
>> change.
>>
>>
>> For MusicXML 3.1, the main issue is if there is anything we need to
>> change with these new id attributes before release. I don’t think there is,
>> but I am not confident that I am really understanding what is behind
>> James’s request.
>>
>>
>> Nothing needs to change as far as I am concerned.
>> If you need more info we could go off this thread
>>
>>
>
Received on Monday, 11 September 2017 16:44:27 UTC