RE: Rights Automation Community Group - comments on draft Standard - Resources from Mark Bird on 2020-12-10 (public-md-odrl-profile@w3.org from December 2020)

From: Mark Bird <mark.bird@databp.com>
Date: Thu, 10 Dec 2020 20:15:50 +0000
To: "public-md-odrl-profile@w3.org" <public-md-odrl-profile@w3.org>
Message-ID: <ddc178e72d48466c9b50af5ffb1ac3ae@IRNLNDWINEXCH.databp.local>
Hi everyone,

We talked yesterday about how Resource is a generic term that applies to a data set in any phase of the life-cycle. We can be more specific by calling the Resource a Source when it's the informationally complete data set created by an Originator. We can also be more specific by calling the resource an Asset when it's (potentially) a subset of the Source and constrained by one or more Rules.

But what about a data set that has been "sliced and diced" so that it's no longer informationally complete, but is also not yet constrained by a Rule? We talk about data in this state all the time, and so I think that having a word to refer specifically it is important. Having thought about it a bit more, I'd like to suggest that we use the word Product for this purpose.

When an informationally complete data set has been refined in a way that makes it commercially useful, I think it makes a great deal of sense to say that it's been productized. Just as a reservoir of crude oil buried in the ground is a resource, but must be extracted and refined before it can be considered a product, so with data we can consider the Source to be complete, but unrefined, and a Product extracted from the Source to be commercially useful.

The knock against product as a term is that it's used so pervasively in the industry that it often means whatever we want it to mean in whatever context we're referring to at the time. But, I think that when we're thoughtful about using the term, we do mean a particular slice of a greater source, that may be licensed, distributed, consumed, and used in many different ways.

So, I propose that we add the term Product to section 2.2 of the ODRL profile<https://w3c.github.io/market-data-odrl-profile/md-odrl-profile.html#Resources>, filling a place between Source and Asset, and referring to a Resource that is "packaged by timeliness, geography, or other dimension before Rules are applied to control its use."

Best,
Mark



From: Mark Bird
Sent: December 9, 2020 7:53 AM
To: Benedict Whittam Smith <ben@deonticdata.com>; Phelan, Nigel <nigel.phelan@jpmorgan.com>; public-md-odrl-profile@w3.org
Subject: RE: Rights Automation Community Group - comments on draft Standard - Resources

Thanks, Ben! Looking forward to discussing more.

Best,
Mark


From: Benedict Whittam Smith <ben@deonticdata.com<mailto:ben@deonticdata.com>>
Sent: December 9, 2020 7:26 AM
To: Mark Bird <mark.bird@databp.com<mailto:mark.bird@databp.com>>; Phelan, Nigel <nigel.phelan@jpmorgan.com<mailto:nigel.phelan@jpmorgan.com>>; public-md-odrl-profile@w3.org<mailto:public-md-odrl-profile@w3.org>
Subject: Re: Rights Automation Community Group - comments on draft Standard - Resources

Ack - this is tricky. My comments inline below for discussion later.

________________________________
From: Mark Bird <mark.bird@databp.com<mailto:mark.bird@databp.com>>
Sent: Wednesday, December 9, 2020 12:00 AM
To: Phelan, Nigel <nigel.phelan@jpmorgan.com<mailto:nigel.phelan@jpmorgan.com>>; public-md-odrl-profile@w3.org<mailto:public-md-odrl-profile@w3.org> <public-md-odrl-profile@w3.org<mailto:public-md-odrl-profile@w3.org>>
Subject: RE: Rights Automation Community Group - comments on draft Standard - Resources


This week our group focused on the Resources section of the profile. Most of our discussion focussed on the concept of "Asset" and the way that it differs from the base term Resource. I'm typing out loud a bit here, but we settled our understanding on the definition of Asset (a Resource, a collection of Resources, or the part of a Resource controlled by a Rule) as meaning that an Asset has two fundamental components:



  1.  The data product itself
  2.  A specific right (and its accompanying obligations) that the Consumer has to that data product.

B> Yes, to the data product, but I'd put the second point a little differently: it's the fact that the data product is targeted by a right (aka rule), rather than the specific rule itself.



At first this was counterintuitive to me, and I was going to argue that Asset is the wrong word, because I'm used to thinking of an Asset as a tangible thing that I can do something with. One of my colleagues pointed out, though, that a firm that has these two license rights:



  1.  The right to distribute Product X internally to display terminals
  2.  The right to use that same Product X in a non-display trading application



In a very real sense has two distinct Assets, even though they're both based on the same data product.



B> No - so long as Product X has the same timeliness, book depth, and all the rest, it's the same Asset but two different Permissions.



So, to confirm my understanding, is it correct to say that an Asset is comprised of both a reference to a specific Resource, and also a particular license right the Consumer has to that Resource?



B> It's not that the specific Resource is controlled by a particular Rule, but the fact that the Resource is controlled by any Rule that makes it an Asset. In the example above, it's made available in two Permissions.



If that is correct, the introductory paragraphs in the Resources section reads: The data supply chain provides resources (usually data resources). To track their progress and ensure compliance, we need to make three distinctions: the original data created by the Originator (1); the resource packaged by timeliness, geography, or other dimension before Rules are applied to control its use (2); and the controlled resource that is received by a Consumer (3). The first is a Source and the last is an Asset. All are Resources.



The sentence that reads "The first is a Source and the last is an Asset" is confusing, since there are three things referred to. I believe that Source refers to the original data created by the Originator, and Asset refers to the controlled resource received by a Consumer. So, shouldn't there be a specific term for the middle thing, "the resource packaged by timeliness, geography, or other dimension before Rules are applied to control its use?"



This term might actually be more useful, or at least more frequently used, then Source, since Source data applies only to a very specific case at the top of the distribution chain. The term I have in mind would describe the data product when a Provider is in the act of distributing it, but before it's been received and constrained by the specific rights of the Consumer.



One final and separate comment: In an Editorial note we read "[a Resource] only becomes a new Resource if the transformation is irreversible (i.e. the original data cannot be recovered) and non-substitutive (i.e. the altered data cannot be used in place of the original)." We in the community group discussed the definition of derivation and agreed that everybody in the industry understands it to be irreversible and non-substitutive. But the gadfly in my group said "well yes but what if some particular Originator wants to define it slightly differently?"



B> We're not forcing participants to use these terms! If there is an alternative pattern, we could model that too.



I've been trying to think of how to frame this comment as a suggestion. Maybe it's that certain notes could be prefaced with something like "It's generally understood in the industry that..." or something along those lines. This would add weight to the definition by pointing to common usage.



B> I think the meanings must be stable and predictable. If the terms don't fit a particular scenario, don't use them! Should we allow some kind of "wild-card" constraints where people could use free text to describe the criteria? Maybe, but we couldn't automate that.



Best,
Mark





From: Phelan, Nigel <nigel.phelan@jpmorgan.com<mailto:nigel.phelan@jpmorgan.com>>
Sent: December 4, 2020 10:39 AM
To: public-md-odrl-profile@w3.org<mailto:public-md-odrl-profile@w3.org>
Subject: RE: Rights Automation Community Group - comments on draft Standard



We have some further feedback (slowly working through the draft...)



2.1.2.1.4 (Party Role/Administrator)



Based on the discussions in the last meeting, we think this is a licensing agent type role, which might be performed by an originator, or a provider, or might be outsourced by one of those to a third party; the comments below still apply; we think the role still exists even if the function is retained by the originator or the provider, and the duties and constraints would be similar (although additional duties could be imposed if the work was handled by a third party).  We are trying to simply the ODRL here by avoiding the need for duplicate duties in the case where a provider might be handling the licensing or it might have been handed to a third party - it feels cleaner to say "whoever is handling the licensing, regardless of other roles they might perform, has the following duties/constraints".



2.3 (Activities)



Firstly, we feel that "Activities" isn't the best name for this; these feel more like "Events".  You have a "Data Use" event, or an "Audit" event, for example, which may be used as triggers in certain rules.  How do others feel about this?  Are we missing the intent, here?



We feel that "Usage", "Audit" and "Derivation" would then be first class "Events", but that "Trading (and sub classes)", "Training", "Technical Support", "Quality Assurance", "Product development" and "Marketing" are types of "Usage" event (i.e. purposes for which the asset may be used).



For "Control", we think that "Closed User Group", "Provider Managed" and "Consumer Managed" are relevant concepts, but don't really fit in this section.  There may be "Control Events" - like granting user access rights or revoking them, but a "Closed User Group" is a thing, not an activity or a state change.  Maybe Control belongs in "Other Things of Interest", or as a section in its own right?



2.3.1.3 (Reasonable Suspicion)

Isn't the actual event here a "Licensing Breach"?  The Licensor has "reasonable suspicion" that the contract terms are being violated and seeks to impose a remedy?



2.3.1.4.1 (Closed User Group)

We prefer the term "Restricted User Group" (closed somewhat implies not subject to change, which feels wrong).  Also the description is a bit too prescriptive about the form of controls on users - most contracts use more open ended working like a need for "adequate technical controls", without specifying the form they take.  Perhaps "uniquely identified users" would be better?  Furthermore, should this be users, or entities?  Could users be "applications"?



2.3.1.5 (Derivation)

Aren't "Irreversible" and "Non-Substitutive" attributes of Derivations, rather than sub-classes (or perhaps they are Constraints)?  Most contracts I've seen require that a derivation possess both properties to be acceptable.  The newly created asset would still be a derived work if it lacked these attributes, but the contract wouldn't actually grant you the ability to do anything with it.  You would, however, have some innate rights in the derived work if the calculations performed incorporated your own IP as well, so it could be the case that nobody else could do anything with it without a license grant from yourself, so you can't quite treat it as a copy of the original data, over which originator/provider have control.





Nigel & Michelle



________________________________

Nigel Phelan | Corporate & Investment Bank | Market Data Services | J.P. Morgan



From: Phelan, Nigel (CIB Tech, GBR)
Sent: 25 November 2020 15:08
To: 'public-md-odrl-profile@w3.org<mailto:public-md-odrl-profile@w3.org>' <public-md-odrl-profile@w3.org<mailto:public-md-odrl-profile@w3.org>>
Subject: Rights Automation Community Group - comments on draft Standard



Hi Ben - Michelle and I have been reviewing the standard this week; it's taking a while and we expect to continue providing feedback on it in the coming weeks as we get further through it, but we had some initial comments on content up to section 2.2.2



2.1.1 - Party Types



We feel that some contracts require further differentiation of party types than simply "Internal Party" and "External Party".  We see the following distinction:



Internal

*       Licensee consumers who are covered by the License entity

*       Licensee affiliates, who sometimes have lesser rights than the Licensee consumers

 External

*       Professional consumers - described as using for business purposes, which can include commercial use

*       Non-professional consumers - described as using for personal or non-commercial use

*       Public website consumer - where there are typically no distinctions between the capacities in which an individual may be using the data



All of these are defined with different obligations on the use case, permissioning and commonly contracts require that you distinguish them in constraints or duties.  We think it is relevant to have these different categories to allow differentiation on the use cases.  How could we model this?  Does it make more sense to introduce qualifiers on Party Types or to subclass the Internal Party and External Party types?  The external case could potentially be handled by using duties that apply to external parties performing particular roles (assuming we define those roles), but the internal ones do look more like distinct subclasses of internal party.



2.1.2 Party Roles



We query the definition of "Service Facilitator".  The existing definition suggests that it refers to entities that are "assisting in the delivery of data services"; we feel that there are cases where it would be more appropriate to talk about them being an external organisation contracted by a Party to use the Party's data access rights to perform a business function (which might encompass generating derived data for the Party, fitting the existing definition, but might involve other activities).



On "Administrator" you state this must be an external entity.  We may be missing the intent here; we were thinking this is the function that controls downstream access to the data assets and potentially has reporting / notification duties.  If so, we would see this as potentially either an external or an internal party.  Some duties might only apply in the case where it was an external party, but that could be accommodated by qualifying the duty by party type.



2.2.1 Resource Types

The use of "former" and "latter" is slightly confusing here; you identify three cases:

*       the original data resource created by the Originator

*       the resource "at rest" in an organisation before Rules are applied to control its use

*       the contolled(typo) resource that is received by a Consumer



Is the intended interpretation that the first two are of type "Source" and the third "Asset", or is the at rest resource something other than a Source (in which case, what is it?).  We note that there are licensing terms related to the storage of historical time series data that might require you to distinguish between the first two cases.





That's all we have come up with to date.







Nigel



________________________________

Nigel Phelan | Corporate & Investment Bank | Market Data Services | J.P. Morgan



This message is confidential and subject to terms at: https://www.jpmorgan.com/emaildisclaimer including on confidential, privileged or legal entity information, malicious content and monitoring of electronic messages. If you are not the intended recipient, please delete this message and notify the sender immediately. Any unauthorized use is strictly prohibited.
Received on Thursday, 10 December 2020 20:16:06 UTC