Re: Clarity on DID Resolution as it refers to DID WG from Justin Richer on 2020-04-21 (public-did-wg@w3.org from April 2020)

From: Justin Richer <jricher@mit.edu>
Date: Tue, 21 Apr 2020 10:54:23 -0400
To: Ivan Herman <ivan@w3.org>
Cc: Manu Sporny <msporny@digitalbazaar.com>, W3C DID Working Group <public-did-wg@w3.org>
Message-Id: <3489AFC5-AF8B-421A-A7E0-2FCD415F9BEB@mit.edu>
That makes sense to me, and therefore if we can drop the two statements from the bulleted list about how it is tested then I think we can move forward in agreement with everything else.

 — Justin

> On Apr 21, 2020, at 10:27 AM, Ivan Herman <ivan@w3.org> wrote:
> 
> Thank you Justin.
> 
> But that goes back to what I said. The question is whether it is in the scope of our charter to define
> 
>  resolve ( did , input-metadata ) -> ( did-document , did-document-metadata , resolution-metadata )
> 
> Indeed (I presume that is where Manu is coming from) if we decide that it is in scope and we add the contracts to the final spec, we also have the obligation to test the statement. But it is not a charter issue to define what the testing approach should be. As a consequence, there is no charter interpretation question related the tests.
> 
> It is up to the WG to decide how to provide a convincing set of argument to the Director on the goal of the CR testing (see my previous mail). There is no obligation to have any kind of automatic testing, for example.  If there is at least two implementations that do implement this function, thereby testing _our specification_, then we are fine. If we do more, that is even better, of course, but that is besides the point at this moment.
> 
> To repeat myself: the (maybe somewhat counterintuitive) goal of the CR test is to validate the specification. It is not the goal of the CR test to validate the implementations. 
> 
> What this means is that, in my view, in the long list of bullet items on "It is in the scope…" I think that the one on testing, ie, the one you actually disagree upon, is not relevant at this point. If the WG agrees on the other bullet points, then we are fine. We will have to fight that out when we come to the CR phase, as part of the overall discussion on how we will test the CR.
> 
> Talk to you soon,
> 
> Ivan
> 
> 
> 
>> On 21 Apr 2020, at 15:37, Justin Richer <jricher@mit.edu <mailto:jricher@mit.edu>> wrote:
>> 
>> Ivan, thanks for the insight on the charter and its interpretations within the W3C. 
>> 
>> The disagreement that Manu and I have is exactly about :how: to test a particular set of requirements implied by the text. So let’s take the resolve function that is in my pull request:
>> 
>>  resolve ( did , input-metadata ) -> ( did-document , did-document-metadata , resolution-metadata )
>> 
>> We are defining what this function looks like in terms of requirements on its inputs and outputs. There are two ways to test a function like this: you are either testing what’s on either side of the arrow, or you’re testing the arrow itself. I am arguing that since we are defining things solely in terms of the inputs and outputs, then it’s up to us to test those inputs and outputs. So you’d mock out an implementation and have it take in a bunch of generated DIDs and input metadata requests, both good and bad, and return a response for each, both good and bad. This tests the software that is calling the resolve() function. The other way to test this, which Manu and I disagree about below, is to test the arrow — that is, to test implementations of this function. This is what testing a resolver does, you have the test suite hand the resolver inputs and have the test suite process the outputs. This is the inverse of having the test suite pretend to be a resolver, to take inputs and give outputs. The two testing methods are complementary and both test the function definition. While they’re both valuable, we are more concerned with the former (testing the inputs and outputs against a mock resolver) than we are against the latter (testing a resolver implementation) at this time. 
>> 
>> The disagreement on this point, that I can see, is that Manu contends that the act of normatively defining the function above means we, as a group, now have a mandate to have automated tests for resolver implementations. It is my stance that this is not the case, and that automated testing of the function can be accomplished in different ways. In fact, that it :should: be accomplished in different ways given the context of the function definition. 
>> 
>> It’s like what we did in the OpenID Foundation: we created tests for IdP’s first, and only later created tests for RPs. Both of these exercised the same specifications, but from different directions and for different code implementations. 
>> 
>>  — Justin
>> 
>>> On Apr 21, 2020, at 2:54 AM, Ivan Herman <ivan@w3.org <mailto:ivan@w3.org>> wrote:
>>> 
>>> 
>>> 
>>>> On 20 Apr 2020, at 23:37, Manu Sporny <msporny@digitalbazaar.com <mailto:msporny@digitalbazaar.com>> wrote:
>>>> 
>>>> On 4/20/20 4:45 PM, Justin Richer wrote:
>>>>> I am hoping we’re not as far apart on our interpretation of
>>>>> things as it might seem, and so here is my counter proposal
>>>>> to your set of conclusions on the interpretation of the
>>>>> charter bullet point list
>>>> 
>>>> Thanks for the concrete counter-proposal, Justin. I'll go through each
>>>> item to clarify the proposal we may want to put in front of the group:
>>>> 
>>>>> * It is in scope for the DID WG to normatively define the >   parameters of a concrete set of processes that>   take a DID as
>>>> input and provides a DID Document as>   output.
>>>> Agree.
>>>> 
>>>>> * It is in scope to normatively define the parameters
>>>>>  of a concrete set of processes that take a DID URL
>>>>>  as input and provide a resource result as output.
>>>> 
>>>> Agree.
>>>> 
>>>>> * It is in scope to make these processes take in options and
>>>>>  provide back a document along with different
>>>>>  classes of metadata (e.g., document metadata and
>>>>>  resolution metadata).
>>>> 
>>>> Agree.
>>>> 
>>>>> * It is in scope to add tests to the test suite that exercise
>>>>>  the callers of these processes and test the generation
>>>>>  of DIDs and DID URLs as well as the processing of
>>>>>  DID Documents and associated metadata. 
>>>> 
>>>> Close. Strike "the callers". There are multiple valid ways to test this
>>>> and we should avoid painting ourselves into a corner.
>>>> 
>>>>> * It is out of scope to add resolution tests to the test
>>>>>  suite that exercise the generalized DID resolution
>>>>>  process in the specification on concrete DID Method
>>>>>  implementations.
>>>> 
>>>> Disagree.
>>> 
>>> Coming in a bit late in the discussion, and not being present on the topic calls: I think we would need some more information on what this statements, and the disagreements around them, really mean.
>>> 
>>> From a process point of view: the charter, and indeed the W3C process, does _not_ give any restriction on _how_ a particular feature is tested. The goal of the CR testing, again per process, is to prove that the specification (more exactly, the normative part of the specification) is consistent, is implementable (usually required to be implemented by two independent implementations of some sort), and is interoperable, ie, independent implementations implement the features the same way based on the specification and based _only_ on the specification. How exactly this testing is done is up to the WG to define and the WG must convince the Director that the testing methodology is suited to these general goals.
>>> 
>>> Testing can be very different. Browser-dependent features (eg, various Web APIs) are usually tested these days via a complex and (semi-)automatic test suite that makes use, for example, of headless clients. Other testing approaches define some sort of a test manifest format accompanied by large collection of test cases; implementations self-test and return a manifest instance of their testing results to the WG (that usually compiles a suitable report out of those). This is what is happening, for example, in the JSON-LD Working Group right now. The situation is even more complex when the specification is, for example, a vocabulary: there is no 'executable' tests, and the tests are more to prove that "implementations" (ie, users) of those vocabularies really make use of the defined terms. Etc.
>>> 
>>> Can someone describe me, in light of that, what exactly is meant by these items and what the underlying disagreements are? Because my first reaction is that how tests are done is not a matter of the charter, i.e., the interpretation of the charter in the first place!
>>> 
>>> Thanks
>>> 
>>> Ivan
>>> 
>>> 
>>> 
>>>> 
>>>>> * It is out of scope to normatively define DID Method specific
>>>>>  details of implementing DID resolution.
>>>> 
>>>> Agree.
>>>> 
>>>>> * It is out of scope to normatively define DID Resolution
>>>>>  protocols.
>>>> 
>>>> Agree.
>>>> 
>>>>> * It is out of scope to test concrete DID Resolution
>>>>>  protocols and data formats beyond the necessary
>>>>>  process to demonstrate interoperability between the
>>>>>  test suite and an implementation.
>>>> 
>>>> Agree.
>>>> 
>>>> So, we're very close. :)
>>>> 
>>>> -- manu
>>>> 
>>>> -- 
>>>> Manu Sporny - https://www.linkedin.com/in/manusporny/ <https://www.linkedin.com/in/manusporny/>
>>>> Founder/CEO - Digital Bazaar, Inc.
>>>> blog: Veres One Decentralized Identifier Blockchain Launches
>>>> https://tinyurl.com/veres-one-launches <https://tinyurl.com/veres-one-launches>
>>>> 
>>> 
>>> 
>>> ----
>>> Ivan Herman, W3C 
>>> Home: http://www.w3.org/People/Ivan/ <http://www.w3.org/People/Ivan/>
>>> mobile: +31-641044153
>>> ORCID ID: https://orcid.org/0000-0003-0782-2704 <https://orcid.org/0000-0003-0782-2704>
> 
> 
> ----
> Ivan Herman, W3C 
> Home: http://www.w3.org/People/Ivan/ <http://www.w3.org/People/Ivan/>
> mobile: +31-641044153
> ORCID ID: https://orcid.org/0000-0003-0782-2704 <https://orcid.org/0000-0003-0782-2704>
>
Received on Tuesday, 21 April 2020 14:54:45 UTC