Re: Bodies resource from Benjamin from Doug Schepers on 2015-09-01 (public-annotation@w3.org from September 2015)

From: Doug Schepers <schepers@w3.org>
Date: Tue, 1 Sep 2015 13:06:35 -0400
To: Benjamin Young <bigbluehat@hypothes.is>, Robert Sanderson <azaroth42@gmail.com>
Cc: W3C Public Annotation List <public-annotation@w3.org>
Message-ID: <55E5DB1B.5050609@w3.org>
Hi, Benjamin–

I realize that you were probably just putting out a strawman for 
discussion, and that you were probably making a different point, but 
since you are talking in code, I thought it would be useful to make a 
specific point about your code.

Just a high-level response, inline…

On 9/1/15 11:40 AM, Benjamin Young wrote:
> On Tue, Sep 1, 2015 at 11:21 AM, Robert Sandersonwrote:
>
>
>         Where this is trending now in my head is that we *keep*
>         motivation on the annotation, but create classes for bodies.
>         What this *might* look like in JSON-LD is something like:
>
>         ```
>         {
>            "type": "Annotation"
>            "motivation": "editing",
>            "bodies": {
>              "tags": ["correction", "typo"],
>              "comment": "wow...I should learn to type...",
>              "edit": {
>                "original": "itinirary",
>                "replacement": "itinerary"
>              },

This should not be necessary, under any of the proposals we'd been 
considering thus far.

My immediate reaction was (I think) similar to Rob's:

>     * A pattern for extension that doesn't involve subProperties is what
>     we have now.

If I'm reading Rob correctly, this means that none of the bodies (or 
targets) should have special sub-properties (or sub-structures) of the 
same type (e.g. motives/motivations/roles) that require special parsing 
or processing.

(Note that Target does have Selectors each with idiosyncratic 
properties, but in this case, I think it's unavoidable and they are 
clearly defined.)


Without making any judgment for or against other aspects of your 
strawman, and keeping everything else the same to isolate this single 
point for discussion, here's how I'd reformulate your strawman:

  ```
  {
     "type": "Annotation"
     "motivation": "editing",
     "bodies": {
       "tags": ["correction", "typo"],
       "comment": "wow...I should learn to type...",
       "edit": "itinerary",
       "related": ["http://dictionary.reference.com/browse/itinerary"]
     },
     "target": "http://example.com/doc1"
     "target": {
       "source": "http://example.com/doc1",
       "selector": {
         "type": "oa:TextQuoteSelector",
         "exact": "itinirary"
       }
     }
  }
  ```

Yes, it's slightly longer. But has the same functionality, and it avoids 
two crucial problems:

1) the needless duplication of information;
1a) you'd need a TextQuoteSelector in the target anyway to correctly 
anchor the selection;
1b) mechanisms that duplicate information in multiple places are prone 
to getting out of sync and causing problems;

2) the need for idiosyncratic and potentially unpredictable additional 
structures or properties within a known type of property
2a) this makes processing more difficult even for known structures of 
this type
2b) introducing such a structure into an extension point sets a pattern 
that makes graceful degradation very difficult


And, again, it's not necessary. I think it's useful for use to talk 
about these edge cases (and central use cases) because it helps us 
validate that our design is practical and versatile. In this case, you 
wrote some strawman code that might well have been done by a developer 
unfamiliar with the data model's design principles, and we were easily 
able to reformulate it into something that easily avoids the problems.

This tells me 2 things:

1) the data model is strong and flexible;

2) we need to be really clear about how the model works, in terms the 
average developer can understand, and show explicitly how to add 
extensions (where they can be added, and how they should be structured); 
we can provide examples to make it clearer (like Rob's  “antecedent” and 
“subsequent” motives).




On a related topic (which I'm putting here just to capture it)…
Note that this my formulation has a somewhat interesting side effect. 
Since the TextQuoteSelector doesn't have a "prefix" or "suffix", it's 
ambiguous which instance of the "exact" quote value "itinirary" it's 
referring to, if there was more than one misspelling in the same 
document. Is it the first instance? The last instance? All instances? Is 
this a hack for spellcheck, or an abuse of the data model? Should this 
be expressed as multiple targets? Or should we define some "all 
instances" property? Or should we require a "prefix" and/or "suffix"? Is 
the Data Model the right place to define UA behavior for resolving 
selectors? Or should there be another spec, perhaps something that 
defines UA behavior for selectors in terms of RangeFinder and other APIs?

Food for thought.

Regards–
–Doug
Received on Tuesday, 1 September 2015 17:06:40 UTC