Re: anno-ACTION-29: Create draft proposal for rendering state/selector from Robert Sanderson on 2015-11-25 (public-annotation@w3.org from November 2015)

From: Robert Sanderson <azaroth42@gmail.com>
Date: Wed, 25 Nov 2015 10:26:12 -0800
Cc: W3C Public Annotation List <public-annotation@w3.org>
Message-ID: <CABevsUH=4+GrG7f7V1u5eQi2LdyWHjCHCh9VdtO54e1qow16EQ@mail.gmail.com>
Meta -- is there a github issue for this for tracking? If not can we create
one please?

+1 to the use case and functionality addition. There doesn't seem like
another way to accommodate it, and the implications on the model and
vocabulary are relatively minor.
I propose that we accept (if we haven't already) this as a feature to
include in the current specs.

I'm fine with a carefully worded MAY. We wouldn't want a validation system
to throw a warning for every SpecificResource that does not have a
renderedVia property, for example.

I think there's a difference between Rendering Agent and State.  State
should get you the right representation, but then the client still needs to
render it.  Other use cases that have been discussed in this space are 3d
models, where you need to get the right representation (such as via conneg)
and then render the model appropriately such that the annotation is
visible.  Also formats that are not implemented in browsers, such as
Jpeg2000 or (previously?) WebP.

Rob


On Wed, Nov 25, 2015 at 9:01 AM, Ivan Herman <ivan@w3.org> wrote:

>
> On 25 Nov 2015, at 17:44, Benjamin Young <bigbluehat@hypothes.is> wrote:
>
> On Wed, Nov 25, 2015 at 6:09 AM, Ivan Herman <ivan@w3.org> wrote:
>
>> Benjamin,
>>
>> I am fine with what you propose, and have only one minor remark. You say:
>>
>> “If a target is transformed by the user agent, information about the
>> software used SHOULD be recorded, so that future consuming applications
>> (and their developers) can make use of the information in order to optimize
>> selection and rendering.”
>>
>> I have the impression that SHOULD is too strong. This whole thing may
>> very well depend on the application setup and usage environment, and MAY
>> may (sic!) be enough. But I admit I do not have a very strong opinion on
>> this.
>>
>
> My concern with it being “lowered” to a MAY is that it’ll just get
> overlooked and not done. SHOULD is still less than MUST after all. :)
>
>
> Well… SHOULD, in our terminology (ie, the RFC2119) means:
>
> > SHOULD  This word, or the adjective "RECOMMENDED", mean that there may
> exist valid reasons in particular circumstances to ignore a particular
> item, but the full implications must be understood and carefully weighed
> before choosing a different course.
>
> It *is* a fairly strong requirement. Whereas
>
> > MAY This word, or the adjective "OPTIONAL", mean that an item is truly
> optional.  One vendor may choose to include the item because a particular
> marketplace requires it or because the vendor feels that it enhances the
> product while another vendor may omit the same item. An implementation
> which does not include a particular option MUST be prepared to interoperate
> with another implementation which does include the option, though perhaps
> with reduced functionality. In the same vein an implementation which does
> include a particular option MUST be prepared to interoperate with another
> implementation which does not include the option (except, of course, for
> the feature the option provides.)
>
> Seems to fit the situation better. It is still stronger than what you
> describe above
>
> Ivan
>
>
>
> Without this information the (rash) assumption is made that the resource
> was viewed via the browser’s rendering engine. If that document were a PDF,
> EPUB, or even a CSV served by a system providing annotation today that’d
> likely very much not the case.
>
> There are also scenarios, such as the Scalar CMS [1], which ships a
> “heavily” semantic RDFa HTML document to the browser and then fully
> transforms the thing on arrival into a different DOM. Such that if you were
> to annotate against the rendered thing, the selectors you stored would not
> match the resource you’d GET back.
>
> [1] http://scalar.usc.edu/scalar/
>
> If this got lowered to a MAY, my guess is that it would only be recorded
> when the self-focused consuming application cared to care about its future
> self—which, honestly, doesn’t even happen as often as one may expect. It
> certainly wouldn’t happen with a focus to value made for others for
> considering the selectors made against the resource.
>
> However, I will add that this render scenario may be better framed as a
> State class…in which case…your though to lower this to MAY seems reasonable…
> http://w3c.github.io/web-annotation/model/wd/#states
>
> I’ll keep thinking about it. I’d prefer SHOULD, but…it’s not the biggest
> fish we’re frying here…
>
> Thanks!
> Benjamin
> —
> Developer Advocate
> http://hypothes.is/
>
>
>
>> Ivan
>>
>>
>> On 24 Nov 2015, at 20:45, Benjamin Young <bigbluehat@hypothes.is> wrote:
>>
>> On Tue, Nov 17, 2015 at 6:08 AM, Ivan Herman <ivan@w3.org> wrote:
>>
>>>
>>> On 16 Nov 2015, at 23:14, Benjamin Young <bigbluehat@hypothes.is> wrote:
>>>
>>> This one's been on my list for awhile, and I've dug into the space
>>> several times, and here (at last) is the output.
>>>
>>> I dug through the following vocabularies:
>>>  - prov-o:SoftwareAgent
>>>  - dcterms:Software
>>>    - we already include `dcterms` as a prefix
>>>  - foaf:Agent
>>>    - `foaf` is also included already
>>>  - doap
>>>    -
>>> https://github.com/edumbill/doap/wiki/JSON-LD-@context-for-package.json
>>>    - non-normative afaik
>>>    - context doc is incomplete (at best), but DOAP is used at Apache &
>>> PyPi among others
>>>  - schema.org's SoftwareApplication
>>>    - by far the most thorough that I looked into
>>>    - according to some chats at TPAC, we can reference them now/soon
>>>
>>> PROV-O has the right terminology in the form of `prov:qualifiedUsage`
>>> and it's `prov:Usage` class. Schema.org <http://schema.org/>'s
>>> schema:SoftwareApplication (and possibly schema:WebApplication) include a
>>> couple properties (and perhaps a few others) which are clearer in their
>>> specification than `dcterms`, `foaf`, etc. More info below.
>>>
>>> Here's an example of what I'm thinking based on the above--I'm using
>>> full CURIE names for the new keys. The other's are specified in
>>> http://www.w3.org/ns/anno.jsonld
>>> ```
>>> {
>>>   "type": "Annotation",
>>>   "target": {
>>>     "type": "SpecificResource",
>>>     "source": "http://example.com/",
>>>     "selector": [{...}, {...}],
>>>     "prov:qualifiedUsage": {
>>>       "type": ["prov:Usage", "prov:EntityInfluence"],
>>>       "prov:entity": {
>>>         "id": "https://github.com/mozilla/pdf.js/releases/tag/v1.1.114",
>>>         "type": "Software",
>>>         "name": "PDF.js",
>>>         "schema:softwareVersion": "1.1.114",
>>>         "schema:browserRequirements": "HTML5, CSS, JavaScript"
>>>       }
>>>     }
>>>   },
>>>   "generator": {
>>>     "type": "Software",
>>>     "name": "Hypothes.is <http://hypothes.is/>",
>>>     "homepage": "http://hypothes.is/"
>>>   }
>>> }
>>> ```
>>>
>>> What I'd like this to end up looking like, however, is this example:
>>> ```
>>> {
>>>   "type": "Annotation",
>>>   "target": {
>>>     "type": "SpecificResource",
>>>     "source": "http://example.com/",
>>>     "selector": [{...}, {...}],
>>>     "renderedVia": {
>>>       "id": "https://github.com/mozilla/pdf.js/releases/tag/v1.1.114",
>>>       "type": "Software",
>>>       "name": "PDF.js",
>>>       "softwareVersion": "1.1.114",
>>>       "browserRequirements": "HTML5, CSS, JavaScript"
>>>     }
>>>   },
>>>   "generator": {
>>>     "type": "Software",
>>>     "name": "Hypothes.is <http://hypothes.is/>",
>>>     "homepage": "http://hypothes.is/"
>>>   }
>>> }
>>> ```
>>>
>>> Turning the `prov:qualifiedUsage a prov:Usage; prov:entity` chain into
>>> just `renderedVia` (or similar) is beyond my physic.
>>>
>>>
>>> I would be worried about bringing prov into this. There is relatively
>>> complex model behind the Provenance ontology[1], and trying to fuse these
>>> things seems to be way more than what we need.
>>>
>>> [1] http://www.w3.org/TR/2013/REC-prov-dm-20130430/
>>>
>>>
>>> Is something like that even possible? If that's not, then I propose we
>>> define `renderedVia` to mean essentially that--the usage of this
>>> SpecificResource (in this case that target source + this selector list) is
>>> qualified by the use of this rendering system...such that these selectors
>>> may not work if the rendering scenario is different.
>>>
>>>
>>> I think that doing partially ourselves, and use either Schema and/or
>>> DOAP terms is way enough for what we need. Let us try to keep it simple.
>>>
>>> My only problem with DOAP is that it is not really maintained by anyone
>>> right now. Last time I talked to Edd Dumbill seemed to suggest that he had
>>> moved on; the repo has not been touched for years. That being said, there
>>> is a decent deployment of his terms, so this is a possibility.
>>>
>>> There are some projects (I would have to dig them out) that are
>>> concerned about the scholarly references of software which could be used
>>> here, but I would not think we should spend too much energy into this.
>>> Users/implementations may use their own vocabularies if we just provide
>>> some sort of elementary scaffolding: that is all we need to do imho. Ie,
>>> more or less what you have in the example above with name, homepage,
>>> software versions, maybe 1-2 terms more, and stop there…
>>>
>>
>> Agree completely. I did the PROV “homework” only because we’d used it
>> elsewhere. But I’m happy to consider it beyond what we need for this case.
>>
>> Given that, here’s (essentially) what I’d propose for the “richest”
>> expression of a renderer:
>>
>> ```
>> {
>>   "type": "Annotation",
>>   "target": {
>>     "type": "SpecificResource",
>>     "source": "http://example.com/",
>>     "selector": [{...}, {...}],
>>     "renderedVia": {
>>       "id": "https://github.com/mozilla/pdf.js/releases/tag/v1.1.114",
>>       "type": "Software",
>>       "name": "PDF.js”,
>>       “homepage”: “https://github.com/mozilla/pdf.js”,
>>       "softwareVersion": "1.1.114"
>>     }
>>   },
>>   "generator": {
>>     "type": "Software",
>>     "name": "Hypothes.is <http://hypothes.is/>",
>>     "homepage": "http://hypothes.is/"
>>   }
>> }
>> ```
>>
>> The only addition, then, to what we have in `generator` is the
>> `schema:softwareVersion` from http://schema.org/SoftwareApplication
>>
>> I removed (since my earlier example) the `browserRequirements` field as
>> it’s not really programmatically actionable as defined (it’s just “prose”
>> for a human being really). `softwareVersion` is also just Text, but there’s
>> some expectation there about what will be stored.
>>
>> It’s a bit tricky (I’ll admit) to measure the possibility of how this can
>> and would be used by others outside of the producing system.
>>
>> In Hypothes.is <http://hypothes.is/>, my hope / plan is that we’ll store
>> it for our own documentation to understand what’s changed. If we’re
>> consistent about it, then we’ll at least know that the XPath selectors we
>> stored are likely useless if we’re no rendering via a newer version of
>> PDF.js…but sadly, comparing “1.2.2” to “1.1.114” is…not uncomplicated…
>>
>> So. I’m completely open to suggestions, but I think *not* giving people a
>> place to put this data (however “rough” it may be) is worse—as it’s
>> anybodies guess what those selectors were built with.
>>
>> The essence would be:
>> “If a target is transformed by the user agent, information about the
>> software used SHOULD be recorded, so that future consuming applications
>> (and their developers) can make use of the information in order to optimize
>> selection and rendering.”
>>
>> …that’s a bit rough, but you get the idea. :)
>>
>> Thanks!
>> Benjamin
>> —
>> Developer Advocate
>> http://hypothes.is/
>>
>>
>>>
>>> Ivan
>>>
>>>
>>> I toyed (for a bit) using prov:qualifiedDerivation, but that seemed to
>>> require stating more about this new generated resource (the PDF.js
>>> representation of the resource referenced by the `target`) rather than
>>> about the `SpecificResource`--such as "when you use this SpecificResource
>>> you should care about this situation" vs. "hey look! PDF.js made a new
>>> thing out of this other thing...and you should care." Especially given that
>>> things like Text Quote Selector are pretty resilient (assuming only the
>>> structure of the document changes between format vs. the content) such that
>>> this "qualified usage" may or may not be of value to the consumer of the
>>> annotation for it's own rendering / display of the document + the
>>> annotation.
>>>
>>> Regardless of which terms we use or what we call the key, the best place
>>> I could find to put it was as a property on `SpecificResource` as having it
>>> on the annotation (as generator is) sends the wrong message about what's
>>> effected by this information.
>>>
>>> Also, `renderedVia` seemed more correct than `hasRenderer`.
>>>
>>> Thoughts welcome!
>>> Benjamin
>>> --
>>> Developer Advocate
>>> http://hypothes.is/
>>>
>>> On Wed, Oct 21, 2015 at 11:36 AM, Web Annotation Working Group Issue
>>> Tracker <sysbot+tracker@w3.org> wrote:
>>>
>>>> anno-ACTION-29: Create draft proposal for rendering state/selector
>>>>
>>>> http://www.w3.org/annotation/track/actions/29
>>>>
>>>> Assigned to: Benjamin Young
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>
>>>
>>> ----
>>> Ivan Herman, W3C
>>> Digital Publishing Lead
>>> Home: http://www.w3.org/People/Ivan/
>>> mobile: +31-641044153
>>> ORCID ID: http://orcid.org/0000-0003-0782-2704
>>>
>>>
>>>
>>>
>>>
>>
>>
>> ----
>> Ivan Herman, W3C
>> Digital Publishing Lead
>> Home: http://www.w3.org/People/Ivan/
>> mobile: +31-641044153
>> ORCID ID: http://orcid.org/0000-0003-0782-2704
>>
>>
>>
>>
>>
>
>
> ----
> Ivan Herman, W3C
> Digital Publishing Lead
> Home: http://www.w3.org/People/Ivan/
> mobile: +31-641044153
> ORCID ID: http://orcid.org/0000-0003-0782-2704
>
>
>
>
>


-- 
Rob Sanderson
Information Standards Advocate
Digital Library Systems and Services
Stanford, CA 94305
Received on Wednesday, 25 November 2015 18:27:05 UTC