Re: OA in HTML (was Annotation Serializations) from Robert Sanderson on 2014-01-20 (public-openannotation@w3.org from January 2014)

From: Robert Sanderson <azaroth42@gmail.com>
Date: Mon, 20 Jan 2014 12:17:55 -0800
To: Doug Schepers <schepers@w3.org>
Cc: public-openannotation <public-openannotation@w3.org>
Message-ID: <CABevsUEZ8ObYU_zRZovuo0EmE1=gRksPLVs9XoRqpCp4S15UQA@mail.gmail.com>
Hi Doug,

As often, I think we're in violent agreement :)

On Mon, Jan 20, 2014 at 11:53 AM, Doug Schepers <schepers@w3.org> wrote:

>
>> * A simple HTML-based serialization would be valuable
>>    -- Embedding an annotation in a page by hitting an API and getting
>> the HTML back
>>
>>
>> I think we're in danger of mixing up a few topics here: UI, API and
>> serialization.  Is the requirement for an API that returns pre-formatted
>> HTML for direct inclusion into other OWP applications, or is it an HTML
>> serialization of the data model that will be interpreted and rendered in
>> some way by a User Agent, perhaps using completely different HTML?  The
>> former implies, but does not require, a particular look and feel, such
>> as "a few minutes ago" in the time part of Doug's strawman HTML.
>>
>
> I hope it was clear that the strawman I made was meant as sort of an
> "idealized" and minimalist example of an annotation, with only some
> essential features.
>

Yes, it was.  I meant here that any HTML representation intended for direct
inclusion (ala tweet streams) into another app or page will necessarily
include styling and design, and thus standardization of that across vendors
will be, in my opinion, impossible and unnecessary.


> A real annotation produced by an authoring tool would likely be full of
> <div>s and <span>s and other cluttered markup inserted for other reasons
> (often for styling, or artifacts of composite generation). For example,
> view the source of a tweet, a Disqus comment, or a Facebook post; this is
> what will be generated. The key is that no matter what other junk was found
> in the content of the root element, certain well-formatted bits would be
> extracted as specifically mapping to the OA model, while the rest would be
> treated as body (or ignored).
>

Yes, which is why I'm keen to explore the limits of RDFa first before
turning to a home grown solution.



> I'd like to bring up another point: while HTML semantics might seem very
> lax to RDF folks, but they are treated very seriously by many web
> developers and designers. They like consistent patterns, and if we can
> provide them some, that will go a long way toward making them comfortable
> with producing distillable annotations.


+1.  And if there's recommendations as to providing a more clearly defined
set of usage patterns for representing annotations in HTML, I'm all for it
:)

 The API providing pre-formatted HTML seems very community and situation
>> specific, and thus difficult to standardize directly or effectively. You
>> would likely not want to include the same HTML into an EPUB reading
>> system, as inline into a web page of the same text, or into a stream of
>> the user's annotations due to the different contexts in which that same
>> annotation is being used.  So my perspective is that while this is good
>> background, it's not itself a requirement that we need to address in
>> this CG (or a potential future WG) without vendors first coming to us
>> with a need to interoperate in this way. On the other hand, having a
>> best practice for HTML serialized annotations such that the contents are
>> able to be understood, regardless of the exact manner in which they were
>> obtained, would be very valuable and the scope is much clearer.
>>
>
> I think you may have misunderstood what I meant by an API; I was talking
> about a client-side JavaScript API for the <note> element, not a
> server-side API for outputting HTML... though I think that's something
> people will do, and in fact, already do (again, see Twitter).


Yes, I wasn't meaning client side here, just the notion of the twitter-like
stream of annotations in HTML.  So for my points above, think twitter.  For
client side, that's another matter entirely that will be essential to
discuss and work on with the input of the stakeholders.


The use of RDFa, as Tim and Ivan discuss, is more clearly a
>> serialization topic -- how can RDFa be leveraged to provide a
>> serialization in HTML that is friendly to web developers?  This also
>> addresses the completeness issue that Doug brings up in his original
>> email.  I think it would be extremely presumptuous not to first do the
>> full RDFa mapping and see what we can come up with, perhaps recruiting
>> additional expertise in the area if needed to help us.
>>
>> Then we can assess the utility and friendliness of the mapping towards
>> Doug's points of adoption.  It may be that the mapping is great, and
>> hence no need to go any further, or it may be that vendors come back and
>> say it's too arcane and there should be further work done. But that's
>> for the future to determine :)
>>
>
> I have already discussed this informally with at least one vendor, and I
> got the feedback I expected: they want us to address their use cases, and
> are less interested in the data model unless it's bundled with other parts
> of the larger puzzle that will make the ecosystem work.
>

For sure. But the first step in having that conversation be more focused
is, IMO, to produce the RDFa mapping such that it can be evaluated.  No one
wants to make additional work that's unnecessary, but there's a fine line
between rigour and adoption.  At this point, I feel we should err on the
side of rigour as adoption is unable to be determined without the direct
input from potential adopters... and they need something to give their
input on rather than just "we want to use HTML".  See also the analogous
JSON/JSON-LD topic too.


> I don't think we have the luxury to put it off to the future; if we don't
> get some key stakeholders from the beginning, and set the right tone for
> the WG, I don't see us getting W3C support for forming an Annotations WG.
> The data model is great, but it's not enough.
>

Said stakeholders would be strongly encouraged to join the community group
and discuss their requirements, or if that's not possible for IP or other
legal logistics, it would be great to share the details in some anonymized
fashion.


And to be honest, I think that is as it should be; I don't think there's
> much chance for success with a working group unless we involve a broader
> set of stakeholders, including browser vendors, JavaScript library authors,
> annotation webapps services, and others, as well as the data-centric folks
> already on this list. Just standardizing a data model is not going to
> interest those other players, because there's nothing for them to do there,
> and no win for them or their constituents; we also need to talk about
> things like serializations, DOM events, selection anchoring, styling, and
> other topics; in other words, things that get implemented in browsers, and
> which will make doing annotations in browsers easier.
>

Agreed completely.  The only point I'd like to make is that "web developers
want this to be easier" is not a design constraint or even useful feedback
-- of course people want things to be easier, that's perfectly clear and
something that we've known from the inception of the CG.  What we need to
know is where the pain points are in implementation, what the use cases and
requirements are, and so forth, such that we can evaluate proposals against
agreed upon criteria, not personal anecdotes and feelings.   [Ahem, literal
bodies]

Hence, my suggestion is to follow the rigourous path first of generating
and discussing a description of how Annotation-in-HTML would look in RDFa,
solicit feedback, and iterate.

Rob
Received on Monday, 20 January 2014 20:18:24 UTC