Re: OA and RDFa (Was: Annotation Serializations)

On 23 Jan 2014, at 17:20 , Paolo Ciccarese <paolo.ciccarese@gmail.com> wrote:

> 
> 
> 
> On Thu, Jan 23, 2014 at 11:06 AM, Robert Sanderson <azaroth42@gmail.com> wrote:
> 
> Hi Paolo, Leyla,
> 
> I agree. The first step towards consensus is: What can we do within existing standards and the current model, and is the simplest we can get something that's almost acceptable to all parties? 
> As Manu said in his post I referenced earlier, the consensus is for consensus :)
> 
> I wonder if we could re-engage Dan during the WG process towards furthering that consensus across the board? There are a couple of very simple constructs in schema.org, but maybe there's space to expand those a bit.  
> 
> And maybe - and I say maybe as we will need to evaluate things better -,  if it is the case, just accept that not all that is possible in OA is representable - or good to be represented - in schema.org constructs. Thinking of a 'lossy data serialization' of some sort... but still useful for search engines and other reasons.

+1

ivan

> 
> Paolo
> 
> 
> 
> Rob
> 
> P.S. Also see Dan's comment to Manu's blog post.  And thanks Doug for the shout out to the CG :)
> 
> 
> 
> On Thu, Jan 23, 2014 at 7:15 AM, Paolo Ciccarese <paolo.ciccarese@gmail.com> wrote:
> The OA model was born for interoperability or "to provide a standard description mechanism for sharing Annotations between systems". We always talked about other possibilities like RDFa or even schema.org but that was not an immediate priority. As a curiosity, Dan Brickley (now Schema.org project lead) was present as invited expert at the very first Open Annotation CG face to face in Boston. We had some brief discussions but nothing more at the time. It was an early stage for us. 
> 
> In my mind, simplified serialization and full RDFa representation are just exercises for now and,  even if related, are very different than talking about the relationship with schema.org. As Ivan said that would require a different kind of evaluation/revision of the model. For instance schema.org already has a 'Comment' feature (http://schema.org/Comment ), does that overlap with OA, how?
> 
> Another way to approach these issues - just a morning idea and nothing more - would be to use the OA model as the interoperability model between systems and then provide simpler (and maybe incomplete) mechanisms/recipes/profiles that allow to convey *parts* of the annotation for different purposes. That way we can have, for instance, a profile for search engines with only the info that can be useful for search, whatever those are… if we make sure to inject a URI that points to the full annotation representation even better.
> 
> If I inject an annotation back in the HTML and the target is a section of the document, there is no need for selectors and other parts of the model. Still the simplified metadata could help understanding the content better… and maybe to point to the full representation.... if that makes any sense...
> 
> 
> On Thu, Jan 23, 2014 at 7:28 AM, Ivan Herman <ivan@w3.org> wrote:
> 
> On 23 Jan 2014, at 08:08 , Doug Schepers <schepers@w3.org> wrote:
> 
> > Hi, Leyla–
> >
> > I think there's a lot of merit in that idea.
> >
> > That's actually the path I started down as part of my strawman, but then I realized that:
> >
> > 1) I was at risk of diverging so far from something recognizable as Open Annotations that it would probably be a bridge too far as a starting point for a conversation;
> >
> > 2) I actually preferred moving away from RDFa as much as possible into a mapping between OA entities and existing or possible HTML elements;
> >
> > 3) I was headed deep down the rabbit hole.
> >
> > So I popped my head back up and threw my strawman together with a bit of help with some RDF experts (butchering their advice so much that I won't implicate them).
> >
> > I have no objections to Rob, Paolo, and Ivan going through the exercise of making a proper RDFa characterization, though I'm uncertain I'd be skilled enough to contribute to that; but if nothing else, it could serve as a starting point to see if we can simplify to something more like the average web developer might produces, and given the uptake of Schema.org vocabularies, I imagine that that could well be the end-point.
> 
> I think the usage of the schema.org vocabulary or not is a bit separate from RDFa.
> 
> Mapping OA onto schema.org is possibly doable, but may require changing some part of the OA model as a whole. I must admit I have not yet considered this, and I am not sure anybody did think about it in the OA CG; it is certainly something worth considering. I would expect that some portions of the OA model would be forced to go through some simplification by doing so.
> 
> Also, formally, an upcoming WG would not have the prerogative to define a standard mapping of OA toschema.org; that is the prerogative of the schema.org partners. They decide whether a schema.org vocabulary would be accepted into the main tree or not. That being said, our current contacts with the schema.org people are excellent, so working out something in practice is definitely doable.
> 
> The question of course is whether annotations are a worthy target for the search engines; after all, that is whatschema.org is all about. If we decide to go down that route, we would have to contact them asap.
> 
> Leyla, thanks for having raised this.
> 
> Ivan
> 
> >
> > It might also be worthwhile getting some OA stuff into Schema.org, if it comes to that.
> >
> > Regards-
> > -Doug
> >
> > On 1/22/14 4:06 PM, Leyla Jael García Castro wrote:
> >> Hi all,
> >>
> >> Probably no related to what you have discussed in this thread but to RDFa.
> >> I think schema.org <http://schema.org> is pretty common in the RDFa
> >> users. Will we map OA to RDFa? There is no need from what I understand,
> >> but maybe it would be a good idea to check what similar is in there.
> >>
> >> Cheers,
> >> Leyla
> >>
> >>
> >> On Wed, Jan 22, 2014 at 1:54 PM, Paolo Ciccarese
> >> <paolo.ciccarese@gmail.com <mailto:paolo.ciccarese@gmail.com>> wrote:
> >>
> >>    Hi Ivan,
> >>    comments inline.
> >>
> >>
> >>    On Wed, Jan 22, 2014 at 5:30 AM, Ivan Herman <ivan@w3.org
> >>    <mailto:ivan@w3.org>> wrote:
> >>
> >>        Hey Paolo,
> >>
> >>        see below.
> >>
> >>        On 22 Jan 2014, at 04:48 , Paolo Ciccarese
> >>        <paolo.ciccarese@gmail.com <mailto:paolo.ciccarese@gmail.com>>
> >>        wrote:
> >>
> >>         > Dear Ivan and all,
> >>         > I would start from the example Ivan included in a previous
> >>        email
> >>        http://lists.w3.org/Archives/Public/public-openannotation/2014Jan/0038.html
> >>         > I also copied it to the Wiki
> >>        http://www.w3.org/community/openannotation/wiki/RDFa were we can
> >>        collect the results of the discussion.
> >>         >
> >>         > Here below a couple of initial considerations/exercises. Ivan
> >>        I am trying to understand this myself so please chime in and
> >>        explain where necessary.
> >>         >
> >>         > 1) Target
> >>         >
> >>         > Current:
> >>         > <blockquote property="hasTarget"
> >>         >           resource="http://example.com/sourcedoc.html"
> >>         >           cite="http://example.com/sourcedoc.html"
> >>         >           data-prefix="essential feature of the memex. "
> >>         >           data-suffix=" When the user is building a tra"
> >>        typeof="">
> >>         >     <p>The process of tying two items together is the
> >>        important thing.</p>
> >>         >     <footer>
> >>         >           - <cite>
> >>         >              <a
> >>        href="http://en.wikipedia.org/wiki/Vannevar_Bush">
> >>         >                 <span>Vannevar Bush</span>
> >>         >              </a>
> >>         >         </cite>
> >>         >     </footer>
> >>         > </blockquote>
> >>         >
> >>         > Using the distiller this becomes:
> >>         > ...
> >>         > "hasTarget": "http://example.com/sourcedoc.html"
> >>         > ...
> >>         >
> >>         > and not what I would expected given Doug example:
> >>         > "hasTarget": {
> >>         >     "@id": "http://example.com/specifictarget/0001",
> >>         >     "@type": "SpecificResource",
> >>         >     "hasSelector": {
> >>         >         "@id": "http://example.com/selector/0001",
> >>         >         "@type": "TextQuoteSeletor",
> >>         >         "prefix": "essential feature of the memex. ",
> >>         >         "match": "The process of tying two items together is
> >>        the important thing.",
> >>         >         "suffix": " When the user is building a tra"
> >>         >     },
> >>         >     "hasSource": "http://example.com/sourcedoc.html"
> >>         > },
> >>         >
> >>         > Was that intentional? Were you (or Doug) thinking of using
> >>        'data-prefix' and 'data-suffix' as shortcuts?
> >>
> >>        Indeed. As I said back then, I was not not sure what really the
> >>        intention of Doug was and how it could/should be expressed in OA
> >>        (ie, how to express the relationship to Bush.), so I skipped
> >>        over this. There were some separate mails on this since which I
> >>        did not follow in details, unfortunately, but I guess that is
> >>        what you use below.
> >>
> >>         >
> >>         > This is an exercise for a more complete - and more complex! -
> >>        approach, which allow the user to add the
> >>        SpecificResource/Selector info if needed (only blockquote here):
> >>         >
> >>         > <blockquote property="hasTarget"
> >>        resource="http://example.com/specifictarget/0001"
> >>         >     typeof="SpecificResource"
> >>        cite="http://example.com/sourcedoc.html" >
> >>         >     <details>
> >>         >         <summary>Source</summary>
> >>         >         <a property="hasSource"
> >>        resource="http://example.com/sourcedoc.html"
> >>        href="http://example.com/sourcedoc.html">http://example.com/sourcedoc.html</a>
> >>         >     </details>
> >>         >     <div property="hasSelector"
> >>        resource="http://example.com/selector/0001"
> >>        typeof="TextQuoteSeletor">
> >>         >         <p property="prefix" style="display: none;">essential
> >>        feature of the memex. </p>
> >>         >         <p property="exact">The process of tying two items
> >>        together is the important thing.</p>
> >>         >         <p property="suffix" style="display: none;"> When the
> >>        user is building a tra</p>
> >>         >     <div>
> >>         >     <footer>
> >>         >       - <cite>
> >>         >              <a
> >>        href="http://en.wikipedia.org/wiki/Vannevar_Bush">
> >>         >                 <span>Vannevar Bush</span>
> >>         >              </a>
> >>         >         </cite>
> >>         >     </footer>
> >>         > </blockquote>
> >>         >
> >>         > I am using 'details' defined as 'The details element
> >>        represents a disclosure widget from which the user can obtain
> >>        additional information or controls.' With the idea that the
> >>        source could be shown/hidden. Probably the 'summary' element can
> >>        be omitted.
> >>         >
> >>         > This markup - that could be simplified - generates the above
> >>        JSON-LD snippet.
> >>         > It is obviously more complicated.
> >>
> >>        Right. I have made a little bit simpler below, but not much:
> >>
> >>                <blockquote property="hasTarget"
> >>                     typeof="SpecificResource"
> >>        cite="http://example.com/sourcedoc.html" >
> >>                     <details>
> >>                         <summary>Source</summary>
> >>                         <a property="hasSource"
> >>        resource="http://example.com/sourcedoc.html"
> >>        href="http://example.com/sourcedoc.html">http://example.com/sourcedoc.html</a>
> >>                     </details>
> >>                     <div property="hasSelector" typeof="TextQuoteSeletor">
> >>                         <meta property="prefix" content="essential
> >>        feature of the memex.">
> >>                         <p property="exact">The process of tying two
> >>        items together is the important thing.</p>
> >>                         <meta property="suffix" content="When the user
> >>        is building a tra">
> >>                     <div>
> >>                     <footer>
> >>                       - <cite>
> >>                              <a
> >>        href="http://en.wikipedia.org/wiki/Vannevar_Bush">
> >>                                 <span>Vannevar Bush</span>
> >>                              </a>
> >>                         </cite>
> >>                     </footer>
> >>                 </blockquote>
> >>
> >>        One of the simplifications is a matter of choice: I would expect
> >>        blank nodes are perfectly all right for what you referred to as
> >>        .../001; unless other parts of the system needs explicit URI-s
> >>        for these, they are perfectly fine as blank nodes imho.
> >>
> >>
> >>    That looks a lot more readable.
> >>
> >>    In the spec we talk about blank nodes in regards to embedded textual
> >>    content.
> >>    The above would generate something like this:
> >>
> >>    "hasTarget": {
> >>         "@type": "SpecificResource",
> >>         "hasSelector": {
> >>             "@type": "TextQuoteSeletor",
> >>             "exact": "The process of tying two items together is the
> >>    important thing.",
> >>
> >>             "suffix": "When the user is building a tra",
> >>             "prefix": "essential feature of the memex."
> >>         },
> >>         "hasSource": "http://example.com/sourcedoc.html"
> >>    },
> >>
> >>    Tim, Rob? Thoughts?
> >>
> >>
> >>        The other, more important change: when using RDFa, the <meta>
> >>        and <link> elements are valid everywhere in the document (much
> >>        as is the case in microdata), and I think that is a much cleaner
> >>        pattern than using a @style="display:none".
> >>
> >>
> >>    That is very good for 'quotes'. Sometimes (this is the case in my
> >>    application) prefix and suffix can be displayed as context for the
> >>    target match. In that case I would still probably stick to a <p> (or
> >>    maybe a <details> that can be shown or hidden).
> >>
> >>
> >>        Note that the extra complication here is conceptual and not OA
> >>        or RDFa specific: it is the stable URI/anchor issue for a quote
> >>        in another document. That issue, in general, should be part of
> >>        the WG (although I personally like the approach taken in OA,
> >>        which is pretty pragmatic...)
> >>
> >>
> >>    Agree.
> >>
> >>
> >>
> >>         >
> >>         > 2) Embedded textual body
> >>         >
> >>         > Current:
> >>         > <p property="hasBody" typeof=""><span
> >>        property="rdf:value">Annotations are at the Web's core.</span></p>
> >>         >
> >>         > According to specs - and forgetting dc:format for now - that
> >>        should be something like:
> >>         > <p property="hasBody" typeof="cnt:ContentAsText"><span
> >>        property="cnt:chars" >Annotations are at the Web's core.</span></p>
> >>         > or, assuming to have a RDFS aware processor (as the domain of
> >>        cnt:chars is cnt:ContentAsText), just:
> >>         > <p property="hasBody" typeof=""><span
> >>        property="cnt:chars">Annotations are at the Web's core.</span></p>
> >>         >
> >>         > The above would generate something on the lines of:
> >>         > hasBody": {
> >>         >     "@type": "cnt:ContentAsText",
> >>         >     "cnt:chars": "Annotations are at the Web's core."
> >>         > },
> >>         >
> >>         > Again, at this stage, these are incomplete exercises for
> >>        discussion purposes.
> >>         > Comments are more than welcome.
> >>         >
> >>
> >>        Yes, that is correct.
> >>
> >>        Unfortunately (see on the wiki) the @prefix="cnt: http://...."
> >>        is necessary in this case, which makes it a bit more complex
> >>        because namespace have to be defined. This is not the case for
> >>        foaf, because that is part of RDFa's initial context, as a
> >>        widely used vocabulary, and therefore it can be used without
> >>        declaration. I wonder whether there are no alternatives to 'cnt'
> >>        that would make this simpler.
> >>
> >>
> >>    Clearly, in the case of RDFa, using pre-defined prefixes is convenient.
> >>    However, for now,  as the current spec is recommending using cnt, I
> >>    will just add this to the wiki (with a prefix declaration) and I
> >>    will write a note.
> >>
> >>
> >>        (The RDFa WG shied away from the definition of a @context of the
> >>        same power as in JSON-LD. We had it defined, there were actually
> >>        implementations for it at the time, but the issue of 'what
> >>        happens if the context is not available' became a stumbling
> >>        block, because it would have made the output of the RDFa
> >>        processing unusable. JSON-LD was probably less shy on this but,
> >>        also, I guess what prevailed was that the JSON content still
> >>        made some sense, so an unreachable @context may not have been
> >>        such a problem. I personally think it is a pity, but, well, that
> >>        is the way it is...)
> >>
> >>
> >>    I will update the wiki and shortly follow with other RDFa related
> >>    things,
> >>    thank you,
> >>    Paolo
> >>
> >>
> >>        Thanks!
> >>
> >>        Ivan
> >>
> >>
> >>         > Paolo
> >>         >
> >>         >
> >>
> >>
> >>        ----
> >>        Ivan Herman, W3C
> >>        Digital Publishing Activity Lead
> >>        Home: http://www.w3.org/People/Ivan/
> >>        mobile: +31-641044153 <tel:%2B31-641044153>
> >>        GPG: 0x343F1A3D
> >>        FOAF: http://www.ivan-herman.net/foaf
> >>
> >>
> >>
> >>
> >>
> >>
> >>
> >>
> >>    --
> >>    Dr. Paolo Ciccarese
> >>    http://www.paolociccarese.info/
> >>    Biomedical Informatics Research & Development
> >>    Instructor of Neurology at Harvard Medical School
> >>    Assistant in Neuroscience at Mass General Hospital
> >>    Member of the MGH Biomedical Informatics Core
> >>    +1-857-366-1524 <tel:%2B1-857-366-1524> (mobile) +1-617-768-8744
> >>    <tel:%2B1-617-768-8744> (office)
> >>
> >>    CONFIDENTIALITY NOTICE: This message is intended only for the
> >>    addressee(s), may contain information that is considered
> >>    to be sensitive or confidential and may not be forwarded or
> >>    disclosed to any other party without the permission of the sender.
> >>    If you have received this message in error, please notify the sender
> >>    immediately.
> >>
> >>
> >
> >
> 
> 
> ----
> Ivan Herman, W3C
> Digital Publishing Activity Lead
> Home: http://www.w3.org/People/Ivan/
> mobile: +31-641044153
> GPG: 0x343F1A3D
> FOAF: http://www.ivan-herman.net/foaf
> 
> 
> 
> 
> 
> 
> 
> 
> -- 
> Dr. Paolo Ciccarese
> http://www.paolociccarese.info/
> Biomedical Informatics Research & Development
> Instructor of Neurology at Harvard Medical School
> Assistant in Neuroscience at Mass General Hospital
> Member of the MGH Biomedical Informatics Core
> +1-857-366-1524 (mobile)   +1-617-768-8744 (office)
> 
> 
> CONFIDENTIALITY NOTICE: This message is intended only for the addressee(s), may contain information that is considered
> to be sensitive or confidential and may not be forwarded or disclosed to any other party without the permission of the sender. 
> If you have received this message in error, please notify the sender immediately.
> 
> 
> 
> 
> -- 
> Dr. Paolo Ciccarese
> http://www.paolociccarese.info/
> Biomedical Informatics Research & Development
> Instructor of Neurology at Harvard Medical School
> Assistant in Neuroscience at Mass General Hospital
> Member of the MGH Biomedical Informatics Core
> +1-857-366-1524 (mobile)   +1-617-768-8744 (office)
> 
> CONFIDENTIALITY NOTICE: This message is intended only for the addressee(s), may contain information that is considered
> to be sensitive or confidential and may not be forwarded or disclosed to any other party without the permission of the sender. 
> If you have received this message in error, please notify the sender immediately.


----
Ivan Herman, W3C 
Digital Publishing Activity Lead
Home: http://www.w3.org/People/Ivan/
mobile: +31-641044153
GPG: 0x343F1A3D
FOAF: http://www.ivan-herman.net/foaf

Received on Thursday, 23 January 2014 16:22:15 UTC