Re: Friendly JSON serialization (Was: Annotation Serializations) from David Wood on 2014-01-22 (public-openannotation@w3.org from January 2014)

From: David Wood <david@3roundstones.com>
Date: Wed, 22 Jan 2014 11:40:47 -0500
To: Robert Sanderson <azaroth42@gmail.com>
Cc: "W. Nema" <waleed.nema@gmail.com>, Paolo Ciccarese <paolo.ciccarese@gmail.com>, public-openannotation <public-openannotation@w3.org>
Message-Id: <EC68CA5C-0228-407F-8BB0-22B64DD18FFB@3roundstones.com>
Hi all,

This is an important conversation.  Thanks for having it.

JSON-LD was developed to address exactly these issues.  Web developers get a JSON serialization that works with their existing tools and techniques, whereas RDF folks get the ability to map directly to the RDF data model.  Everyone wins.

Regards,
Dave
--
http://about.me/david_wood



On Jan 22, 2014, at 10:44, Robert Sanderson <azaroth42@gmail.com> wrote:

> 
> Hi Waleed,
> 
> That's extremely important feedback, thank you!  
> 
> The cost of the mapping is reduced (somewhat) by being a context document in JSON-LD.  So the RDF developer who sees the JSON-LD serialization will automatically have it parsed into the same triples as if she had retrieved it in RDF/XML or Turtle.  The cost is, as you say, when you're comparing serializations in their native form, and oa:hasTarget is suddenly "on".  I guess my follow-up question is how important you think consistency between serializations is?
> 
> 
> Also, while we're discussing serialization friendliness ... my experience is that developers (my previous self included too) prefer well-documented and coherent libraries that implement an API or specification, rather than reading the specification or serialization format itself.  If I can "import openannotation as oa" and then do anno = oa.fetch_annotation(id) that's much easier than worrying about exactly what gets sent across the wire.  At which point, should energy instead be spent in designing and implementing easy to use libraries on top of the existing spec?
> 
> Thanks!
> 
> Rob
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> On Tue, Jan 21, 2014 at 9:56 PM, W. Nema <waleed.nema@gmail.com> wrote:
> Hi Rob,
> 
> I happened to be part of that developer community, albeit amateurish since I'm usually focused on business usage. I may be off here and certainly am no authority on this but my humble opinion is that the cost of maintaining mappings of a serialization to another/spec will probably outweigh the convenience and defeat the purpose of standardization. I'm not sure if we'd be setting a precedent for this spec/serialization vocab mismatch or if it's been done and found useful.
> 
> As a developer, say I'm comparing values from two different serializations (JSON vs. RDF) why should I have to first go through a mapping look-up?
> 
> It does make sense to simplify and reach out to make the model more easily usable. RDFa is great for that but anyway I just wanted to share this thought.
> 
> Waleed
> 
> 
> On Wed, Jan 22, 2014 at 6:21 AM, Robert Sanderson <azaroth42@gmail.com> wrote:
> 
> Hi Waleed,
> 
> I hear you! :)
> 
> The distinction, in my opinion, is that there are two communities with very different expectations about what makes a "good" vocabulary.  The RDF mindset and a web developer who is completely unfamiliar with the semantic web, but does know JSON, are unlikely to agree on appropriate terms for the fields.
> 
> As we already have a very strong RDF oriented community, the next step is to bring in new voices and perspectives, so the JSON-LD Context is a great way to both have our cake and eat it -- we can have the full ontology using terms appropriate for RDF, and have keys in JSON that are friendly to web developers.
> 
> Does that make sense?
> 
> Also, this exercise is to see how friendly we can get it without changing the model, and similarly for the RDFa exercise and HTML.  We're under no requirement to change the CG specs, and the WG is under no obligation to do anything more than consider them as input.
> 
> Rob
> 
> 
> 
> On Tue, Jan 21, 2014 at 8:10 PM, W. Nema <waleed.nema@gmail.com> wrote:
> I thought one important aspect of standardization is vocabulary. Why change it at all? If simplification is needed why not do it in the OA language too?
> 
> 
> On Tue, Jan 21, 2014 at 11:43 PM, Paolo Ciccarese <paolo.ciccarese@gmail.com> wrote:
> 
> 
> 
> On Tue, Jan 21, 2014 at 3:41 PM, Robert Sanderson <azaroth42@gmail.com> wrote:
> 
> 1. How about "on"?
> Then the body is a comment on the target, or a tag is a tag on the target. 
> 
> +1 
>  
> 
> 2. Agreed. Atom uses the term "generator". Twitter uses "source", and only has one timestamp, "created_at".
> Previously we've used the terms generator and generatedAt in preference to serializedBy/At, which would be one option.
> I would prefer not to use "publisher" as that would be confusing to companies that consider themselves digital publishers.
> 
> Agree, 'publisher' would create confusion. We want to keep it for the "right' scenario.
>  
> 
> 
> 
> On Tue, Jan 21, 2014 at 12:33 PM, Paolo Ciccarese <paolo.ciccarese@gmail.com> wrote:
> Rob,
> feedback on these two:
> 1) oa:hasTarget :       about
> Because of the inversion with Tags and Semantic Tags. 
> I would prefer 'target' or 'on' even if I know it can be confused with a timestamp.
> 
> 2) oa:serializedBy :    client
> Because I serialize the annotation in OA format mostly from a server.
> Just to avoid confusion I would change that. Not sure what to though (serializer? publisher?)
> I would also keep oa:serializedBy and oa:serializedAt aligned terminologically speaking (serializer/serialized or publisher/published).
> 
> I feel pretty comfortable with all the rest.
> 
> Best,
> Paolo
> 
> 
> 
> On Tue, Jan 21, 2014 at 1:54 PM, Robert Sanderson <azaroth42@gmail.com> wrote:
> 
> To kick this off with a strawman mapping:
> 
> 
> Annotation:
> 
> oa:hasTarget :       about
> oa:hasBody :         body
> oa:motivatedBy :    reason
> oa:annotatedBy :    user
> oa:annotatedAt :     time
> oa:serializedBy :    client
> oa:serializedAt :     published
> oa:styledBy :         stylesheet
> oa:equivalentTo :    copyOf
> 
> Resources:
> 
> cnt:chars :         content
> cnt:bytes :         bytes
> dc:format :         format
> dc:language :     lang
> foaf:page :         page
> rdfs:label :         label
> 
> 
> Agents:
> 
> foaf:name :        name
> foaf:mbox :        email
> foaf:homepage : homepage
> foaf:openid :       openid
> foaf:nick :          nick
> 
> 
> SpecificResources:
> 
> oa:hasSource :      full
> oa:hasSelector :    selector
> oa:hasState :         state
> oa:hasScope :        seenIn
> oa:styleClass :       style
> oa:when :               when
> oa:cachedSource :  cachedAt
> dct:conformsTo :     spec
> 
> 
> Selectors:
> 
> oa:start :  start
> oa:end :    end
> oa:prefix : prefix
> oa:suffix : suffix
> oa:exact : quote
> rdf:value : value
> 
> 
> Multiplicity:
> 
> NB -- I think we can simplify this part of the spec a lot by JUST using rdf:Lists.  So I present two proposals, one for the current spec and then for the proposed change.
> 
> 1. Current Spec.
> 
> oa:default : default
> oa:item : items
> rdf:first : first
> rdf:rest : rest
> 
> 
> 2. Proposal for simplification
> 
> We can make better use of the JSON list construction (and other serializations) by not having a multi-class object, and instead only ever using the list as an anonymous object of a predicate, say oa:items.  After the discussion last time, the serialization algorithm for JSON-LD was updated so our current spec does now work at least without randomly dropping out information.
> 
> Thus oa:Choice becomes:
> { "@type" : "Choice",
>   "items" : [default, firstOption, secondOption]
> }
> 
> And oa:Composite and oa:List collapse to:
> { "@type" : "List",
>   "items" : [first, second, third]
> }
> 
> 
> Classes:
> 
> * We predefine them all without their prefixes, and drop super-class names (eg oa:FragmentSelector --> Fragment)
> * Make a simpler term for SpecificResource of "Segment" -- anyone who knows the model will understand that they can also use State and Styles with it, even if it's not a "segment" per se.
> 
> --------
> 
> Thus, the simple example from the spec:
> 
> {
>     "@context": "http://www.w3.org/ns/oa-context-new.json",
>     "@type": "Annotation",
>     "body": "http://www.example.org/body1",
>     "about": "http://www.example.org/target1"
> }
> 
> And the more complex example:
> 
> {
>     "@context": "http://www.w3.org/ns/oa-context-20130208.json", 
>     "@id": "http://www.example.org/annotations/anno1", 
>     "@type": "Annotation",
> 
>     "time": "2012-11-10T09:08:07", 
>     "user": {
>         "@id": "http://www.example.org/people/person1", 
>         "@type": "Person", 
>         "email": "mailto:person1@example.org", 
>         "name": "Person One"
>     },
> 
>     "reason" : "commenting",
>     "body": {
>         "@type": "Text", 
>         "content": "This is part of our logo"
>     }, 
>     "about": {
>         "@type": "Segment", 
>         "selector": {
>             "@type": "Fragment", 
>             "value": "xywh=10,10,5,5"
>             "spec": "http://www.w3.org/TR/media-frags/", 
>         }, 
>         "full": {
>             "@id": "http://www.example.org/images/logo.jpg", 
>             "@type": "Image"
>         }
>     }
> }
> 
> 
> 
> Thoughts?
> 
> Rob
> 
> 
> On Tue, Jan 21, 2014 at 3:18 AM, Ivan Herman <ivan@w3.org> wrote:
> @context rules:-)
> 
> Ivan
> 
> On 20 Jan 2014, at 21:26 , Robert Sanderson <azaroth42@gmail.com> wrote:
> 
> >
> > Doug,
> >
> > My experience is the same.  In IIIF [1], we specifically did NOT use the OA context mapping and went with something less RDFy and more in line with the domain.   We went even further than your suggestion, with "hasTarget" being just "on".  This is one great advantage of JSON-LD, that the serialization can be very different from the (abstract) data model while still enabling semantic interoperability.
> >
> > I could easily imagine something like:
> >
> > {
> >   "@type" : "Annotation",
> >   "reason" : "commenting",
> >   "body" : "http://example.net/body",
> >   "about" : "http://example.org/target"
> > }
> >
> > as being more palatable than the current RDF centric context.  Again, I think this is something that we can derive some criteria for, redesign, solicit feedback and iterate on.
> >
> > Rob
> >
> > 1 -- http://www.shared-canvas.org/datamodel/iiif/metadata-api.html#Annotation
> >
> >
> >
> > On Mon, Jan 20, 2014 at 12:13 PM, Doug Schepers <schepers@w3.org> wrote:
> > Hi, Rob–
> >
> > I couldn't agree more with Randall and Blaine. I was struck by how clumsy the predicate-like syntax felt to me; doubtless this is just an aesthetic from my background in JS, HTML, CSS, SVG, etc., but I think it's how most web developers would react as well.
> >
> > If there's a way to have terms that map from JSON-friendly syntax to RDF-friendly syntax, that would be really great (e.g., "hasBody" and "body" are quivalent; similarly for "hasTarget" -> "target", "annotatedBy" -> "annotator", "annotatedAt" -> "timstamp", and so on).
> >
> > Regards-
> > -Doug
> >
> >
> > On 1/20/14 1:07 PM, Robert Sanderson wrote:
> >
> > Second item from the discussion seems to be the availability of a web
> > developer friendly JSON serialization.
> >
> > Some background -- we have been asked for JSON serializations for at
> > least 3 years. Here is one such request of many, from 2011:
> > https://groups.google.com/forum/#!topic/oac-discuss/CSq9Jsdd3zk where we
> > kicked the can down the road waiting for JSON-LD to come along.
> >
> > Which it has, and is the recommended serialization for Open Annotation.
> >
> > So the question is not the JSON serialization's existence, but its
> > developer friendliness, and whether we can do any better while remaining
> > conformant with the JSON-LD specification.
> >
> > I think that would be a great discussion to have, or rather to restart,
> > as it was brought up in point 2 of this thread:
> > http://lists.w3.org/Archives/Public/public-openannotation/2013Apr/0015.html
> >
> >
> > Rob
> >
> 
> 
> ----
> Ivan Herman, W3C
> Digital Publishing Activity Lead
> Home: http://www.w3.org/People/Ivan/
> mobile: +31-641044153
> GPG: 0x343F1A3D
> FOAF: http://www.ivan-herman.net/foaf
> 
> 
> 
> 
> 
> 
> 
> 
> 
> -- 
> Dr. Paolo Ciccarese
> http://www.paolociccarese.info/
> Biomedical Informatics Research & Development
> Instructor of Neurology at Harvard Medical School
> Assistant in Neuroscience at Mass General Hospital
> Member of the MGH Biomedical Informatics Core
> +1-857-366-1524 (mobile)   +1-617-768-8744 (office)
> 
> CONFIDENTIALITY NOTICE: This message is intended only for the addressee(s), may contain information that is considered
> to be sensitive or confidential and may not be forwarded or disclosed to any other party without the permission of the sender. 
> If you have received this message in error, please notify the sender immediately.
> 
> 
> 
> 
> -- 
> Dr. Paolo Ciccarese
> http://www.paolociccarese.info/
> Biomedical Informatics Research & Development
> Instructor of Neurology at Harvard Medical School
> Assistant in Neuroscience at Mass General Hospital
> Member of the MGH Biomedical Informatics Core
> +1-857-366-1524 (mobile)   +1-617-768-8744 (office)
> 
> CONFIDENTIALITY NOTICE: This message is intended only for the addressee(s), may contain information that is considered
> to be sensitive or confidential and may not be forwarded or disclosed to any other party without the permission of the sender. 
> If you have received this message in error, please notify the sender immediately.
> 
> 
> 
>
Attachments

application/pkcs7-signature attachment: smime.p7s
Received on Wednesday, 22 January 2014 16:41:17 UTC