Re: Social data /syntax/ vs Social data /vocabulary/

I agree with the principle. We should consider what kind of vocabulary 
we are defining; due to the JSON-LD alignment, I lean towards an RDF OWL 
vocab*, especially as that way we can use the same JSON-LD mapping to 
translate the other vocabs we import into JSON.

My main reason for raising this, though, is the matter of the existing 
AS2 syntax doesn't really provide a strong basis for a long term (i.e. 
storable) vocabulary. In an attempt to simplify the syntax, it has 
diluted the concept of an Object, which was quite strong in AS1. An AS1 
object is a discrete, atomic object; it should be named (by a unique 
ID), and each one stands on its' own.

The AS2 notion of an object is a lot weaker. The incidences of unnamed 
objects are much more. Worse, I can see lots of different instances of 
objects with the same ID occurring - and it by necessity to implement 
its' alternatives concept introduces lots of duplicated objects which 
refer to the same actual thing.

This is the reason I raised issue 11[1] on bringing back the Media Link 
concept from AS1 (as Media Source Objects); it may increase the 
complexity of the spec, but it reduces the conceptual complexity for an 
implementer (keeping straight all these subtly different usages of the 
same Object type). It also brings back a very useful property of AS1: in 
AS1, its' possible to place each object as a discrete entry in a 
database and build up a graph structure of all of your data, as long as 
each is named (this is why, BTW, my social API/federation protocol 
proposal requires every object be named).

This is a very useful processing model, and it translates well to RDF, 
which would allow the reuse of tools like SPARQL. While I would never 
want to mandate RDF or RDF processing (because we would scare off an 
awful lot of people), and indeed I think we should be very cautious 
about our messaging there ("You can process ActivityStreams using RDF if 
you like", for example), it is incredibly useful for processing graphs 
and, well, there is a reason people say social graph.

Basically, what I'm saying is that I think the principles and basis of 
AS1 were right, and while it may have some shaky aspects (e.g. the 
re-use of URL in different and even overlapping contexts), I don't think 
we should throw them out.

They're an eminently usable and practical vocabulary.

* The other  reason is that both are graph data structures, where nodes 
are identified by URIs
[1] https://github.com/jasnell/w3c-socialwg-activitystreams/issues/11
> James M Snell <mailto:jasnell@gmail.com>
> 09 September 2014 00:52
> My $0.02 is that we ought to do both, really. Essentially, start with
> AS2 as we've already decided, then map that to a semantic model that's
> defined separately.
> Owen Shepherd <mailto:owen.shepherd@e43.eu>
> 09 September 2014 00:31
> Spurred by a conversation in [1]
>
> Our WG charter says that one of our deliverables isNow, there is an 
> open question of should we be defining a /syntax/ or a /vocabulary*/?
>
> The difference is that a syntax is purely a transport format, whilst a 
> vocabulary is a /data model/. In particular, it should be possible to 
> usefully place data in a vocabulary in a database and each named 
> object stand on its’ own.
>
> ActivityStreams 1, intentionally or not, defines a vocabulary; social 
> protocols based upon it tend to use it as both a transport format, and 
> to define the model used by their internal database.
>
> ActivityStreams 2, per the current specification, defines a syntactic 
> model. It does not make sense to store ActivityStreams 2 objects in a 
> database as discrete objects - they only make sense in context. 
> Meaningfully storing said data involves manually decomposing them into 
> some internal representation (which may involve detailed knowledge of 
> all of the types involved).
>
> My opinion on this is that we should define a vocabulary. I say this 
> especially as someone interested in the upper layers of the stack we 
> are chartered to build - that is, the social API and federation 
> protocols. I have a proposal[2] I’d like to bring to the committee in 
> the future, based upon experience and existing practice with 
> AcivityStreams 1, which covers both with a small and compact 
> specification, but this depends upon ActivityStreams 2 being able to 
> fulfil the role of a data model.
>
> The trade-off here is that we make the AS2 specification slightly more 
> complex - the current spec abstracts nearly everything away as an 
> “Object”. We would probably need to bring back something like the 
> “Media link” concept from AS1 (I prefer the term Media Source, to more 
> clearly explain the intent).
>
> But I feel it would be worth it - this simplifies the data model for 
> everyone interacting with the protocol, and makes it useful as a data 
> model. It would make the data much easier to rationalise, and help 
> clarify what data “stands alone” vs being an integral part of some 
> other object.
>
> * Technically I suppose a syntax is a subset of a vocabulary. The 
> question is if we should define a syntax which is a vocabulary, or 
> just a syntax.
> [1] 
> https://github.com/jasnell/w3c-socialwg-activitystreams/issues/11#issuecomment-53518263
> [2] My current very early working draft of which can be found here: 
> http://oshepherd.github.io/activitypump/ActivityPump.html
>
> - Owen
> [sorry for the delay in sending this - I’ve been busy]

-- 
Sent using Postbox:
http://www.getpostbox.com

Received on Tuesday, 9 September 2014 18:35:16 UTC