Re: naming dataset syntax from Kingsley Idehen on 2012-09-26 (public-rdf-wg@w3.org from September 2012)

From: Kingsley Idehen <kidehen@openlinksw.com>
Date: Wed, 26 Sep 2012 15:51:58 -0400
To: public-rdf-wg@w3.org
Message-ID: <50635CDE.9020808@openlinksw.com>
On 9/26/12 3:20 PM, David Wood wrote:
> Hi Arnaud,
>
> I appreciate your need for marketing simplicity.  However, please 
> consider this:
>
> RDF used to have one standard format (RDF/XML) which was, as you say, 
> overly complicated for many potential users.  Now we have two standard 
> formats (RDF/XML and RDFa).  Those serve very different communities 
> (enterprise XML developers and some Web developers).  We are now in 
> the process of defining either two additional standard formats (Turtle 
> and JSON-LD) or three (if we add TriG).  Again, the potential users of 
> those formats are different, but in each case we can parse the formats 
> as RDF.
>
> To my mind, that is a feature, not a bug.  We do not need to explain 
> each format to all users.  Instead, we need to figure out which kind 
> of user is in front of us and tell them about the format that most 
> closely suits their needs.

+1

Kingsley
>
> Regards,
> Dave
>
>
>
>
> On Sep 26, 2012, at 15:12, Arnaud Le Hors <lehors@us.ibm.com 
> <mailto:lehors@us.ibm.com>> wrote:
>
>> I realize this group is more interested in technical purity than 
>> marketing and that from a technical point of view using two different 
>> formats and names can be totally justified but I'd like to ask 
>> everyone to think about the bigger picture here.
>>
>> RDF is already plagued with the image of being an overly complicated 
>> technology and this is hindering its uptake in the industry. We 
>> really don't want to make things worse by introducing a bunch of new 
>> formats and names.
>>
>> In a private email Andy wrote to me:
>>
>> > A collection of graphs isn't itself a graph.
>> >
>> > A syntax for a collection of graphs isn't a syntax for a graph.
>>
>> This certainly makes perfect sense and is very simply put. As an 
>> engineer I can certainly appreciate the difference but as someone 
>> interested in helping adoption of RDF in the industry I just don't 
>> think this is worth introducing a whole new format and name.
>>
>> Turtle is providing us with something everyone can understand (unlike 
>> RDF/XML) and the name has been out there for a while now. We should 
>> try to build on that rather than start confusing things (again) with 
>> the introduction of multiple formats.
>>
>> Could we not simply have two different versions of Turtle with a way 
>> for programs to differentiate the two so that we can still only talk 
>> about Turtle?
>>
>> Regards.
>> --
>> Arnaud  Le Hors - Software Standards Architect - IBM Software Group
>>
>>
>> Sandro Hawke <sandro@w3.org <mailto:sandro@w3.org>> wrote on 
>> 09/26/2012 11:18:34 AM:
>>
>> > From: Sandro Hawke <sandro@w3.org <mailto:sandro@w3.org>>
>> > To: David Wood <david@3roundstones.com 
>> <mailto:david@3roundstones.com>>,
>> > Cc: Arnaud Le Hors/Cupertino/IBM@IBMUS, W3C RDF WG 
>> <public-rdf-wg@w3.org <mailto:public-rdf-wg@w3.org>>
>> > Date: 09/26/2012 11:19 AM
>> > Subject: naming dataset syntax
>> >
>> > On 09/26/2012 01:58 PM, David Wood wrote:
>> > Hi Arnaud,
>> >
>> > We agreed quite early (Feb 2011) to "use 
>> http://www.w3.org/2010/01/Turtle/
>> > as the starting point for the Turtle work" [1] and in April 2011 to
>> > limit syntactic sugar additions to Turtle [2].
>> >
>> > IIRC, we had substantial conversations regarding the desirability of
>> > turning Turtle into a quad language, but we decided (without
>> > resolution) not to do that because:
>> > - Turtle is widely fielded already
>> > - We wished to minimize disruption, as per our charter
>> > - Issues around datasets/quads were (and are) less agreed upon
>> >
>> >
>> > Yes, we agreed to get Turtle out the door as a language for Triples.
>> >
>> > So, now, what do we call a language that's like Turtle except it can
>> > also include datasets (that is, the triples can be segmented into
>> > named sections)?
>> >
>> > Frankly I expect this language to supplant Turtle as soon as it is
>> > well supported, as long as it doesn't do anything to exclude simple
>> > usage.   I think the kind of people who use Turtle (or RDF) are the
>> > kind of people who will want to segment and manage their data. But
>> > (1) I could be wrong, and (2) it may be a long time before it is
>> > well-supported, given how confused we are about it within the WG.
>> >
>> > So, myself, I'm split about what to call it.  Compared to me,
>> > however, the WG, tends to lean more toward existing users and
>> > experts, over new users and non-experts, so I expect the WG to just
>> > go with "trig" unless someone makes a strong case for something else.
>> >
>> > (In my prototype coding, I called the hypothetical trig-like
>> > language "mugl", for MultiGraphLanguage.    If we start from a blank
>> > slate, we can probably do better than mugl or trig.)
>> >
>> >        -- Sandro
>> >
>> >
>>
>> > Regards,
>> > Dave
>> >
>> > [1] http://www.w3.org/2011/rdf-wg/meeting/2011-02-23#resolution_1
>> > [2] http://www.w3.org/2011/rdf-wg/track/issues/34
>>
>> >
>> > On Sep 26, 2012, at 12:42, Arnaud Le Hors <lehors@us.ibm.com 
>> <mailto:lehors@us.ibm.com>> wrote:
>> >
>> > Hi Sandro,
>> >
>> > This discussion had already started when I joined the WG and as I
>> > caught it midstream I thought it was about extending Turtle. I've
>> > since then realized that this wasn't the intent and everybody seems
>> > to agree with that but I must admit that I still don't know why.
>> > Could you please explain or point me to some reference I could read
>> > to catch up on that?
>> >
>> > I have to say that the proliferation of formats for RDF makes me a
>> > bit nervous. This doesn't go along with making RDF simpler for the
>> > masses/industry and facilitating adoption.
>> >
>> > Thanks.
>> > --
>> > Arnaud  Le Hors - Software Standards Architect - IBM Software Group
>> >
>> >
>> > Sandro Hawke <sandro@w3.org <mailto:sandro@w3.org>> wrote on 
>> 09/25/2012 04:14:25 PM:
>> >
>> > > From: Sandro Hawke <sandro@w3.org <mailto:sandro@w3.org>>
>> > > To: W3C RDF WG <public-rdf-wg@w3.org <mailto:public-rdf-wg@w3.org>>,
>> > > Date: 09/25/2012 04:14 PM
>> > > Subject: Dataset Syntax - checking for consensus
>> > >
>> > > I'm not sure how much progress we'll be able to make on dataset
>> > > semantics tomorrow, so I thought I'd draft some proposals on dataset
>> > > syntax.   The chairs can put this on the agenda is they like (but 
>> it's
>> > > too short notice for these decisions to be binding yet).  I'm 
>> thinking
>> > > it would be useful to see how close we are to agreement on these 
>> issues.
>> > >
>> > > If you followup with votes, please use -1 for Formal Objection, 0 
>> for
>> > > abstain, +1 for approve.   Numbers in between are fine, too.
>> > >
>> > > PROPOSED: We will produce a W3C Recommendation for a dataset syntax,
>> > > similar to TriG and to SPARQL's named graph syntax.
>> > >
>> > > PROPOSED: We'll request a media-type for this syntax which is 
>> different
>> > > from the media-type for Turtle.  (That is, we will not consider this
>> > > language to supplant Turtle and take over the name, becoming the new
>> > > "Turtle", as was once proposed.)
>> > >
>> > > PROPOSED: Our dataset syntax will allow for the expression of empty
>> > > named graphs, whatever their semantics might be (to be decided). The
>> > > syntax is an empty curly-braces expression, as in "<g> { }".
>> > >
>> > > PROPOSED: Our dataset syntax will have some standard mechanism 
>> (to be
>> > > determined within the next few weeks) through which a Dataset
>> > > serialization can include some RDF data about the Dataset (that 
>> is, some
>> > > metadata in the form of an RDF graph).
>> > >
>> > >
>> > > Below, there are groups of proposals which are alternative 
>> solutions to
>> > > a design issue.   If you approve of more than one of the 
>> alternatives,
>> > > please vote "+2" for your favorite.
>> > >
>> > > * Name of the dataset syntax
>> > >
>> > > PROPOSED: We will call our recommended dataset syntax "trig",
>> > > capitalized to Trig as needed.
>> > > PROPOSED: We will call our recommended dataset syntax "TriG", but
>> > > informally and in the media type, "trig".
>> > > PROPOSED: We will call our recommended dataset syntax "TriG", and 
>> use
>> > > that capitalization everywhere.
>> > >
>> > > * Use of equals sign, like <g> = { <s> <p> <o> } .  This is not in
>> > > SPARQL but is in traditional TriG, for compatibility with N3.
>> > >
>> > > PROPOSED: In our dataset syntax, a "=" MAY appear between the 
>> name and
>> > > the graph.
>> > > PROPOSED: In our dataset syntax, a "=" MUST appear between the 
>> name and
>> > > the graph.
>> > > PROPOSED: In our dataset syntax, a "=" MUST NOT appear between 
>> the name
>> > > and the graph.
>> > >
>> > > * Use of the "graph" keyword, which MUST be used in SPARQL and 
>> MUST NOT
>> > > be used in traditional TriG.
>> > >
>> > > PROPOSED: In our dataset syntax, the case-insensitive keyword 
>> "graph"
>> > > MAY appear before the name, in a name-graph pair.
>> > > PROPOSED: In our dataset syntax, the case-insensitive keyword 
>> "graph"
>> > > MUST appear before the name, in a name-graph pair.
>> > > PROPOSED: In our dataset syntax, the case-insensitive keyword 
>> "graph"
>> > > MUST NOT appear before the name, in a name-graph pair.
>> > >
>> > > * Use of curly braces { <a> <b> <c> } around the default graphs. 
>>   They
>> > > MUST be used in traditional TriG, and MUST NOT be used in SPARQL.
>> > >
>> > > PROPOSED: In our dataset syntax, triples of the dataset's default 
>> graph
>> > > MAY be surrounded by curly braces.
>> > > PROPOSED: In our dataset syntax, triples of the dataset's default 
>> graph
>> > > MUST be surrounded by curly braces.
>> > > PROPOSED: In our dataset syntax, triples of the dataset's default 
>> graph
>> > > MUST NOT be surrounded by curly braces.
>> > >
>> > > * Some designs for carrying for metadata
>> > >
>> > > PROPOSED: In our dataset syntax, we'll say that metadata goes in the
>> > > default graph
>> > > PROPOSED: In our dataset syntax, we'll say that the default graph 
>> goes
>> > > inside curly braces and the metadata goes outside curly braces
>> > > PROPOSED: In our dataset syntax, we'll say that metadata goes 
>> inside a
>> > > set curly braces after a keyword "meta".
>> > > PROPOSED: In out dataset syntax, we'll have a keyword "meta" 
>> followed by
>> > > "default" or the name of a named graph, to indicate to readers 
>> where the
>> > > metadata is.
>> > >
>> > > 
>


-- 

Regards,

Kingsley Idehen	
Founder & CEO
OpenLink Software
Company Web: http://www.openlinksw.com
Personal Weblog: http://www.openlinksw.com/blog/~kidehen
Twitter/Identi.ca handle: @kidehen
Google+ Profile: https://plus.google.com/112399767740508618350/about
LinkedIn Profile: http://www.linkedin.com/in/kidehen
Attachments

application/pkcs7-signature attachment: S/MIME Cryptographic Signature
Received on Wednesday, 26 September 2012 19:52:20 UTC