Re: Deprecate most "native" RDF serializations from Manu Sporny on 2012-05-04 (public-rdf-wg@w3.org from May 2012)

From: Manu Sporny <msporny@digitalbazaar.com>
Date: Fri, 04 May 2012 11:08:12 -0400
To: public-rdf-wg@w3.org
Message-ID: <4FA3F0DC.1020304@digitalbazaar.com>
On 05/04/2012 04:23 AM, Andy Seaborne wrote:
> On 04/05/12 05:42, Manu Sporny wrote:
>> Please send a strong signal out to RDF authors deprecating the
>> following "native" RDF serializations:
>>
>> * RDF/XML * N-Triples * N3 * TRiG * N-Quads * anything else that
>> isn't TURTLE :)
>
> What is the current state? I seem to see Turtle and not RDF/XML these
> days but that isn't a systematic observation.

I don't know if this is the right question - I think the right question
is "What do we want the current state to be?". I'd imagine that we're
all fairly happy with TURTLE... and we all want a very clean message on
what people can use for native RDF markup - use TURTLE (or TURTLE Lite).

>> That is, something that we can point people to and say: W3C says
>> not to use that, don't use it for any future work, put a bullet in
>>  the serialization.
>>
>> The only officially supported RDF serializations should be:
>>
>> * TURTLE * TURTLE Lite
>
> Actually, I don't see N-Triples for "Turtle Lite" as causing
> problems. Developers get it and it stresses which subset of Turtle it
> is.

Developers don't understand that N-Triples and (possibly in the future)
N-Quads is a subset of TURTLE. If A is a subset of B, then why does A
have a completely different name from B? This is really bad marketing.

What we want is this: "What language do I use to write RDF"? "You use
TURTLE."

What would've been even better is having the core language and the data
model called the same thing, because developers rarely make the
distinction. You don't have the "PDQ data model, and you write it in
Python" - you just have the Python Language. Ideally, we would've had
the RDF Language.

>> TURTLE Lite would effectively be a subset of TURTLE - N-Quads, or
>> something that would be N-Quads-like (allowing for either "s p o"
>> or "s p o c" statements).
>
> We have discussed this. It is very helpful to know that a large set
> of data (10e6+ triples) is all triples, and not quads, ahead of time.
> Triples can be read into a single graph; quads can't. However it's
> done, a way to indicate that it's triples only is needed.

 From an implementation perspective, yes. From a developer perspective
(which is where I'm coming from), I shouldn't have to care. Just suck
the data in and let me work with it.

> graphs != datasets.

This is going to be lost on most beginner developers. That is, people do
get this eventually... the 5% that you have left over after the 95% have
given up on RDF because it's so damn complicated.

>> The goal would be to simplify the dizzying array of options authors
>> have in front of them right now and send a clear message about what
>> they should be using and not using.
>
> That would be good. Your proposed mechanism ("only officially
> supported") would be OK if there wasn't so much deployed software,
> documentation and data.

I don't think this is a good reason not to do it. There were a plethora
of structured query languages in the 1970s for accessing a database...
SQL is the standard now. We shouldn't care about the amount of deployed
software today when we're planning for 20+ years down the road. I think
that we should be looking at where we want to be, and start moving in
that direction.

> From my viewpoint of linked data, there is a significant amount of
> data out there; a lot of people (not wg members) who have evangelised
> it to their organisations; and open source projects who implement the
> specs... That makes for a big cost of change and a big mess of a
> transition.

These sorts of transitions rarely happen quickly - change is almost
always costly and messy. While there is good bit of Linked Data out
there right now, it pales in comparison to what will be published in the
future (if we get our collective RDF act together). Open source projects
do pay attention to direction from W3C and do take articles and advice
to heart when it comes from an expert community and that community is
saying: Use TURTLE - standardize on that.

To put it another way, the line of argumentation where we say "the state
of the art is already fragmented, so it's going to be hard to get
everybody on the same page" doesn't help us advance the state of RDF (or
any technology, really).

I think what we should be saying is that: We have a fragmentation
problem - there is a fairly clear answer to this problem - let's start
moving toward that solution.

-- manu

-- 
Manu Sporny (skype: msporny, twitter: manusporny)
President/CEO - Digital Bazaar, Inc.
blog: PaySwarm Website for Developers Launched
http://digitalbazaar.com/2012/02/22/new-payswarm-alpha/
Received on Friday, 4 May 2012 15:08:56 UTC