Re: Is it the syntax? (was Re: RDF: XULing or Grueling) from Harry Halpin on 2007-10-08 (semantic-web@w3.org from October 2007)

From: Harry Halpin <hhalpin@ibiblio.org>
Date: Mon, 08 Oct 2007 07:56:23 -0400
To: Bijan Parsia <bparsia@cs.man.ac.uk>
Cc: 'Semantic Web' <semantic-web@w3.org>
Message-ID: <470A1AE7.5070208@ibiblio.org>
Bijan Parsia wrote:
>
> On Oct 6, 2007, at 12:01 PM, Harry Halpin wrote:
[snip]
>> Again, anything a desperate C hacker can't write a parser for quickly
>> is going to be a problem, and about 99 percent of people I know dislike
>> RDF (and the SemWeb in general) primarily because they find the RDF/XML
>> syntax user-hostile, and secondly because they view it as the use of
>> taxonomies to organize data.
> [snip]
>
> While I welcome a general discussion of marketing issues, again this
> doesn't seem to be on point at all. I mean, if it all were as simple
> as a syntax twiddle, I'm sure we'd all be happy. But I don't believe
> *anyone* *really* believes that. If we did, we'd be *INSANE* not to
> have fixed this by now.
But it's not really fixed now. It's mostly fixed - ala Turtle/N3.
But....if you're a desperate C hacker and yo hear about this RDF thing,
and you go to RDF Primer, yes - you see frankly rather horrifying
RDF/XML syntax, and surrender immediately. To even understand the data
model issues you have to get beyond the syntax.

XML and JSON to no large part succeeded not only because they were
timely ideas (i.e. other communities were working on similar concepts,
just didn't have a standard - ala Peter Buneman's work of
Semi-Structured Data [1]), but also because someone who wasn't even that
great of a coder could write up a parser pretty damn quickly, and
understand the general concept in about 10 minutes. Programmers are busy
people.

RDF can be explained in about 10 minutes (the basics), but the syntax is
really, really daunting, and I would say for 99 out of 100 people. I
rarely have heard people say "directed graphs with URIs attached are the
wrong data model for my problem." They *could* - but many I'm afraid
give up beforehand after seeing the syntax. Syntax does matter - look at
Lisp - and Lisp is also a good poster-child for while standardization is
important.
>> Of course, RDF/XML came out the way it did
>> for good reason, and taxonomies are damn useful in some applications.
>> But still, I think better evangelism about RDF for data merger and
>> rubber-stamping the Turtle/N3 syntax would neatly solve 99 percent of
>> these complaints.
> [snip]
>
> I'll save a discussion of the weaknesses of RDF as a data merger
> solution (it's not a silver bullet; I had an interesting conversation
> with a CTO of a data merger consulting firm who uses XML, XSLT, and a
> home grown set of business concepts; his team goes in, dumps the
> relational db to xml; xslts it into the business concepts (for
> alignment), then xslts it out again into the new relational thingy;
> from 2 months or 2 years to 2 weeks; he *tried* doing the same thing
> with RDF and found it a mess; now, clearly something went wrong, but I
> hope this anecdote shows that it's possible to go wrong, perhaps very
> wrong, using RDF for data merging; it's also possible to go right with
> other technology for data merging; so, we run the risk with *blind*
> evangelism of setting up future big failures) for another day. But
> again, if we all believe that the syntax is 99 percent (or any
> significant fraction, say more than 10%) of the problem then we as a
> community are just broken.
I find RDF a pretty good data merger solution, in particular "graph
merge", and to get two databases to talk together converting them to RDF
then graph merging is in general easier than any sort of schema-mapping.
You just can't do that with XML/JSON/relational databases. The real
problem I generally run into is moving the RDF back to XML, i.e. one
large problem is round-tripping between RDF, XML, and any sort of
relational database.
> You don't need a WG to fix such a fundamental problem that would have
> such huge benefits.
>
> Finally, It's clear that you didn't read the things I pointed to.
> Neither one of those even MENTIONS the syntax. Not even in passing.
> They seem mostly concerned with implementation problems and with
> expressivity. My issues with using RDF as the uber syntax (or the uber
> abstract syntax) have *nothing* to do with RDF/XML and everything to
> do with the fact that it's very bad at representing syntax. (Note, I
> also HUGELY discourage people from using OWL as a grammar for syntax.
> That's not what it's for. Perhaps if we had integrity like constraints
> it would be better, but even then it's going to such in a lot of ways
> compared to something that has a better connection to concrete syntax
> issues.)
I pointed out the syntax, because it's a huge problem that everyone
admits to inside the community (and we have our work arounds) but from
outside appears to be totally broken. As for the nice post by Bowers, I
might add hie states: "I've worked with RDF several times now and each
time, I am confused as to how the RDF folk manage to take this simple
concept of a graph and make it so unbelievably complicated to use." [2]
His problem is not the data model of a graph itself, or URIs, but how
hard it was to use and a broken implementation.
>
> So, perhaps 99% of the people you encounter really are only hung up on
> the syntax. I don't know. Maybe those people are representative of the
> world. I don't know that, either. It seems probable, however, that the
> problems that Mozilla faced --- or RSS 1.0 faced --- have little to do
> with RDF/XML. Or, at least, I need some sort of evidence before I'd
> accept that. I'm trying to *reflect* on past evangelism gone wrong to
> learn what works and what doesn't. Indeed, we should want to know
> where RDF isn't the right choice so we don't look like fools for
> proposing it in the wrong circumstances.
If you can get past the syntax and look at the (simplified) data-model,
I think it would be accurate to classify the dataformats as:

1) RDF is for directed graphs
2) XML for trees
3) JSON for associative arrays

You can hack XML to fit directed graphs (i.e. RDF/XML does precisely
that, and one could reinvent the wheel), XML for associative arrays
(it's just more syntax), and JSON for trees is just embedding the
objects. Neither XML nor JSON can do the easy merger RDF can. XML has
better support for tools and JSON came around at the right time when
Javascript was implemented correctly. XML is the only one that can be
support mixed content.

Another problem for RDF is the lack of any decent presentation and easy
"select then insert into XML", although Fresnel seems a good step in the
right direction [3].

Today, I'm working on an application that take XML-centric data, convert
to RDF using XSLT using merge it using RDF (which I find easier than
schema mapping), and then use custom tools to over to JSON for
presentation purposes - Simile [4] is barking up a useful tree there.
It's a lot of shims but I find it works well as a design pattern.
> [snip]
>
> Cheers,
> Bijan.
>
I do think the updating OWL with experience is a good idea, and the same
should be done to RDF once there's some agreement to the problems. "RDF:
Experience and Directions" Workshop anyone?

And I do thank Bijan for bringing up the issues.

[1]http://portal.acm.org/citation.cfm?id=263675
[2]http://www.jerf.org/resources/xblinjs/whyNotMozilla/notXulTemplates.html
[3]http://www.w3.org/2005/04/fresnel-info/
[4]http://simile.mit.edu/

-- 
		-harry

Harry Halpin,  University of Edinburgh 
http://www.ibiblio.org/hhalpin 6B522426
Received on Monday, 8 October 2007 11:56:36 UTC