- From: Thomas Baker <tbaker@tbaker.de>
- Date: Mon, 28 Feb 2011 22:19:52 -0500
- To: public-xg-lld <public-xg-lld@w3.org>
Dear all, In our report, we should consider Dan's eloquent reflections on the reception of RDF (below). For example: "If you're used to XML or SQL schema structures, the schema designer is typically (not necessarily) in a much more authoritative role. With RDFS we stripped a lot of power away from schema designers: they can't tell you what to do any more! There's no "a shipping order *must* have an address" mechanism in RDFS/OWL"... Tom On Mon, Feb 28, 2011 at 03:13:16PM -0500, Thomas Baker wrote: > From: Thomas Baker <tbaker@TBAKER.DE> > Subject: DanBri about the RDF "message" > To: DC-ARCHITECTURE@JISCMAIL.AC.UK > > Dear all, > > I'd like to share some insightful comments from Dan Brickley > about what has made the Semantic Web message more difficult > to convey than some of us had expected. > > As the comments were made on a closed list, I have with Dan's > permission removed the context from the excerpts below. > > Tom > > > Dan was asked why it has taken since 1998 to get the world to > understand what can be achieved with URIs and 3-tuple data > representations. Dan's reply: > > Part of our problem, I fear is that we have collectively tended to > approach the situation with an essentially evangelical style. > > Time and again, this has got smart people interested and intrigued, > and so they go try out some RDF tools. > > Very often this is a frustrating experience. And there are good > technical reasons why working with RDF (* or any other '3-tuple based > Structured Data Representation' *) will often be frustrating. The > 3-tuple approach thrives in chaotic situations where data flows > around, with bits missing, bits added, extensions and gaps everywhere. > This kind of data is intrinsically rather annoying to deal with. There > are workaround and strategies (details on request :) but that > frustration is inevitably core to the experience, because it is a set > of problems the RDF data model was designed to engage with. > > So http://www.w3.org/DesignIssues/LinkedData.html marked a turning > point when TimBL took FOAF's RDF linking model, improved it by > demanding URIs everywhere (rather than our earlier bNodes and > seeAlsos), and inspired mass publication of RDF data. Until we had > data, few were RDF-curious. Now we have data, we can disappoint more > curious new people per month than ever before. Or on a good day, make > them happy. > > The Semantic Web project has delivered several four specific things to > the world so far: data, tools, community and standards. > > Because it grew from a standards organization, the tendency has been > to focus on the standards, and what they do to improve the world - the > 3-tuple model as seen in RDF, and the specs that build on top of it > (SPARQL, RDFS/OWL etc.). > > Now standards are great, but they're pretty distant from solving > day-to-day problems. And there are good reasons to believe that > 3-tuple data structures will typically be annoying to use, as well as > useful. They only really shine when multiple parties are using them in > complementary ways, so that data can be usefully mixed and merged and > extended and overlaid and so forth. > > So getting those big public, link-friendly datasets out there was a > foundation for RDFy 3-tuple data becoming more useful than it was > annoying. But it's still annoying for developers, trust me! Having > solid standards with test cases (the RDFCore 2004 revision of RDF) was > a good step forward, but still standards alone are not enough. The > missing ingredients are tooling and community. Both of which we have, > both of which we can always benefit from more/better. So communities > like the RDF/SW interest group at W3C, like Lotico, like the LOD group > which bridged W3C's scene with the outside world, these help new > adopters make the most of the 3-tuple model. I've seen quite a few > efforts burned by mis-applying RDF in contexts where it just wasn't > important or useful to use it. That's natural with a newish > technology. And I've seen smart developers frustrated by the lack of > documentation, polish and guidance around our tooling. But the growing > suite of RDF-oriented tools can't be ignored, and that's a key part of > the technology's appeal. > > We have data, now, and that's enough to attract people. But as seen in > discussions around eg. data.gov.uk, many mainstream developers see > RDF, SPARQL and 3-tuples and associated tools as a hurdle or barrier > that stands between them and data. In a way, they're right. We have > all these standards and tools as a means to an end (sharing > information, the Web's founding slogan > http://www.w3.org/Illustrations/LetsShare.ai.gif "Let's share what we > know"). RDF is not an end in itself. > > So imho the message should not be "we've found the best technical > model for sharing data on a global scale - URI-linked 3-tuples!", but > rather, that we have a global community committed to sharing data, > tools, standards and their own experience and time in pursuit of > solving problems through information linking. This doesn't mean that > all tools need be opensource, nor all data public, but that there are > common architectural principles giving coherence to all this data, all > those tools... > > All the time we frame this as "RDF is 'easier/better' than > [wonder-technology X]" we will lose. It's not. And nor is any vaguer > notion of "3-tuples with URI" [...]. What we have here in > the Semantic Web effort that is unique is a special combination of > data, tooling, standards and community that simply can't be found > anywhere else... > > And to a follow-up question on the exactly what problems people > and developers have with 3-tuples, or what they would rather have > in their place...: > > I think it's not so much the 'what they get back' (API/format/model), > but the whole framework of how we structure our data. > > If you're used to XML or SQL schema structures, the schema designer is > typically (not necessarily) in a much more authoritative role. With > RDFS we stripped a lot of power away from schema designers: they can't > tell you what to do any more! There's no "a shipping order *must* have > an address" mechanism in RDFS/OWL. For e.g., as editor of the FOAF > vocab's RDFS I can never say anything in an imperative style in the > schema, all I can do is define the meaning of the classes and > properties in the FOAF namespace. Same for the Dublin Core team, for > SIOC, etc. This permissiveness encourages re-use in lots of different > ways. > > This is simultaneously critical for scaling to the Web, but also, as I > say, annoying to be on the receiving end of. For developers trained in > the idea that schemas tell you what is or is not an acceptable > instance, RDF is strangely passive. The only formal way of screwing up > in RDF is contradicting yourself. Someone could publish a FOAF-based > RDF/XML document that was simply a collection of triples using > 'foaf:homepage'. Even with bNodes on either side of the property. Or > someone else might publish a bunch of <foaf:Image about="uri" > dc:title="...."/> triples. The FOAF vocabulary faciliates this, and > that is useful, but it also means that knowing the vocabulary is not > itself enough for interop. You only get interop when a bunch of folk > do things in roughly the same way; using the same triple patterns. > There's a whole layer to do with characterising more specific triple > patterns, 'idioms', that is essentially missing from our collective > practice. There have been experiments in various directions towards > characterising such patterns (eg. using SPARQL, see Schemarama...) but > as a community we seem to act as if schemas are all that's needed. > > As Ed Dumbill put it (http://times.usefulinc.com/#13:13 via > http://danbri.org/words/page/27?sioc_type=user&sioc_id=22 ) > > "Processing RDF is therefore a matter of poking around in this graph. > Once a program has read in some RDF, it has a ball of spaghetti on its > hands. You may like to think of RDF in the same way as a hashtable > data structure -- you can stick whatever you want in there, in whatever > order you want." > > This loose nature is the key at once to our success and to our > problems. The analogy is with developers who are used to nice (if a > little brittle/rigid) OO models are not always happy replacing > everything with a chaotic hashtable. At least not unless we have a > good set of unit tests. And what we're missing, by analogy, is just > that. Nobody knows when they've been passed a 'good' RDF graph, versus > one so uninformative, or expressed in such alien terminology, that it > can't be used for the task at hand. So some of the essential ideas > from non-RDF development just don't really make sense when using > unconstrained triples. That leads to headaches, frustrations etc. -- Thomas Baker <tbaker@tbaker.de>
Received on Tuesday, 1 March 2011 03:20:31 UTC