Notes on Linked Data, terminology and marketing

"Linked Data" is a marketing term, a loose brand. "If you like
*links*, and you like *data*... you'll love what we're doing with
**linked* *data**! Check out our Formal Semantics for RDF!"

W3C has rebranded RDF at least 4 times now.

1. 1996's PICS spec was scheduled for improvements that led to the
Metadata Activity, PICS-NG

See http://www.w3.org/TR/NOTE-pics-ng-metadata (check out the syntax,
http://www.w3.org/TR/NOTE-pics-ng-metadata#syntax)

"The broad goal is to define a metadata mechanism which makes no
assumptions about a particular application domain, nor defines the
semantics of any application domain. The definition of the mechanism
should be domain neutral, yet the mechanism should be suitable for
describing information about any domain."

2. In attempting this, the PICS-NG Working Group (see
http://www.w3.org/PICS/NG/ ) expanded beyond their original scope,
which was merely to " define a new format for labels in order to
better address [these issues]. This label format is based on the
original specification as defined in PICS-1.1. The new format will
permit non-numeric values (e.g., strings), structured values (e.g., an
author and corresponding affiliation) and element repeatability."

In doing so, they brought in additional ideas from other proposals,
most specifically MCF; http://www.w3.org/TR/NOTE-MCF-XML-970624/
http://www.w3.org/TR/NOTE-MCF-XML-970624/MCF-tutorial.html

MCF was soup of assertions expressed as binary relations and written
in XML. The diagram in http://www.w3.org/TR/NOTE-MCF-XML-970624/#sec1.
should look familiar.

W3C couldn't call the new 1997 spec "PICS" since there was already a
big PR problem with PICS and anti-censorship critics - e.g.
http://www.w3.org/PICS/PICS-FAQ-980126.html nor was 'MCF' appropriate,
since there were other proposals and ideas going into the mix.

Hence "RDF". Bland but a new name previously unclaimed and un-used.

So by 1997, both PICS-NG and MCF are candidates for being "things that
were renamed RDF".

3. W3C launches RDF (1997-1999) to almost universal indifference from
the wider industry. It was widely disliked by XML enthusiasts, and
people were crazy for XML at the time, so it languished.

Semantic Web activity launched, to draw on research funding to keep
this work evolving and set it in a bigger context.

4.
Semantic Web brand goes somewhat off the rails towards AI/KR.

A bunch of us who had been slogging along using RDF for more
data-oriented applications (DC,FOAF, SKOS, ...), felt increasingly
alienated from the "Semantic Web" brand. Conferences were rejecting
papers for not using enough complicated logic, journals were equating
'Semantic Web" with "complicated logic languages" rather than a
project to improve the Web. Even enthusiastic RDF implementors were
frustrated by this; and the situation wasn't doing a lot of good for
RDF's already poor image in the wider industry. An important part of
the technology (logic/inference) came to dominate its perception.

TimBL's http://www.w3.org/DesignIssues/LinkedData comes out, and it
was a great name, and a timely refocussing on the importance of
publishing and linking data in the Web.

So RDF effectively got rebranded again; this time as "Linked Data".

At that time, FOAF was by far the largest deployed use of linked RDF
documents, "This linking system was very successful, forming a
growing social network, and dominating, in 2006, the linked data
available on the web." ... and Tim's document gives a *lot* of space
to arguing for the importance of URIs everywhere, to make graph
merging easier. Some have even argued from time to time that data
without URIs for each node in the graph isn't really "Linked Data".
But we managed to move beyond technical nitpicking and Linked Data has
broadly settled down as something close to RDF without being a synonym
for it.

I think that's as good as we can get. We can't keep rebranding every
3-5 years. A good name helps, but we could've made a go of this thing
when it was called "PICS-NG", "Semantic Web", "RDF", or whatever. XML
and JSON and SQL are not successful because of their names, but
because they are useful.

It would be a mistake to try to fully rebrand RDF as 'Linked Data", to
the extent of turning "Linked Data" into a new technical term, meaning
whatever 'RDF' used to mean. Would RIF become the "Linked Data Logic
Language"? OWL the Ontology Web Linkeddatalanguage"?

A lot of the energy in the Linked Data scene came from a sense of
"let's do this thing, ... get some data out there... see what happens
when we have lots of datasets that use the same standards". The
standards being a means to an end. And as part of that sometimes it
came with an inappropriately dismissive attitude towards the more
mathematically- and logically- inclined end of the Semantic Web
technology spectrum. Some of W3C's work on RDF and related standards
is necessarily going to be a bit complicated. Standards are like that.
"Linked Data" is a flag saying "hey, this can also be fairly simple
too... just throw some data up there and structure it so it can be
merged again and queried...". We lose that appeal when we start adding
appendices about http-range-14 and HTTP redirects and conneg and so
on.

When W3C calls a group for example "Data Access Working Group" rather
than "Knowledge Base Remote API Working Group" it is sending a signal
of the flavour of the work it's looking to standardise. Same for
http://www.w3.org/2012/ldp/wiki/Main_Page ... it's making a kind of
indirect promise, "we won't make this one too complex, honest"; "we
know some of you have found some of our specs a bit heavy sometimes,
but this time will be different". Which is why I see "Linked Data" as
a marketing / branding / public perception issue, rather than a
technical term. As soon as we start arguing earnestly about whether
some site is "really" Linked Data when it is only sending an HTTP 303
instead of a 302 (or vice-versa, I forget), then the game is already
lost.

Basically we're driving along a road somewhere. If we get too nerdy,
and tie "Linked Data" to some currently popular subset of RDF
deployment, we've lost it in detail and drive off the left side of the
road into the weeds. But if we get too broad, and happily endorse as
"Linked Data" anything that's publishing data in the Web using
identifiers to link records (MySQL dumps anyone?), we've lost it the
other way and end up in the complimentary weeds on the other side of
the road. As technologies mature and get integrated into the
mainstream (e.g. the RDB2RDF work) then it becomes increasingly
plausible to talk about Linked Data in more contexts without the
phrase losing all meaning. Too nerdy, we lose mass apeal. Too
rhetorical, and we lose interoperability.

If I had to say what was "middle of the road" for me, w.r.t. Linked
Data, I guess I'd say:

 * emphasis on standards *and community* for practical data sharing,
where standards are means to an end not the end in themselves.
 * using basic RDF concepts - graph datamodel, URIs - and supporting
standards (RDFS/OWL to document vocabularies)
 * emphasis on getting stable, well known, etc. identifiers into the
data as far as possible, to make data merging easier

In those terms it's a marketing program for RDF-based information
sharing. This should not mean we get into "you're not Linked Data
because you've used xml:lang or rdf:datatype wrong" or http-range-14
debates. It's rather our last, greatest, chance of moving beyond our
nitpicking habits and refocussing on the bigger picture, which is
information sharing. And in the Web standards business, RDF is at the
heart of that picture.

Received on Saturday, 27 October 2012 20:23:30 UTC