- From: Dan Brickley <danbri@danbri.org>
- Date: Sat, 13 Feb 2010 09:32:36 +0100
- To: Ying Ding <dingying@indiana.edu>
- Cc: Semantic Web <semantic-web@w3.org>, public-lod@w3.org
On Fri, Feb 12, 2010 at 8:22 PM, Ying Ding <dingying@indiana.edu> wrote: > Hi, > > If you are interested to know the Semantic Web: Who is who from the > perspective of Scopus and Web Of Science, recently we conduct a bibliometric > analysis in this field > (http://info.slis.indiana.edu/~dingying/Publication/JIS-1098-v4.pdf), which > might be interesting to you. It's interesting to see what a traditional - ie. essentially pre-Web - citation analysis comes up with; however I wouldn't leap so quickly to claim this this results in 'identifying the most productive players'. A lot of key SemWeb infrastructure came about through non-academic collaboration; either industrial or what we might call collaborations conducted online informally, 'Internet-style'. In fact I'd argue that the needs of the academic publication process have often been a retarding factor on this collaborative work. The traditionally-published academic literature is of course a key part of the story, but if you look at it alone you will end up with both a misleading sense of how things got this way, and -worse- misleading intuitions about how to get more involved and help further the project. This is why I bother to make a little fuss here. The phrase 'Semantic Web' from ~2000 was essentially a rebranding of the then-unfashionable RDF technology. Prior to calling it RDF, the project was called PICS-NG. These days many call it 'Linked Data' instead. From http://lists.w3.org/Archives/Public/sw99/ -> http://www.w3.org/1999/11/SW/Overview.html (Member-only link) 'We propose to continue the W3C Metadata Activity as a Semantic Web Development Initiative'. But by this point, the base technology was already out there, both as a W3C Recommendation and as something in use: Netscape - the Google of it's time - was using RDF already. For example back in October 1988 http://web.archive.org/web/19991002043750/www.mailbase.ac.uk/lists/rdf-dev/1998-11/0004.html R.V.Guha, then at Netscape wrote "I still see this as a big and important use of RDF. This server answers over 2 million requests in RDF every day." ... "I do plan to fix the RDF, but thats with the next version of the browser (I have about 6M browsers out there which are depending on this older format)." Any narrative that puts the start of Semantic Web history in 2000/2001 will confuse people as to where it came from: we had major browser buy-in 2-3 years previously, after all. And any narrative that omits the role of MCF - simply because it didn't come through the academic publication process - risks misleading 'emerging stars' about how to make an impact on the world rather than just on the citation databases. Netscape bought into RDF because it grew from MCF, acquired from Apple with Guha. A reformulation of MCF to use an XML notation was one of the key inputs into the RDF design; see http://www.w3.org/TR/NOTE-MCF-XML/ and the earlier MCF White Paper http://www.guha.com/mcf/wp.html Now MCF had significant mind-share and presence in the tech world back in 1996 - http://web.archive.org/web/20000815212707/http://www.xspace.net/hotsauce/ - and even grassroots adoption on sites that wanted to have a '3d fly thru' using Apple's then-cool visualization plugin. MCF was a direct ancestor to RSS (also originally an RDF-based Netscape product); it was triples-based, written in XML, and quite recognisable as RDF's precursor to anyone who reads the spec. The grassroots, information linking style of MCF was one of the inspirations behind FOAF too. However it did not leave any footprint in the academic literature. We might ask why. Like much of the work around W3C and tech industry standards, the artifacts it left behind don't often show up in the citation databases. A white paper here, a Web-based specification there, ... it's influence cannot easily be measured through academic citation patterns, despite the fact that without it, the vast majority of papers mentioned in http://info.slis.indiana.edu/~dingying/Publication/JIS-1098-v4.pdf would never have existed. In my experience, many of the discussions that shaped the early RDF and Semantic Web efforts were conducted online, using email, often also IRC chat, and as the years went by, increasingly in blogs and now microblogs. And many of the people who got a lot done were not employed in an academic setting where there was an institutionalised pressure to public in certain kinds of places. This is not to belittle the critically important contributions that came from those employed in academia, just to note that the wave of interest and research funding that followed 200/1 served largely to polish and promote ideas (and tools, specs) that had already reached prominence via Internet/Web/industry means. Without that academic buy-in and associated research funding, the Semantic Project would surely be dead by now. However, there is a continuing danger of confusing the real project --- a global collaboration to improve the Web's information-linking facilities --- with the activity of writing about it. The two are not the same, we need both, and the lack useful modern impact metrics makes it easy to conflate the two. It is not appropriate to entitle an academic citation analysis of the SemWeb project "Who is who in the field", not because of the bruised egos of those it omits, but because it risks misleading younger developers about how to make an impact on the world, rather than just on the literature. "Who cites whose paper?" might be a more accurate characterisation. This is not a problem distinct to the Semantic Web scene. All kinds of scientific collaborations (the Web's founding use case) can be conducted with greater speed thanks to the Web. But impact analysis lags behind, making it hard for those who work openly, rapidly and collaboratively to show the merits of their approach. Or the same in Web standards: any account of recent developments in HTML should pay a lot of attention to Web browsers, to organizations like Mozilla, Microsoft, Opera, Apple, KDE, WebKit and to fora like #whatwg (an IRC channel), the whatwg- and W3C- mailing lists, and countless blogs where the future of HTML is being passionately debated. If you scan the academic literature concerning HTML5 it is a pale and much-delayed echo of the real debates. It is hardly suprising that a technology community - HTML5 - devoted to improving the Web are also using it to conduct their discussions. I think you'll find, although perhaps to a lesser degree, the same also to be true of the Semantic Web project... cheers, Dan
Received on Saturday, 13 February 2010 08:33:10 UTC