W3C home > Mailing lists > Public > www-archive@w3.org > December 2011

RFC Clusters

From: Bjoern Hoehrmann <derhoermi@gmx.net>
Date: Sun, 04 Dec 2011 19:37:30 +0100
To: www-archive@w3.org
Message-ID: <n5end7toggp67entqgitmu9lf83o3snse6@hive.bjoern.hoehrmann.de>
Hello.

  http://lists.w3.org/Archives/Public/www-archive/2010Sep/0009.html is a
graph of all RFCs based on cross-references among them. I've ported the
code that determined the clusters in the graph to C++ and made a wrapper
for it http://search.cpan.org/dist/Graph-NewmanGirvan/ so it can be used
in Perl code. I used that to make the attached document.

It is based on the same cross-reference data I used to make the earlier
graph. It shows how the RFCs are subdivided by the algorithm, the first
level is the individual tables, the second level columns, and the third
is in groups in each column. Within groups the titles are sorted by the
sum of the edge weights of incoming edges (the edge weight is 1 for all
edges, so it's the number of times the document is referenced). The al-
gorithm internally uses the sum of outgoing edge weights; I am thinking
about making that configurable.

Note that the input data does not consider, say, the difference between
normative and non-normative references, I mainly use this for things I
have better input data for, but RFC linkage is a good test case, indeed
I decided to port the code precisely because it worked quite well in the
earlier study.

regards,
-- 
Björn Höhrmann · mailto:bjoern@hoehrmann.de · http://bjoern.hoehrmann.de
Am Badedeich 7 · Telefon: +49(0)160/4415681 · http://www.bjoernsworld.de
25899 Dagebüll · PGP Pub. KeyID: 0xA4357E78 · http://www.websitedev.de/ 

Received on Sunday, 4 December 2011 18:38:24 UTC

This archive was generated by hypermail 2.4.0 : Friday, 17 January 2020 22:34:11 UTC