Re: [Information Gathering] data augmentation, ratings (was Re: [Information Gathering] next steps: syndication, good weblocation) from Lee Feigenbaum on 2007-03-31 (public-sweo-ig@w3.org from March 2007)

From: Lee Feigenbaum <feigenbl@us.ibm.com>
Date: Sat, 31 Mar 2007 18:33:20 -0400
To: "Danny Ayers" <danny.ayers@gmail.com>
Cc: public-sweo-ig@w3.org
Message-ID: <OF0FA3BE29.6258F528-ON852572AF.007B5689-852572AF.007BE650@us.ibm.com>

"Danny Ayers" <danny.ayers@gmail.com> wrote on 03/31/2007 06:17:58 PM:

> On 31/03/07, Lee Feigenbaum <feigenbl@us.ibm.com> wrote:
> 
> [snip]
> 
> > What I care about and think is important for our education and 
outreach
> > efforts is for us to do the work to identify what the cream of the 
crop
> > SemWeb information resources are, and then organize them based on 
which
> > ones are most useful for which types of people. To do this, I believe 
that
> > we need to augment the existing information resources with:
> >
> > a/ some way to identify the best (this could be digg.com-style 
ratings,
> > google-style rankings (don't think we need that level of complexity), 
or
> > even just simple "best of breed" flags)
> 
> That could roughly be split into three different approaches according
> to how the data's generated:
> 
>    manually ("best of breed" flags)
>    algorithmically (linkrank etc - there's probably some existing open
> service that could help)
>    user feedback (digg etc)

Exactly!

> I suspect that's more-or-less in order of how hard and/or
> time-consuming each would be. It'd be undesirable for work on the
> fancier approaches to hold up publication in the simpler form, but I
> guess it could be built incrementally.

Exactly, again.

> (If Tom Heath's http://revyu.com was rebranded a little it could serve
> to provide user feedback, though it might take a long time to get a
> useful quantity of scores in).

Agreed -- it would be a great goal, but I don't think it's close enough to 
be able to get valuable data from it sometime this year (say).

> Hmm, "best of breed" would have to rely on someone's value judgements,
> does that sound ok? Maybe there's also something fairly objective
> nearby - maturity (age in years), activity (1/time since last
> release)..?

Well, it could be "multiple someones", such as "SWEO members," which is 
somewhat what I've pictured in the past. (Multiple people's combined (by 
voting, say) value judgements are moe likely to be reasonable than a 
single editor's.) Maturity is interesting, but I imagine there's a great 
deal of mature material around the Web about the Semantic Web that might 
be better categorized as "stale" rather than "mature." :-) 

This point has always been the sticking point when I've asked that we 
identify best of breed resources in previous SWEO discussions -- I don't 
have a foolproof answer, other than to say that I think it's worth the 
risk of potentially slighting someone in order to generate materials that 
are useful (and therefore limited in quantity and high in quality) to 
non-SemWeb folks.

> > b/ appropriate predicates and editorial work to associate information
> > resources with the appropriate audience that each is aimed at (both on 
a
> > technical capability level and on a industry/domain level)
> 
> Sounds very desirable & not unreasonable, as long as the effort needed
> can be kept within sane limits. Again, maybe sophistication by
> increments would be a good idea. I don't see any way of avoiding the
> design and/or selection of suitable predicates.

Right.

> But perhaps the editorial workload could be reduced by creating a
> questionnaire, asking the the tool developers to fill it in themselves
> (which could even be rigged up to generate triples fed directly into
> the store, say tweak DOAP-a-matic a little).

I think this is a great idea. We'd probably still want to exercise 
editorial review over it (to prevent people from potentially attepmting to 
draw all attention to their materials by classifying them in unreasonable 
ways), but that would be something that could evolve incrementally. heck, 
this approach could work incrementally backwards and forwards:

1/ SWEO editor/editors/task force begins classification of prominent 
resources
2/ SWEO/community developer(s)/task force builds survey that lets resource 
publishers (more general than "tool developers" in my mind) classify their 
matierals
3/ SWEO editors keep an eye on classifications and refine anything that 
seems amiss

Lee 

> Cheers,
> Danny.
> 
> -- 
> 
> http://dannyayers.com

Received on Saturday, 31 March 2007 22:33:32 UTC