Re: Size matters -- How big is the danged thing

On Fri, Nov 21, 2008 at 7:51 PM, Kingsley Idehen <kidehen@openlinksw.com> wrote:
>
> Yves Raimond wrote:
>>
>> On Fri, Nov 21, 2008 at 8:08 PM, Giovanni Tummarello
>> <giovanni.tummarello@deri.org> wrote:
>>
>>>>
>>>> Overall, that's about 17 billion.
>>>>
>>>>
>>>
>>> IMO considering myspace 12 billion triples as part of LOD, is quite a
>>> stretch (same with other wrappers) unless they are provided by the
>>> entity itself (E.g. i WOULD count in livejournal foaf file on the
>>> other hand, ok they're not linked but they're not less useful than the
>>> myspace wrapper are they? (in fact they are linked quite well if you
>>> use the google social API)
>>>
>>
>> Actually, I don't think I can agree with that. Whether we want it or
>> not, most of the data we publish (all of it, apart from specific cases
>> e.g. review) is provided by wrappers of some sort, e.g. Virtuoso, D2R,
>> P2R, web services wrapper etc. Hence, it makes not sense trying to
>> distinguish datasets on the basis they're published through a
>> "wrapper" or not.
>>
>> Within LOD, we only segregate datasets for inclusion in the diagram on
>> the basis they are published according to linked data principles. The
>> stats I sent reflect just that: some stats about the datasets
>> currently in the diagram.
>>
>> The origin of the data shouldn't matter. The fact that it is published
>> according to linked data principles and linked to at least one dataset
>> in the cloud should matter.
>>
>>
>>
>>>
>>> Giovanni
>>>
>>>
>>
>>
>>
>
> Yves,
>
> I agree. But I am sure you can also see the inherent futility in pursuing
> the size of the pure Linked Data Web :-)  The moment you arrive at a number
> it will be obsolete :-)
>
> I would frame the question this way: is LOD hub now dense enough for basic
> demonstrations of Linked Data Web utility to everyday Web users? For
> example, can we "Find" stuff on the Web with levels of precision and
> serendipity erstwhile unattainable? Can we now tag stuff on the Web in a
> manner that makes tagging useful? Can we alleviate the daily costs of Spam
> on mail inboxes? Can all of the aforementioned provide the basis for
> relevant discourse discovery and participation?

Sorry, this is getting too interesting to stay in lurker mode ;)

Kingsley, absolutely. We have got to that point. The fun part has begun.

To quote Jim, who started this thread:

http://blogs.talis.com/nodalities/2008/03/jim_hendler_talks_about_the_se.php

Go to minute 28 aprox ( I can't listen to it here, I just blocked mp3's ).
Jim touches on how a geo corpus can be used to dissambiguate tags on flickr.
This is one such use, low hanging fruit wrt the huge amount of linked
data, and a first timer in terms of IT.

This was not possible last year!
It is now.

I guess that is THE question now: What can we do this year that we
couldn't do last year?
( thanks to the massive amount of available LOD ).

Best,
A

>
> --
>
>
> Regards,
>
> Kingsley Idehen       Weblog: http://www.openlinksw.com/blog/~kidehen
> President & CEO OpenLink Software     Web: http://www.openlinksw.com
>
>
>
>
>
>



-- 
Aldo Bucchi
U N I V R Z
Office: +56 2 795 4532
Mobile:+56 9 7623 8653
skype:aldo.bucchi
http://www.univrz.com/
http://aldobucchi.com

PRIVILEGED AND CONFIDENTIAL INFORMATION
This message is only for the use of the individual or entity to which it is
addressed and may contain information that is privileged and confidential. If
you are not the intended recipient, please do not distribute or copy this
communication, by e-mail or otherwise. Instead, please notify us immediately by
return e-mail.
INFORMACIÓN PRIVILEGIADA Y CONFIDENCIAL
Este mensaje está destinado sólo a la persona u organización al cual está
dirigido y podría contener información privilegiada y confidencial. Si usted no
es el destinatario, por favor no distribuya ni copie esta comunicación, por
email o por otra vía. Por el contrario, por favor notifíquenos inmediatamente
vía e-mail.

Received on Friday, 21 November 2008 23:03:30 UTC